getData returns null array php curl

Hi,
We would very much appreciate some input. We just created a sandbox account with an api key. We are only running GET data. Returned arrays are null, even for a simple test with company number hard-coded.

For the Sandbox API it returns null array and service error:
Array ( [errors] => Array ( [0] => Array ( [type] => ch:service [error] => company-profile-not-found ) ) )

There is no auth error, nor does our code return errors.

Please find enclosed our code.

<?php
$api_key = 'validKey'; $company_number = 'Valid number hard-coded';
$ch = curl_init();

// curl_setopt($ch, CURLOPT_URL, "url api-sandbox.company-information.service/company/".$company_number);
curl_setopt($ch, CURLOPT_URL, "url api sandbox information.service/search/officers");
// curl_setopt($ch, CURLOPT_URL, "url /company?q=".$company_number);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, array( "Authorization: Basic " . base64_encode($api_key . ":"), "Content-Type: application/json",
));
$response = curl_exec($ch);
// Check for errors
if (curl_errno($ch)) {
echo 'Error:' . curl_error($ch);
} else {
// Decode the JSON response
$company_details = json_decode($response, true);
print_r($company_details);
}

curl_close($ch);

?>

We then tested a 2nd API key, status=live. For the live key it returns: Array ( )
For the live key test we used:

curl_setopt($ch, CURLOPT_URL, “url api dot companies house /company?q=”.$company_number.“filing-history”);

First - unless you have a particular need (e.g. to upload your own data to sandbox and then test that) I would suggest avoiding the sandbox. There have been multiple posts suggesting that it may not work quite the same as the main API but also that it doesn’t necessarily have data there (unless you’ve put it there yourself).

Second - given you’re using PHP/Curl anyway, if you haven’t already I would check each endpoint using the command line curl tool if you can. This makes things really easy to check / experiment with - and you can easily see exactly what you send / get back if you need.

I note that some of your URLs look off also. Difficult to say with the format you’ve put them in though. If you’re really stuck post actual examples, but don’t post your API key.

If you’re after the company profile the format is eg:

https://api.company-information.service.gov.uk/company/NF004299

And the filing history list:

https://api.company-information.service.gov.uk/company/NF004299/filing-history

The search endpoints take a parameter e.g. “?q=” - so:

https://api.company-information.service.gov.uk/search/companies?q={thing to search for}

OR

https://api.company-information.service.gov.uk/search/officers?q={thing to search for}

Don’t forget here that if you’re building a string, this is a URL, so a parameter will need url encoding (PHP rawurlencode - but a neater way might be to use http_build_query )

Another possible gotcha: note that company “numbers” should always be 8 byte strings. If a number has fewer non-zero digits it should be left-padded with zeros.
e.g. UK company 5656883 has company number “05656883”.
Companies with a prefix (and even a postfix for the few) - again the numeric part should be zero padded to give the total length (prefix + rest) = 8 bytes. Example (a particularly odd one) - “SP0001WS”

For what it’s worth you don’t need to set a Content-Type header here. You’re not sending any JSON, are you…? (None of the public data API endpoints involve you sending JSON).

If you wanted to be explicit you could suggest you want to receive it using the “Accept:” header, but Companies House is currently going to respond with JSON regardless! (Note - the Accept header is used if you get to requesting the actual file associated with a filing though, if more than one data format is available).

Examples (live system). Note that using curl on command line you provide your (plaintext) API key as part of a user:password string with the -u switch. The detail is that the API key is the user part and the password is empty. With curl you can switch on verbose mode and see all the traffic to and from the server if needed ( -v).

curl -u YOUR_APIKEY: "https://api.company-information.service.gov.uk/company/NF004299"
{"accounts":{"last_accounts":{"made_up_to":"2010-03-31","period_end_on":"2010-03-31","type":"full"},"next_accounts":{"overdue":false,"period_end_on":"2011-03-31"},"next_made_up_to":"2011-03-31","overdue":false},"can_file":false,"company_name":"PDV CONSULTANTS LIMITED","company_number":"NF004299","company_status":"active","date_of_creation":"2008-11-18","etag":"c05ae969e2038b1c81ffe73c11025025d2c6acca","external_registration_number":"488683","foreign_company_details":{"accounting_requirement":{"foreign_account_type":"accounting-requirements-of-originating-country-apply","terms_of_account_publication":"accounting-publication-date-does-not-need-to-be-supplied-by-company"},"is_a_credit_financial_institution":false,"originating_registry":{"country":"NEW ZEALAND"},"registration_number":"488683"},"has_been_liquidated":false,"has_charges":false,"has_insolvency_history":false,"jurisdiction":"united-kingdom","links":{"self":"/company/NF004299","filing_history":"/company/NF004299/filing-history","officers":"/company/NF004299/officers","uk_establishments":"/company/NF004299/uk-establishments"},"previous_company_names":[{"ceased_on":"2022-03-07","effective_from":"2008-11-18","name":"PLATTS DRIEVAP ENGINEERING LIMITED"}],"registered_office_address":{"address_line_1":"Pricewaterhousecoopers 3rd Level Pricewaterhousecoopers Centre","address_line_2":"Cnr Bryce And Anglesea Streets","country":"New Zealand","locality":"Hamilton"},"registered_office_is_in_dispute":false,"type":"oversea-company","undeliverable_registered_office_address":false,"has_super_secure_pscs":false}

curl -u YOUR_APIKEY: "https://api.company-information.service.gov.uk/company/05656883"
{"accounts":{"accounting_reference_date":{"day":"31","month":"12"},"last_accounts":{"made_up_to":"2023-12-31","period_end_on":"2023-12-31","period_start_on":"2023-01-01","type":"micro-entity"},"next_accounts":{"due_on":"2025-09-30","overdue":false,"period_end_on":"2024-12-31","period_start_on":"2024-01-01"},"next_due":"2025-09-30","next_made_up_to":"2024-12-31","overdue":false},"can_file":true,"company_name":"BLOCK 4 PORTERS WOOD MANAGEMENT COMPANY LIMITED","company_number":"05656883","company_status":"active","confirmation_statement":{"last_made_up_to":"2024-12-16","next_due":"2025-12-30","next_made_up_to":"2025-12-16","overdue":false},"date_of_creation":"2005-12-16","etag":"fd54c4079c5bcced3f4cc873f80aed3b30fdea11","has_been_liquidated":false,"has_charges":false,"has_insolvency_history":false,"jurisdiction":"england-wales","last_full_members_list_date":"2015-12-16","links":{"persons_with_significant_control":"/company/05656883/persons-with-significant-control","self":"/company/05656883","filing_history":"/company/05656883/filing-history","officers":"/company/05656883/officers"},"registered_office_address":{"address_line_1":"1 Doolittle Yard","address_line_2":"Froghall Road","country":"England","locality":"Ampthill","postal_code":"MK45 2NW","region":"Bedfordshire"},"registered_office_is_in_dispute":false,"sic_codes":["68320"],"type":"ltd","undeliverable_registered_office_address":false,"has_super_secure_pscs":false}

Non-existent company:

 curl -u YOUR_APIKEY: "https://api.company-information.service.gov.uk/company/99999999"
{
    "errors": [
        {
            "type": "ch:service",
            "error": "company-profile-not-found"
        }
    ]
}

Good luck.

Thanks very much for the detailed responses. We’ll update findings in case it helps anyone else, or to post additional questions. (re. urls for a new post the forum blocked them even gov dot uk hence the short-hand).

I noted some issues when attempting to post a response to you (hence splitting a post into 3) - I was getting an http 403 and the preview tool said “Drafts not available”.

I am not 100% sure why this was (not seen it before) but one of the lines of example code you’d posted seemed to consistently cause issues or at least when I removed that text I could post.

I think there may be some character encoding issues - I think it was some of the quote characters which might have been an issue (possibly copying, then pasting, then reencoding and maybe in a particular string of characters)? Anyway, that’s an aside.

We have used PHP successfully to access all parts of the public data API so (aside from some quirks, many noted in this forum) it works.

Good luck.

I’m not from Companies House and have no involvement with UK government. I’m just a long-time user of the API / Companies House services (we were using the XML Gateway before it).

(Most Companies House users have this shown in their profile e.g. see MArk Williams)

I just reply to posts here as a “pay it forward” for assistance from others back in the day. (Companies House seem not to be over-resourced - or at least keeping documentation up to date, patching minor bugs and dealing with queries is not something they’re given to do with a high priority…)

Thank you so much for your help - very much appreciated. Their documentation definitely has room for improvement.

Adding user-pass solved the getData issue. We can now load data from CH. Our main issue is solved.

Unfortunately the document api caused issues. We remove CH authorisation and redirect to AWS. We load the pdf with data, meaning we can view manually. But automating “search for a string” shows “object: protected” using a pdf parser.

We noted another post 544/9 suggesting they could not download the file. That was not our issue. We could download and view correct data, but we are blocked from parsing the pdf. However, as it mentioned we should treat data as json, we also tested this and confirm it returns NULL.

We were hoping to extract just 2 pieces of data: Shareholder Funds/Equity and number of staff. Do you have any suggestions?

If you can download a readable file using the Documents API then I’d say the API is fine - what you have there are other issues e.g. “We expected to find certain data in files / files which we can perform certain operations on, but we can’t!”

Are looking for particular data in accounts filings? From your mention of shareholder funds / number of staff it sounds like it. (That info is not available as JSON via the API, you have to look in the filings).

If so then probably the easiest route is to download the iXBRL data rather than the PDF data *. There is quite a lot of information on this forum on:

a) how to request iXBRL rather than PDF. Short - use the http Accept header to request the appropriate mime type, having first checked the mime types listed in the document metadata to ensure the requested format is available e.g. see:

and b) what is in there / how to parse it. See e.g. links in this summary:

  • Note: this depends on firms filing this data in the appropriate format. I am not sure but I believe it is still possibly for them to submit filings on paper / just as a (raster) PDF rather than as formatted accounts data. Plus historic filings past a certain date will not have formatted information available - it will be scans of (sometimes handwritten) forms! See Companies House note on this from 2017 here:

Examples:
Company number 09540283 has accounts filings in PDF and iXBRL. Getting the particular filing information (just for illustration - you could e.g. get this information by requesting the filing history list, possibly passing in the parameter to filter this to only return accounts categories etc):


curl -u APIKEY_HERE: https://api.company-information.service.gov.uk/company/09540283/filing-history/MzQzOTU1MTQ3MmFkaXF6a2N4
{"transaction_id":"MzQzOTU1MTQ3MmFkaXF6a2N4","barcode":"XDDTAN3U","type":"AA","date":"2024-10-15","category":"accounts","description":"accounts-with-accounts-type-total-exemption-full","description_values":{"made_up_date":"2024-04-30"},"pages":8,"action_date":"2024-04-30","links":{"self":"/company/09540283/filing-history/MzQzOTU1MTQ3MmFkaXF6a2N4","document_metadata":"https://document-api.company-information.service.gov.uk/document/clSoJ2xbeFSFypK5EcCgR26SpkVe8H4k1nlRrgqpD1Q"}}

Requesting the document metadata using the link above:

curl -u APIKEY_HERE: https://document-api.company-information.service.gov.uk/document/clSoJ2xbeFSFypK5EcCgR26SpkVe8H4k1nlRrgqpD1Q
{"company_number":"09540283","barcode":"XDDTAN3U","significant_date":"2024-04-30T00:00:00Z","significant_date_type":"made-up-date","category":"accounts","pages":8,"filename":"09540283_aa_2024-10-15","created_at":"2024-10-15T11:32:51.898330388Z","etag":"","links":{"self":"https://document-api.company-information.service.gov.uk/document/clSoJ2xbeFSFypK5EcCgR26SpkVe8H4k1nlRrgqpD1Q","document":"https://document-api.company-information.service.gov.uk/document/clSoJ2xbeFSFypK5EcCgR26SpkVe8H4k1nlRrgqpD1Q/content"},"resources":{"application/pdf":{"content_length":108473},"application/xhtml+xml":{"content_length":76922}}}

To get the iXBRL you’d request the file as you did before, but set the http Accept header to the application/xhtml+xml mime type.

On the PDF issue - I have no idea what PDF parser you’re using, nor have you mentioned which PDF, so I can’t help with that. (Aside from that we don’t do any PDF parsing ourselves currently either).

I’ve just checked a downloaded document which does have textual data (a Confirmation Statement - these are now usually machine-generated) and the PDF doesn’t seem to have much restricted (via Security) e.g. content can be copied etc.

If you search this forum there may be information on parsing of PDFs which could help you.

Thank you so much for all your efforts. If we can reciprocate in any way do let us know (not on this api - but anything else coding-related). We’ll try your suggested method(s).

Re. pdf parsers in general, typically they automate extraction from pdf. For what we need it didn’t seem worthwhile running OCR, even though we have code for that. A quick dirty method is exclude micro company accounts within CH and consider the rest. We were looking at financials just to segment out the smallest firms. The main goal was to confirm status=‘active’, get SIC and basic data like this, which we now have. I think under CompanyOverview there should also be website, to help differentiate two similar names, plus reception phone number.

Thanks to your help we’ve now managed to extract what we need from the source filings. We really appreciate your help. I wrote to Companies House suggesting they update their documentation.

For others reading please note: take the td tag and id, rather than the inline to extract data, then strip out as needed.