Generally: caution is advised. Companies House presents a very large dataset, which has been gathered over a long time (in some cases over 100 years…)
Further - by law Companies House essentially has to record what the members of the public supply. But it seems that even where it might be reasonable to do so they don’t always do a great deal of validation.
All that means that some fields can have all kinds of unexpected data - and in general “beware what you don’t know”.
As @ash notes - where the API returns one of their “enum constants” the data should be as advertised. But that doesn’t include fields like address fields, or people’s names etc.
The best way to interrogate the data is probably just to try using it and see what you get.
However you can also find bulk data at:
… albeit this is in a different format to what you can get from the API and you don’t get all the information you do in the API. Documentation for that is in the specs (for some unknown reason this box refuses to accept words or URLs containing part of the word dev*eloper so I can’t post a link - just search!)
There are other bulk datasets - for officers you have to apply to Companies House directly though).
Questions:
Check if company is UK registered? You may have two questions / categories to decide: is this a “company”? Via the API Companies House records entities other than “companies” - you’ll find a small zoo of “societies” and “associations”, financial institutions, charities (see e.g. the “enum constants” and the company_type)… Then: “UK registered”: some data for foreign-registered entities is recorded also where they have UK branches. That’s reflected in the company “number” - you have “FC…” foreign companies and their related “BR…” UK Branch(es). There are also some EU-related entities.
I am not sure how jurisdiction relates to Crown dependencies, BFPOs, or overseas territories however.
For the question about addresses - the address fields are some of those which don’t seem to be parsed/validated, so I believe that’s just down to us users to do.
As for whether a UK company could have a non-UK registered address? I am not certain they all will be (again - on us to check) but the current rules suggest so for a private limited company at least:
An example of things the other way round would be the “FC…” - “BR…” pairing (though the “BR” would presumably always have a UK address, by definition).
Limited companies - see the Company Profile documentation again.
… and the “enumeration constants”. This possibly comes down to how you want to slice and dice things. Companies House records things like “ltd” (private limited company), “plc” (public limited company), and differentiates things like “Private Limited Company by guarantee without share capital use of ‘Limited’ exemption", “Private company limited by guarantee without share capital” etc. That’s without getting into other types like partnerships etc.
Active: see the company_status and company_status_detail fields. Again - this is partly “what do you class as active?” also.
As for insolvency - that’s another area. I am not certain, but I seem to recall something from Companies House posted here regarding how those values changed as e.g. a company became solvent / was restarted etc. See:
Eventually the ultimate arbiter of these things are the actual filings the company has made… which may not be very helpful for a “mass” query! Fortunately most companies are recent, and most are pretty simple affairs.
Good luck