WRITTEN ON March 30th, 2009 BY William Heath AND STORED IN Data nitwittery, Foundation of Trust, Transformational Government, We told you so..., What do we want?

Personal data is valuable. Government has twigged that, which is why it’s trying to grab so much: it’s the oil of the new economy. But just how much personal data does government hold?

The JRRT Database State report mapped out the top 46 or so government databases. But I had no idea hom much data lies on them.

Eleanor Laing (Shadow Minister) has been asking some PQs. I’m not sure I can understand or believe the results. For DWP, for example, Jonathan Shaw replies:

Customer Management System: this system holds 480,000 customer records and 1,300 data fields.

Pensions Transformation Programme: this system holds 6.5 million cases and 15,500 data fields.

Customer Information System: this system holds 92 million person related records and 9,800 data fields.

Tell Us Once system: this system is currently in feasibility and testing within 14 local authority areas. It only collects the information that a customer is already required to report following a birth or a death. To date the service has collected and stored information from 8,440 people in up to 110 data fields depending on their circumstances.

Income Support Computer system: 8 million cases of which 5 million are currently active and 700 data fields.

Data Matching Service: this information is not available and to obtain the overall number of records of all types within this system would be at disproportionate cost.

Does this mean that the DWP’s CIS, which is to be the basis of the Benighted ID Scheme, is geared up to store 9800 x 92m fields of valuable personal data? That is 960 BILLION fields. I wonder how many of them are used? I wonder what proportion of those which are used are accurate? I wonder if anyone has a clue?

There’s a DIUS anwer about MIAP here

But the most bizarre of all is HMRC. My mate Stephen Timms is a good egg and I love him. He reports the number of records as:

PAYE 58m;
Self Assessment 15.5m
Tax Credits System 11.1m
Citizen Identification Framework 77m
Primary Tracing Framework 77m individuals and 7.7m employers
Secondary Tracing Framework 80m individuals
Employments Framework 282m employments and 7.7m employers
National Insurance Recording System 71m
Child Benefit System 11m

So how many fields do these databases hold, dear servants at HMRC?

There are many data categories of different kinds within each of the identified systems. A count of them all could be produced only at disproportionate cost.

It’s official. HMRC collects so much personal data that they can’t afford or can’t be bothered to provide Parliament with the number of fields of data they collect. They weren’t asked for a list here of the fields – just how many they were.

Verily the database state is out of control.

Wibbi:
– the state were up front about what data it collected
– it collected less data (starting, perhaps, with an amount small enough to count)
– we looked after more of it ourselves.

2 Responses to “PQs start to reveal scale of the data problem”

 
Guy Herbert wrote on March 31st, 2009 11:01 am :

Thank-you, William (and Eleanor Laing) for elicidating something I have been trying for years to get through the heads of activists, journalists and politicians alike: broad abstract categories of information are not the same as “items of data”, and counting the former as an indicator of the threat is bound to lead to miscomprehension.

‘Information about expenses’ is not the same as ‘Virgin Media statements’ is not the same as ‘line-item on particular Virgin Media statement’. It is the interpretation placed by others on the fine detail, regardless of its subjective meaning, or the importance in the context of the larger picture, that causes the trouble.

Iain Henderson wrote on April 5th, 2009 12:47 am :

In the private sector the assessment of data content * data quality * use is pretty well understood and regularly undertaken at detailed level (usually as a feed into a business case for improvement activity).

UK public sector just does not seem to have these disciplines in place, database are allowed to grow ad hoc and without regular audit.