IAS Gyan

Daily News Analysis

How much is too much when collecting data for planning?

11th September, 2020 Editorial

Context:  It is useful in delivering development outcomes, but technology has to be deployed carefully


  • “On Independence Day, Prime Minister announced the launch of the National Digital Health Mission (NDHM) under which “every Indian will get a Health ID card.
  • The NDMH seeks to create an ecosystem under which health records will be digitized.
  • The government has clarified that this would be voluntary, data will be stored locally, and only anonymous data will be shared upwards.

How can technology transform data collection and analysis?

  • This is digital data and it is important to recognise privacy issues.
  • As of now, in the U.S. under the electronic health records initiative technology dramatically transformed healthcare at all levels.
  • Technology can’t work without data. So, it can be at the population level, it can be at the individual health level. At the population level, it can be used effectively to control the outbreak of a pandemic. At the individual level, it may have several levels of benefits.
  • In India, millions of people don’t have access to quality healthcare. The silver lining is that the government has allowed for digital health, which allows a lot more people access.

How much is too much when collecting data for planning? What problems do we face when it comes to demographic and health data?

  • In terms of sampling, frequency, accuracy, range we face problems in Sampling size, sampling errors, and non-sampling errors.
  • Cost is a big factor in addressing these questions.
  • The use of technology is the cheapest and easiest way forward, in collection of data, monitoring the quality.

What can technology do for collecting data, keeping it safe, and using it for the public good, which was impossible a decade ago?

  • Technology is the only solution. It is an excellent solution on the collection side as we can expand the sample [size] at low cost.
  • Computing devices that can use ‘store and forward’ type architecture [can overcome] connectivity challenges.
  • But the quality of the data is where the biggest advances will be felt because we can control it through many methods such as pattern matching and looking for trends, all real-time.
  • On the privacy and security side there is an issue of hacking.
  • There are several ways of making technology more robust — create levels of anonymity, make it much harder for somebody to figure out who you are.
  • AI algorithms can protect data and prevent theft.
  • But [that] has to be complemented with a legal framework that acts as a deterrent for anybody who is caught stealing or misusing personal health data.
  • It must be independent of any political machinery. So, possibly, create an independent commission, like the Election Commission.

The Personal Data Protection Bill is currently before Parliament but it has been criticized for the sweeping powers it gives to state agencies. Are we still a long way from a reassuring legal framework?

  • Some legal framework are there for protecting Census and other survey data, but the problem is that organizations that handle these are not strictly independent regulatory authorities.
  • For example, the Census Commissioner of India has the power under the Indian Census Act to say no —but an independent commission is needed.

How is private data used for analysis and policymaking?

  • In policymaking, the use of data is increasing but it is not extensive. For instance, the response to the pandemic. We use electronic health records at the population level and individual level all of a sudden now.
  • A much greater amount of information sharing has happened. All of a sudden, hospitals are talking to each other. And they’re all sharing information not only on individual patients but on incidents.
  • With anonymous data, people have less of an issue but in some cases it may not be anonymous — for instance, in contact tracing.
  • When you go beyond health, we have other interesting issues in this public health crisis, policy issues — pensions, loans that were given to businesses.

The decennial Census has been around for more than 130 years now. This time it is being disrupted by the pandemic. Can’t we have a real-time capture of demographic data?

  • That is a distinct possibility. Some countries, such as Germany, have already gone into that.
  • To get there, we have to improve the system of our birth and death and marriage registration and registration of other statistics.
  • This has improved, but only in urban areas, and not to the level where we can make the data usable and also won’t have to go for the same data to the Census.
  • There are other types of data, which are also collected in the Census regarding culture, language, economy, but yes, we don’t need that every 10 years. That could be collected at longer intervals.

Is it possible and desirable to collect at least basic demographic data on a real-time basis rather than wait for 10 years?

  • Totally and it is not going to be very expensive.
  • Technology costs have come down so dramatically, and we have such a large penetration of mobile phones in India.
  • Individuals will have to be willing to report and that’s really where India’s challenge might lie.

For this, it is important that they see a personal benefit.

  • We can introduce a lot of intelligent algorithms and traceable technologies that detect the quality at the source.

What level of specificity and granularity in data could that balance be optimally achieved? How much is too much and how much is too little when it comes to data?

  • That’s a real problem. In a census as well as in the big demographic surveys, health surveys, the smallest area level is the village.
  • More sensitive data and what involves the question of quality are shared only at the broader units such as the State or district level.
  • Technology could make data capture and analysis possible in smaller units but at the same time, one has to be very judicious doing that, for that we need an independent agency outside of the government to take these decisions.

Consent and anonymisation are the key words often used to reassure sceptics. Is it all good as long we adhere to these two principles?

  • Both are very important, and are elementary.
  • We must have a stronger, autonomous regulatory framework. Technology can do a world of good in delivering health, education and other development outcomes, but one has to be very careful and judicious in its deployment, and there should be an independent and robust regulatory mechanism to oversee that process.

Reference: https://www.thehindu.com/opinion/op-ed/how-much-is-too-much-when-collecting-data-for-planning/article32576730.ece?homepage=true