NON-PERSONAL DATA GOVERNANCE FRAMEWORK – INDIAN PROPOSAL – PART I

By Arya Tripathy on 06 August, 2020

In September 2019, the Government constituted the Committee of Experts to study various issues relating to Non-Personal Data (NPD), and make recommendations for its regulation (NPD Committee)[1]. NPD Committee released its report on July 12, 2020 (Report)[2] which is open for public consultation till August 13, 2020. The Report is structured into 7 key chapters delving into rationale for the regulation of NPD, its scope, key players in NPD ecosystem, the legal basis for ownership over NPD, contours of undertaking data business, need and technology architecture for data sharing, and proposes a new regulatory framework for NPD governance. The new set of regulations will likely deal with anonymisation standards, data sharing protocols, regulation of data businesses and markets, etc. The Report also contemplates creating a new regulator – the Non-Personal Data Regulatory Authority (NPDA).

This post aims at analysing select few recommendations of the Report and evaluating the impact of the proposed NPD governance framework. In our subsequent posts, we will further continue with our analysis.

1. Genesis: Justice B.N. Srikrishna Committee Report, while laying out the norms for the Personal Data Protection law (PDP Bill), suggested regulation of community data (i.e., body of data sourced from multiple individuals) for group privacy rights, as an extension of a robust data protection framework[3]. It observed that individual control over aggregated data sets is impractical, and a suitable law should facilitate collective protection of privacy on basis of certain principles. Alongside, such protection should take into account intellectual property ownership of the entity processing the data. Thus, the Srikrishna Committee Report recommended that the government must consider promulgating a law that accords specific protection to “community” and “corporate” data.

Drawing a clue from here, the NPD Committee was formulated. It was also vested with an additional mandate – to recommend a regulatory framework for NPD. A perusal of the Report reveals that the objective has been to propose a framework that focuses on “data imperialism” where NPD is akin to a common or sovereign resource, subject to socio-economic welfare ideals and perhaps, may result in dilution of how community and corporate data is protected. The proposed NPD regulations will apply to government, and all public as well as private entities. Further, NPD relating to Indians is proposed to be governed and regulated as per the NPD regulations, irrespective where they are transferred or processed.

2. Why regulate NPD? The Committee’s primary case for regulating NPD ecosystem is to harness the economic value of data in order to bolster innovation, and create a level playing field in the data market. The underlying idea is regulation of NPD in such manner that benefits accrue to Indian community and its businesses. The Report suggests that existing methods of realizing data value such as direct monetization (selling processed data), using data for determining new avenues of investment, and corporate restructuring strengthen the ideology that data is an asset that supports organizational growth. This asset when arises or relates to India should be treated as a common resource that must be available for social, public and economic value creation for all Indians.

It appears that the Committee shuns the idea of few players owning and utilizing enormous volumes of data, which could create entry barriers for new indigenous players, stifle innovation, and closet the untapped social and public value of data. In this vein, the Report argues that it is apt to set out regulations around NPD ecosystem, data analysis, sharing, and distribution of gains. This in turn, will enable innovation, promote start-ups and MSMEs, facilitate development and better delivery of public services, and build-in checks and balances against monopoly of few “first-movers” in the data market.

While arguing the case for NPD regulation from a welfare state vantage point, the Report merely makes a passing reference to the existing legal matrix around intellectual property. It states that NPD regulations must safeguard intellectual property rights and incentivise new business creation. However, it fails to adequately acknowledge that data forms a fundamental block of an organization’s confidential and proprietary information, such as trade secrets, know-how, technology, software, programmes, protocols, marketing strategy, etc. These are globally safeguarded through statutory or contractual rights. Without adequate principle-based suggestions around how NPD regulations will balance conflicting interests and fair market competition, the Committee’s case for regulation of NPD is at best, a half-hearted attempt to assess the entire ecosystem.

3. What is NPD? Generally, NPD is data stripped off of its identifying features. It is not the same as pseudonymised data where the data is retained in a de-identified form and can be mashed with identifiers as and when required. NPD will refer to anonymised data i.e., personal data which having undergone an irreversible process can no longer be identified to a particular person. The Report classifies NPD into 3 categories: (i) public NPD, (ii) community NPD, and (iii) private NPD.

  • Public NPD means NPD that is collected or generated by government, its agencies, or collected or generated in course of any public funded work; for example, anonymised land records, pollution levels, public health information. However, any public NPD that is protected by confidentiality is outside the purview. This exclusion could cover data that is protected from disclosure under Section 8 of the Right to Information Act, 2005 such as information concerning sovereignty of India, cabinet papers, or information protected under the Official Secrets Act, 1923.
  • Community NPD refers to NPD about any thing or phenomena that originates or relates to a community of natural persons. Community is defined as any group of people with common interests, purposes, and who are involved in social and/or economic interactions, including a virtual community. Thus, community will include actual groups like religious, geographic, regional, linguistic, and virtual ones like chat groups, social networks, etc. Community NPD illustrations would include datasets collected by utility providers, municipal corporations, and private companies like intermediaries, telecom, e-commerce, aggregators, etc. The Report further states that only “raw/factual data” which has not undergone any processing will only be treated as community data. This essentially means, data collected at the primary point, having been anonymised will qualify as community NPD. Where the collected and anonymised data has been further processed, say has been analysed, or structured, the same will fall outside the scope of community NPD.
  • Private NPD includes NPD collected or produced by persons or entities, other than government. These NPD sets will originate or relate to privately-owned assets and processes. Essentially, it will mean any NPD set that has been processed further by private persons and has moved beyond “raw/factual” state such as derived, analysed, structured, observed data sets, proprietary knowledge, data insights.

In our view, the classification is overlapping, ambivalent and establishes a faulty premise for regulation of NPD.

Firstly, there is no clear distinction between community and private NPD. The edifice of whether NPD is in raw or derived state is fallacious. For NPD to be in anonymised raw state, the collecting entity is already utilizing technology processes and incurring costs for collection and anonymisation. Alongside and as we delve in detail in our subsequent post, the Report proposes that NPD sets for data sharing will have to be adapted to certain standardized formats, in which case, it is incorrect to state that only raw data can be treated as community NPD.

Secondly, it is unrealistic that an entity after collection of personal data would directly move to anonymisation, without processing it for providing goods or services and extracting derivations. Identifiable data tends to form the initial value chain for data-centric businesses providing them with realistic insights. Thus, the classification premise is contrary to the realities of a data processing cycle.

Thirdly, the classification could motivate entities to retain data in identified or identifiable format longer than is required just to escape the rigours of NPD regulations all together, and specifically the proposal on data sharing (as will be dealt in our subsequent post). Should that be the case, large volumes of identified data could remain exposed and vulnerable, creating imminent threat to an individual’s privacy.

Fourthly, the Report is silent on what categorisation will a collated NPD pool fall into, if it comprises of public, community and private NPD. It is not uncommon for anonymised datasets to be sourced from different projects, businesses, and individuals, in which case, the distinction proposed by the Committee is likely to fail.

The NPD regulations must dive deep and delineate the scope of NPD with much more clarity taking into account the origin of data, processing stage, and economic realities.

4. Sensitivity of NPD: NPD, in most circumstances, will owe its genesis to personal data. The Report acknowledges this and proposes categorisation of NPD based on its degree of sensitivity. Drawing an analogy with sensitive and critical PD as contemplated under PDP Bill, the Report emphasizes that it is important to borrow the concept of sensitivity for NPD from the following perspectives – (i) national security or strategic interests, (ii) collective harm to a group’s collective privacy[4], (iii) sensitivity or confidentiality for businesses, and (iv) risk of re-identification as anonymisation techniques are not absolute. The Report thus, suggests that NPD should be classified as sensitive and critical NPD. Sensitive NPD will refer to anonymised data derived from sensitive personal data, and critical NPD will mean anonymised data extracted from critical personal data.

PDP Bill defines sensitive personal data widely to include any personal data which may reveal, be related to, or constitute financial, health, official identifier, sex life, sexual orientation, biometric, genetic, transgender status, intersex status, caste or tribe, religious or political belief/affiliation, or any other category that may be notified by the government as sensitive data. The above scope is expansive, and most kinds of personal data are covered within its ambit[5]. Critical personal data is not outlined under PDP Bill, and the government has discretion to notify such kinds of data as it deems fit[6]. In light of these definitions, it can be assumed that vast volumes of NPD can qualify as sensitive and critical NPD. Where NPD regulations lay out specific requirements for access, storage, deletion and distribution of these NPD categories, organizations may have to conduct holistic data inventorization, implement segregation structures at the time of collection and build in data classification processes for their anonymised data sets in order to comply with such legal mandate. Additionally, the Report seems to have overlooked the possibility of NPD that emerges from a combination of personal, sensitive and critical personal data, and how such hybrid NPD would be ranked in terms of sensitivity.

While how this classification will affect businesses is yet to be seen, at this stage, it can be observed that the Report seems to completely ignore the fact that where NPD is deanonymized or reidentified, it no longer remains subject matter of NPD regulations. It would be governed as personal data and regulated as per PDP Bill. To this extent, such classification on basis of sensitivity of underlying data may be redundant without any actual purpose.

5. Consent for Anonymised Data: Large sets of anonymised data can be mashed to result in de-anonymised data about communities and individuals. The Report underscores the risk such de-anonymisation poses and the consequent harm that it may cause. In this vein, the Report proposes a noble, yet flawed guiding principle. It states that anonymised NDP should be treated as NPD of the data principal, and this will ensure that any consequent harm that results due to re-identification of NPD is actionable by the individual. It further states that “specific” and withdrawable consent from an individual should form the basis for anonymisation of data into NPD and its subsequent usage.

While we admire the sentiment of the Committee to uphold individual and community expectation of privacy, consent-based framework for anonymisation and usage of NPD may not be fruitful. Consent for anonymizing personal data to create NPD is already provided under PDP Bill. PDP Bill defines “processing” as operation or set of operations performed on personal data including collection, recording, organisation, structuring, storage, adaptation, alteration, retrieval, use, alignment, combination, indexing, disclosure, dissemination, erasure, etc[7]. Essentially, it covers all kinds of dealing with personal data, including anonymisation. For any kind of processing unless specifically exempted under PDP Bill[8], free, informed, specific, clear and withdrawable consent of the data principal. This will require organizations to obtain consent even for anonymisation, and hence, to this extent, the Report reinstates what is already captured in PDP Bill.

What is troublesome is where the Report recommends that organizations must also obtain “specific” consent for using NPD. Implementing this rigour could be extremely problematic and stifle realisation of NPD’s economic potential. If consent were to be specific, it will not suffice to merely state that NPD will be used subsequently. Applying specific consent principles as used in context of personal data, specific consent may require organizations to flesh out the subsequent use of NPD in some detail. If so be the case, it is unfathomable how organizations can capture all current and future uses of anonymised NPD while obtaining consent for anonymisations. This is because potential uses and ramifications of aggregated NPD sets are constantly evolving. Should this recommendation find place in the NPD regulations, data intensive processing businesses are likely to get affected. As it may be, there has been severe resistance to consent as the basis for processing data owing to consent fatigue, compliance cost, inadequacy of consent as a basis, and the lack of flexibility that it affords to data processing businesses. Applying the concept of consent- based processing to NPD could be counter-productive and uncalled for.

Conclusion: The Report is the first move by a country to regulate processing of NPD, and is not limited to free flow of data across borders. Considering the sui generis nature of the Report and its recommendations, it cannot be overemphasized that the proposals are well thought through and articulated. As it appears, the Report in suggesting principles for NPD governance framework has chosen to remain silent on several fundamental aspects. In our next post, we will continue to analyse other recommendations in the Report and its potential impact.

[1] Ministry of Electronics & Information Technology Office Memorandum No. 24(4)/2019-CLES dated September 13, 2019 accessible at https://www.meity.gov.in/writereaddata/files/constitution_of_committee_of_experts_to_deliberate_on_data_governance_framework.pdf (last accessed on August 2, 2020)

[2] Report by the Committee of experts on Non-Personal Data Governance Framework available at https://static.mygov.in/rest/s3fs-public/mygov_159453381955063671.pdf (last accessed on August 2, 2020)

[3] Report titled “A Free and Fair Digital Economy – Protecting Privacy, Empowering Indians” by the Committee of Experts under the Chairmanship of Justice B.N. Srikrishna available at https://www.meity.gov.in/writereaddata/files/Data_Protection_Committee_Report.pdf (last accessed on August 2, 2020)

[4] Harm is defined under Clause 2(20) of PDP Bill to include bodily or mental injury, loss, distortion, theft of identity, financial or property loss, loss of reputation, humiliation, loss of employment, discrimination, etc. While the Report does not explain what is harm, it is likely that the NPD Regulations will follow a similar scope for defining collective or individual harm.

[5] Clause 2(36) and Section 15 of PDP Bill

[6] Clause 33 of PDP Bill

[7] Clause 2(31) of PDP Bill

[8] Chapter III read along with Chapter VII of PDP Bill provide the situations for processing on other grounds than consent. Further, the government can exempt any organization from consent requirements in exercise of powers under Clause 35 of the PDP Bill.

The views expressed here do not constitute legal counsel, are aimed at knowledge sharing and awareness advocacy, and are views of the contributing author.

Leave a comment