By Arya Tripathy on 01 May, 2020
“A genuine leader is not a searcher for consensus but a molder of consensus”
Martin Luther King, Jr.
There is global consensus that COVID-19 transmission chain can be interjected with containment measures. Containment efforts require rapid identification and quarantine of potential carriers or “contacts” who may have contacted the virus through a confirmed patient. Manual tracing requires the patient to jog their memory, recollect, and identify all locations, times, duration and contacts. This process is arduous, ineffective, and inaccurate. This is where “contact tracing” through technology becomes critical. Regardless of whether countries have a codified data-protection and privacy law, most have either launched or are contemplating use of contact tracing applications. They are also keenly analysing various technology frameworks proposed by technology companies and academia for adaptation in COVID-19 tracing applications. While governments are fiercely urging people to use contact tracing technology as an essential tool to contain the virus, there is little or no discussion around their immense potential for invasive surveillance and dilution of user’s privacy.
This post aims at providing an overview of some of the existing tracing tools across the world with a view to comparatively analyse Indian government’s Aarogya Setu application, privacy and data protection implications, and the way forward.
1. What is contact tracing? The World Health Organization (WHO) defines contact tracing as the identification and follow-up of persons who may have contacted a contagious disease from an infected person in order to help the contacts get relevant care and treatment. It comprises of 3 basis steps:
- contact identification => identifying everyone who has come in contact with the infected subject;
- contact listing => inform every contact about their status, implications, actions that they must take, and guidelines for care; and
- contact follow-up => follow-up with all contacts to monitor symptoms.
Digital contact tracing tools currently collect and process important pieces of personal information like bluetooth transmission, geo-location, proximity between two devices, cellular ID, mobile data, battery usage, and health data. These information help monitoring a person’s movement, social distancing, identifying exposed cases, and implementing preventive care measures. The data transmission chain involves different stakeholders including users of the application, operating system providers (“OS”), network carriers providing internet connectivity, third-party servers that store information, and the government. Quite naturally, there are significant concerns around what, how and why data is being processed, and whether privacy is traded-off only to the extent necessary for containment of COVID-19. In a way, there is a thin line between contact tracing and state surveillance.
2. Concerns around contact tracing technology: Privacy advocates have voiced their concerns and it echoes with the common man. For instance, a joint statement was issued by 100 civil rights and digital policy organizations, emphasizing the need for governments to “respect” human rights while deploying digital surveillance technologies to fight the pandemic. Some of the key concerns are:
- contact tracing could form the basis for expanded and indefinite digital surveillance;
- governments could use disproportionate means for containment needs in complete disregard of fair processing principles, such as data minimization and limited data retention;
- there is lack of data processing transparency and accountability, which will diminish the chances of any future challenge to state action; for example, many countries claim that anonymous data is being processed, but there is no evidence to substantiate the same;
- there is increased possibility of “function creep” i.e., where information is collected for one purpose, but used otherwise, without any basis; for example, COVID-19 contact trace information although primarily collected for containment can end up being used for future commercial usage, profiling, or for intelligence and security purposes;
- there may be absence or deployment of poor network and data protection infrastructure, making data sets susceptible to breach and resultant harm to users; and
- states may deny access to judicial remedies by users for breach and harm.
3. Comparative study of different countries: Nonetheless, governments are moving rapidly in developing and deploying contact tracing tools. Based on a comparative study, it appears that states have been using two key technologies – bluetooth signalling and geo-location tracking. Bluetooth signalling uses bluetooth signal transmission between devices for determining distance, time and duration of contact. This provides proximity tracing with a positive COVID-19 patient. This can be used to contain further spread by providing prevention guidelines to potential contacts. Geo-location tracking goes a step further. It tracks the actual movement of the user at certain time intervals. This helps proximity tracing, but also allows actual surveillance. It also appears that for data processing, countries are following either a centralized or decentralised data collection and storage strategy. The underlying technology and usage conditions for the contact tracing apps are fundamental blocks for analysing the privacy implications. The table below provides an overview of select government apps and technology frameworks, including India’s Aarogya Setu. We have not analysed China and South Korea as a lot has been written and discussed about them. While analysing, we have not factored state specific data protection and privacy laws. Rather, we rely on well accepted fair processing principles – fair and lawful processing, data minimisation, limited retention of data, processing for specific purposes, necessity, transparency and accountability. Based on these core tenets, we have colour coded each of the analysed contact tracing apps on a high, medium and low risk matrix.
|#||Application or framework||Country / region||Core technology||Data storage||How it works?||How is data and privacy protected?||Risk to privacy|
|(i)||HaMagen||Israel||Geo-location cross-referencing of user’s data with confirmed patient’s data||Decentralised storage and centralised tracing through government server||HaMagen uses geo-location data of contacts to cross-reference with that of a COVID-19 patient, and not just bluetooth signals. This is how it works:While installing, HaMagen requires authorization to access location as well as internet data.It collects location information upon activation, which is stored on the user’s device only.The location data is only used for tallying with the tracking information collated on the routes of a confirmed COVID-19 patient. This helps in identifying any overlap in time and place.For this tallying, HaMagen downloads a file with anonymous list of locations, dates and times of visit of COVID-19 patients on the user’s phone. The file is downloaded from ministry’s cloud on an hourly basis. Thereafter, it cross-references the COVID-19 patient’s GPS location with that of the user on the device.The cross-referencing does not take place in ministry’s cloud, and is not shared with third-party without consent. Further, location data is not transmitted to the cloud.If there is a possibility of infection, user receives a notification with details of location and times for potential exposure.Such user must quarantine and if they experience symptoms, must connect with healthcare providers.Location, wireless connectivity, track of proximity and cross-referred data are retained on HaMagen for 2 weeks.||HaMagen poses a higher risk to globally accepted privacy norms for the following reasons:It is likely that geo-location cross-referencing disregards data minimization principle. Proximity tracing and self-care advisory to contacts can be issued using bluetooth data alone, and there may not be an urgency in processing geo-location or internet data. Thus, HaMagen uses a larger pool of personal information to directly monitor movement of users, and this increases surveillance risk.There is no consent mechanism and government’s access rights are indirectly inbuilt.There is no clarity on whether data stored on device is encrypted.While the ministry claims that anonymised data about COVID-19 patients is downloaded, there is always a possibility that user can deanonymize it by combining anonymous information with his own location data.There is increased suspicion that Shabak, Israel’s security agency and health ministry are using HaMagen data for surveillance, which if be the case, flouts the fundamental norm of purpose limitation.||High|
|(ii)||Aarogya Setu||India||Bluetooth signalling and geo-location data tracking||Decentralised storage and centralised tracing through government server||Aarogya Setu uses bluetooth signalling as well as geo-location data to identify COVID-19 positive or symptomatic cases, and contact trace. An overview of how it works is below:To install and use, user must enable bluetooth & GPS services on the device.During installation, Aarogya collects name, phone, age, sex, profession, countries visited in last 30 days, location and internet data.The information is transmitted and stored on centralised government server.The information is then, hashed with a unique digital ID (DiD) (hashing is a form of encryption), which is pushed onto the user’s device for encryption. All stored and transmitted data on phone and server are encrypted.Aarogya collects information continually through various mechanisms. It constantly collects location data at every 15 minutes. Further, it collects location data when user undertakes self-assessment through the app. Furthermore, when a user is in proximity of another user, the DiD is transmitted to the other device and, time plus location are captured on each device.Collected information is stored on phone and not transmitted to server.Information is uploaded on the server in 3 scenarios, without any consent from the user by correlating the DiD with personal information initially stored on the server – (i) confirmed COVID-19 infection, (ii) symptomatic through self-assessment using the app, or (ii) self-declaration for infection.The data uploaded is used for contact tracing, quarantine, care, location sanitization, identification of clusters, generation of heat maps and containment.Initial personal data collected is stored as long as one uses Aarogya or as required under law.Other data on phone is stored for 45 days, if not uploaded on server. If uploaded, it will be retained for 60 days on server for confirmed cases, and 45 days for others.User is mandated to keep the device in their possession and not allow use by anyone. If not, app can identify wrongly and government shall not be liable for such error.Government shall not be liable for any claims regarding use of app, inaccurate identification, and unauthorised access.Government can use aggregated/anonymous data for containment and statistics.No third-party transfer, except healthcare workers and this is without consent of the user.||Aarogya Setu stands out as an application that poses significant risk to a user’s privacy for the following reasons:Similar to HaMagen, data minimization principle is disregarded as Aarogya collects information such as profession, travelled countries, geo-location and internet data. It is unfathomable as to how profession will facilitate contact tracing and containment.There is no consent mechanism for uploading information on server, or sharing with third parties, which essentially means that government can access data, including a situation where Aarogya wrongly identifies the user as symptomatic.There is no accountability and transparency, as government has disclaimed all forms of liability.While 21 days is argued as the ideal timeframe for identifying potential COVID-19 cases, Aarogya retains initial identified information indefinitely, and other data sets are retained for longer durations. The rationale for such long duration is unclear.There is some evidence that hashing is susceptible to hacks and hence, the degree of protection does not commensurate with the kind of information collected and retention period.Aarogya gives an opportunity for government to track physical movements every 15 minutes, which appears unreasonable, disproportionate and unnecessary. Thus, there is a high risk for unauthorised surveillance and even heightened risk for function creep.Further, it is important to note that government processing is not regulated under the Information Technology (Reasonable Security Practices and Procedures and Sensitive Personal Data) Rules, 2011. This creates a situation where government can exercise unfettered power while dealing with personal information collected without any limitation, unless someone challenges such processing to be in disproportionate breach of privacy right.||High|
|(iii)||TraceTogether (TT)||Singapore||Bluetooth signalling model||Decentralised storage and centralised tracing through government server||This is how TT works:For downloading and installing TT, mobile number and bluetooth permission are essential.User does not need internet connectivity after downloading TT, except for ID retrieval as explained below.User consent is required for (i) storing mobile number in secured TT registry and (ii) receiving information on contacts and risk.Keyed in mobile numbers are substituted by a random permanent ID, and details are stored on TT registry. They are not shared with any other user.Data allowed to be accessed shall be solely for purpose of contact tracing.Participating devices exchange proximity information using bluetooth signals whenever TT detects another TT enabled device. Bluetooth relative signal strength (poor or strong) readings between devices is key for ascertaining distance proximity and duration of contact.When close to another TT enabled device, the bluetooth exchanges a temporary ID generated by encrypting the permanent ID with a private key held by health ministry. This temporary ID can only be decrypted by health ministry.Every time a COVID-19 contact is detected, TT sends notification to indicate signals are transmitted and this comes with time stamp.All user data is encrypted and stored on user’s device, and data will not be accessed unless user has been in close contact with a confirmed COVID-19 case.As per TT’s privacy statement, if the user is COVID-19 positive, user has the option to give government access to TT data. However, the FAQs state that users have a legal obligation to assist in containment plan and must provide information such as location timelines, logs and data stored in other apps.Data is stored for 21 days on rolling basis.||Several features make TT less regressive of user privacy and more responsive to the core processing principles:Purpose is identified.It follows principles of data minimisation as geo-location and internet data are not collected. Further, retention is for 21 days, which is based on scientific evidence that 21 days is important for mapping symptoms and cure.Based on how TT works, and specifically, the FAQs around consent-based data access, it can be argued that consent is not meaningful and “free”, because users have a legal obligation to disclose for containment measures. However, this is limited for those who are COVID-19 confirmed, and not for suspected cases.Data collected is stored locally in user’s phone. This decentralised data logging minimizes risks of unsuthorised access and use.Data is stored in encrypted format.Mobile numbers are not revealed to TT users. They are substituted by a random permanent ID. Further, mobile number and its corresponding user ID are stored in a secured server.||Medium|
|(iv)||Apple-Google API and contact tracing framework ||Proposed for world-wide use||Bluetooth signalling model||Decentralised storgae and decentralised tracing through different applications using the framework and API||Apple and Google have announced launching of a “comprehensive soluton” that includes APIs and OS-level technology for contact tracing. In first phase, APIs allowing inter-operability between different OSs and applications will be released by May 2020. While the details are yet to be released, it appears that the framework will use bluetooth signalling for tracing. Some of the key features released include:One-time tracing key will be generated when contact tracing is enabled on the device.Additionally, daily tracing key will be generated every 24 hours.These will be stored on the device and not transmitted onward to servers.Proximity idnetifier will be derived from daily tracing keys and transmitted through bluetooth on tracing enabled devices.When a person tests COVID-19 positive, diagnosis keys and associated daily numbers are uploaded to the diagnosis server.The server aggregates diagnosis keys from COVID-19 positive cases and anonymised/aggregated information will be sent to all users who are using contact tracing.Key schedule is fixed and defined by OS components.||It is premature to comment on how this technology framework will work and what will be the privacy implications. A lot will depend on how implementing apps work around consent and data access, but, certain aspects available publicly create an impression that the risks will be medium to low:Apple-Google claim that API will prevent applications from using static information pools like name, as this could be used for tracking location.Since the key schedule is fixed and generated by OS, the privacy risks shall be reduced.It will not be computationally feasible for an attacker to find a collision on a rolling proximity ID, preventing a wide-range of replay and impersonation attacks.No metdata will be processed.||Medium|
|(v)||COVIDSafe||Australia||Bluetooth signalling model||Decentralised storage and cenralised tracing through government server||Similar to TT, COVIDSafe uses bluetooth signalling and works in the following manner:For using COVIDSafe, user is required to provide name, age range, mobile, and postal code.SMS is sent to complete installation. This infomration is used to generate an encrypted reference code specific to the user.Bluetooth access is required for COVIDSafe to work. Location data or internet connectivity is not used.When two COVIDSafe devices are in range, the encrypted reference codes are transmitted and stored on devices. The date, time and distance are also generated, encrypted and stored on devices.The information on device is not accessible to anyone or transmitted to governement server.Only when a person is diagnosed postive, user’s consent will be obtained for uploading information from device to governmet server, who will decrypt and use it further for contact tracing.Confirmed patient details will not be shared.Data is retained for 21 days, after which they are automatically deleted.Data will only be used for contact tarcing and not otherwise.||COVIDSafe was launched reportedly after detailed privacy impact assessment. This follows a necessity-based approach and similar to TT adheres to well-known fair processing principles like data minimisation, limited retention, decentralised data storage, and encryption. Some additional privacy centric checks are below:All data is encrypted and inaccessible, except with unique decryption key that government will use only when encrypted data is uploaded.Unlike TT which generates a pseudonym ID, COVIDSafe uses encrypted reference code, which further protects stored data.Consent is required for uploading the contact information and the chances of obtaining “free” consent in the true sense are higher.||Low|
|(vi)||PEPP-PT NTK proxmity tracing system||Proposed for European Union||Bluetooth signalling model||Decentralised storage and centralised tracing through specific application related server||PEPP-PT NTK proximity tracing system is a framework curated for European Union countries factoring requirements under General Data Protection Regulations and EU Privacy Directive. PEPP-PT NTK operates on bluetooth signalling and will work in the following manner:Any app imbibing PEPP-PT NTK will work in background.It will not collect any personally identifiable or location data.Temporary IDs for each bluetooth device during tracing will be generated pseudo-randomly and will change periodically. These temporary IDs do not associate multiple signals to the same device and cannot be used to identify the user.Temporary ID will be transmitted via bluetooth signal to other devices. The signal will also include information about bluetooth output power that will help estimating distance and duration.Thus, only 3 information set will be collected temporary ID, distance and duration. This will be encrypted and stored on device.If a user tests COVID-19 positive, a healthcare professional will provide a code, which when wired by the user will allow data access. This access will be with consent. The data then will be transmitted on to the connected server, where it must be retained for specific duration only.Once data is obtained, prevention notifications will be sent to other users depending on physical proximity and duration with the positive user in the past.||The proposed framework definitely trumps our analysis for being the least intrusive of one’s privacy, but a lot will depend when this is actually adopted and put to use. Apart from the privacy protection features used in TT and COVIDSafe, the noteworthy feature that safeguard user’s privacy is that there will be minimal collection of data, mostly without personal identifiers. Further, tracing will happen through temporary IDs and the chances of identification and surveillance of any kind are almost negligible.||Low|
Based on the comparison above, it can be observed that the approach underlying contact tracing technology varies per jurisdiction. The common theme is that there is need to use contact tracing technology for public health emergency, and to that extent an individual’s privacy can be curtailed. However, the question remains – to what extent and at what cost? For contact tracing tools to work, it is important that majority of a given demographic download and use it. Approximately, 75 million users have downloaded Aarogya Setu and the Indian government is considering installing the application as a default app on new smartphones. From user’s perspective, there is a dilemma. While many are using contact tracing applications as preventive tools, there is apprehension that the data will be misused in complete disregard of user’s privacy. In our view, bluetooth signalling method combined with consent-based access facilitates containment measures. On the other hand, geo-location tools allow movement tracking which adds to the existing trust deficit. When scrutinised on parameters of proportionality and necessity of state’s containment measures i.e., whether deployed tools are proportionate to the containment purpose and absolutely essential, it appears that bluetooth signalling has better chances to stand the scrutiny. However, geo-location tracing may be ruled as disproportionate as it permits ongoing surveillance. Even when these privacy arguments are kept at the periphery, it is important to acknowledge that the technology has limitations and one can never rely on the assessment findings and notifications prompted on these apps. These apps rely on time duration based on bluetooth signal or geo-location. For instance, many trigger notifications if bluetooth signals are transmitted for 15 minutes. There is no scientific basis for relying on duration of exposure as a benchmark, and it is possible that someone contracts the infection in few seconds of being around a confirmed or symptomatic person. Thus, there is a possibility of false notification and inaccurate identification of contacts and symptomatic cases. In any event, reliance on tracing application’s symptom assessment is not fool proof. This raises a fundamental question on efficiency of tracing tools, and whether the privacy-utility trade-off is proportionate.
4. Conclusion: What can governments do to encourage people to download tracing apps and contribute to containment efforts? One approach is to make it mandatory for the entire population as has been done by China. But, this approach, apart from breaching fundamental human rights, can strain the existing information technology infrastructure for developing countries like India. The possible course could be to bring in checks and balances. For India, it is even more crucial as informational privacy is a fundamental right and any government action that curtails or suspends the right must be as per the constitutional mandate. In absence of a dedicated personal data protection law (which definitely is much needed now!), this is what Indian government can do to garner popular consensus for use of Aarogya Setu:
- conduct privacy impact assessment between bluetooth signalling and geo-location methods to identify unique requirements for India’s containment strategy;
- revisit the liability disclaimer and remain accountable, transparent and subject to judicial scrutiny in future;
- minimize data retention period to 21 days for all kinds of data collected, unless there is scientific basis for longer retention;
- maintain data processing audit logs to substantiate the claims that government processing is for limited purpose and duration, and to rule out any scope for function creep;
- bring in user consent mechanism for access and transfer of data to government server or any third party; if consent is not feasible, access must be minimal and only to the extent of contact tracing;
- put in place a data sharing agreement and protocol that allows public to know more about how and why data is being transferred to a third-party including healthcare providers;
- undertake and ensure that the app and its data will be purged with end of the pandemic; and
- not to forget – the weakest link in any technology are humans, and therefore, awareness and training is of paramount importance.
 To learn more about HaMagen, access https://govextra.gov.il/ministry-of-health/hamagen-app/download-en/ (last accessed on April 29, 2020)
 Israel’s moves have been drastic and much debated. The government passed an emergency bill overnight allowing Shabak to conduct cellular monitoring surveillance on COVID-19 patients and it is suspected that HaMagen facilitates surveillance measures. For more information on Shabak’s monitoring of cellular data, access https://techcrunch.com/2020/03/18/israel-passes-emergency-law-to-use-mobile-data-for-covid-19-contact-tracing/ (last accessed on April 29, 2020)
 It also requires consent on location data for Android phones, although location data is not processed for contact tracing. The requirement stems from Google’s requirement to seek consent whenever bluetooth permission is obtained as bluetooth id can be combined with other information to determine location.
 To learn more about the proposed framework, access https://www.apple.com/in/newsroom/2020/04/apple-and-google-partner-on-covid-19-contact-tracing-technology/ (last accessed on April 29, 2020)