The Government of India has launched an app called the Aarogya Setu app to find against COVID-19. The app enables people to assess the risk of them catching the infection. To know more about the app and discuss the consequences, BOOM's Govindraj Ethiraj spoke to Lalitesh Katragadda who is an iSPIRT Fellow & Founder of Indihood and Raman Jit Singh Chima, Senior International Counsel, Access Now.
'The app allows people to register themselves and answer questions about themselves when they choose to,' says Lalitesh Katragadda. He further mentions that it also tracks other people, an individual comes in contact with via the help of Bluetooth and GPS. He says that there is a de-identified ID so that the actual user information is not shared.
On the other hand, Raman Jit Singh Chima said, 'The one reality in this space about the larger response including tech interactions is that we need to be cautious about what is effective as sometimes the additions that happen in the tech space can be more problematic.'
Watch this episode of COVID-19 From The Frontlines to know more about this app. Here is the transcript for this -
The COVID –19 virus continues to attack; it continues to grow—more than 7500 cases in India now. It is spreading in places like Mumbai where I am, it has become an epicentre of sorts, and obviously the concern is growing. There are many responses that are being worked on both in India and overseas. We have been talking about the work being done at the frontiers with people from the medical profession, researchers, scientists and so on. But let us also look at another aspect of controlling and containing this disease, or the spread of it using technology. The Government of India has launched an app, Aarogya Setu, that sits on your smartphone if you download it, then shares data about you via Bluetooth and GPS to a central server, which in turn allows you to know if there is someone who is COVID affected near you and I am assuming vice versa as well. So, an app like this is very useful and most people would welcome it at a time like this. But it also raises questions about the way the data is gathered, collected, and stored and what happens in the longer term. So that is the question, we are really asking today, even as we try and understand both the front end and back end of this interesting and impressively put together application in a reasonably short period of time. My guest to do that Lalitesh Katragadda. He is the founder of Indihood and an iSPIRIT Fellow. ISPIRIT is an organisation of IT tech companies and he has worked on this app. I am also joined by Raman Jit Singh Chima, a Senior International Counsel, non-profit Access Now, co-founder of Internet Freedom Foundation, and former India Policy Lead, Google. Lalitesh is also Google ex and I do not know if both you worked together at the same time, but you have all come from the same place.
Govindraj Ethiraj: Lalitesh, tell us about this app and how does it work and how is it designed. I know you are a maps expert—you have done a lot of work in mapping, building the mapping architecture at Google. So that is your specialization and obviously some of that has found its way into this app as well.
Lalitesh Katragadda: The app is very straight forward, it is not a very sophisticated app. There is a level of sophistication that will develop over time. What the app does is, it does three things. One is it allows people to register themselves; answer questions, when they choose to, about how they are feeling, whether they were in touch with someone who was infected (known to them) or they have travelled internationally—of which we take this very complicated chart that ICMR has created about assessing people and assessing yourself, whether you are at risk or not, and using very simple chat-like questions it determines that and tells you immediately whether you are fine or not, whether you are at risk and need to take some precautions, or you are at a high degree of risk and need to get tested immediately. That is one thing it does. The other thing it does in the background, as soon as you register, is it starts tracking two pieces of information. One of them is that whoever you come in touch with—who is also using the Aarogya Sethu app—it looks at Bluetooth interaction; basically it listens to the other person's Bluetooth and the other app listens to your Bluetooth, both of you do not know who you are and there is a deidentified-id behind this (the actual user information is not shared). And the other thing that the app also does is, every 30 minutes it measures your latitude-longitude, and both are kept on the phone. The only information that is going to the server is, whenever you take a self-assessment, specifically when the assessment is showing you are unwell or at risk, that information goes to the server along with the latitude-longitude of where you took the test, so that if testing or some intervention is required, the authorities potentially know where you are. So those are the two things happening. What is the use of all this? The use of all this (I will talk about the algorithm separately) is, like you mentioned, if it so happens that one of the person you came across and spent significant time and interaction with, where you are close enough to them and spent a bunch of time with them—both of them matter—it will then assess that you are at risk, and allow you to much earlier than, any PCR test may show otherwise, much earlier allow you to quarantine yourself, keep your family safe or if you were close enough to test, it will advise you to test and the health authorities will consequently help you get tested much earlier. If you are like me, staying up late nights and reading research papers about this infection, there are two things that are material about what we are discussing. It is all about the biology. One is that this is contagious when you are asymptomatic, a lot of people who are infected never become symptomatic, but they are still contagious. That is one. The other one is that when you get symptomatic or when you get infected the sooner you get treatment the better. Because when the inversion happens and you have this seizure of the lungs (some people are calling it the cytokine storm; some people are calling it haemoglobin reaction) we still do not know the biology of the disease because this is moving so fast. But whatever the reason is the sooner you get the treatment the more likely you are to recover and get better. So this is all a race against time and the whole idea of the app is not that we will immediately see benefits when we install; the idea is that if a large number of people install, we all use it for the next fifteen, 30, or 40 days (however long this crisis takes); the peaking of this will happen in the next 4-6 weeks. We do not know when. Dr Devi Shetty is saying this will peak towards end of April or middle of May. I am inclined to believe a medical expert. So, whenever it peaks, this will allow us to control, trace and quarantine people much early and contain it early than we would otherwise. That is what it is doing both from a back-end point of view and front-end point of view.
Govindraj Ethiraj: As an architecture was there a specific intent to combine the medical part with the geo location and the location part? Earlier I thought this was more about knowing whether there is an infected person around you or in your vicinity. But this is as much about following and trying to track the disease which is slightly more complex phenomenon than just knowing where someone is...
Lalitesh Katragadda: There are three parts. Let me first answer your last observation. The moment somebody has a disease, unless they are taking the phone and running away out of quarantine, they are not going to be in the field. So, detecting someone near you having the disease is unlikely, what is more likely is you came in contact with that person before it was discovered that they have the disease. That is the reason we are doing the Bluetooth. Now the question is, why are we doing GPS. The reason is why we are doing GPS is, if we have sufficient data from multiple people, who were later diagnosed as having the disease and they were all using the Aarogya Sethu app, then it allows us to very rapidly identify hotspots. Whether this infection happened in a coffee store or near a kirana store or some other place where people were working, it will become evident much more rapidly within hours of the disease being detected in the people rather than days of trace work that health authorities and surveillance workers are doing now. We have the capacity to do it today when a few thousand are detected, but the capacity will disappear, if something like what is happening in Europe (God forbid); it may not happen, but if it does happen we will not have the state capacity to figure out where all the disease is coming from and at that point ….We are trying to contain that and solve the problem by reducing the effort levels to track this, which is why the GPS location.
Govindraj Ethiraj: Raman, what are your first thoughts on this app? Applicability, the way it is positioned and then rolled out
Raman Jit Singh Chima: I think the important thing to recognise during this crisis, I think everyone is trying to do …..response. Whether you are a technologist, government or state government, or federal government, you are trying to take urgent action. I think the one reality in this case about the larger response to this including the tech-interventions is we sometimes need to be little bit cautious about what is effective and sometimes the adage that works in the tech space—move fast, break things and then maybe patch them later, launch and iterate—can sometimes be more problematic when it comes to the public health space. More specifically there are a couple of different things—remember that we had contact tracing app very specifically, that was what the Singapore example was, the TraceTogether app—which earlier this week has completely open sourced, you can review and see the code base—but for a very specific purpose, for users to know if people nearby have self-declared to be infected and being cautious about that. We already have learnings: a small percentage of Singapore's population use that app. In India's case what we are doing right now, is something, which is somewhat unprecedented, perhaps more equivalent to Chinese, the People's Republic of China's intervention in terms of COVID tracing, where there is ratification from mobile devices, it is sharing that with the central backend, which is also collecting location data, and not using it for a user to just know what is happening, but the Government or rather the public health authorities to look at potential hotspot tracking or location tracking and there is much more data that they collect, perhaps than in many other places. {Why I am cautious about this} is that, one it does have implications. For example, even currently, it is not just about the data stored on your phone, it is about the data stored centrally at the Government's end, whether it is the NITI Aayog or some other agency (that is a good question to also ask)
Govindraj Ethiraj: I think it is the National Informatics Centre
Raman Jit Singh Chima: Who is keeping that, who has access to that, who will be hosting it later—right now it is hosted on AWS (Amazon Web Services) and (I am assuming) if they are porting it later, then who is actually in control of that, what has been kept there, right now there is no clarity on what data is kept after the pandemic finishes. But let us talk about during the pandemic time itself. There have been a lot of useful previous examples. For example, 'tracking Ebola', from South Africa, using data-driven approach—using call data records, massive geo locations of cell phones, where in fact the postscript, after action analysis was that it was not very helpful. Infact, it may have been counterproductive, may have made things more confused; therefore, we must be more cautious here. Contact tracing is an experimental step in terms of the app-based model; it is being tried for the first time; I sometimes think it is worthwhile to look at what public health people are saying. One fact which Lalitesh himself said that this is only effective only if 50% or more of India's total population uses this. So, one must see how that happens. And if that sort of urgent step is taken where, basically majority of our population is being asked to install this, we need to be cautious about what is being there. Would it leak data in other ways, would it be subject to malware, what happens to the data afterwards. More importantly, even today, it is talked that it will let you know about your possibility of being infected. Is that a human-based intervention coming from the ICMR [run] state health agencies? Is it an algorithm coded between NITI Aayog, NIC and volunteers? So these are important questions there. Essentially, what I am saying is independent of this app, you need to be little bit cautious of tech-solutions when it comes to public health. While it can be a useful assistant, I am still not sure that it can do all sorts of different things we plan to do, and most troubling in India's case it is trying to do a lot in one single app or one single dashboard. And that makes me generally wary.
Govindraj Ethiraj: Lalitesh, are we trying to do too much in one app?
Lalitesh Katragadda: Well, we are trying to do whatever is necessary. Too much or not, I think time will tell. I think the more important thing I am focused on is whether this is going to be effective. That I completely agree with Raman—digital for the sake of digital is pointless unless you make it effective. See, there are a few checks and balances that are in the system. One is, the information you are storing on the phone is only pulled out when it is determined that you are infected using a virology test. And in rare cases when it is determined that you are at very high risk because of [deep] proximity. And that is the color you already have or have reported that you have been in deep proximity with somebody. So that percentage, if you take the entire population of people who are registered, that suppose tomorrow we have 400-500 million people registered—presently we have 6000 and even if the number goes to 100 thousand—and their cohorts. Presently, we are seeing a cohort ratio of about 3-4. Then at that point, we will have half a million records downloaded on the server, the rest of them will remain on the phone. We are not downloading everybody's information and we cannot. You do not know this, but the team inside, the volunteer team has been going at it day in and day out—battling it out figuring what is the privacy edge we can walk and what we cannot walk. I think we have spent more than 40% of the effort battling privacy and less fighting the code and the app because if we downloaded all the data to the server and ran this, we would be able to do a much easier job. The algorithm is very very complex because we are minimizing the amount of data that we are downloading. And the other thing is—to Raman's point—I am old enough to have a lot of mistakes in data science. If you write enough algorithms, you will realize that most of your algorithms do not work the first time. So none of this is rolled out—we have written the algorithms, but we have not rolled them out. As and when the data comes in, the data is anemic, quite honestly, so as and when the data comes in and if our algorithms have confidence in the data, even before informing the users. One of the reasons, we have not taken the Singapore approach, one of the reasons why we are very weary of the Apple and Google approach is that it directly informs the user and tell them that they are at risk. Especially in a country like India that can cause mass panic. So, we are not doing that. We are detecting that if there are a potential set of people, who might be in proximity, we are going to run that information by the health authorities. Possibly even, the first few days we get enough information, we do on ground testing to see if this algorithm is doing anything useful or not. If it is not doing anything useful, then we will discard the data. There is no value, the only value will be.
Govindraj Ethiraj: And what about the longevity of the data. Do you know or do we know what is going to happen to this data July-August onwards?
Lalitesh Katragadda: There are two constraints. One of the constraints is—I do not know if the latest terms of service has gone up on the web or not, I apologise, the engineers are literally not sleeping, so I do not know where it is—but the latest privacy policy drafted by our legal counsel says four things. One is, it says that this data can be used only by the necessary health authorities and the authorities delegated by them—potential surveillance officers and so on—only for battling COVID 19, not even for other infectious disease. And there is no etcetera for other purposes because we did not want this to become a runaway train. And the other three constraints are: if you are never shown to be at risk through self-assessment or coming in close contact, we throw away that data on a running basis in 30 days. Even on your phone it is wiped out within 30 days. If you are determined to be at risk, that window is expanded to 45 days. If you are virologically proved to be infected—where the virology test says you are infected, after you are cured, we keep the data for 60 days for post analysis, because analysis of the data is useful. And then all the non-anonymous person data is thrown away. And to make sure that we are talking about anonymization and not shallow anonymization because anonymized data can be de-anonymized, right? So, what we are doing is we are mixing up data of multiple people, in the space of 50-100 people, and bucketing it in large geo fences like 100-200-meter grids in dense urban areas and even larger grids in rural areas. So, there is actually data of 50-100 people in each of these buckets, and then anonymizing it and keeping that for research. Rest of the data is getting wiped out.
Govindraj Ethiraj: One key takeaway, Lalitesh, from what you have said is, the data is not going to be there in perpetuity on this database or by extension with the Government. So, it is either a 30, 45 or a 60-day window—let us say 60 days or 90 days after which the data is gone.
Lalitesh Katragadda: Yes, and there is one more thing, and I think this nuance is being lost. Most of the data never makes it to the government.
Govindraj Ethiraj: Right, you said that there is only the fringe that makes it, the rest of it on this thing. What is another app that could compare it with, Lalitesh? For people to understand what this is like?
Lalitesh Katragadda: Raman might know...I am trying to recall what other app
Govindraj Ethiraj: Where the data is primarily sitting on the phone and the exchange is only happening with the server
Lalitesh Katragadda: Most of the apps I know like Google Maps and Facebook and so on—all the data flows to the server site/side. I do not know of many apps, of this scale, where the data sits on the client side.
Govindraj Ethiraj: Raman, two points. Lalitesh's assertion that most of the data is going to be sitting on your phone and not getting transferred to the server. And the second, what I think sounds more important is, there is a window, sunset, for all this data and at the end of which it all disappears. How do you see it?
Raman Jit Singh Chima: The point to, of course remember is, based on the current design, which is why some of more boring questions that lawyers, and people tend to ask when it comes to government is: who is running this? Who is accountable for this and how does it work? The reality of course with the government or say, with the CEO in a company is that they can set their own rules, unless they are controlled by something else—like you can say the Board in the case of the company, in the case of the Government, the Parliament or something else. But why mention this is, it is an important point. There has already been tension, right, as we have seen. Between the State Governments and the Centre. Even within the central agencies—sometimes about the powers, what is happening. Is the National Disaster Management Authority in charge of this, is it the NITI Aayog, is it NIC, is it governed by the Epidemic Diseases Act or the National Disaster Management Authority Act is the important thing, I will say, lesser than the engineers who might be working round the clock, the Government needs to answer. The second other thing is some of the data that is being collected—and again I put that caveat, I am a lawyer who spent a lot of time with product engineers and others. I note that when you still try to collect large amounts of location data for predictive or later trend tracking, that is something I would be cautious about. Because what you are doing that is not something you want to keep right afterwards, but you want to potentially use for other things. Even the Singapore TraceTogether app has had that criticism, where people have said that it is very clear that the app will not continue after the pandemic, that you can make a data deletion request. This for example, in Aarogya Sethu, right now, is not fully clear if you can do so. And even with the Singapore app you can make the data deletion request. But there have been points where even the Singaporeans have raised concerns saying, that well, what if the Government put a retention order on it, how do I challenge it? Just imagine they are trying to do everything genuinely, but some officer for some reason decides to mess up...How do you escalate that? That is the concern. But also, to what Lalith has just said, as he mentioned, it is hard to find a parallel about what is going on here. The amount of data collection that is happening, what is being shared, what is not—that is where, as a lawyer who has worked with product people, I would sometimes be cautious. That is where you say you need to have an open conversation; you need to ask other people to have an open review and understand what the processes are. More importantly, if something goes wrong with this—my worry is not even the civil liberties consequence, but the public trust consequence. In a large diverse country like ours, if something goes wrong with this, you will see so much skepticism and worry from the general population across different states or working with the state governments or working with the ICMR, that is something we should be very wary about. And that is the biggest concern here—that if something goes wrong, it might make the situation worse. So, therefore, one has to be a little bit more and careful and perhaps consult more and more importantly, there is a global sense that the governments are slightly jumping ahead too much and focusing on these app-based solutions and perhaps not focusing on other elements, other key public health interventions. But of course, these are just initial comments, I would say one needs just right now, is open up this process a bit more, get some of these questions answered not just by the engineers but by the government itself. And maybe that is the first step that needs to be taken.
Govindraj Ethiraj: Raman, if I have understood you correctly, you are ok with the concept, you are ok with what seems to be the execution at this point but as long as the process going forward is made sufficiently open and there are opportunities for people like you to jump in and ask these questions and get them answered...
Raman Jit Singh Chima: I was going to say that, the reality is there is so much going on in this app that I cannot.....something, I immediately support with because it is contact tracing, it is predictive, it is health information added and other steps there. Secondly, noting that if it requires 50% of the population to use it and therefore requires a lot of government resource and political energy to get everyone behind it, I would question if that is the most important thing to do right now, when we are having other data concerns with the government itself..right? For example, release of COVID data and whether it is accurate there? So, my take would be the app has concerns. From a public health policy perspective if this is what you want to double down on right now, I would be wary and note that there other alternative approaches as well. An engineer would come and say, "Look, there are 100 different ways you build something" which is why sometimes I would say therefore maybe you need to be sure that this is what you need to build because that is going to take time and energy on behalf of the government, which at this point of time is perhaps the precious commodity of them all.
Govindraj Ethiraj: Raman, it is possible that the government could work on many fronts, right? What Lalitesh is doing here is in some ways independent. He is in any case independent of the government and represents a bunch of interested people who are committed to do something and giving their time and effort and that ends there. And there is for other people to take over and say, is the right legal technical architecture being created so as to protect citizen's rights and to ring fence it from leaking out or being misused in any way.
Raman Jit Singh Chima: But that is exactly the concern I had. As you know that any product that is created, it is about continuously running, not necessarily the initial engineering, creation, deployment from the production network to the general consumer usage. It is constant maintenance. For example, even today the shift from AWS to NIC that is being considered about. And that is one reason who is really investing in it. How has that happened? How are people...not just accountable but how do we actually know who is pulls that....sometimes legal consequence, this person's data must be kept in the system because it very clearly , they are infected, we need to issue an order under the Epidemic Diseases Act, which by the way only state governments can do, for example, to look at a particular person. That is the concern that is there. And there consequences to build something and other people will come to run it approach—it is what I would say, we should learn more from history and more importantly I was stuck by the fact that many people who looked at the use of the data solutions for their Ebola crisis, they have all said be very cautious in investing resources into things like this more generally. In the case of the Singapore app, its use case is very clear, basically for the populace to know what is going on. And as Lalith has mentioned, not very appropriate to India, it may lead to panic. In that case it is very clear. In the case of other apps In China and elsewhere, they tried to do many things and of course the difference between us and China is that ultimately we are a bit limited to what people can do in a democracy and we do worry about the long-term consequences of what we can do. We must respond to a pandemic but what continues happens afterwards should not be a basic violation of what is our idea of democracy.
Govindraj Ethiraj: We are running out of time, and there will be some evolving questions that I will be coming back to you both with. Lalitesh, can I give you the last word to respond on these two points. One is who owns the data today and likely tomorrow? And from the point of data deletion, how do I as a citizen ensure that once things are ok or I am better, I want to get out of the system in a way I understand it across this country and it is easy to do.
Lalitesh Katragadda: I think the data deletion policies are very clear. If you uninstall the app, the data is gone, because data is sitting on your client. The other point that is being raised, is little bit slightly untrue that this app is useful only if 50% of the population installs. The app is immediately useful to you the day you install it because you can report your symptoms and get help. This app is useful for figuring out where the hotspots in the city might be, completely anonymous. Even if 10% of the people install, because that is what it takes statistically, where people might be or locations where the infection may be spreading. For contact tracing, you need to get to 40-50%, I fully agree. But contact tracing is again sitting on your client and the data is as safe as the phone is with you. And having said that, there is a fear of big brother and the fear of the government changing things later and that is a question I am not qualified to answer. But to answer one specific question, this data is currently under the control of NIC, even though the servers are not in NIC, the data is being controlled by entirely by NIC, this is a Ministry of Electronics and Information Technology (Meity) app...
Govindraj Ethiraj: This app reports to someone...is it the Ministry of health?
Lalitesh Katragadda: Ya, the stakeholders are the Ministry of Health...Unclear...all of these are the stakeholders we are corresponding with. But the entity responsible for maintaining the app and making sure that it works the way it is supposed to is NIC, inside Meity.