60 crore emails and other personal data stolen and sold: Decoding Cyberabad data theft
In March, a Hyderabad-based social worker approached the Cyberabad police, one of the three police commissionerates in Hyderabad, with a tipoff that personal data was being sold by someone on JustDial, an internet technology company that provides area-wise contact details of various service providers. The police hatched a plan and directed the social worker to pretend to buy data from the seller. When the police gave the green signal for the entrapment, little did they know they would stumble upon a large-scale crime involving the theft of confidential information.
Once the transaction took place, the police swung into action. In the next few weeks, they analysed the data and traced the seller. Their investigation led to the arrest of Vinay Bhardwaj, a Faridabad-based web-designer-turned-data-thief. Vinay was found to be involved in procuring, holding, and selling personal and confidential data of more than 60 crore individuals. The data in his possession included personal information of customers of various organisations from across 24 states, including eight metros. The police also revealed that the accused used data scraping tools to access the details. The police found that a lot of the stolen data belonged to major companies like Facebook, Big Basket, PhonePe, Tech Mahindra, Club Mahindra, and Policy Bazaar. Data from various banks like State Bank of India, Axis Bank, Bank of Baroda, etc. was also found in the database in Vinay’s possession.
Though the police claim that the data belongs to around 66.9 crore individuals, some experts say that the numbers look inflated and the real figure can be ascertained only after the database is fully analysed.
According to the police, Vinay had established a business and collected databases from two people and then used social media platforms to sell the data to fraudsters for profit. He was also operating through a website called InspireWebz. The main customers who bought the data include advertising and marketing agencies. PhonePe, however, denied any leak to Moneycontrol and said that its data is safe.
However, Vinay is just a cog in the wheel of a very lucrative business. Speaking to TNM, Kalmeshwar Shingenavar, DCP Crimes, Cyberabad police, explained, “Data is the new gold for any business. Everyone wants to get their hands on data.”
Once the police realised the extent of the data theft, they wanted to get to the root of the issue. That led the investigation to the companies whose data was found in Vinay’s possession. Based on the analysis of the data, the Cyberabad police issued notices to a total of 21 organisations and asked them to be present at the Cyberabad commissionerate on given dates. The companies will be shown the recovered data to verify if it is indeed stolen from their database. Then the police will work together with the organisations to trace where the data theft occurred.
TNM has learnt that many of the companies summoned by the Cyberabad police weren’t even aware of a possible data leak from their servers. While some denied any data theft, they were shocked to find that the data collected was in the same format as their database.
What kind of personal data was stolen?
The data included various kinds of personal information entered by customers when downloading an app or signing up for a service from banks or other organisations. Vinay had in his possession data pertaining to students enrolled in Byjus and Vedantu, information related to 1.84 lakh cab users across metros, data of salaried employees in six cities and Gujarat state, customer details like address, mobile number, date of birth, email ID, etc. from websites like Amazon, Netflix, YouTube, PayTM, PhonePe, Big Basket, etc. The data also included sensitive information like income declared, first swipe amount, second swipe amount, bank code, educational background, marital status, etc.
Speaking to TNM, L Srikanth, a public interest technologist, said that the numbers projected and the claim that data from so many companies was leaked may not be entirely true. “It is often cheaper to create data. In India, ‘data factory shops’ that employ people to generate counterfeit data are not uncommon. In the current case, while it is possible that some of the data may be genuine, a lot of it is bound to be fake as well. Those who buy this data also know that if it works it works.”
“Hence the police’s claim that data of around 60 crore people was leaked may not be entirely correct. The integrity of the data needs to be analysed. Often it can so happen that the same data has been entered multiple times. Duplicity also leads to inflated numbers,” Srikanth added.
However, the police believe that it is unlikely that fraudsters will spend time to generate data in the same format as the companies named, so according to them it was a data leak at some point in the chain.
How do data thefts happen?
According to the police, there are three main ways in which data is stolen or leaked from companies:
-
When the company’s database is compromised. This happens when hackers hack into the database with the intention of stealing data or to demand a ransom in return for the data collected. For example, e-grocery startup Big Basket admitted in November 2020 that they faced a security breach that ended up compromising the data of almost 20 million users.
-
When an employee of the organisation who has access to the data sells it without the knowledge of the employer.
-
Data can also leak from third-party vendors hired by the organisation. The vendors could be hired to carry out verification processes or other tasks for which the company’s data is shared with them. For example, a company called Matrix Pvt Ltd was engaged for tele and field verification by six banks. The police suspect that a field verification agent from Matrix stole the data and sold it.
According to the police, though it is mandatory for companies to inform customers when their data is shared with a third-party vendor, the process is conveniently by-passed.
Cyberabad Police Commissioner Stephen Ravindra had outlined how stolen data can be misused. Addressing the media about another similar case of data theft in which seven people were arrested last month, he had said, “The data of defence and government employees can be used for espionage and impersonation and … may jeopardise national security. Data related to PAN cards can be used to commit serious offences.”
The police believe that a little care can go a long way in protecting one’s data. “Download only trustable apps and do not give access to unnecessary permissions requested by apps,” explained a source in the investigating team. The police say that investment, trading, loan, and matrimonial apps and websites are usual targets of data theft.
What next?
The next step for the Cyberabad police is tracing the origin of the data leak. But Srikanth believes that full traceability is difficult in cases of data theft.
“Unlike a money trail where everything can be dated back and traced, in the case of a data leak even if they do manage to find when the leak happened, it would be extremely difficult to find how it happened. Sensitisation and understanding the importance of data governance is the only outcome from this.”
Even as the investigation continues, the biggest challenge for the police is that they have managed to reach only the fourth level in the chain that began with Vinay Bhardwaj. With each level cracked, they realise that the chain is very long and the investigation time-consuming and exhausting.
According to the police, most companies that collect data do so for business purposes. While they do have the expertise to detect data theft and secure their customers’ information, this is often secondary to their business goals.
These incidents of data theft gain significance as India still does not have a data protection law. The Digital Personal Data Protection Bill, 2022 has not been presented yet in Parliament. Many experts believe that even if a strong data protection law is passed, enforcement will still remain a challenge. “It is best to evolve best practices and processes to prevent such data leaks. Data minimisation – where very little or only the most necessary data is collected should be the way forward. Today, the government regulates the bulk of the data. When it comes to data minimisation, the government itself is the biggest violator. Expecting companies to comply will take a long while,” said Srikanth.
TNM has written to State Bank of India about the data leak; this article will be updated if we receive a response.