
On January 20, quite shocking news hit the world, particularly the AI community about a new AI model that could perform equally as well as the current models at a fraction of the cost and time required to build them. This news sent the AI community and leaders in the game scrambling to learn about this new model that seemed to take over the AI space in the US. The record-breaking number of downloads of this AI model drew even more attention to it. The US President Donald Trump tagged it a “wake-up call for competing US companies. The news didn’t go without its negative effect on giants like NVIDIA, causing them to lose about $600m – the biggest loss record ever in US history in an AI stock sell-off – within a day. In summary, DeepSeek came to break records which did not go well with the AI giants.
Following DeepSeek’s release, everyone wanted to use the model and compare it with leading models out there like ChatGPT, Gemini, Claude, etc. The race was on to determine which model was a better one. In doing my comparisons, I decided to ask ChatGPT itself and it favoured itself as the better model, the responses were very objective with credit given to DeepSeek for its strengths in heavy-reasoning tasks and programming. In doing the comparisons, tests also had to be carried out by researchers to determine if it was possible to have a model that did what it claimed to do without any foul play. While normal tests were carried out, security researchers were concerned about data privacy as it collects massive user data which is stored on Chinese servers.
Security researchers at the New York security Firm – Wiz discovered a database that was exposed to the internet and contained a lot of user data, in millions. The publicly accessible database was allegedly easy to find as it was open and contained no authentication, presenting a trove of data on a platter. It was hosted on oauth2callback.deepseek.com:9000 and dev.deepseek.com:9000. The data included digital software keys, chat logs, backend details, and other sensitive information. There is a general belief that the firm was not the only one to find this data leak because of the ease of discovery. The severity of the leak was aggravated by the gravity of the unathorized access it provides to anyone that finds it. According to Wiz, “The exposure allowed for full database control and privilege escalation without authentication or any form of defense to the outside”.
The implications of this exposure has been nothing short of severe owing to the amount of data that was exposed, and the amount of trust and enthusiasm placed in this model. This reflected in the number of downloads within a space of 24 hours. The fallout from this has been commensurate with several countries placing bans on the model shortly after learning of this massive data leak. The initial concerns about the data residency and the processing of data by the Chinese Artificial Intelligence company which is in China, did not help matters at all. Having user data stored and processed in China makes it subject to the Government there, which could lead to potential, unfair information leak to the said opposing Government. The Italian Protection Data Authority, known locally as the Garante ordered the company to cease all forms of operations in Italy on January 30. The US issued strong warnings to users about the risk of using it and strongly advised Government employees to stay away from it altogether.
This occurrence has led to a global call for proper data privacy vetting of AI models as they interact with a lot of user data and other information of millions of users. Regulators and Security experts are concerned about the safety of users’ data. It is one very important area AI companies generally have to be particular about. For DeepSeek, it is unclear what will happen from now on. Will it be able to redeem itself after this disappointing and disgraceful occurrence?