In a turn of events, Microsoft AI researchers unintentionally exposed tens of terabytes of sensitive data while publishing an open-source training data storage bucket on GitHub. This accidental exposure has raised significant concerns regarding data security within one of the world鈥檚 tech giants.
听
Discovery by Cloud Security Startup Wiz
听
Cloud security start-up Wiz uncovered this security lapse during their ongoing investigation into cloud-hosted data exposures. According to reports, Wiz stumbled upon a GitHub storage file belonging to Microsoft鈥檚 AI research division. What they found was deeply concerning.
The GitHub file was intended to provide open-source code and AI models for image recognition, and users were directed to download these models from an Azure Storage URL. However, an oversight led to the misconfiguration of this URL, granting permissions not only to the intended data but to the entire storage account, revealing a host of sensitive information.
听
The Extent of the Data Leak
听
Among the exposed data, a staggering 38 terabytes of sensitive information came to light. This included personal backups from the computers of two Microsoft employees, laying bare personal data and potentially compromising their privacy.
Furthermore, the data breach exposed passwords to various Microsoft services, secret keys, and over 30,000 internal Microsoft Teams messages from hundreds of employees.
听
Configuration Errors Amplify the Risk
听
The gravity of the situation was exacerbated by configuration errors. The exposed URL was not set to 鈥渞ead-only鈥 permissions but rather to 鈥渇ull control,鈥 opening the door for malicious activities.
Those aware of the issue could potentially delete, replace, or inject malicious content into the exposed data, further heightening the risks associated with this incident.
听
More from News
- From Workouts To Managing Jetlag: The British Tech Scale-Up That Just Hit One Million Users Globally Appoints New CEO
- Hackers Tricked Instagram鈥檚 AI To Leak Your Log In Details 鈥 How Can Users Stay Protected?
- New Research Reveals The UK鈥檚 Top 10 鈥淔uture-Ready鈥 Cities
- New Research Shows How Elections Are Impacting The Job Market 鈥 Here鈥檚 How
- Is London Becoming The World鈥檚 Next AI Capital?
- Google鈥檚 AI Can鈥檛 Even Spell 鈥淕oogle鈥 鈥 So Why Is It Replacing Search?
- Will AI Labels Actually Save YouTube From AI Slop?
- The Rise Of 鈥淣ew Brand鈥 Cybercrime Groups And The Business Of Ransomware
听
Shared Access Signature Token (SAS) Oversight
听
Notably, the storage account itself was not directly exposed. Instead, the oversight came from an overly permissive shared access signature (SAS) token embedded within the URL. SAS tokens are a standard mechanism used by Azure to create shareable links that grant access to an Azure Storage account鈥檚 data.
This lax oversight of SAS tokens raises concerns about the security protocols in place within Microsoft鈥檚 AI development teams.
听
Microsoft Responds and Expands GitHub鈥檚 Security Measures
听
Upon discovering the extent of the data exposure, Wiz promptly notified Microsoft on June 22, and Microsoft took immediate action by revoking the SAS token on June 24. Subsequently, the tech giant conducted an investigation into the potential organisational impact, which concluded on August 16.
Microsoft responded to the incident in a blog post, saying that 鈥渘o customer data was exposed, and no other internal services were put at risk.鈥 As a direct response to the incident, Microsoft announced the expansion of GitHub鈥檚 secret spanning service. This enhancement will monitor all public open-source code changes, focusing on the inadvertent exposure of credentials and other secrets, including SAS tokens that may have overly permissive expirations or privileges.
听
The Broader Implications
听
This incident serves as a reminder of the challenges tech companies face as they harness the power of AI and handle vast amounts of data. As data scientists and engineers rush to develop cutting-edge AI solutions, extensive security checks and safeguards are more important than ever. The incident at Microsoft shows the growing difficulty in monitoring and preventing data breaches as data manipulation, sharing, and collaboration become integral aspects of AI development.
In conclusion, the accidental data exposure by Microsoft鈥檚 AI research division raises critical questions about data security in the AI era and serves as a cautionary tale for organisations worldwide. It underscores the importance of robust security measures and vigilant oversight in an age where data is not just an asset but also a potential liability.