Microsoft’s AI research division inadvertently exposed 38 terabytes of sensitive data, including secret keys, passwords, and internal Microsoft Teams messages. The mishap occurred when the company published a storage bucket of open-source training data on its GitHub repository, affecting both internal personnel and potentially anyone with the knowledge to find this data.

The Exposure

While attempting to provide open source code and AI models for image recognition, Microsoft also accidentally gave permissions to access a much broader range of data stored on Azure Storage. According to Wiz, a cloud security startup that discovered the oversight, the URL was misconfigured to allow “full control” rather than “read-only” permissions. This error meant that anyone with access to the link could potentially manipulate, delete, or inject malicious content into Microsoft’s data storage account.

A Closer Look at the Leaked Data

The data exposure included the personal backups of two Microsoft employees’ computers and more than 30,000 internal Teams messages involving hundreds of employees. Alongside these were passwords to Microsoft services and other secret keys. All these were made accessible due to an overly permissive shared access signature (SAS) token embedded in the Azure Storage URL.

Swift Action by Microsoft

After Wiz notified Microsoft about the security lapse on June 22, the tech giant took just two days to revoke the problematic SAS token. Microsoft has since expanded GitHub’s secret scanning service to monitor for SAS tokens with overly permissive expirations or privileges. The company emphasized that no customer data was compromised and no other internal services were affected.

Lessons Learned

Wiz CTO Ami Luttwak pointed out that as companies like Microsoft strive to develop AI solutions, they need to impose additional security checks and safeguards. Given the vast amounts of data that development teams need to manipulate and share, this incident serves as a crucial reminder to tech companies to bolster their data security measures.

Wider Implications

This security lapse at Microsoft is not isolated. In July 2022, JUMPSEC Labs highlighted similar risks with misconfigured Azure storage accounts. And just two weeks prior to this incident, Microsoft revealed that its systems had been compromised by hackers based in China. As technology evolves, it is increasingly essential for organizations to remain vigilant about data security.