Millions of files with potentially sensitive information exposed online, researchers say

A survey by Censys found 314,000 distinct internet-connected devices and web servers with open directory listings.
(alengo/Getty Images)

Thousands of computers and other internet-connected devices are exposing millions of files with potentially sensitive data across the internet, either inadvertently or on purpose, leaving the data discoverable and potentially exploitable in any number of ways, an analysis published Wednesday found.

Researchers with Censys, a service that indexes devices connected to the internet and the services they’re running, recently indexed nearly 314,000 distinct internet-connected devices and web servers with open directory listings and at least one file. The scanner then took note of file names, paths, file sizes and last-modification timestamps, creating what the company calls “one of the most comprehensive databases of all open directories on the internet.”

The analysis found hundreds of devices containing database backups, for instance, as well as devices “serving millions of files with common spreadsheet file extensions.” An examination of the spreadsheet filenames shows more than 9,000 with an indication of being related to financial data and thousands of other files that could contain authentication and credential data, network packet capture files, and more.

The Censys researchers noted that they did not view the contents of the files, and did just enough to attempt to expose the current state of the problem.


“From our perspective, this data indicates that there is a potential goldmine of database-related information exposed on the internet that could be used by malicious parties to exploit weaknesses, compromise sensitive information, and launch targeted attacks,” the researchers said.

Files being exposed online in this manner is an established and well-documented phenomena, the researchers noted. An analysis of the last-modified timestamps shows “that most of the data was created or modified in 2023, illustrating that this old problem is still going strong even as organizations become more security-conscious.”

The exposed files are available via open directory listings, which are folders on web servers that list and link to all files on a given system. The directories are typically not openly accessible, but sometimes they end up open anyway, whether on purpose for administrative or performance reasons or inadvertently due to configuration errors. The practice of finding open directories is a hobby for some, but data gleaned from the exposures could lead to serious damages or more serious cyberattacks.

“For defenders, open-directories can inadvertently expose sensitive information like development artifacts, backups and other sensitive information,” said Silas Cutler, a security researcher with Stairwell and a member of the Ransomware Task Force.

Data exposures via misconfigurations can have major consequences. Health insurance data associated with roughly 56,000 Washington, D.C. residents — including prominent officials and members of Congress, and their families — was downloaded and posted on a cybercriminal forum in March. The attackers in that case told CyberScoop the data was essentially sitting in the open, and a subsequent analysis confirmed a misconfiguration was to blame.


Open directories also aid researchers and others trying to fight crime and state-aligned hacking threats. In March, an anonymous security researcher found sensitive personal data for more than 550,000 users of a website for the buying and selling of guns after the hackers in that case left the data on an open server, according to TechCrunch.

“In a Forest Gump way, open web directories are like a box of chocolates, sometimes it’s a repository of Linux images, sometimes it’s a nation state threat actor that made a mistake,” Cutler said, pointing to a report he published last month at Stairwell detailing data found this way from an exposed server being used to deploy the Akira ransomware variant.

AJ Vicens

Written by AJ Vicens

AJ covers nation-state threats and cybercrime. He was previously a reporter at Mother Jones. Get in touch via Signal/WhatsApp: (810-206-9411).

Latest Podcasts