During testing of web applications, many a times, we see that some of the sensitive files in a web application are accessible without authentication. These sensitive files are primarily non-html files such as text, pdf, doc, xls, ppt, etc. This happens mainly because the web application is not designed to render such files in a secure manner. Instead, the web server is configured to serve such requests. Hence, when a client requests for a non-html file, the web server serves the request by retrieving the file from the specified location.
We have observed on many applications that some common methods of handling non-html content lead to such insecurities. Such methods include:
Lack of validation of session credentials before serving the request
Storing files in a public folder on the web server
Creating files with predictable names
Caching of files on the client machine
Let's see how can we address these issues. Well as a best practice, we can make the filename unpredictable. This will prevent an adversary from guessing the filenames and obtaining unauthorized access to them. But even if the filename is unpredictable, an adversary can still access the file through the browser history, where an authorized user's previous requests are stored or the browser cache, where the entire file itself may have been cached. This happens due to the lack of check on the web application to see if the request has come from an authorized user. So use of unpredictable filenames alone doesn't really secure the files.
The most secure manner of handling non-html content is to
serve the request only after session validation
create the file dynamically and stream the data to the browser
set an appropriate content-type tag (e.g. if the file is a text file then put content-type = “application/text”) as the browser renders the content based on the tag
prevent the caching of the files by setting the “no-store” and “no-cache” directives
We also discussed this topic in this quiz entry in August 2004.