Cached Pages

By Paladion

October 16, 2004

How can an application ensure that its pages are not cached or left on the client after a user has logged out?

  1. Set pragma: no-cache
  2. Set page expire = -1
  3. Set cache-control: no-cache, no-store
  4. Set cache-control: must-revalidate

The best answer to the quiz is “3. Set cache-control: no-cache, no-store”.

Browser caches are a place where a lot of information leakage occurs. Authenticated pages get cached on the user’s browser, and an attacker with access to the machine can view them in the browser cache (in IE, that’s the Temporary Internet Files folder). As we shall see, this is a difficult problem to solve in practice, as not all browsers obey the rules uniformly.

The pragma: no-cache statement is perhaps the most commonly misused directive to avoid caching. The HTTP RFC does not require browsers to honor this pragma statement, and most don’t. It is only informational and different browsers treat it differently. The pragma statement can be set in the HTTP header, or as a meta-tag in the html page, as

<meta http-equiv='Pragma' content='no-cache'>

Either way, this does not ensure the page is not cached.

Setting a page to expire in the past is a neat trick to ensure that the browser does not use that page from the cache. But the browser still stores the page in the cache folder. It just fetches a new copy from the server and updates its local copy. Based on the cache usage, expired pages are removed after a time (minutes, hours or even days later). Until then, an attacker can view those files directly from the browser cache.

HTTP 1.1 introduced cache-control directives to fine-tune the interaction between the application and the caches. The option 4. cache-control: must-revalidate is an example. It ensures that the browser check with the server before serving a page from it’s cache. This directive allows the browser to use a local copy so long as it checks with the server before using it.

The cache-control: no-cache, no-store twin directives are also part of the HTTP 1.1 specification. Strictly speaking,
they insist that the browser not cache the page, nor store it in its cache--just what we wanted. This works in most cases. Two caveats, though: for non-html pages (like pdf etc.), IE does not obey these instructions (Firefox and others do); secondly, old browsers that support only HTTP 1.0 also do not obey these instructions.

Tags: Quiz