When is Index.dat not Evidence of Browsing

It is easy to fall into familiar habits as a human being, we see patterns in what we do and expect those patterns to persist. However when these patterns can be the difference between a person keeping or losing their job, we need to make sure we are being as vigilant as possible.

During the course of creating a forensics CTF which would be made available to 28,000 14-18 year olds, an image was taken of a Windows 7 machine. The learning objective of this challenge was to show program execution. In order to make the image more authentic a controlled amount of user activity was scripted and carried out by the author.

The challenge was created, tested and then made available to the target audience. During the event we received a message from one of the players telling us they had found “evidence of inappropriate browsing”. While the player reporting this was doing so in a ‘tongue-in-cheek’ way, we took this very seriously.

While I was confident the author had not acted inappropriately, proof was needed to show she hadn’t.

The following screenshot was all of the evidence that was provided:

We know that the various Index.dat files track user browsing activity, so looking at the screenshot, we can see why this was the players first assumption.

The player reported that they had run autopsy against the image and then looked through the URLs that Autopsy reported. This URL was listed alongside legitimate browsing activity as the search function used was a regular expression looking for anything matching a that structure.

My initial observation was looking at ‘ietld’, this seemed an odd thing to be at the beginning of a URL and was the first focus of the investigation. Additionally we could see ‘Internet Explorer’ was being reported as being related to this artefact by Autopsy.

IETLD

The first step was to see if there was a quick win for this. Had it been seen before? Did Microsoft have a knowledge base article on this?

As many people in the forensics world will be familiar with; the results were not as helpful as we had hoped, with Yahoo answers providing

While this is factually accurate, it is not overly helpful.

This Forensics site shed the most light on the situation, however as you can see by the screenshot below that was still not a complete answer, but at the very least we could see that it had been seen before and was the first confirmation that this file was expected behaviour and as you can see not indicative of user browsing.

This find was a relief, but was not a good enough answer for a company that prides itself on going that extra mile when it comes to all things security. From here I was determined to find out why this file exists and what it is used for.

Domain

Before following the rabbit hole, I decided to prove that browsing was not possible to this domain as I had a pretty good idea that the TLD portion of IETLD was a thing we like to refer to as a clue

The next logical step was to see if the domain had a IP address active on it. It did not.

Looking back through historical DNS records I could see no evidence of an IP address ever being associated with this domain. I knew that the domain hadn’t been visited by the content author already, but was the final nail in that coffin.

Index.dat

When you begin to research index.dat using the evidence provided above, you can see when someone not well versed in the nuances of Windows would jump to the conclusion this is browsing activity. There are many articles, forum posts and Q&A sites that indicate anything in index.dat is evidence of browsing history. Fortunately we know this is not the case, and the location of this index.dat makes it different to its namesakes.

This file was located in %APPDATA%/Roaming/Microsoft/Windows/IETldCache. Again notice the final folder name, this is indicating that this is not the browser history that we are looking at.

Additionally this file was filled with a list of domains, none of which would have been visited by the author.

Further Research

Carrying out more research, including looking up the various domain names we found in this file, we began to notice that these domains appeared on a list called the Public Suffix List, this list was originally created started by Mozilla in an attempt to stop TLD level cookies

Public Suffix List

From – https://publicsuffix.org/

A “public suffix” is one under which Internet users can (or historically could) directly register names. Some examples of public suffixes are .com, .co.uk and pvt.k12.ma.us. The Public Suffix List is a list of all known public suffixes.

The Public Suffix List is an initiative of Mozilla, but is maintained as a community resource. It is available for use in any software, but was originally created to meet the needs of browser manufacturers. It allows browsers to, for example:

  • Avoid privacy-damaging “supercookies” being set for high-level domain name suffixes
  • Highlight the most important part of a domain name in the user interface
  • Accurately sort history entries by site

We maintain a fuller (although not exhaustive) list of what people are using it for. If you are using it for something else, you are encouraged to tell us, because it helps us to assess the potential impact of changes. For that, you can use the psl-discuss mailing list, where we consider issues related to the maintenance, format and semantics of the list. Note: please do not use this mailing list to request amendments to the PSL’s data.

It is in the interest of Internet registries to see that their section of the list is up to date. If it is not, their customers may have trouble setting cookies, or data about their sites may display sub-optimally. So we encourage them to maintain their section of the list by submitting amendments.

History of Public Suffix List

The Public Suffix List was originally a Mozilla project before being open/crowd sourced. As such we can also find relevant references on the Mozilla Wiki page

Mozilla Wiki

From – https://wiki.mozilla.org/Public_Suffix_List

Purpose(s)

Previously, browsers used an algorithm which basically only denied setting wide-ranging cookies for top-level domains with no dots (e.g. com or org). However, this did not work for top-level domains where only third-level registrations are allowed (e.g. co.uk). In these cases, websites could set a cookie for co.uk which will be passed onto every website registered under co.uk.

Clearly, this was a security risk as it allowed websites other than the one setting the cookie to read it, and therefore potentially extract sensitive information.

Since there is no algorithmic method of finding the highest level at which a domain may be registered for a particular top-level domain (the policies differ with each registry), the only method is to create a list of all top-level domains and the level at which domains can be registered. This is the aim of the effective TLD list.

As well as being used to prevent cookies from being set where they shouldn’t be, the list can also potentially be used for other applications where the registry controlled and privately controlled parts of a domain name need to be known, for example when grouping by top-level domains.

TLD vs ccTLD

There is some confusion about how to name the various parts of a URL or domain name. This is relevant when looking at ‘country code top level domains’ or ccTLD. Traditionally the “letters after the last dot” where considered to be the ‘top level domain’, with the word before that dot being called the ‘root domain’ and finally anything before the root domain being called a ‘child domain’ or more commonly with internet based systems the ‘sub domain’.

This logic begins to get confusing when we look at domain names ending ‘.co.uk’ for example. While ‘.uk’ is technically the ccTLD and ‘.co’ is the ‘second level domain’, it is generally accepted that ‘co.uk’ is the ccTLD.

Browsers are able to natively detect ccTLDs as these are heavily documented and are regulated under ISO standards. Non-standard TLDs are not regulated in the same way.

Conclusion

The domains contained within this particular index.dat are not evidence of browsing, they are simply evidence of Internet Explorer. This list is part of a publicly available list that is designed to allow domain owners to declare that they are controlling a TLD.

This entry was posted in Browser Forensics, Internet Explorer and tagged , , , , , . Bookmark the permalink.

1 Response to When is Index.dat not Evidence of Browsing

  1. Onur says:

    I found that index.dat file in IETldCache folder under user account is constantly updated EVENTHOUGH you use any 3rd party browser like Firefox and Chrome. It is not purely controlled by IE. I think Windows caches and logs TLDs here when a request is made to a new domain even you don’t use Internet Explorer anytime.

Leave a reply to Onur Cancel reply