Forensics Investigation

What is

I was first introduced to Keybase a few years ago. It was explained to me as a place to validate your identity with regards to sharing public keys for email encryption. Showing that a Twitter account is controlled by you, or that a github repo is truly yours.

It is a good way to view the ‘web of trust’ around a person, especially if this is a person you would not expect to meet face to face.

I personally never paid much attention to it, not because of anything bad around Keybase, just because it didn’t solve any problems that I was facing.

Forensics on a Website?

At first glance, or if you used Keybase historically you may think that this is simply a website offering a web of trust style service. However Keybase was brought to my attention by a friend. They asked if I had ever carried out an investigation or researched it.

I was very confused, so I went back to see what had changed. I saw that there is now a downloadable app which offered encrypted chat, file transfer and groups. This piqued my interest as this is the criteria bad guy would look for if they wanted to discuss bad things!

Is Keybase Bad?

Just to get this out of the way first, at no point in this post am I attempting to say Keybase is anything other than a legitimate company/app offering a service. Anyone abusing this service is the same as those who abuse other communication mediums. I am carrying out this investigation purely as I feel it could benefit a forensicator that hasn’t come across this before.

Lab, Set-up and Basic Scope

I will be using virtual machines for this investigation. One is Windows 10 (1703) the other is Ubuntu 16.04. I will be creating 3 accounts on Keybase:

  1. Windows 10 App
  2. Linux App
  3. Chrome Browser

My primary focus will be on the Windows machine, but I will be cross checking with Linux to see if similar artefacts exist.

I will create a ‘team’ and also ‘follow’ my accounts. I will look to see if I can find proof that the accounts are connected from host based artefacts alone. I will also look to see if I can capture files that were transferred privately or via the team and any other information that might be available.

The images I will use will be from or something I create in Paint (I am a terrible artist!!)

I intend to look at browser extensions, but that may wait for another blog post.

I won’t be looking at mobile apps, I do not presently have the set up to do so. I hope that someone reading this will pick up that research. I would be very interested to see how we could tie a user to a device.

Signing Up

Initial sign up is very easy from the app. You only need to provide a username, password and a name for your device. All of these are public. You do not need a valid email address and testing has shown you can use throw away services such as Mailinator.


Now the part you are really here for. Lets start going through the artefacts at what they mean.

Firstly the application itself is in the Local Appdata folder (I will refer to this from the ‘Local’ root folder in future file paths)


There is also a Roaming AppData folder created (I will refer to this from the ‘Remote’ root folder in future file paths)


There is no sign of this program in the Program files Directory.


On both Linux and Windows there is an ‘Avatars’ folder which contains the profile picture of anyone who has appeared in the app. Unfortunately the app auto suggests people to you, as such this would be difficult to prove a connection. It does however indicate how active a person has been. 10-20 avatars were recorded from a couple of clicks after creating the account





The files are named ‘avatar<9-digits>.avatar’ on both operating systems and after testing known profile images I can say that the 9 digits do not match. I confirmed the MD5 of two profile images. While the hashes matched, as did the image, the filename did not.

Within a few clicks I had this directory over 1,500 avatars.

Viewed Images & Avatars

The following folder contains a cache of image files that were viewed within Keybase. This includes Avatars and any images sent or received by the machine being investigated. It does not store all files, for example documents, that are transferred. Instead it is my belief this is simply caching any image that was viewable in the app.


This location is probably the most important to criminal investigations as the files in here show what images were shared. These files no not have a file extension, but are in their native format. In order to filter out Avatars from Images I would suggest sorting by size as the Avatars are typically no larger than 60kb in size, where as images are more in the region of 100+kb

Location of last sent file

It is possible to see the location of the last file that was sent over Keybase chat. There is no obvious difference from this artefact alone as to whether it was sent in a team chat or a 1:1 chat.

The following file does not have a file extension, but can be read using a normal text editor.


This could be useful if you suspect external media, encrypted media, or network locations may be in use. The artefact should read


Team Creation

If a team is created on the machine being investigated it will be recorded in the keybase.sevice.log file.


and will be easily identified by the log entry:

YYYY-MM-DDTHH:MM:SS.mmmmmmZ - [DEBU keybase teams.go:63] 1053 + TeamCreate(TeamName) [tags:TM=hXqL-F0Xwsfw]

The ‘1053’ appears to be an iterative Hex value for the log entry. The ‘go’ value and the makeup of the ‘tags’ field require further investigation.

Added to a Team

I believe it is important to know that a person can be added to a team without their explicit consent. I simply clicked on the individual I had previously spoken to (in my case my test account) and added them. I then got a notification to tell me I had been added.

When a person is added to a team the same keybase.service.log file will record the following entry

YYYY-MM-DDTHH:MM:SS.mmmmmmZ - [DEBU keybase team_handler.go:248] 3e64 teamHandler.newlyAddedToTeam: team.newly_added_to_team unmarshaled: [{Id:107282d36081bcbb018874d93e097824 Name:TeamName}] [tags:GRGIBM=WT3UeztTfELk]

Incorrect Team Name

When trying to join a new team, the name must be accurate, you do not appear to be able to search for Teams from a direcory, but instead must know the Team name. There are several log entries relating to a Team not existing such as this one

<date/time> - [DEBU keybase teams.go:4180] 5b34 - TeamAcceptInviteOrRequestAccess -> Root team does not exist (error 2614) libkb.AppStatusError [time=238.3055ms] [tags:TM=IL6Xi8j8UVec]

Unfortunately the team name they searched for does not appear in this log if they were not successful.

Requesting to Join a Team

When a user requests to join a team the following log entry will appear in keybase.sevice.log

<date/time> - [DEBU keybase teams.go:4180] 5b48 - TeamAcceptInviteOrRequestAccess -> ok [time=258.2293ms] [tags:TM=YW_eGpfUPiAX]

There does not appear to be any correlation, in this log file, between the invitation and acceptance to a group. The name of the group is only disclosed once an invitation has been accepted.

Team Chat Members to Team Name

I found that “Chat-Trace” was very useful in linking team members to team names. I have not done extensive research on this artefact to guarantee fidelity, however I was able to use the term to identify the users of the public team chat I joined.

<date/time> - [DEBU keybase user.go:333] 7d64 + Store user UserName [tags:chat-trace=M9L0hrvapdnb,platform=windows,LU=SwGnJf0KDRR7,CHTUNBOX=5ObRJV-ynRzY,CHTLOC=Sk6db9l6bidX,apptype=desktop,user-agent=windows:Keybase CLI (go1.13):5.1.1,CHTLOCS=KKNAAQHUSra6]
<date/time> - [DEBU keybase user.go:355] 7d67 - Store user UserName -> OK [tags:CHTUNBOX=5ObRJV-ynRzY,CHTLOC=Sk6db9l6bidX,apptype=desktop,user-agent=windows:Keybase CLI (go1.13):5.1.1,CHTLOCS=KKNAAQHUSra6,chat-trace=M9L0hrvapdnb,platform=windows,LU=SwGnJf0KDRR7]

Once you have a chat-trace you can use that to pivot either to or from the team name

<date/time> - [DEBU keybase teams.go:432] 7445 ++Chat: + TeamsNameInfoSource: DecryptionKey(TeamName,67c99659bdc24920b56ccec3a42dd424,false,1,false,<nil>) [tags:chat-trace=M9L0hrvapdnb,platform=windows,CHTUNBOX=5ZTfEkNGaDqv,CHTLOC=Sk6db9l6bidX,apptype=desktop,user-agent=windows:Keybase CLI (go1.13):5.1.1,CHTLOCS=KKNAAQHUSra6]

Once you have filtered the log by chat-trace you can then filter again by “keybase user.go” to get a list of users. This list appears to be online users only, further testing required to confirm. In my test the channel info reported 11k users, but the log shows 84 unique usernames.

Leaving a Team

When a user leaves a team the following log entry will be seen

<date/time> - [DEBU keybase teams.go:420] 1e235 + TeamLeave(TeamName) [tags:TM=aByOmRdtM1XJ]


The Keybase app leaves a lot of useful information behind for forensicators to use in their investigations. While currently it is not possible to capture the chat history, we can see Usernames, Team Names and importantly what images where shared in these groups.

Posted in Keybase, Linux Forensics, Windows Forensics | Tagged , , , , , , | Leave a comment

When is Index.dat not Evidence of Browsing

It is easy to fall into familiar habits as a human being, we see patterns in what we do and expect those patterns to persist. However when these patterns can be the difference between a person keeping or losing their job, we need to make sure we are being as vigilant as possible.

During the course of creating a forensics CTF which would be made available to 28,000 14-18 year olds, an image was taken of a Windows 7 machine. The learning objective of this challenge was to show program execution. In order to make the image more authentic a controlled amount of user activity was scripted and carried out by the author.

The challenge was created, tested and then made available to the target audience. During the event we received a message from one of the players telling us they had found “evidence of inappropriate browsing”. While the player reporting this was doing so in a ‘tongue-in-cheek’ way, we took this very seriously.

While I was confident the author had not acted inappropriately, proof was needed to show she hadn’t.

The following screenshot was all of the evidence that was provided:

We know that the various Index.dat files track user browsing activity, so looking at the screenshot, we can see why this was the players first assumption.

The player reported that they had run autopsy against the image and then looked through the URLs that Autopsy reported. This URL was listed alongside legitimate browsing activity as the search function used was a regular expression looking for anything matching a that structure.

My initial observation was looking at ‘ietld’, this seemed an odd thing to be at the beginning of a URL and was the first focus of the investigation. Additionally we could see ‘Internet Explorer’ was being reported as being related to this artefact by Autopsy.


The first step was to see if there was a quick win for this. Had it been seen before? Did Microsoft have a knowledge base article on this?

As many people in the forensics world will be familiar with; the results were not as helpful as we had hoped, with Yahoo answers providing

While this is factually accurate, it is not overly helpful.

This Forensics site shed the most light on the situation, however as you can see by the screenshot below that was still not a complete answer, but at the very least we could see that it had been seen before and was the first confirmation that this file was expected behaviour and as you can see not indicative of user browsing.

This find was a relief, but was not a good enough answer for a company that prides itself on going that extra mile when it comes to all things security. From here I was determined to find out why this file exists and what it is used for.


Before following the rabbit hole, I decided to prove that browsing was not possible to this domain as I had a pretty good idea that the TLD portion of IETLD was a thing we like to refer to as a clue

The next logical step was to see if the domain had a IP address active on it. It did not.

Looking back through historical DNS records I could see no evidence of an IP address ever being associated with this domain. I knew that the domain hadn’t been visited by the content author already, but was the final nail in that coffin.


When you begin to research index.dat using the evidence provided above, you can see when someone not well versed in the nuances of Windows would jump to the conclusion this is browsing activity. There are many articles, forum posts and Q&A sites that indicate anything in index.dat is evidence of browsing history. Fortunately we know this is not the case, and the location of this index.dat makes it different to its namesakes.

This file was located in %APPDATA%/Roaming/Microsoft/Windows/IETldCache. Again notice the final folder name, this is indicating that this is not the browser history that we are looking at.

Additionally this file was filled with a list of domains, none of which would have been visited by the author.

Further Research

Carrying out more research, including looking up the various domain names we found in this file, we began to notice that these domains appeared on a list called the Public Suffix List, this list was originally created started by Mozilla in an attempt to stop TLD level cookies

Public Suffix List

From –

A “public suffix” is one under which Internet users can (or historically could) directly register names. Some examples of public suffixes are .com, and The Public Suffix List is a list of all known public suffixes.

The Public Suffix List is an initiative of Mozilla, but is maintained as a community resource. It is available for use in any software, but was originally created to meet the needs of browser manufacturers. It allows browsers to, for example:

  • Avoid privacy-damaging “supercookies” being set for high-level domain name suffixes
  • Highlight the most important part of a domain name in the user interface
  • Accurately sort history entries by site

We maintain a fuller (although not exhaustive) list of what people are using it for. If you are using it for something else, you are encouraged to tell us, because it helps us to assess the potential impact of changes. For that, you can use the psl-discuss mailing list, where we consider issues related to the maintenance, format and semantics of the list. Note: please do not use this mailing list to request amendments to the PSL’s data.

It is in the interest of Internet registries to see that their section of the list is up to date. If it is not, their customers may have trouble setting cookies, or data about their sites may display sub-optimally. So we encourage them to maintain their section of the list by submitting amendments.

History of Public Suffix List

The Public Suffix List was originally a Mozilla project before being open/crowd sourced. As such we can also find relevant references on the Mozilla Wiki page

Mozilla Wiki

From –


Previously, browsers used an algorithm which basically only denied setting wide-ranging cookies for top-level domains with no dots (e.g. com or org). However, this did not work for top-level domains where only third-level registrations are allowed (e.g. In these cases, websites could set a cookie for which will be passed onto every website registered under

Clearly, this was a security risk as it allowed websites other than the one setting the cookie to read it, and therefore potentially extract sensitive information.

Since there is no algorithmic method of finding the highest level at which a domain may be registered for a particular top-level domain (the policies differ with each registry), the only method is to create a list of all top-level domains and the level at which domains can be registered. This is the aim of the effective TLD list.

As well as being used to prevent cookies from being set where they shouldn’t be, the list can also potentially be used for other applications where the registry controlled and privately controlled parts of a domain name need to be known, for example when grouping by top-level domains.

TLD vs ccTLD

There is some confusion about how to name the various parts of a URL or domain name. This is relevant when looking at ‘country code top level domains’ or ccTLD. Traditionally the “letters after the last dot” where considered to be the ‘top level domain’, with the word before that dot being called the ‘root domain’ and finally anything before the root domain being called a ‘child domain’ or more commonly with internet based systems the ‘sub domain’.

This logic begins to get confusing when we look at domain names ending ‘’ for example. While ‘.uk’ is technically the ccTLD and ‘.co’ is the ‘second level domain’, it is generally accepted that ‘’ is the ccTLD.

Browsers are able to natively detect ccTLDs as these are heavily documented and are regulated under ISO standards. Non-standard TLDs are not regulated in the same way.


The domains contained within this particular index.dat are not evidence of browsing, they are simply evidence of Internet Explorer. This list is part of a publicly available list that is designed to allow domain owners to declare that they are controlling a TLD.

Posted in Browser Forensics, Internet Explorer | Tagged , , , , , | Leave a comment

HTTP Methods

In this post we are going to look at different types of HTTP/1.1 methods. We will leave HTTP/2 methods for another day.

This will be a summary of each method, it is possible to go into great detail with some of these points, but that would get tiresome to read (and write).

What is an HTTP ‘Method’?

With HTTP communication there is often a lot of information being sent backwards and forwards. Some of this is completing previous requests, some is new and occasionally some is erroneous. The client and server both need to quickly be able to see what is happening and how to deal with this communication.

The HTTP method is at the very start of the HTTP communication

As we can see on the above screenshot the very first word on the top line is GET. GET is an HTTP method. This is the same for any new communication as it allows the server to process the request in an appropriate way as efficiently as possible.


This is the most common request from a user’s perspective. When you loaded this page your browser issues a GET request to the host “” with a request for the specific resource. If you are viewing this from the main page the request would look similar to above requesting a forward slash. If you were clicking a link you would be requesting ‘/2019/04/09/http-methods’ as the resource.

Separate GET requests are generated for additional resources on that page, even down to the favicon which shows up in the tab on most common browsers.

Results of a GET request can be held in the browser cache and will appear in the browser history. The GET request itself should not be transmitting any data, only the header.

Side note – this used to display next to the URL, but browser creators removed it to stop people using padlocks to trick people into thinking the site was secure when it was’t


A POST request is used if you are sending data to the server. An example of this could be that you complete a simple form. This data is then sent form the client to the server where it is dealt with depending on the configuration of the server.

This process is slightly more secure than a GET as the data can be sent inside the request. With a GET this data would be sent in the URL which would mean it was recorded in the server logs and in the browser history.

POSTS can also be used to exfiltrate data by an attacker. While this is a very noisy method, it could be used in a ‘smash and grab’ attack; where the attacker has no interest in being stealthy, but instead just wants to be fast.


A HEAD request works the same basic way as a GET request, the difference is with the response. The major difference being that the HEAD request only requires the return header and no data.

This is seen quite often with bots. Consider a search engine may index your entire page on an hourly basis, this requires a lot of time and energy on the part of the bot. If they instead send a head request and check for details like Content-Length, Content-MD5, ETag or Last-Modified, then there would be far less work required for sites that haven’t been updated.


PUT is a way of interacting with the server without causing successive changes. For example if you wanted to create a family tree on a website you could PUT the date of birth on server as this would not change. Even after a family member passes away their birth-date remains the same.

This idea of not changing the data on the server is referred to as ‘idempotent’ and Mozilla have a good Glossary on it here


More commonly expected to be seen with an API than with normal user activity, this HTTP method simply deletes the resource that has been identified in the API. Like PUT this is idempotent and after the first time the delete request is actioned by the server subsequent requests will simply return a 404 not found response.


The CONNECT is most likely to be seen connecting to proxy servers in order to authenticate before the users encrypted browsing session begins. This allows non-SSL breakout proxy servers to monitor web activity. This can also be used for non-encrypted traffic in the same way.


The OPTIONS header is a way of asking which HTTP methods are allowed by that site. This is now mostly seen in pre-flight CORS checks, which I won’t be covering here.

It is important to realise that OPTIONS should not contain any data and the response should be quite short, simply stating which HTTP methods are accepted. This would be a good candidate for command and control traffic.


This is the HTTP version of Ping or Traceroute in the sense that it creates an ‘application layer loopback’ to the recipient sever. It has a ‘Max-forwards’ field that can be decremented at each forward. The recipient simply responds with the same method with 200 as the response code.

This would be unusual to see in a normal user environment, this type of activity should really only be seen in a dev or web-dev environment, or for testing proxy chains for infinite loops.


Another API heavy HTTP method, this could be used after a PUT has created a resource on the server. You don’t want to replace the resource, but you do need to update it. As such you can send the amendments using the PATCH method. This is very unlikely to be seen in a user browsing context.

Why do I care?

This is a case of ‘knowing normal’, if you know why a POST might appear unusual vs a GET, you suddenly see a spike in OPTIONS or a domain controller suddenly starts sending out lots of encrypted traffic surrounded by CONNECTS, then you can begin to investigate with the knowledge that it is unexpected behaviour.

It is important to realise that in the immortal words of Phil Hagen; there is no RFC police. These methods are generally agreed upon principles, and there is nothing stopping a developer from using these methods in an unexpected way. Whether that be via an API or a new browser.

Understanding the different methods also allows us to build our knowledge of HTTP and how the internet works from a browsing perspective.

Posted in Network Analytics | Tagged , , , | Leave a comment

Wireshark – More Basics

I have been approached recently about explaining some of the fundamentals of how Wireshark can be used.

Let’s have a look at some traffic that I captured for a challenge I created recently.

Here we can see an example of HTTP traffic that has already been captured. There are some things we can immediately pick up on from this view alone. We can see that we are not looking at HTTPS traffic, either this is a non-encrypted site or it has been decrypted by some other method.

The IP addresses in use are RFC1918 (not routable on the internet), meaning this was either internal to internal traffic, maybe a company intranet server on a small network. Or maybe we are looking at traffic from behind a NAT device. All of this information becomes important when you are doing this in anger, but for now it is simply for consideration.

Finally, we can also see 3 ‘GET requests’ in this traffic. This shows that the client requested something from the server (quick note, server simply means the machine dealing with the request, don’t get confused with Microsoft terminology). We can also see a couple of the responses with ‘200 OK’ meaning something that was requested was also served back to the client.

The problem we have here is that there are multiple streams of information, there are multiple requests all with their own responses. What if we want to single one of these out?

Follow Stream


In older versions of Wireshark you could only follow the TCP stream, this meant if the traffic was encoded in anyway you would not be able to see what the user would see in their browser after it was decoded.

Above you can see the ‘right click context menu’ that lets you see how to follow a stream. If you were to follow a stream on a encoded stream, this is what you would see:

The top part is the header which ends with the ‘Date:’ field. All after the new line is encoded text that you aren’t able to decode from here (one it’s just plain hard, two there are non-ASCII characters which are represented by dots)

If we follow the HTTP stream however

We will see the decoded stream:

While this particular case may not be easy to read, it is typical of what you might encounter. As a network analyst you may need to work with the malware reversing team to fully understand some of the data you are looking at. Javascript is often obfuscated, which means the developer either doesn’t want someone looking at their code or they are trying to reduce the size of the data being transferred.

Why do I care?

If you are lucky enough to be working with full packet capture you need to be able to know how to use one of the most commonly used analyst tools. Wireshark has is limitations, but for looking at a small sub-set of traffic and wanting to know exactly what happened, it is excellent.

As an example, we can see in this traffic stream above the contents of the web page without having to visit it. We can see that the page that was served did not have any malicious scripts or file downloads on it. It simply had a header with a flag in it (this is a real flag, so I hid it 🙂 )


We have covered the basics around following a HTTP stream and a TCP stream and why each is different in the context of HTTP traffic.

This may seem like simple to a lot of people, but to new people entering the industry, this could be the thing stopping you from winning that competition, or failing a technical test. Network analysis is important, there is a lot of cool information you can see on the wire!

Posted in Network Analytics, Network Forensics, Wireshark | Tagged , , , | Leave a comment

Decrypting Traffic in Wireshark

If you have a HTTPS session captured and are looking at unlocking the secrets that lie within, you are probably looking at Wireshark with eternal optimism hoping that somehow the magical blue fin will answer all of problems….

Sadly that’s not quite the case…. but it will help.

(To help me structure this post I am going to use a CTF challenge as a walkthrough. It was originally a DEFCON CTF, then was later picked up by, if you want to play along at home click here)

Encrypted Traffic in a PCAP? I’m outta here!!

Hold your horses, there is a lot of useful information in an encrypted PCAP that may help you to find a weakness, or even all the information you need. In this instance we can see that the network traffic is using a certificate that has had the private key published online.

People don’t publish private keys online!

……. ummm …… yes they do. A friend of mine, Kev ‘TheHermit’ Breen created a Pastebin scraper (PasteHunter) that uses Yara rules to check pastes for interesting stuff then indexes them. He did a presentation at CyberThreat 2018 giving a summary of (redacted) results, amongst them, private keys. It is also possible to find some using Google searches, however most people have become wise to this method (normally the hard way).

So you’re saying this is easy?

Well… no. 99.999…% of the time you will need to get the private key in a legitimate way. You can’t simply google for Microsoft’s private key. The exception is typically in a contrived situation, like a CTF. Which is what we are discussing!

However the point of this post is to show how to do this when someone gives you the private key file.

Back to the CTF

This CTF gives you a clue to use google and tries to lead you to an old Github page that has this key listed as ‘expired’  (

The fun thing about CTF’s is that there is no single way to solve them. So with some creative thinking and lots of searching I found that the certificate has been around the houses a few times:

Anyway, we are getting off topic! I suspect this is an old challenge and hasn’t been updated when the certificate was replaced on the original Github page.

The bit we are interested in is the Private Key, everything else will just break Wireshark. So we grab the following:


You need to include the hyphens at the beginning and end to.

Now we have this bit, save it as a .pem file (server.pem maybe?), the name isn’t important, only the file extension.

Using the .pem file in Wireshark

Right, we have stuff we need. Stuff is important.

There are a couple of ways of doing this, I am going to use the menus on the main Wireshark window. This is done in version Wireshark 2.6.4. I doubt they will move the bits I am talking about… but they may go full-Microsoft on us at some point.

Go to Edit > Preferences

In the preferences screen that pops up, you want to go to the left side and look for “Protocols”, expand this out and find “SSL” (I typically press ‘T’ then it’s at the top of the screen).

On this screen you want to click on the RSA Keys List button. You should also specify a debug file, this will create a text file that will help you should something not work. Have a look at a working version (after following this guide) so you know what it should look like.

Add the server IP address, the port (in this case it’s 4433 instead of the default 443), protocol TCP and the location of the key file. Leave the password blank.

OK your way back to the main screen.

Normally you would now ‘Follow SSL stream’, however that doesn’t work here, possibly because Wireshark doesn’t know what to do with the data (it’s not web browsing, hence no ‘site details’ as per my previous post).

If we now look through the packets we can see that packet 13 sticks out, it has a lot of flags set and is a malformed packet. When we investigate further we see this…

If you look to the right, you can see why Wireshark declared this malformed, all of the fields have been manipulated to print out a message.

Why do I care?

The CTF was used as a mechanism to demonstrate how to decrypt data in Wireshark. So you don’t need to care about the challenge, but knowing how to add a private key is very important. This is the type of task IT staff would assume the security people can do, but if you have never tried it, this allows you to play.

Looking at encrypted traffic could provide the case your working on with that critical piece of evidence the bad guy thought they had hidden.

This also shows that network forensics is not going anywhere, HTTPS is a GOOD thing and should be embraced. We can put technical steps in place to allow us to keep using HTTPS and HSTS while still maintaining the level of detection we have always had.

Posted in Cryptography, Encrypted Traffic, Network Analytics, Network Forensics | Tagged , , , , , , , | Leave a comment

Identifying Sites in Encrypted Traffic

There is some mis-information around; that encrypted traffic is useless, and you should go back to netflow and statistical analysis only. I disagree. I will be doing a few posts showing clear-text information leakage we can use to our advantage.

Let’s start with a biggy;

What site was visited?

Imagine you need to prove a user went to a specific website. Providing the site isn’t on the HSTS pre-load list within the browser, you can see this. (We will visit HSTS and the pre-load another time, but for this instance we will assume malware, or nefarious activity which wouldn’t be included in this list).

I am going to pick on the truly evil wikihow website (just because I used them in a HTTP-PCAP-CTF a few years back before they moved to HTTPS and now I am proud of them 🙂 )

So a user is suspected of faking their new job and visiting wikihow to see how to do stuff. We check the packet capture and run a filter looking for GET requests to We don’t see any…. then the IT dept tells you SSL breakout broke a while back, and the CEO dictated it was turned off as it was stopping him streaming…. work…. stuff.

Now we have an issue. We filter this users machine, pull the packets only for the time frame this person was suspected of the activity (cutting many corners for ease here, just go with it). We find lots of SSL Handshakes and have a closer look…..

Easy right?

Kidding 🙂 The answer is actually under “Extension: server_name”, but there is an easier way. Follow stream!

The orange coloured part is outbound from my PC, the blue part is the response. You can see here “” in the packets.

We can also see something interesting in the response. This is a shared certificate and all of the sites listed share this certificate. I am not going to cover certificates, as this would need me to talk about key exchanges which hurts my head. Just accept that certificates can be shared and data is still secure… ok? cool. If you want to know more, there are lots of really interesting sites on the subject.

Why do I care?

This information can help with an investigation while you are waiting for someone to bring you the private key, or if no keys are available. You can at least check all of the requested sites. This would be the encrypted equivalent of looking at all GET requests (kind of… all GETs would also show resources within sites, this won’t… but you get what I mean).

If you want to look for all ‘Client Hello’ requests in a PCAP use the following Display Filter

ssl.handshake.type == 1

Red Team Recon

This is a nice easy way for pentesters to recon a site with normal user behaviour. Looking at the response to the Origin API (see last paragraph) I can now see lots of sub-domains to play with:

What information does your certificate leak about your company?

Side note

This can be interesting to see what your machine is doing, while I was running this capture I also unintentionally captured a request going to I have the origin client installed, but it wasn’t running at the time. Now I know that Origin has a service that runs in the background doing something…..

Posted in Encrypted Traffic, Network Analytics, Network Forensics | Tagged , , , , , , , | Leave a comment

SMB2 Protocol Negotiation

This is one of the few times when looking at SMBv2 you will need to use SMBv1 commands. The initial negotiation request will always be sent out as SMBv1. It makes sense when you think about it, SMB does not have ‘backwards compatibility’, instead it relies on negotiating to the lowest common denominator.

To find the initial request use the following SMBv1 command

smb.cmd == 0x72

If the server responds using the SMB2 protocol a second negotiation is sent. This time on SMB2.

To see all SMB2 negotiation and responses you will need the following command

smb2.cmd == 0

During the negotiation you are able to see what capabilities the server has, what the client has and any negotiated authentication/encryption technique. You can also see the time that is set on the server, as well as its Timezone.

Why do I care?

There is a lot of useful information in here to help with Server identification and potentially geographical location. Looking at the capabilities of the server can with OS identification; is it a Windows box, a NAS etc.

Using the SMBv1 filter you are able to see the first communication between the two devices, aiding in timeline building.

Posted in Network Analytics, SMB | Tagged , , , | Leave a comment

SMB2 – File/Directory Metadata

Using SMB it is possible to retrieve data that is typically only expected when carrying out host based forensics.  The MACB (Modification, Access, Change and Birth) data is sent across regardless of if a file is accessed or not.

With SMB v1 this was a bit of a pain to find, the Wireshark filter required was

smb.cmd == 0x32 && smb.trans2.cmd == 0x0005 && smb.qpi_loi == 1004

with SMBv2 it is simper


This command can have a value added after it, however in its current state it is the equivalent of having “exists” on the end.

The output looks different to SMBv1 as you would expect, but the data is the same.



With SMBv2, the simple addition of a column provides us with the path detail that was removed from the SMBv1 command

Why do I care?

As you can see from the screenshots, this shows what files were accessed. If you look closely at the second screenshot you can see that a file named “~$resource-to-share.xlsx” was referenced. When you look at this you can see it talks about a file being created on the share. This tells us two things.

  1. We can see when files are uploaded to an SMB share
  2. The Excel file was opened on the local machine as that is the temporary file Office creates to allow auto recovery on crash

The above is from a Windows Server 2016 VM I have in my home lab, but I have also tested this on my NAS and got the same type of results.

Posted in Network Analytics, Network Forensics, SMB | Tagged , , | 1 Comment

SMB Tree Connect/Response Details

If you want to play along at home, the sample PCAP I will be using for SMB2+ is here, the SMB v1 PCAP is not something I can give away sadly.

Tree Connect Request/Response

When the SMB protocol connects to a resource it needs to know exactly what is there. This is where the OS retrieves the share name. If the share name has a ‘$’ at the end (like IPC$ or C$) this means the share is hidden, typically the system will create hidden shares, but users can also create them. Hidden means that if you were to go to the root of the resource (\\servername\ ) you would not see the hidden shares listed.

Tip. If you are monitoring SMB and see \\servername\exfil$…. might be worth looking at!


SMB v1 looks like this:

SMB v2 on the other hand looks like this:

So what’s the difference?

As you can see there are some cosmetic changes, the ‘andx’ part has been dropped. The biggest difference for me is the addition of the ‘SessionID’ details in v2, this now provides the requesting username* and the requesting client

In the Hex the Flags have been moved and v2 has less Flags. We can still see the path in the details pane.

*It is worth noting that the username is the one used to connect that share, not the one which is logged on locally. This can be entered when the share is initially created, or is prompted for when the user clicks on the link. Bear this in mind during investigations


Posted in Network Analytics, Network Forensics, SMB | Tagged , , , | Leave a comment

SMBv2+ SYNC Header Explained

SMB2 Header

The SMB2 Header will either be ASYNC or SYNC, you need to look this up from the flags. SYNC is the most common header as this can be in the form of a request or a response, where as a ASYNC header will be used for responses to requests processed asynchronously by the server.


The Credit Charge field appears to be either 0 or 1 and will only be 0 on protocol negotiation (initial communication) after that it will be set to 1.

Credits appear to be a way for the client to control the requests being sent in that session, providing the client has credits remaining, it can continue to communicate. You would expect to see the ‘Credits Granted’ = 1 if the server is answering a request.

NT Status

This will only be seen in responses and will give the status, or error, of the request. ‘STATUS_SUCCESS’ is as it sounds, and will typically be expected from a successful tree connection

Channel Sequence

Channel Sequence is explained by Microsoft as “In a request, this field is interpreted in different ways depending on the SMB2 dialect.”

…which is helpful…


This is reserved… no seriously, don’t use it. SMB’s version of the ‘evil bit’?


The command section will have one of the following commands within it:


The Flags field will be populated thus:

Chain Offset

Microsoft refer to this field as “NextCommand” and say it must be offset from the first header. However every instance I have seen has this set to 0x00000000

Message ID

This field is very useful as it identifies individual messages (conversations) within the SMB protocol. If you follow stream on an SMB conversation in Wireshark you will see a large number of messages, which can get confusing, this way you can see what the request and outcome was to each request.

Use the following filter in Wireshark

smb2.msg_id == ##

Where ## equals the message ID you want to track

Process ID

The default value is 0x0000feff

When it is not set to default (or 0x00000000) it can be used to manage broken connections. For example if the server sends a pending response the client can decide send a cancel request to the server to stop that particular message from consuming resource.

Tree ID

Another useful field, within the Tree Connect Response this field will contain an identifier to a share. This means if you are investigating a lot of SMB2 traffic across many shares, this will help you keep track of a specific one.

smb2.tid == <hex value>

Where <hex value> is the hexidecimal value displayed in this field

Session ID

The Session ID will help track a specific user session, similar in usefulness to Tree ID, it will allow you to see if a new username was used. It can also show failed logon attempts, as the value will change with each new authentication attempt.


This field will be set to 0 if the Signing Flag is set to not signed.

This appears to be used for verification and encryption. Most likely to be seen in environments that utilise SMB encryption


We have covered SMB v2 SYNC header. This was mostly based on Tree Connect and Response, as this was the original post!

Why do I care?

If you need to look at SMB on any modern system there are some good artefacts in the SMBv2 header.

I hope this helps, feel free to comment.

Posted in Network Analytics, Network Forensics, SMB | Tagged , , , , | Leave a comment