I have been creating network and computer security ‘Capture the Flag’, or ‘CTF’, challenges for a number of years now. My latest job had me doing this full-time for events that would attract several thousand players. During this time my team and I have learnt many important lessons on what to do and more importantly what not to do.
I want to share some of these lessons here as I see more and more CTFs being created. Note. My specialty is in forensics and the blue team, I have working knowledge of most disciplines, but you will notice most of my examples will be around the blue side of the house. They should all translate quite happily to the red side of the fence.
Why do you want to make a CTF?
Answering this will help guide your decision making. Bear in mind that these are not mutually exclusive and may bleed into one another. Some of the most common reasons include:
- Business reason – This may be a trade show, or a way to get your company name out there. This reason obviously needs the most care and attention as your reputation is on the line
- Teaching – You may want to show off some new tools or techniques, or maybe you feel there is a skills gap that needs to be addressed. This can be separate or joined with the Business element
- Conference – There are many security events around the world, and whether your CTF is on-site only or for people who aren’t able to physically make it, this type of event allows everyone to feel that they are part of the action
- Fun – You just want to make something because you love the topic and you want to share that with the community. This might be a solo undertaking or may be something that you get a peer group together for
I have tried to put these in order of ‘required structure’. What I mean by that is; if you are doing this as a business it should follow the same pattern as any project. There needs to be requirements, deliverables and all of the usual planning that would be put into a large piece of work. If you are doing this for fun, you should still have these, but they may simply be in your head. You will be given a lot more leeway by the people playing your CTF if they know it was a labour of love and is not trying to be a representation of what your company is offering.
Think about who you are going to be presenting this CTF to. If you are teaching people, then you will need to have entry challenges that are easier to solve. Whereas if you are looking professionals at a conference, you will most likely want to pull out the big guns and have some really difficult challenges. Look at the demographic in terms of discipline too. Are you presenting this to pen-testers, forensicators, developers? We will discuss discipline of challenges later in this post, but it is important to know who will be playing your CTF, or at least who you are targeting it at.
Most CTFs have a progressive difficulty. For example, with forensics your first question might be “what is the hash value of the forensics image”. This is typically generated by the capture tool and stored as a text file with the image. It is a simple ice breaker that allows people to ensure they understand the flag format for submission, and proves the file they are using is intact and correct. Bear in mind that if your target audience is entry level with regards technical skill, your entry challenges may need to be even simpler.
I have previously made trivia questions that prompt the user to think about their environment. For example, “Which command allows you to list files on a Linux workstation?”. Or “In Volatility v2 which plugin will show you the operating system the memory file was taken from?” Difficulty is best measured in 4 categories; easy, medium, hard and extreme.
Easy will be the ice-breaker and trivia type questions. Extreme will be something that only someone with a very in-depth knowledge of the subject would be able to do. An example might be having to carry out several different techniques in sequence in order to find the answer. Hard and Medium will sit in between. Think about the spread of difficulty in terms of weighting. If you are teaching you will probably want 35% easy 35% medium 25% hard 5% extreme (or 35/35/25/5). If you are looking at a conference or highly technical you may change to 10/25/40/25. Be honest about your own ability too, making an extreme challenge that is fun, engaging and realistic is not always easy. This is your CTF, you are making the rules!
As you can tell by now, my expertise means I am a lot more comfortable creating forensic challenges (network/host/memory) than I am making something like cryptography or malware reversing. You need to be honest about what you are able to do and play to those strengths. If you have a multi-skilled team like I had, then you will able to have a diverse discipline set.
Think about why you are making the CTF, if you are at a conference that is focused on pen-testing, or coding, then forensics challenges will most likely not go down too well. Conversely having a lot of challenges based on a single language also may not be ideal. Target audience is important here. You can have a mix of disciplines, but seek validation of any challenges you make that are not in your discipline. For example, if someone solves a challenge you wanted to be ‘extreme’ using strace, then you will look a little bit silly.
This is often overlooked, or lost, when creating challenges. Even if you are creating the CTF for fun, you are still teaching, or reinforcing, a skill. Ask yourself what it is you are teaching, and what the real-world application would be.
Something my team and I discovered is that we were pushing out too many challenges that had very limited real-world application. Examples of this were steganography challenges, mostly using the same tool but with different ways of hiding the password. Another was putting challenges on Twitter which were either simple XOR, rotation cipher or Base64/32/85 encoded. While these are interesting for an ice breaker, they were being over used and detracting from the overall experience.
Even at the high end; I got sick of hearing “CBC bit flip” whenever it came to difficult challenges. Looking again at relevance, do your challenges represent the real world? A perfect example of forensics would be to have everything based on Linux. While Linux forensics is an important skill and should be in a CTF, it should be put into the correct context. Perhaps the Linux image is from a web server that was compromised. Linux desktops are quite rare when held against Windows and Mac.
Narrative & Easter Eggs
A narrative isn’t strictly necessary, but it can be the difference between an OK CTF and a great CTF. If you have a general underlying story then it allows the player to play along in their heads. Quite often forensic challenges will be around stealing company data, you can enrich this with Easter eggs; have some emails, documents, web-browsing etc that plays into the narrative.
SANS DFIR team do an excellent job of this. When playing their ‘capstone’ events you can see the huge amount of time and effort they put into generating the evidence. While this is above and beyond what the average CTF will contain, remember that they are able to re-use this data for years with new artefacts being found each presentation. Putting a little extra effort into the challenges early will make for a more enjoyable experience later.
I would often post amusing (I thought they were funny at least) messages to Pastebin which were never referenced in the challenge questions, afterwards I would have people telling me they found, and it would become a conversation piece over beers. Having extra data in a forensics challenge also raises the difficulty. If I say “which docx file was opened on this date” and there are only 3 files, why bother looking up the data in the intended way when you can simply brute force it?
Traditionally a flag would be formatted as “flag:text_here” or some derivative of this. Using the word “flag” is not the best idea, as people can search, grep or otherwise look for that string instead of actually completing the challenge. You can have the player manually append the word “flag” in the submission field if need be.
Challenges that I have made recently now include dates from the evidence. For example, “What time/date did xx happen?”.
If you use this method don’t be afraid to over explain the format expected. In my previous example I would need to say “format is yyyy/mm/dd HH:MM:SS and in UTC”. Often new CTF creators make the assumption people will know what the flag should look like. This simply isn’t the case!
If you do use the traditional flag format, I would also recommend adding some fake flags to stop people finding alternative ways to find them. I have put 25,000 lines of random flag strings to stop people using forensic tools to search for the word “flag”. The truly evil part of that is that I didn’t use the word flag in any of the answers. Just be careful not to troll your players too hard with fake flags, only use them to discourage trivial challenge bypass!
Hard != Esoteric
Esoteric is a word I have come to use a lot when planning CTF challenges.
1. intended for or likely to be understood by only a small number of people with a specialized knowledge or interest.
I have played many CTFs which had challenges involving a simple concept with an esoteric element added to it, to then claim it as a difficulty level. While this is a possible way of making something harder, it should not be relied upon.
A silly example would be “Guess the password on this zip file. Hint: The password is my date of birth, followed by my parent’s anniversary”. There is no way I would expect any of you to know that. I don’t even know what my parent’s anniversary was! (they divorced <redacted> years ago).
Other examples could be using a vulnerable PHP version, then tweaking the vulnerable piece of code manually. This then means you have changed it from a simple out of the box challenge that could be solved with metasploit/burp, to a challenge that needs a custom exploit based on a patch that was implemented and never published. Or a Vigenère cipher where you expect them to brute force the key. This only works if the key is easily guessed after getting the word ‘flag’ at the start.
This point deserves far more discussion that what I have written here, a general rule of thumb is to remember something I have said to my team many times “The challenge should be hard because the subject is hard, not because you’re being a dick”.
Evidence & Scope
Do you have a really cool forensic challenge, something that’s really exciting, revolutionary? All you need to do is download this 500GB file.
It’s probably not going to get the attention it deserves!
A trick I started to use on Forensic challenges was to use a tool like Kape to copy all of the important data to a USB stick, then copy extra ‘fluff’ data across, this was typically the contents of program files, and user appdata. I would then capture the USB stick which was 8GB in size. Since the unallocated space was empty this compressed down into a couple of hundred MBs.
With memory images reduce the RAM on the VM to something more manageable. You may have a bad day due to slow responsiveness, but it’s better than 8GB of memory just to capture a PID or two.
With regards to scope, make sure you have permission from the service provider to carry out the CTF, especially if you are hosting web applications. Azure and AWS have allowed CTFs to be carried out using their infrastructure. But I would recommend contacting them, or you could find your CTF being cut short.
We all love writing documentation, right?… right?
You may think you don’t need to document your challenges because you can remember them, and even if you don’t you can re-solve them. No. This is a bad mindset to have. When you are getting 50 questions and complaints that a challenge isn’t working, you don’t want to be solving your own challenge, trying to remember the arguments to a tool, or trying to remember which offset the important thing as on. Write. It. Down!
By having good, simple walkthroughs you can test your challenges easily. You can get them validated easily! And you can re-use aspects of them in the future.
Testing & Validation
This proves to be one of the most difficult parts of creating a CTF; getting someone to test it. As we all know, testing your own work is never a good idea. You need external validation.
This is typically harder for a business to do than an individual (unless the business is a security consultancy or similar), as they would need to employ a small QA team to go over the challenges. With an individual, you can ask friends, or peers to help out. But in my experience, the uptake on actual testing is very low, especially if the discipline or difficulty is outside what your peer group is comfortable with.
Shouting out on social media, like Twitter, can really help. Asking for volunteers to test your challenges. Validation is more around confirming the difficulty. Does someone with the target demographic skill set agree with the difficulty? Do your peers agree? Who is right? No one and everyone! The stats at the end of a CTF will often be the real truth-teller. Did everyone solve your hard challenge? This happened to my team with a crypto challenge; it took ages to build and minutes to solve because there was a tool that had been developed for a very similar challenge used on a different CTF.
Do all of your challenges work? If you have documented the challenges then you can get people to carry out functional test. These are, as the name implies, a simple test to make sure the challenge works and can be solved in the way you intended. The testers will have your walkthrough and will be following it step by step. They can then feed back to you if they thought your route to solve made sense and if they think it would be reasonable to expect a player to take that route.
Where and how are you going to present your CTF?
There are engineering considerations to take into account when choosing a platform. Having a really popular CTF may seem like a great problem to have, but when people aren’t able to play, they may get frustrated quickly. Think load balancers and how to host these systems. Should you be looking at hosting on a cloud service provider rather than the old trusty 386DX you have lying around in the cupboard?
The current leader in open source CTF platforms seems to be CTFd. This platform can be cloned from GitHub, or installed via Docker allowing for quick set ups. It also has the ability to work alongside external platforms such as Major League Cyber.
There are other open source platforms out there and companies are now starting to pop-up with CTF Platform-as-a-Service. But these are still few and far between, as this is a growth (sub) industry, I expect to see a lot more managed CTF platforms going forward.
When choosing a platform look at if they have case sensitivity in the submission field. This type of detail is often overlooked when creating flags. Does the service offer any sort of post-CTF stats? Can you look at how many people participated? How many challenges were solved? etc etc.
Players will write about your CTF, they will spoil, or burn your challenges. Accept this as part of the process and encourage it. I have seen companies claim that doing this was an infringement of intellectual property. I am no legal expert, but I doubt that would hold up in court!
By having people write up your challenges you are getting free and unfiltered feedback and your CTF will get even more publicity. Meaning if you run a second CTF you will have more players. Also, these write-ups will create more footfall to your website (if you have one!).
Regardless of whether you run the CTF for profit generation or for run, you should run a retrospective analysis on how the CTF went. As this can be a blog post in itself I will advise you to read up on it here, or any other project management blog.
There is a lot to take in from the post, and not as many pictures as normal. I hope that you find this helpful when creating your own CTF. Keep creating and keep pushing what we know.