CTFForensics - OpenSecurityTraining.info

PCAP data
How we get it
Direct capture from the NIC on a machine
 tcpdump
 wireshark
 Netwitness
 etc.
Network coverage – an aside
Network coverage is how much of the
traffic on the network that your sensor
network can see. You can have different
types of monitoring on different parts of
the network, but the main idea is to avoid
blind spots. This applies to PCAP, flow,
logs, and everything else.
Network coverage – an aside
Since different segments of the network
carry different traffic, where you decide to
place you sensors will determine what you
can see.
What would you see on the outside of
the border firewall that you wouldn’t see
inside? What kinds of things do you WANT
to see?
Network coverage – an aside
Things to think about
 NAT – solve with placement of sensors
 VPN – solve with placement of sensors
or VLAN specific configuration
 Multiple border gateways – solve using
channel bonding/aggregation
Network coverage – an aside
On the outside of your firewall, you see
the attacks that didn’t get through in
addition to the things that did. On the
inside of your firewall you see things that
actually got through. The outside tells
you who’s attacking and how. The inside
tells you what attacks worked.
Network coverage – an aside
In addition to the amount of the network
that’s covered, we can also think about
WHEN the network is being covered.
Sometimes you’ll want PCAP data for a
couple of hours, but couldn’t handle 24/7.
When might that be? Could you perhaps
trigger full PCAP for a time based on some
event? Absolutely!
PCAP data
Hands on
Now that we know where, why, and how
to collect PCAP data, let’s go do some
PCAP data
Doing analysis - Wireshark
Wireshark is your good old fashioned,
run of the mill, go-to, protocol analyzing,
packet capturing, file carving buddy.
Learn to love it.
PCAP data
Doing analysis - Wireshark
What we’ll be doing today
 Learning the layout of the interface
 Capturing PCAP data
 Looking at the structure of packets
 Filtering packets to find interesting things
 Following a TCP session
 Carving files
 Reading emails
PCAP data
Doing analysis - Wireshark
Sources for pcaps
 http://wiki.wireshark.org/SampleCaptures
 http://packetlife.net/captures/
 http://www.pcapr.net
 http://www.icir.org/enterprisetracing/download.html
 Your own machine
PCAP data
Doing analysis - Wireshark
So that’s Wireshark. Pretty nice, huh?
When it comes to finding out exactly how
your machine got pwned (aka owned,
pwnt, etc.), it’s pretty effective.
Also, the functionality of Wireshark can
be extended by coding up plugins and
decoders, and anything else you want.
It’s open source!
PCAP data
Doing analysis - Wireshark
But what if we don’t have time to do all
that poking about and sifting through
packets? Is there a better way to look
through a big pile of PCAP data?
I thought you’d never ask…
PCAP data
Doing analysis - Netwitness
What we’ll be doing today
 Learning the interface
 Importing some PCAP data
 Doing (almost) everything we just did in
Wireshark in less time than it took us before
 Catching things that we might have missed
PCAP data
Doing analysis - Netwitness
Netwitness is a tool for getting a quick
picture of what someone was doing on the
network, especially if you’re going after less
advanced threats, like insider threats or the
average criminal.
Currently there’s a freeware version and a
paid version. Give it a try next time you get
stuck during an investigation. Often you can
catch certain clues via the session based view
that you wouldn’t simply by digging through
PCAP data
Doing analysis – Other tools
In addition to sitting down and doing
deep dive analysis on PCAP data by hand,
we can also run it through automated
processes (sometimes even at line
speed!) to do all sorts of other stuff. This
is how firewalls and IDS work, after all.
Depending on the audience, this is
where we discuss our organization’s
custom tools 
PCAP data
Generating flow and alert data
Useful when someone hands you a big
wad of PCAP and you have no other data
 Can be done when you’ve got data from
before you fielded your flow monitoring or
alert generating apps (IDS, firewall, etc.)
 Makes analysis of large data sets easier
since it’s faster to look at coarse grained
 We’ll cover this when appropriate.
When you have PCAP you can see pretty
much everything.
It’s very heavy weight whenever you start
dealing with enterprise level networks.
It’s the only way you’ll see what’s being
said on the network, but it’s not as good
as flow or log/alert data for figuring out
what’s important to look at.
Day 1
Agenda and motivation
Intro to forensic data types
Working with PCAP data
 What it looks like
 How to interpret it
 How to get it
Working with flow data
 What it looks like
 How to interpret it
 How to get it
Day 2
PCAP and flow recap
Working with logs and alerts
 What they look like
 How to interpret them
 Getting them all in one place
 SIEM’s and their familiars
Fielding a monitoring solution
Flow data
Things to keep in mind
This is easy data to get, so make sure you
Better used to figure out where to look, than
to figure out exactly what happened.
Even when you’re not on an investigation,
you should collect flow data to do baselining.
Visualization helps a lot.
Flow data
What is flow data?
There’s some variation, but generally a
record contains the following:
 Source and dest ip
 Source and dest port
 Protocol
 Start time + (duration | end time)
 # of packets
 # of bytes
 Directionality? Depends on format.
Flow data
Netflow v5 protocol
Source: caida.org/tools/utilities/flowscan/arch.xml
Flow data
Command line output
Flow data
Some types of flow records are unidirectional
(SiLK, rw tools), and others are bidirectional
(argus, ratools, original flow data).
Unidirectional flow data has a separate record
for both sides of the conversation. This is how
Cisco NetFlow v5, v9, and IPFIX records are
Bidirectional flow data combines both sides into
one record, usually having extra fields for “# of
sender packets”, “# of destination bytes”, and
other things that would get muddled by
combining two unidirectional flows.
Flow data
Depending on what you need, you can
convert between bidirectional and
unidirectional using whatever tool is
appropriate to your data set.
Flow data
Cutoff and Aging
Until conversations end, their flow data sits in the
router/switch/etc. memory, taking up space (DOS?).
So if we’ve got lots of very long lived flows or flows
that didn’t end well (FIN ACK) we need to free up
that memory and write the flows.
For long flows, we have a configurable time (say 30
minutes) after which we write a record and start a
new one. Figuring out how long the flow actually was
will require massaging your data.
For broken flows, another cutoff time (maybe 15
seconds?) will clear them out.
Flow data
When there’s too much traffic for your
switch, NIC, or whatever to handle,
sampling is used to throttle the workload.
Instead of every packet being recorded in a
flow (sample rate = 1 out of 1), we take 1
out of N packets, make flow records, and
then scale the appropriate values by N.
We will miss flows due to this  but for very
large throughputs it’s necessary. Also, N is
not always constant over time.
Flow data
And then there are different formats…
Cisco NetFlow v5 and v9 are very common. V5 will
only do IPv4, though.
IPFIX is a lot like v9 plus some interesting fields.
Open protocol put out by IETF.
sFlow hardware accelerated, forced sampling, mainly
an HP thing.
And there are others, but we’ll focus on v5/v9 and
Flow data
There isn’t a current standard for how to
store flow data on disk, so different
software suites will store it differently to
suit their search and compression
capabilities. Choose your software suite
based on what formats it can consume,
and be prepared to perform a conversion
if you switch.
Flow data
Switches and routers
 Flow data is gathered by the network
hardware, and then sent over the network to
one or more listeners.
 To set up collection and forwarding, look up
instructions particular to your device and the
revision of its OS (typically Cisco IOS).
 Remember, this is going over the network, so it
can be intercepted, falsified, or blocked by
attackers, outages, and misconfigurations!
Flow data
Machines on the network
 Creates flow data based on what network
traffic that machine can see.
 Can either generate flow data and forward it to
another collector, store it locally, or both.
 Also possible to collect flow data from other
machines or network hardware.
 Eventually your flow data will have to end up
somewhere. You want that somewhere to be
handy to your analysts.
Flow data
Analyzing with argus
Argus is another popular tool which is much
easier to deploy, so we’ll be using it to do
some sleuthing.
 Become familiar with a few of the tools
 Locate a scanning machine
 Detect beaconing
 Find activities by a compromised machine
 Find routing misconfigurations
Flow data
Capturing with SiLK
YAF – yet another flowmeter
 Produces IPFIX data from files or network
 Can write to disk or push out over network
 Lightweight, easy to install
 Works well with SiLK tools
Flow data
Capturing – consolidating in SiLK
 Part of the SiLK toolset
 Designed to receive input from multiple
sensors and build a consolidated repository for
 Just one of the pieces of a full sensor network.
Flow data
Analyzing with SiLK
SiLK tools
 Produced by CERT NetSA
 Relatively easy to use
 We’ve already been using them and have done
a decent amount of writing on how to use them
(check my transfer folder)
Flow data
SiLK tools - conclusion
 Free, very powerful, extensible, pretty easy to
 Command line tools are great for things that
we have running as daemons, but for
visualizing flow data we can find a better
interface. With the right tools, we can add
better visualization.
Flow data
Open source
 Afterglow + graphviz: cheap, but too much
work to set up
 Scrutinizer: quick and easy, consumes pretty
much any flow data, free version is limited to
24 hours of data
 Lynxeon: belongs in the SIEM category,
visualization tool is worth a mention though,
60 day trial
Flow data
TONS more
Source: plixer.com, vizworld.com, networkuptime.com
Flow data
Continuing research
Flowcon, Centaur Jam, etc.
 Come join us!
 Share your tools!
Statistical anomaly/group detection
 Complicated math
 New-ish technology, but worth a look if you’ve
got a pile of netflow data that you’re sitting on.
Day 1
Agenda and motivation
Intro to forensic data types
Working with PCAP data
 What it looks like
 How to interpret it
 How to get it
Working with flow data
 What it looks like
 How to interpret it
 How to get it
Day 2
PCAP and flow recap
Working with logs and alerts
 What they look like
 How to interpret them
 Getting them all in one place
 SIEM’s and their familiars
Fielding a monitoring solution
Most granular data we can collect
Takes a lot of resources to gather
Great for finding out how machines got
Bad for figuring out what’s going on
Can be converted into flow and alert data
with the right tools
Info about conversations on the network
 Cheap and easy to collect
 Quick to analyze with the right tools
 Different analysis suites, formats
CTF Forensics
Jim Irving
• Forensics in a CTF Context
• Network forensics tools
• Host based forensics tools
Working in a CTF environment
Unlike a typical forensic investigation, a CTF will
always be down and dirty with as close to 0
rules as you can get.
You also care a lot more about speed than you
do accuracy. There isn’t a court case going on
Working in a CTF environment
Forensics generally requires that you know an
awful lot about the underlying systems, but in a
CTF there are tons of systems that you don’t
have the time to learn.
For that reason we’ll be focusing on tools that
you can use where the learning has already
been done for you.
Possible scenarios
• Each team has a server VM that has to be
• Teams get a VM or disk image of a
compromised machine to be analyzed
• Points are awarded for a series of web
• Crazy mess, like half the team has to play
Team Fortress or something…
For reals forensic challenges
If you’re given VM’s or disk images to analyze, then you’ll
be doing a lot of host based analysis.
Invariably, I cannot teach you what you need to perform a
comprehensive analysis in 8 hours.
So we’re going to focus on the things that at least help
you figure out where to start googling. When you do
google, you’ll probably be looking for free tools or scripts
to perform a particular task. When competing with more
intelligent opponents, rapidly aquire and use tools to
level the playing field.
Forensics as a support class
In challenges where you have machines that must be
protected while you attack others, forensics allows you to see
what’s happening.
You’ll have access to network traffic, system logs, and
whatever else you can get visibility to within the system.
The purpose of using your forensic tools in this context will be
• See where attacks are coming from so you can set up
• See what kinds of attacks are being used so you can see
what works and figure out how to neutralize it ( stealing
ideas is a legit strategy)
How the class is divided
• Look at several network tools and practice
deploying them quickly
• Look at freely available host based forensics
and practice common techniques
Disk dumping
Memory analysis (Volatility)
File carving/recovery
Phase 1: Network stuff
Something to think about when dealing with network
traffic is where is it coming from and who can see it.
If you have a VM to protect, you will at least be able to
see the traffic going into and out of it. In this situation
you want to be getting full packet capture to that
machine. If possible it would be nice to get that data
passively so that if you get owned the attacker won’t be
able to see what you’re using for defense. Also you won’t
have to set up tools on the machine you’re protecting.
Phase 1: Network stuff
If you’re all attacking the same thing, like web
challenges, you probably won’t have a good way
to monitor data going into and out of the victim
server. If you can, that might actually be
considered cheating. If not, you should
definitely do it.
Setting up your sensors
Assuming we’re defending a VM, and that we’re
passively monitoring traffic, we’re going to set up
the following tools:
• Wireshark – for keeping a packet level eye on
certain types of data
• Snort – for alerting us when certain things
• Argus – for quickly sifting through connection
• And maybe some other things
Setting up your sensors
We’re going to start with the SIFT Workstation
because it’s already got most of the tools on it.
Once you get it downloaded, install argus, like
sudo apt-get install argus-client
And the password is forensics
Setting up your sensors
Now we need to clean up snort. Open up
/etc/snort/snort.conf (you’ll need to sudo).
Now find the section that says “preprocessor
http_inspect_server” and make the following
preprocessor http_inspect_server: server default \
profile all ports { 80 8080 8180 } oversize_dir_length 500 \
server_flow_depth 0 \
client_flow_depth 0 \
post_depth 0
Additionally, any port that you think might have
http going over it should be added to the brackets.
Setting up your sensors
Once you’ve changed the preprocessor block,
scroll down to the bottom of the file and
comment out all the “include something.rules”
lines EXCEPT local.rules.
Now save and close.
Now to practice…
While you are here, before leaving this page,
turn on your VM’s and make absolutely sure that
you can do all of the following.
• Start wireshark and see packets
• snort –A console –c /etc/snort/snort.conf
• argus –d –e ‘localhost’ –w argus.log
• Anything else you think you might use, like
tcpdump, netcat, telnet, FIREFOX  serious
Uses for wireshark
Wireshark is going to serve as your real time,
packet level visibility tool. You want it to let you
know when very specific types of traffic occur
and allow you to quickly inspect them. To this
end you’ll probably want to have more than one
instance of it running, tiled across the screen,
with some very specific capture filters in place.
If(derp what’s a capture filter?)
Uses for wireshark
Let’s practice this now. We’re going to set up
the following capture filters AND TEST THEM:
• DNS traffic out of the server
• HTTP requests into the server
• HTTP responses from the server with error
• FTP traffic into or out of the server
Uses for wireshark
SAVE THESE FILTERS!!! You will forget them when
the time comes and it is far better to have them
ready to go and a snapshot in place.
Can you think of any more filters that might be
Once you’ve got a good set of filters that prove
useful in the field, please consider making them
publicly available.
Snort is going to be particularly useful when
defending a VM. From your passive monitoring
box it allows you to make very specific filters for
alerts. From your vulnerable server it gives you
something with which to filter and drop packets.
That means you’ll need to learn to write some
(Start on page 130. Careful, it’s not always right.)
The rules that you write will go into
Whenever you write a new rule you’ll need to save
the file, and restart snort by re-running the
command that started it.
We’re going to write a few rules to get the hang of
the syntax and then try to make actual useful rules
that you can modify and use in a competition.
Snort rule syntax
Snort rules, at their most basic, look like this:
action protocol src_ip src_port direction dst_ip
dst_port (rule options)
Here’s one with the information filled in:
alert tcp any any -> any 80 (msg: “text goes here";
sid: 10001; rev:1;)
Inside the rule options, the msg, sid, and rev are
Snort rule syntax
Unfortunately, the syntax for snort rules is really
really expansive, the documentation has a lot of
deprecated stuff, and it’s really hard to find a
good quick tutorial. So here’s a link to the
shortest, closest to right one that I’ve found. It’s
linked here since we certainly can’t reprint it and
it’s way too big to recreate. Bookmark this on
your analysis machine.
Snort rule syntax
Now let’s try writing some rules using the
documentation that we have.
• Alert on http get to the server with “../” in the
• Drop on tcp traffic to port 22 on the server
(it’s like iptables, but completely different)
• Alert on http response from server containing
the following byte string “E3 80 04 32 54”
Quick snort review
Snort on your sensor for seeing very specific
attacks. Write rules that match the type of traffic
that the attackers will need to hit your services.
Snort on your server to drop packets that look
scary as an impromptu IPS. Can also be used to
deny scouting information by intercepting
outbound error messages (if you perhaps can’t
turn off debugging output).
Using argus in a hurry
Argus is a netflow collector/analyzer. Typically
netflow is only useful when you have a lot of it, but
there are a few things that can make it useful even
in a quick setting.
Netflow is a record of all the conversations that go
across a network. It doesn’t contain all the
contents of the packets. It’s helpful when you want
to see the forest and not concentrate on the trees,
so to speak.
Using argus in a hurry
Whenever you start your sensor box, start argus in
daemon mode and have it write out to a file (argus
–d –w /somefile.log). This is going to keep writing
to a file, and you’re going to analyze that file using
ra reads the output of argus and will print its
contents in different ways depending on what
parameters you pass it, and there are a lot of
Using argus in a hurry
ra commands look like this:
ra (bunch of options) – (filter expression)
The options tell ra how to display the data,
modify the data, and a little bit of how to filter
the data (like by time), and the filter expression
part tells ra what data you want (or more
specifically what you don’t want). That’s a
hyphen surrounded by spaces in between them.
It’s necessary.
ra options
Here are some of the important options to use:
• r = which file to read, this is your log file
• t = time range to look in, will search all if omitted
• s = which fields to print, will print a default set
So the first half of a call to ra where you look through the
last 15 minutes and only want to see the src_ip dest_ip
and direction it would look like this:
ra –r /somefile.log –t -15m –s saddr dir daddr -
ra filter expressions
You can omit the filter expression and ra will just
print everything. There are a lot of these but here
are the really useful ones
• tcp = only tcp traffic
• src port ## or dst port ## or just port ## = only
traffic on the specified port
• host (hostname) = only traffic involving the
specified host, can be prepended with src or dst
These can be combined using the keywords “and”
and “or” to make compound statements. You can
also prepend anything with “not” to negate it.
A lot of time the output of ra isn’t going to help
much, so what you do is right after the ra
command, put the following:
| rasort –m (keyword) | less
Where keyword is typically bytes or pkts or
something. For more keywords just do “man
rasort”. Same thing goes for “man ra”.
Playing with argus
So now that you’ve got a little idea of how to
use argus we’ll try a few.
• List all hosts that are talking and sort them by
sent packets
• List all hosts that were talking to the
vulnerable server in the last 5 minutes.
• List all the hosts that the top host from the
previous example talked to in the past hour.
Using argus to greatest effect
Argus is mostly going to help you see who’s
talking to you and sort them by how and when
they’re doing that talking. When you detect
that you’re getting attacked, check who’s been
talking to you a lot recently. Check every so
often for high numbers of low packet flows for
evidence of scanning. Also every so often
search for everything but what you expect to
see so that you’ll see what you’re not expecting.
Network summing up
So you’ve got snort running all the time and someone on
defense writing filters specific to whatever service that
you’re running. That’s going to tell you when you’re
getting attacked. You’ve also got wireshark up and
running so that when you see an attack you can quickly
look at the packets and the tcp stream and figure out
what’s going on and improve the filters. Finally you’ve
got argus running so that once you calm down from the
attacks you can see who all is attacking you and maybe
even who else they’re talking to. In addition to that,
you’re checking it every so often to see how and when
you’re getting scanned. There you go. That’s how you
play defense.
Phase 2: Host based forensics
We’re going to assume that you know how to
look for rootkits, so you can find hidden
processes and hidden things in the registry.
What we’ll concentrate on is the forensic
investigative techniques and not necessarily the
defensive techniques.
First thing’s first…
The first thing you need to do in most situations
is take images of memory or disk. Here’s what
you’ll most likely see:
• A virtual machine
• A disk image of a virtual machine
• A busted usb key
• A machine accessible only across the net
Imaging a virtual machine - memory
I’m going to focus on vmware here, FYI. Taking
images of VM’s is a particularly entertaining
thing to do and there are a couple of tricks.
To get a physical memory image, snapshot the
machine and the *.nvram that gets created is
the best memory image you can get. Better
than you can even get by running apps on the
Imaging a virtual machine -disk
Disk imaging however is not as easy. I have found no reliable means to
go from .vmdk to a raw image, so you want to do the following.
1. Make an iso containing MoonSols community edition
(http://www.moonsols.com/windows-memory-toolkit/), dd.exe
(http://www.chrysocome.net/dd), and a statically linked version
of dcfldd (http://dcfldd.sourceforge.net/).
2. Make a directory on your analysis machine and share it with the
VM (as a web share for windows VM’s).
3. For windows run “dd.exe –list” and pick which device looks
right, then run “dd.exe if=(thing from –list) of=(path to
a file on shared folder”. For linux, run “mount” and see
what’s mounted on /, and then run “dcfldd if=(dev/whatever)
of=(path to file on share) bs=512”.
4. Profit.
Imaging a disk image
It’s a disk image, dummy.
But seriously, if you want a particular type of
memory image, or to boot from it, create a new
VM using the image as the disk, and then
snapshot the VM once it’s running and use the
.nvram as the memory image.
Imaging a usb key
Plug it into a linux machine, figure out which
device it is, and then run the same dcfldd
command from before, but with the if
parameter set to the usb device. dcfldd will
continue on errors and so will be effective even
for damaged drives.
Imaging across a network
On your analysis machine install netcat (nc), gzip,
and dcfldd. Then run the following:
nc –l –p (some port) | gzip –dfc | dcfldd
of=(some filename)
And then on your source machine install all of those
and run the following:
dcfldd if=(drive to image) | gzip cf | nc (ip
of analysis machine) (that port) –q 30
This will pump the disk image across the network,
gzipped, and then deflate it on the other end and
write it to a file.
Things to do with memory images
You’ll learn a lot more about this from your rootkit
class, but I’ll put some of this in brief for quick and
dirty analysis.
On your analysis machine, install python and
download Volatility (use tortoiseSVN to pull it from
tility-1.4_rc1). Make sure python is on your PATH,
and then go drop your memory image wherever
Volatility is sitting.
Volatility quick and dirty
Once you’ve got all that set up, here are things to do.
Prepend all of these commands with “python volatility”
and append them with “-f (path to image file)”
1. Run both pslist and psscan3
2. Find any process that’s in one and not the other.
3. Run filescan, connscan, regkeys, sockscan, or anything
else and look very closely at anything that’s reported
for the mismatched processes
4. Run memdump –p (pid of process) –o (offset of
process) –output-file=(some filename)
5. Run strings on that output file and start googlin’.
Finding deleted files
For this class we’re going to focus on finding deleted files
in disk dumps. I’m not going to discuss trying to find files
hidden by rootkits, since rootkits ought not to be running
on your analysis machine. To accomplish this, we’ll be
focusing on using the SANS SIFT 2.0 VM. It’s got
Sleuthkit, PSK, and a bunch of other things already
installed on it.
There’s also a ton of video tutorials for doing this sort of
thing on www.sans.org. You should certainly take a look
at them since we’ll be doing it very very quickly here.
Finding deleted files
1. Obtain your disk image (not memory image, they don’t
work well for this) as previously directed.
2. Make it available on your analysis machine somehow.
3. Open PSK or Autopsy from the VM.
4. Open a new case, fill out data, whatever.
It more or less really is that easy. In fact, the methodology for
getting timelines of file activity. Now we’re going to do that
with as many sample disk images as I can bring. This is our
easy win, folks.
Host based summation
So when you get a forensics challenge, you’re
going to do the following things:
1. Get images of memory and disk.
2. Run volatility on the memory image.
3. Load the disk image into PTK and look for
files, make a timeline, and do anything else
that looks shiny in the UI.
4. Profit more.
Wrap up
So that’s the gist of it. You have now used several
tools, have a working analysis machine, and have
commands that will get you some of the things
you’ll need.
The toolkit that you have is popular and videos of it
being used are all over youtube. If I didn’t cover it,
you shouldn’t have any trouble finding someone
who has.
Go forth, and conquer.

similar documents