Botnet Behavior and
Detection Strategies
Brad Wilder
 What are botnets?
 What security threats are associated with botnets?
 How prevalent are botnets?
 What strategies can we use to detect and contain botnets?
Page  2
What Are Botnets?
Basic Definitions
 Botnets are networks of malware infected machines, capable of being
controlled by a remote adversary. They consist of:
– Bots: malware infected machines
– Botmaster (aka bot herder): the attacker who controls the network
– Command and Control (C&C) channel: the communication channel over which
the botmaster communicates with and issues commands to the bots
– Bot client: the particular malware on which the bot is based
 The malware may be a virus, worm, Trojan horse, spyware, rootkit, or any
other malicious/unwelcome software
Page  3
What Are Botnets?
Basic Architecture
Page  4
Botnet Security Threats
Information Infiltration
 Intellectual property and personal information theft
– Trade secrets
– Military intelligence
– Banking credentials
– Usernames and passwords
– Information on personal preferences and habits
 Key logging
 Phishing/man-in-the-browser
 Forms the basis for massive copyright infringement and identity theft
Page  5
Botnet Security Threats
Information Infiltration (cont’d)
 Stone-Gross, et al hijacked the Torpig botnet, used for spam and phishing
attacks, for 10 days in early 2009, during which time they collected:
– ~70GB of data from more than 180,000 victims
– 8310 financial account credentials at 410 different institutions
– ~300,000 username-password pairs from 52,540 different infected machines
 28% of the victims reused the same credentials at other Web sites, giving
them access to another 368,501 Web site accounts
 Mariposa botnet, taken down in Spain in March
– Sensitive information from 800,000 users, including half of the Fortune 1000
companies and more than 40 of the world’s major banks
Page  6
Botnet Security Threats
 Distributed Denial of Service attacks
– Denial of Service attacks involve flooding a server with so much traffic that it
crashes due to the unexpected load
– Botnets distribute the workload among many bots
– Can be used to take down critical infrastructure
– Also used in extortion plots
– DDoS attacks are far more difficult to stop than DoS attacks, since blocking one
IP address does not stop the others
 Torpig had an aggregate bandwidth of 17Gbps without factoring in
corporate networks, which accounted for 22% of the total
Page  7
Botnet Security Threats
 95% of all spam is thought to originate from botnets
 Spam represents 90% of all email traffic
 160 billion spam messages per day!
 Spam is not just irritating; it causes noticeable effects for the end user
– Slows connection speed
– Can steal contact information from your email inbox
– Is a conduit for spreading infections
 Spam is virtually free to send, but costs time for the recipient to sift through, and
even more if a malware payload is delivered successfully
Page  8
Botnet Security Threats
Cyber Attack Sophistication vs. Cyber Criminal Sophistication
Cross site scripting
“stealth” / advanced
scanning techniques
packet spoofing
attack tools
www attacks
automated probes/scans
denial of service
back doors
network mgmt. diagnostics
disabling audits
burglaries sessions
exploiting known vulnerabilities
password cracking
self-replicating code
password guessing
Page  9
Source: CERT
How Prevalent Are Botnets?
Size Estimation Is Difficult
 Botnet footprint: aggregate number of bots under the botnet’s control
 Botnet live population: number of bots simultaneously under the botnet’s
 There is no clear way to measure the size of a botnet
– Analyzing DNS traffic, looking for bots locating a C&C server, or querying DNS
blacklists to see if they have been flagged
– Redirecting C&C traffic into sinkholes
– Infiltration of the botnet C&C server
 Most methods rely on counting bot IDs or IP addresses
Page  10
How Prevalent Are Botnets?
Size Estimation Is Difficult (cont’d)
 Bot IDs can be changed at the whim of the botmaster, and may be inflated
to make the botnet appear larger
 IP addresses do not represent a one-to-one relationship with machines
– One of the shortcomings of IPv4
– DHCP: dynamic allocation of IP addresses; ensures the same user does not
always use the same IP address; overinflates the size estimate
– NAT: allows multiple users on the same private network to more or less share
the same IP address; underinflates the size estimate
Page  11
How Prevalent Are Botnets?
Sizes May Be Overrepresented
 Sizes are often erroneously reported
 Mariposa botnet was widely reported to have claimed more than 12 million
– Original quotation indicates 12 million IP addresses
– Still must have compromised hundreds of thousands and possibly millions of
– What is most surprising is the botmasters’ utter lack of proficiency
Page  12
How Prevalent Are Botnets?
Torpig Case Study
 Stone-Gross, et al showed that IP address information may give a basis
for estimating the size of a botnet
– Over 10 days, they observed 1.2 million IP addresses
– Determined later that the botnet had a footprint of 182,800 bots
– Estimated an average live population of ~49,000 bots, based on the rate of new
IP addresses used
– Found that the IP address count represented about an order of magnitude
overrepresentation of the botnet footprint
Page  13
How Prevalent Are Botnets?
Total Number Of Bots In Operation
 The difficulty in estimating the size of a single botnet further compounds
the difficulty of quantifying the entire botnet problem
 There may be significant overlap among botnets, leading to
 Current estimates diverge widely
– Very conservative estimates put the total number in the hundreds of thousands
– More convincing estimates put the total number in the millions or tens of
millions, spread across perhaps thousands of botnets
Page  14
What Can Be Done
Prevention Strategies
 Difference between prevention and detection
– Prevention involves stopping the spread of malware
– Detection is a reactive approach
– Only 46% of computer users always update their AV software
– 30-60% of users have little to no knowledge about basic security issues
– Almost half of users that open spam do so intentionally
 Zero-day viruses
– 20% of malware is not detected in the best of cases
Page  15
What Can Be Done
Basic Detection Strategies
 Blacklisting domain names or IPs that exhibit problematic behavior
– Use honeypots: software traps into which malware can be lured
– Spam boxes can be used to study spam behavior
 Early (naive) attempts were overly simplistic
– Listening on particular ports: these are often just a suggestion
– Examining packet contents: doesn’t work if the transmission is encrypted, or if
the bot commands are not known ahead of time
Page  16
What Can Be Done
IRC-Based Detection Strategies
 Traffic analysis seems to be the most promising method for detecting
botnet C&C activity
 Strayer, et al showed how a pipeline of successive filters could be used to
distill network traffic
 They started with a base pool of over 9 million traffic flows taken over a 4
month period; >600GB of just TCP/IP header information; they added to
this 42 botnet C&C flows they generated with a bot under their control
over the course of hours
Page  17
What Can Be Done
IRC-Based Detection Strategies (cont’d)
Page  18
What Can Be Done
IRC-Based Detection Strategies (cont’d)
Page  19
What Can Be Done
IRC-Based Detection Strategies (cont’d)
 Classifier stage is used to group flows into classes of communication
– Interactive
– Bulk data transfers
– Streaming
– Tranactional
 Even though this seemed promising, the researchers omitted this stage in
their implementation because there were too many false negatives
Page  20
What Can Be Done
IRC-Based Detection Strategies (cont’d)
 Correlator stage attempts to take flows that occur very close to each other
according to some metric and correlate them
 Want to find flows that are the product of similar applications, that
demonstrate a causal relationship with one another, and that follow the
multicast model of communication
 Associate with each flow a vector quantifying certain metrics
– Based on temporal qualities in their implementation
 Group flows pairwise and plot each against the distance between the
contributing flow’s vector
Page  21
What Can Be Done
IRC-Based Detection Strategies (cont’d)
Page  22
What Can Be Done
IRC-Based Detection Strategies (cont’d)
 Topological Analyzer stage attempts to find a common node that would
indicate the C&C channel
 The researchers were able to identify 9 out of 10 bots, and find the C&C
 Confirmed their original hypothesis that IRC-based C&C flows are highly
Page  23
What Can Be Done
Fast-Flux And Domain Flux
 Fast-flux: bots query a given domain looking for the C&C server, but the
domain is mapped onto a set of constantly changing IP addresses
– Researchers have combated this since there is a single point of entry
– Based on DNS traffic similarity
– The method was strongly affected by experimental parameters, and dependent
on blacklist
 Domain flux: the domain names themselves change; each bot contains a
domain generation algorithm
– Stone-Gross et al countered this when hijacking Torpig
– The arms race is stacked against us; it is not scalable
– This technique is used by Conficker to generate 50,000 domains a day
Page  24
What Can Be Done
Current And Future Trends
 P2P botnets: distributed architecture makes them much more resilient
– Probably best countered with traffic flow analysis, but this is an area of intense
 Smaller size
 More bandwidth
 Decentralized C&C channels
 Advanced customized encryption
 IP disguising
Page  25
 What are botnets?
 What security threats are associated with botnets?
 How prevalent are botnets?
 What strategies can we use to detect and contain botnets?
Page  26
Thank You
Page  28

similar documents