Chapter 3
Virus Definition
 Recall
definition from Chapter 2…
 Self-replicating: yes
 Population growth: positive
 Parasitic: yes 
 When executed, tries to replicate
itself into other executable code
o So, it relies in some way on other code
 Does
not propagate via a network
parts to a virus
 Infection mechanism --- how it spreads
o Multipartite virus uses multiple means
 Trigger
--- decides when/how to
deliver payload
 Payload --- what it does other than
o Either intentional or accidental
Virus Pseudocode
 Without
infection mechanism…
o It’s not a virus, it’s a logic bomb
 But
trigger and payload are optional
 Generic virus pseudocode
def virus():
if trigger() is true:
Infection Pseudocode
Targets must be “local”
 Don’t select already infected targets
o Can be a double edged sword
def infect():
repeat k times:
target = select_target()
if no target:
Virus Classification
 Possible
to classify in many ways
 Here, we classify in 2 ways:
 Target
o What/where does the virus infect?
 Concealment
o What does it do to remain undetected?
Classification by Target
 Briefly
consider 3 cases
 Boot-sector infectors
 Executable file infectors
 Data file infectors
o Macro viruses
Boot Sequence
 Generic
boot sequence
Power on
ROM-based instructions run
o Self-test, device detection, initialization
o Boot device IDed, boot block read from it
o Control transferred to the loaded code -- this step known as primary boot
Boot Sequence Continued
Code loaded in primary boot step
loads larger, fancier program
o This is secondary boot
Secondary boot loads/runs OS kernel
Boot Sector Infector
 Why
infect boot sector?
 A boot-sector infector (BSI)
o Infects by copying itself to boot block
 May
copy boot block elsewhere
o Could be tricky, require lots of code
o So a fixed “safe” location chosen
o Different viruses may use same “safe”
location (e.g., Stoned and Michelangelo)
Boot Sector Infector
once popular, not so much now
 Why?
o Machines don’t reboot so often
o Much harder to infect, due to better
Multiple Infections
File Infectors
 OS
views some files as executable
o Like “exe” and similar
 Files
that can be run by a command-line
"shell" also considered executable
o Batch files, shell scripts, …
 File
infector --- infects executable file
o Exe, shell code, consider executable
o Binary executable is most common target
File Infectors
 Two
main issues…
1. Where to put the virus within file?
2. How to execute the virus when
infected file is run?
 Consider
these two (interrelated)
questions in next few slides
Beginning of File
 Older
exe formats (e.g., .COM) treat
entire file as chunk of code and data
o Entire file loaded into memory
o Execution starts by jumping to the
beginning of the loaded file
 Can
put virus at start of such a file
o That is, prepend the virus code
Prepended Virus
End of File
 Append
a virus (even easier?)
 Then how does virus get executed?
 Some possibilities…
 Replace first line(s) with a jump to
viral code --- save overwritten code
 Later, transfer control back to code
o How to do this?
End of File
 How
to transfer control back to code?
o Run saved instructions in saved location
o Restore the infected code back to its
original state and run it
 Many
exe file formats specify start
location in file header
o If so, virus can change start location to
point to its own code and jump to the
original start location when done
Appended Virus
Overwritten into File
 Virus
places itself atop original code
 Can avoid changes in file size
 Easy for virus to get control
 But… overwriting code will break the
original code
o Making virus easier to discover
 Is
it possible to overwrite without
breaking the code?
Overwritten into File
 Smart
ways to overwrite?
 Overwrite repeated data
o May be trickier to execute virus
 Save
overwritten data (like BSI)
 Use over-allocated space in a file
 Compress code to make space
 For these to work, virus must be
Merged with File
 Could
try to merge virus with target
 I.e., intermixing virus/target code
 Difficult
o So, it’s “rarely seen”
 But,
supposedly, Zmist does this
o So, apparently it is possible
o That’s impressive…
Not in File
 Companion
virus --- separate from,
but naturally executed before target
 No modification to infected code
 May take advantage of process used
by OS or shell to search for exe files
 Like a Trojan horse but it’s a virus…
o …since it’s self-replicating
Companion Virus
 Virus
is earlier in the search path
o Same name as the target file, almost…
 E.g.,
MS-DOS searches for “foo” by
1. Look for
2. Look for foo.exe
3. Look for foo.bat
 If
the target file is a foo.exe,
companion virus is in file
Companion Virus
 Windows
registry associates file
types with applications
 Can modify registry so that
companion virus runs instead of exe
o Then companion can transfer control to
the corresponding exe
 In
effect, all exes infected at once!
Companion Virus
file format used on recent Unix’s
 Has "interpreter" specified in each
exe file header
o Points to run-time linker
 Companion
time linker
virus can replace the run-
o As above, effect is that all exe files
infected at once
Companion Virus
 Companion
viruses possible in GUI
 App’s icon can be overwritten with
the icon for the companion virus
 When a user clicks on “app” icon…
o Companion virus runs instead
Macro Virus
 Some
apps allow data files to have
macros embedded in them
 Macros are short snippets of “code”
interpreted by the application
 Such a languages often provide
enough functionality to write a virus
Macro Virus
 Macros
often run automatically when
file is loaded
o Easy to write compared to low-level code
 First
proof of concept in 1989
 Hit “mainstream” in 1995
o Virus known as Concept
o Targeted Microsoft Word (of course)
o Installed in “global macros”
o Infected all edited documents
Macro Virus: Concept
 Targeted
Word Docs
 AutoOpen macro --- runs
automatically when file opened
o How you get the virus from infected file
 FileSaveAs
--- when “file  save as”
selected from menu
o So the virus can infect other docs
Macro Virus: Concept
Classification by
Concealment Strategy
 Most
viruses try to hide
o Why?
 So,
how do they hide?
o Encryption
o Polymorphism
o Etc., etc.
 Yet
another way to classify viruses..
No Concealment
 Do
nothing to hide
 This is easiest for virus writer…
o …but also easiest to detect, analyze
 Why
 Virus body is “hidden” from view
o In particular, the signature is hidden
 Distinguish
between strong encryption
and obfuscation
 Viruses usually only obfuscated
o Very weak encryption
Encrypted Virus
 How
to encrypt?
o Let me count the ways…
Simple encryption
o Rotate, increment, negate, etc.
Static encryption key
o E.g., XOR fixed byte to all bytes
Variable encryption key
o Like static, but key changes
Encryption (Continued)
Substitution cipher
o Permute the bytes
o Could be via lookup table
o Could even have multiple ciphertexts
decrypt to same plaintext
Strong encryption
o DES, AES, RC4, etc.
o Might use crypto libraries
 Tries
to hide the infection
o Not just hide the virus signature
 Examples
of stealth techniques
o Change timestamp and/or other file info
to pre-infection values
o Intercept I/O calls to hide presence (in
MS-DOS user-accessible interrupts)
o Hijack secondary boot loader
 Stealth
viruses “overlap” rootkits
 Rootkit --- installed on compromised
machine so attacker can use it
o Stealth is critical to rootkit success
 Some
malware use rootkits
o For example, Ryknos Trojan hid itself
using a rootkit designed for DRM
Reverse Stealth Virus
 What
is “reverse stealth”?
 Make everything look infected!
 Why is this malicious?
o Damage may be done by AV software
trying to disinfect
 Oligomorphic
or semi-polymorphic
 Code is encrypted
 Decryptor code is morphed
o But not too many different decryptors
 For
o Whale had 30 different decryptors
o Memorial had 96 decryptors
 How
to detect?
 Like
oligomorphic, but lots more
 Essentially, an infinite number
 For example
o Tremor has almost 6 billion decryptors
 So,
AV software cannot have a
signature for each decryptor
problems for polymorphic writer…
 How to generate decryptors?
o Use a mutation engine
o Engine is part of encrypted virus
 How
to detect previous infections?
o Data “hiding”: timestamp, file size, file
system features, external storage, …
o “Inoculate” system by faking infection?
Mutation Engine
Equivalent instruction substitution
o One or more instructions
Instruction reordering
Register swap
Reorder data
Spaghetti code
Insert junk code
Run-time code modification/generation
Mutation Engine
Subroutine permutation
9. DIY virtual machine
10. Concurrency --- threads 
11. Inlining/outlining
12. “Threaded” code --- not threads
Jump directly from one subroutine to
another, without returning
Subroutine interleaving
Mutation Engine
 Many,
many other possibilities
 Possible overlap with optimizing
o Seems more like de-optimizing…
Equivalent Instructions
 All
of these lines set register r1 to 0
clear r1
xor r1,r1
and 0,r1
move 0,r1
Concurrency Example
r1 = 12
r2 = 34
r3 = rl + r2
start thread T
r1 = 12
wait for signal
r3 = r1 + r2
r2 = 34
send signal
exit thread T
 Aside:
Concurrency may be very
effective anti-reversing technique
o Use multiple threads
o Intentional deadlock
o “Junk” threads
 Described
in masters project:
 Improved software activation using
 Mutation
also can be used for good
Makes reverse engineering attacks
more difficult
Make software more “diverse”
 Apply
polymorphism to virus body
o Aka, “body polymorphic”
 No
encryption/decryption needed
 Body must change a lot
o Goal is to have no common signature
 Mutation
code must be mutated too!
o Otherwise, a signature will exist
o Different from polymorphic (why?)
 Two
types of metamorphic generators
o Both types difficult to produce
o Apply generator offline
o Easy to make old malware into “new”
Malware “carries its own generator”
o Necessary if self-propagating
o A much more difficult problem
Metamorphism: Apparition
 Apparition
--- metamorphic virus
 Delivered in source code (Pascal)
 If compiler is present…
o Insert junk code and compile
very lame approach
 Real metamorphism must be done in
assembly or (better yet) machine code
Metamorphism: Simile
 Simile
--- metamorphic virus
 Simile’s metamorphic generator
o 12,000 lines of assembly
o Translate Simile to intermediate form
o Then remove all old transformations
o Obtains a base form of virus
o Apply new set of transformations
o Generate new (morphed) machine code
Metamorphism: MetaPHOR
 Metamorphic
Permutating HighObfuscating Reassembler
o That is, MetaPHOR
 Described
in How I Made Metaphor
and What I’ve Learnt by The Mental
 Complex expander/shrinker strategy
 Almost impossible to analyze
Metamorphism: MWOR
 Metamorphic
Worm, i.e., MWOR
 Experimental metamorphic malware
designed by former masters student
 Modeled on MetaPHOR, but…
o Easier to understand
o Better for experiments and testing
o A useful research tool
 How
to detect?
 The
bottom line…
 Metamorphics difficult to detect
o Machine learning works well on hacker
malware, but can be defeated
 Metamorphics
also difficult to write
o Most “metamorphic” generators aren’t
 Current
state of the art?
o “Undetectable” metamorphic viruses
Strong Encryption
 What
is strong encryption?
 Use a real cipher
 For this to be useful, must not store
key with code
o Why not?
 But
must decrypt the virus
 How to get the key to the code?
Strong Encryption: Key
Store key on the web
o Then must go fetch the key
o But then how to get the key?
Binary virus --- 2 parts
o Low probability that both parts arrive
“Environmental” key generation
o Key based on machine-specific info
o Key derived at runtime
o Harder to analyze
Virus Kits
 Many
malware construction kits
o See VX Heavens
 Many
kits claim to be metamorphic
o Or polymorphic, or encrypted, or …
o You should be very skeptical of claims
o Some have nice GUI interface
 Success
is failure?
o The more successful, the more likely it
has been studied and can be detected

similar documents