zozzle

Report
ZOZZLE:
Fast and Precise
In-Browser
JavaScript Malware
Detection
WHAT IS THE PROBLEM?
JavaScript allows authors to run any code when
a user visits a web page
 JS-based malware attacks are the majority of
successful mass-scale exploitation
 Malware is easy to hide: self-generating code that
produces more code to run
 JS severs important functionality for many sites
 In-browser solutions have not been fully accepted
because of the performance hit
 Browsers use offline scanning to check URLs but
there are too many sites and malware typically
comes and goes frequently

CHALLENGES

Performance


Accuracy


Detection is not fast enough to be used in a browser
False positive rates of 5% is acceptable for static analysis
tools but is over 100x what is acceptable for in-browser
detection
Obfuscated malware
Most JavaScript code is frequently obfuscated so purely
static detection is generally ineffective
 Ex. eval, document.write generate code at runtime that is
difficult to pattern-match


Malware transience
Offline-only scanning is not effective because web malware
“infects fast and dies young”
 Nearly 20% of malicious URLs were gone after 1 day

SOLUTION : ZOZZLE

Performance
AST-based detection is fast and scalable
 Fast classification: throughput at over 1 MB of
JavaScript code per second


Accuracy
AST-based detection uses hierarchical (contextsensitive) features more precise than text-based
 Low false positive rate: 0.0003% (< 1 in 1/4 million)


De-obfuscation

Uses JavaScript engine of a browser to expose
obfuscation and get the final, expanded version of
JavaScript code
WHAT IS ZOZZLE?
A highly precise, mostly static detector for
malware written in JavaScript suitable for inbrowser deployment
 3 Steps:
 JavaScript context collection and labeling as
benign or malicious
 Feature extraction and training of a naïve
Bayesian classifier
 Applying the classifier to a new JavaScript
context to determine if it is benign or malicious

ZOZZLE: HOW IT WORKS
JavaScript runtime engine exposes attempts to
obscure malware
 JS code is unfolded to just before it’s executed
 Intercept calls to compile() in the JavaScript
engine
 It’s invoked when eval is called and whenever
new code is included with an <iframe> or <script>
tag
 Observe JS code at each level of its unpacking
just before it's executed by the engine.

HOW IT WORKS CONT.
A static classifier trained with a context-sensitive
AST (abstract syntax tree) and a collection of
labeled malware samples analyzes JS
 Nozzle runtime detector dynamically crawls
millions of URLs and collects sample malware by
observing the behavior of running JS code
 Tries to avoid transience and cloaking by
scanning a wide range of URLs

BENIGN VS. MALICIOUS SAMPLES

similar documents