Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Do you really want to delete this prezi?
Neither you, nor the coeditors you shared it with will be able to recover it again.
Make your likes visible on Facebook?
Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.
DGA Domain Classification
Transcript of DGA Domain Classification
Building a Better Botnet DGA Mousetrap
Separating Rats, Mice and Cheese in DNS Data
Hewlett Packard Enterprise TippingPoint
Miranda Mowbray & Prasad Rao
Hewlett Packard Labs
Classify domain names as benign and malicious
Classify malicious domains according to DGA family
Minimize false positive classifications of malicious
Not a Goal
Gather domains by provenance
Determine groups with matching
Classifiers for Benign / Malicious
Classifiers for family or origin
Unknown domains one at a time
Determine coarse and fine syntactical features
Classify if Benign / Malicious
Classify family or origin
Top level domain
Number of levels in domain
Possible Subsets = 2
Handful in practice
Quantify everything about a string
Coarse and fine syntax the same for elements within a lobe
Characters by Position
doubles 'aa' - 'zz'
Dots, Dashes, & Underscores
RFC 1034 Violations
length > 254
labels > 63
labels not [a-z].*
labels not .*[a-z0-9]
empty labels '..'
Counts of characters
Counts of character
pairs: 'aa' 'ab'... 'zz'
Counts of character
triples: 'aaa', 'aab'...'zzz'
Separate linguistic / non-linguistic elements
Catch bias in DGA PRNG
Boolean slots for whether a given character occurs within a given position indexed from beginning and end of domain
Classifying fixed substrings
Banjori, Bankpatch, Caphaw, Web Services
Counts of words
Max count of non-overlapping words
Max percentage of characters comprised of words
Length of the longest word
Classifying Benign vs Malicious
Matsnu, Rovnix, Suppobox
Syntactical rules help
Unbalanced data hurts
Not a standalone solution
Results worse on real data
Especially word based DGA FPs
Some features are good for classifiers
Linguistic or not (bigrams)
Hash words to compress dictionary to reduce FP
Build classifers for infected hosts
Determine which hosts are infected with which malware
Length of prefix