Cognitive Bias in Information Security
Great work done by Chris Sanders (@chrissanders88): "Defeating Cognitive Bias on Developing Analytic Technique"
Orientation: Building Mental Models
Why do we have bias?
Adaptive Bias: Theory that humans favor making decisions that lower the cost of being wrong versus the number of times they are wrong.
Costly information hypothesis: It is cheaper to learn from others than to figure things out for ourselves.
Availability Bias
Events that have happened recently and are "available" in memory seem related.
In incident response:
- Conclusions from recent incidents are favored
- Conclusions described by other responders are favored
Packet
1.1.1.1 > 2.2.2.2
dstport 25
Policy
Two-factor
Authentication
So, is United/NYSE/WSJ related to our email outage?
Bias suggests yes, but evidence offered spear phishing theory as better model match.
Factors affecting orientation:
- Quality of current mental models
- Efficacy of previous observations
- Bias
Thinking About Thinking
How do we limit cognitive bias?
Inter-OODA Feedback
United/NYSE/WSJ are down, our email is down. Are these events connected?
- Each success at a lower level is feedback to the upper level.
- Successful incident response can lead to policy change.
- At higher levels, OODA speed is less important than accurate mental models.
We need to get data, analyze it, and make a decision.
- Collect more data to have more facts to operate on (increase new information)
- Put data into context, make it convenient to access (more analysis)
- Integrate previous incident data with new data (previous experiences)
Question-based Incident Response Model
What do the outages at United Airlines, the New York Stock Exchange, and the Wall Street Journal all have in common?
WHAT (artifacts)
Cultural traditions and genetic heritage have ingrained biases.
Use tools that account for cognitive bias by automating the parts of analysis that humans tend to apply bias against.
Basic search capability is a critical component to verifying assumptions. Coupled with analytics, it can be even more powerful.
Meaningless anomalies: Anything wrong here?
By applying basic statistical analytics to search results, we create a hybrid in which the human analyst provides a starting point and the analytics provide clarity and validation of assumptions.
The OODA Loop was invented by USAF Colonel John Boyd in the 1960's.
Initially applied specifically to air combat, he lectured for many years on using it as a framework for winning in abstract contests.
Do you have proper incident response reporting?
- Statistics
- Indicator extraction
- Categorization
- Attribution
- Follow-up interviews
- Sweeps
How do you keep mental models fresh for responding to major attacks?
Performing incident response every day on common crimeware attacks will help ensure your tech and procedures are relevant, fresh, and effective.
Without regular exercise, your models will be out of date in the face of a serious incident.
Real investigation:
IDS Alert for POWELIKS crimeware on 10.151.212.238 and another at the same time on 10.207.240.162
Mental model (orientation): These are related incidents and are part of the same campaign. Verify.
Action Items:
Office of Personnel Management Hack:
22 million records stolen, CN implicated.
Run two searches, one for A and one for B, then compare the result counts for each field-value (such as dstip) combination and show the intersection.
Are key decision makers aware of your findings to maintain and improve political capacity?
Bad
Iffy
Good
Don't leave this feedback loop open.
Scoring is simple (A + B) - abs(A - B). Tries to find highest counts that are similar between A and B.
The OODA Loop is well-known and has been used as a framework for business, athletics, and many other forms in which there is an adversary.
Analytics can be misleading, we need to get a second opinion.
Time frame: YEARS! Initial breach in 2013.
Most recognized aspect: Go from data to action as quickly as possible
We can use multiple analytical methods on the same data to see if they agree. We can then follow-up by inspecting the final conclusion with search.
(Answer: not much)
So why do we find ourselves automatically asking this question?
Read about attacks on others in your industry, be ready for what they encountered.
- Are you collecting the right data?
- Do you have a way to orient the data when it comes in?
- Do you have the right mental (threat) models?
- Do you have the tech and political capacity to act on decisions?
OODA is present in incident response at many levels
- Event/Packet
- Incident
- Campaign
- Policy/strategy
OPM did not shift mental models in the face of an abrupt change. Had no means of updating models by focusing on patching.
Try a Different Analytical Approach
Jenks natural breaks classification (K-Means) groups our domain counts
http://www.nextgov.com/cybersecurity/2015/06/timeline-what-we-know-about-opm-breach/115603/
Bad guys in big groups
with similar counts
Google
Hypothesis: Spear Phishing
Boyd's Example of Building Mental Models
Gödel’s Incompleteness Theorems
Take the motor
Take the skis
Timeline:
Credentials used to spam, causing MX blacklist
User falls victim, creds stolen
Recon pulls company directory
Phishing emails are sent
Boyd inferred that any logical model of reality is incomplete (and possibly inconsistent) and must be continuously refined/adapted in the face of new observations.
New data is always required to continually validate conclusions.
Take the treads
Take the handlebars
The speed at which we can develop and test this hypothesis constitutes the speed of our OODA loop.
OODA is often oversimplified.
What do you get if you combine these pieces?
Heisenberg’s Uncertainty Principle
Even as we get more precise observations about a particular domain, we’re likely to experience more uncertainty about another.
As we find out more about what is happening, it raises more questions.
If you get a quick mental model fit you get to skip the decide step entirely and go directly to action.
It is about shifting mental models to better acclimate to new realities.
Back to the hypothetical email outage, what pieces might we have?
2nd Law of Thermodynamics
Lack of new information about the environment creates a “closed system" which leads to high entropy in the system (chaos).
Information starvation will lead to confusion.
Event data showing:
- User login to email
- Search engine indexing the company directory
- Company is unable to send email
- SMTP event rate above average
Deeper with OODA
New ELSA transform:
cluster(<field>, [ <num groups> ])
Example:
sig_msg:POWELIK groupby:srcip | subsearch(method:POST) | cluster(dstip) | sum(cluster)
Search and analytics speed up the OODA loop by decreasing the time it takes to build mental models and test hypothesis.
We need to derive a mental model to describe the situation.
OODA Requirements to Get and Use Data
Observation:
Collect this info
OODA is about constantly reevaluating current understandings to ensure the most effective mental models are constructed.
Action:
Ability to test hypothesis
Orientation:
- Know what web page is directory
- Know email servers and have data describing failure
- Analytics to find anomalous increase
Mental models are contained in the orientation phase, which is why it is the most important.
Decision:
Mental model (hypothesis) of spear phishing
It draws on fundamental laws of math and science.
Analytics point out patterns, groupings, and features of interest in search data.
Each cycle refines the data until it conclusively supports or invalidates the hypothesis.
Search results show actual packet and event data, decorated with WHOIS/GeoIP, threat intel, etc.
Same data minus Google:
Search-Analytic Cycle
Use these find interesting data, then zoom in and verify with another search.
Presentation Goals:
Start with seed IDS event to search, use multiple analytics on results:
- Anomalous spike (linear regression)
- Interesting groupings by count (Jenks natural breaks)
- Similar peaks/valleys (sample correlation)
- Interesting patterns (FFT)
Looks like it correlates, does it? We can take the time series data and check it with sample correlation:
domain:ip-addr.es domain:mevtutorial.in 0.925
domain:ip-addr.es domain:motomiles.com 0.924
domain:ip-addr.es domain:mundofomix.com 0.923
domain:ip-addr.es domain:mobilecomputingtoday.com 0.922
domain:ip-addr.es domain:loverocksusa.com 0.920
domain:ip-addr.es domain:maoribooks.com 0.921
Yes, comparing peaks and valleys together shows strong correlation.
- Understand how we think
- Learn how this affects how we acquire and analyze data
- Use this knowledge to improve our incident response capabilities
Can spot periodicity in overall event volume with Fast Fourier Transforms
Overall event rates also show a pattern:
A few ELSA Updates:
- Github is going great, THANKS to all contributors!
- Opallios, Inc. contributing improvements to graph lib, ingest methods, and aggregation functions
- Upcoming release of new search engine with ELSA plugin
About Me:
Senior Researcher at FireEye,
Co-Founder of Threat Analytics Platform
Author of open-source Enterprise Log Search and Archive (ELSA) and StreamDB projects.
Minor Airplane Enthusiast
Security Event Data in the OODA Loop Model
Martin Holste, FireEye, Inc.