Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Do you really want to delete this prezi?
Neither you, nor the coeditors you shared it with will be able to recover it again.
Make your likes visible on Facebook?
Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.
Infrastructure & Metadata
Transcript of Infrastructure & Metadata
Absolut OK ( 9634 )
Telekom Srbije ( 186366 )
Amres ( 3269 )
SBB ( 131260 )
PTT ( 6776 )
Verat ( 5346 )
Beotel ( 16298 )
EXE Net ( 6760 )
126.96.36.199 SBB TELEPARK, BELGRADE, RS
188.8.131.52 – EQUINIX, LÄRCHENSTR. 110, 65933 FRANKFURT – DE-CIX PREMIUM ENABLED SITE
184.108.40.206 – EQUINIX, I.T.E.N.O.S. KPN, LEVEL3, TELEHOUSE , KLEYERSTRASSE 79-90, FRANKFURT. – DE-CIX
220.127.116.11 TELECITYGROUP, DUBLIN, IR
TUCKERTON NJ, TAT 14 LANDING POINT
EQUINIX, 44470 CHILUM PLACE, ASHBURN, VA
18.104.22.168 FACEBOOK DATA CENTER, FOREST CITY, NORTH CAROLINA, US
Hacking Team Metadata Analysis
HEAT-MAP OF INTERNAL COMMUNICATION
POTENTIAL ORGANISATIONAL STRUCTURE
BASED ON THE LEVEL AND DIRECTION OF COMMUNICATION
NUMBER OF SENT EMAILS PER HT EMPLOYEE IN TIME (2014)
EXTERNAL CONTACTS WITH MORE THAN 50 EMAILS EXCHANGED WITH HT EMPLOYEES ( 2014-2015 )
EXTERNAL CONTACTS GROUPED BY THE DOMAIN NAME BASED ON THE D.VINCENZETTI EMAILS
NUMBER OF EMAILS EXCHANGED (>30) BETWEEN HT EMPLOYEES AND EXTERNAL CONTACTS ( 2014-2015 )
TIMELINE OF SELECTED COMPANIES EMAIL COMMUNICATION
WITH HT EMPLOYEES (2014)
PATTERN RECOGNITION :
SUM OF MR.D SENT AND RECEIVED EMAILS PER HOUR DURING THE DAY (2014)
PATTERN RECOGNITION :
SUM OF MR.D SENT AND RECEIVED EMAILS PER WEEK DAYS AND MONTHS (2014)
ANOMALY DETECTION : NUMBER OF MR.D SENT EMAILS PER HOUR (2014)
PATTERN RECOGNITION AND ANOMALY DETECTION : HEAT-MAP OF MR.D SENT EMAILS PER HOUR (2014)
TIMELINE OF EMAILS WITH SUBJECTS FROM AMAZON.IT
MAP OF HT EMPLOYEES FLIGHTS BASED ON CWT EMAILS SUBJECT LINES
INDIVIDUAL HT EMPLOYEES FLIGHTS MAP
BASED ON CWT EMAILS SUBJECT LINES
TIMELINE OF INDIVIDUAL HT EMPLOYEES FLIGHTS TO DIFFERENT COUNTRIES BASED ON CWT EMAILS SUBJECT LINES
TIMELINE OF HT EMPLOYEES IP LOCATIONS
BASED ON EMAILS RECEIVED BY MR.D (2014-2015)
MAP OF HT EMPLOYEES IP LOCATIONS
BASED ON EMAILS RECEIVED BY MR.D (2014-2015)
Pattern of Life
SOCIAL NETWORK ANALYSIS OF HACKING TEAM EMAIL DATABASE ( PERIOD 2013-2015 )
SOCIAL NETWORK ANALYSIS
OF NODES WITH +100 EXCHANGED EMAILS
NSA : STELLAR WIND
VLADAN JOLER - SHARE LAB
In our previous research we explained how metadata is being collected and accessed by numerous actors – government agencies, Internet service providers, Internet companies such as Google or Facebook, data dealers or producers of mobile phone applications. We explained the invisible infrastructure behind data flow, but we never had a chance to investigate what these actors can really do when they have access to a vast amount of metadata about you. This data investigation is exactly about that
On July 5, 2015, one of the World’s biggest cyber weapon manufacturers and dealers – an Italian based company, Hacking Team, faced a leak of their internal email database. The twitter account of the company was compromised by an unknown individual who published an announcement of a data breach and provided links to over 400 gigabytes of data, internal emails, invoices, and source code.
HACKING TEAM LEAK
IN SOME KIND OF REVERSE ENGINEERING PROCESS WE EXPLORED THE POSSIBILITY OF USING THEIR OWN METHODOLOGY FOR AN INDEPENDENT DATA INVESTIGATION OF THE HACKING TEAM, ONE OF THE “CORPORATE ENEMIES OF THE INTERNET”.
The first step we took in exploring this pile of data was to perform a Social Network Analysis, a strategy for investigating social structures based on network and graph theories. It characterises networked structures in terms of nodes (individual actors, people) and ties or edges (relationships or interactions) that connect them.
By filtering out the nodes with less than a 100 exchanged emails, we isolate the internal Hacking Team communication and get a closer look at their internal structure based solely on it.
In this example we see the same data set as the one presented above, but this time in the form of a heat map.
Using the insights from both visualisation methods, we are able to shape a communication chart that might represent a credible representation of the organisational structure.
If we were to add another interesting piece of information retrieved from metadata – the time component, we would be able to track the activity of every individual employee in time, based on the number of sent messages from each one of them.
Even more interesting, or relevant for investigative data journalism and our effort to understand the nature of the organisation that we are investigating, are probably the external contacts.
In our set of data that means around 4600 different individuals that exchanged emails with Hacking Team employees in the course of 2 years.
We grouped the emails by domain, and after some research about the companies behind the domain names, we classified them by the type of service they officially provide.
If we add the Hacking Team employees on the other axis, we will get information who in the team communicated with external contacts and how frequent and strong the communication was.
We can explore the relation between selected companies and Hacking Team in time. We can also track how different actors are taking communication in different times.
THE UNIQUE WAY WE INTERACT WITH THE TECHNOLOGY WE USE, THE UNIQUE SET OF CONTACTS WE HAVE OR OUR UNIQUE BEHAVIORAL PATTERNS DEFINE OUR METADATA SIGNATURE, OUR FINGERPRINT. IN THE EYES OF THE ALGORITHMIC ANALYSIS EVERY SINGLE PERSON IS UNIQUE.
Sent emails represent the behaviour of the person that we are examining and received emails represent the overall behavioural pattern of his social or professional environment.
Pattern-of-life analysis is a method of surveillance specifically used for documenting or understanding subject’s habits. It is a computerised data collection and analysis method used to establish the subject’s past behavior, determine its current behavior, and predict its future behavior.
What is even more important to our analysis are the anomalies in his behavior. Anomalies can point to many things. People are changing their behaviour when depressed, sick, working under pressure, when there are some deadlines or important events, when they are traveling or when they fall in love, for example.
By extracting the emails sent by Amazon to Hacking Team employees, we were able to get an insight into their purchases.
Every time an airplane ticket is booked, the agency would send an email with name and airport codes, contained in the subject line, to the prospective passenger. Extracting that information from the subject and cross-referencing with the date the email was sent, we are able to get an approximate information about the journeys of HT employees.
If we group the flights by passenger’s name, we realize that each of the most frequent flyers is based in a certain place, and covers a certain region/market, such as SE Asia, Middle East, South America etc.
If we regroup the same set of data, by location, we can see at which point in time and where two or more Hacking Team employees have met or have traveled together. This implies potential business meetings, sales of surveillance tools, establishing new relations with international customers and government agencies around the globe.
The email header hides one even more precise location information. In some cases, the email headers reveal the IP address of the sender. The IP address can then be geolocated, using some publicly available tools, to the level of a city or individual router.
On a World map, the distribution of their locations looks like this.
If We zoom on EU level, it looks like this
.. city level ....
WHO HAS ACCESS?
PREDICTING FUTURE BEHAVIOUR
lab for exploring different technical aspects of the intersections between technology and society.
We are using various network topology, data mining and data visualization methods to create a unique Internet Privacy and Transparency Atlas, which is a set of visual representations and methodologies created to map, uncover, visualize and independently monitor different aspects of Internet privacy and transparency.
Our research can be described as a form of data-driven investigation, a process based on analyzing and filtering large data sets for the purpose of creating a story. In most cases, our method does not rely on data available online as a result of open data initiatives or in official documents – it’s rather based on active remote sensing, acquisition of information without making physical contact with objects by using different open network analysis tools or data produced by applications or hardware.