Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


Internet Research For Journalists and Researchers - A Visualisation

A conceptual description of the four main data sources available to journalists and researchers: the hidden web, the searchable web, people and breaking news. Illustrated with some examples of the best research tools available..

Colin Meek

on 30 November 2012

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Internet Research For Journalists and Researchers - A Visualisation

The Hidden Internet The Searchable Web People Breaking News Advanced
Searching and
using operators Image Tracking Tracking Breaking News (Twitter Mining) Database Content (Instagram) Google Images The ‘hidden’ or ‘invisible’ web is the vast amount of data that many normal search engines don’t or can’t index. Most of the data is held on databases. Some of these databases have pay-walls, but others are simply constructed in a way that prevent normal searching. The hidden web is in constant flux. Search engine algorithms are different and vary so the various parts of the web are visible to different search engines. New technologies mean that data becomes available in new formats that may or may not be visible. Information can also be hidden from search engines because it is in the wrong format; sites can repel search engine spiders; and, pages within large sites may never be indexed. The vast majority of web data is actually hidden from search engines. Some estimates put the figure at between 90 and 95% of total web content. It may be much higher. The Searchable Web is the ‘surface web’ or all of the information that can be found using standard search engines. It obviously consists of information in the form of markup languages for displaying web pages but it also includes data in other formats that site owners have made available online such PDF, Excel or Word documents. Data is also available in many other forms including, for example, images and KML which expresses geographic data on maps. You can use advanced tools for faster and more accurate search. There are a myriad of sources for finding people and for finding out about people. Search engines can be useful, but information about people is often held on databases – on professional registers for example. Specialist tools such as 192.com are also important as are people-focused search tools such as zoominfo.com.
But social media sites are obviously critical sources as so many people use at least one of the big three on a daily basis: Facebook, Linkedin and Twitter. Social media has transformed the way many people use and interact with the internet. People are no longer defined by where they work and the websites controlled by their bosses. And, equally, companies and organisations must use social media to maintain a profile. But the best way to efficiently access publicly accessible information on social networks is not obvious.
Advanced search techniques can take you much further, faster. Or 'Real Time Media' is the stuff that's going on now - in social media sites. The traditional news media can’t compete with the rapidly changing content published on social media by people either commenting on breaking stories or those who are actually caught up in news events. Hurricane Sandy, recent elections, the riots in England in 2011, the revolution in Egypt – all demonstrate that the internet is a vital source of information on breaking stories. Brilliant tools exist to monitor, analyse and geolocate comments and posts – as they happen. Internet Research For Journalists and Researchers - A Visualisation Introduction The green circles represent major sources of information on the web The blue rectangles and brackets describe examples of tools that can be used to tap into the web sources Geolocation Advanced
Twitter RSS Simple Solutions... Tweetdeck to: monitor and filter Twitter topics and lists; communicate with users Geolocate foursquare users Use Twitter Search operators to pin-point specific users and topics Geolocate specific tweets Combine operators for
powerful search Filter to find only the most useful content Standard search tactics Analyse connections and twitter
networks Connections between a hand-picked twitter users Connections between people in a list Search and monitor Twitter lists Monitor trends, hashtags
and topics Hone in on locations... ...and Twitter users Lets you monitor news and other content automatically Simple solutions include Netvibes but other tools give you more flexibility... Topsy lets you create RSS feeds or alerts for Twtter searches Search for feeds and subscribe Filter feeds so you quickly
get only relevant stories Create feeds for news searches Use search engine advanced operators for faster, more precise and flexible search Combine operators for precision Focus your results on specific sites and specific documents Use other powerful tools to tap into Twitter conversations and users Tweetarchivist lets you download Twitter archives, analyse Twitter conversations by topic and locate linked web content Use neoformix to analyse tweets from one users Think out the box...use image searching as a research tool Use filters to obtain only focused content Use advanced searching to hone results Reverse image search to check on provenance Use reverse image search instantly with browser extensions Use reverse image search to check pictures found on social media Search and monitor instagram images by tag Create galleries and monitor
them automatically Create rolling feeds of images Focus in on users by topic Use Tweeted Times... ...to create a 'magazine' from links recommended by Twitter users you follow Limit your search to just one site Filter results by filetype Remebmer - search engine results vary enormously are not the same... This shows the 100 top results for the same search term in 2 engines. Only 20% of the pages were found by both. When you use search engines, exploit all of their functionality,
Filter the results to focus on what really matters. Try other search engines Many search tools offer new and innovative tools... 'Alerts' services let you monitor search terms for automatic email or RSS notification Maps and streetview for further information about people and places Hidden Internet Search Tools Interrogate social networks using
advanced search commands It is usually easier and much more effective to search social networks such as Linkedin and Facebook using advanced search operators... This search in Facebook returned just 1 result Most of the content in databases is hidden from normal search.
Here are a few of the millions of databases that exist - illustrating the value and diversity of Hidden Web content.
The key to tapping into the Hidden Web is to search effectively for the database. BAILII - a free database of all UK and Irish case law. An example of a professional register. One of many WHO databases for access to health information. The world famous PubMed database gateway to medical literature. You can set up an RSS feed for your search. The world's most authoritative database on toxic substances One of many provided by the FDA A database containing planning applications from across the UK The database of UK legislation A database containing planning applications from across the UK The National Archives Portal. Access to critical data available nowhere else. A database containing planning applications from across the UK Access to the US Library of Congress "Dissatisfied users of general-purpose search engines may mistakenly conclude that, because they could not find what they needed, it does not exist in the web information world."
Devine and Egger-Sider
Going Beyond Google Tools that are particularly good at finding sources of hidden web content A subject directory constructed by academic librarians. With alerts and RSS services. Scientific information search engine "The Open Directory Project is the largest, most comprehensive human-edited directory of the Web." IPL.org is another database administered by librarians. Complete Planet is a searchable gateway to 70,000 accessible databases People Search Tools But this search in Google returned 7000+ results This is the search restricted to 'posts'... ...and just one of the results A search on Facebook for 'Gary Speed' the day he died Hundreds of results with a search using Google A similar search on the Occupy movement The same technique can be used on many social networks including Linkedin Turn Google searches into
RSS feeds Web
Monitoring Understand the limitations of search engines and the problems related to over-reliance on just one tool.... "Personalization algorithms typically look at what you click first. On the Internet, code is the new gatekeeper. It’s making value decisions, but it doesn’t have any value system built in. 'It may be showing us what we like, but it’s not showing us what matters'.”
Eli Pariser, 2011 Beware of results 'tracking' and the 'filter bubble' concept which means we get personalised results whether we want them or not: Yahoo axis search allows you search without leaving a page and you can 'sync' searches between your tablet, phone and desktop. With an option for image search Duck Duck Go is one of the best alternatives to Google Unfiltered results and no tracking You can also use simple news aggregation sources to monitor web and social media content automatically. These magazines obtain content for you on specified subject areas... www.pulse.me sulia.com www.zite.com (tablet only)
Full transcript