Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


Data Warhouse Presenation

No description

Gh Karim

on 19 July 2013

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Data Warhouse Presenation

Let Us start Group Members:
Ganga Muppidi
Ghaida AlRwdi
Samir Al-Othary
Sheraz Ahmed How has Big Data emerged? Who are dealing with Big Data Now? Well-known Enterproses Examples/Case studies: How is it important? What are the Sources of Big Data? Why is Big Data important? (Benefits) FACTS Definitions What is Big Data? According to McKinsey Global Institute , Big Data: "refers to data sets whose size is beyond the ability of typical database software tools to capture, store, manage and analyze". Intel: adopt the definition: "Big Data as a new generation of technologies and architecture, designed to economically extract value form very large volumes of a wide variety of data by enabling high-velocity capture, discovery and/ or analysis" There some facts that McKinsey has highlighted, For example:
•There are 30 Billion pieces of information shared on Facebook each month.

•Only $600 today can buy a disk drive that will allow you to store all of the world's music!!

•About 15 of 17 industry sectors in the United States have more data per company on average than the U.S Library of congress!! According to IBM:

•Every day, About 2.5 quintillion (trillion) bytes of data is created
•90% of the data in the world today has been created in the last two years alone •Lower risk by detecting fraud and monitor cyber security in real time: By collecting information from new types of sources such as social media, emails, sensors all together.

•Enable businesses to gain a full understanding of their customers: For example, what they buy, why they buy, what they prefer, why they switch and actually what they'll buying next!!:

According to Gary Spakes: There are four ways Big Data can benefit businesses:
Detect, prevent and remediate financial fraud.
Calculate risk on large portfolios.
Execute high-value marketing campaigns.
Improve delinquent collections.
Those are the famous ones but there are many more...!! T-Mobile USA has integrated Big Data across multiple IT systems to combine customer transaction and interactions data in order to better predict customer defections. By leveraging social media data (Big Data) along with transaction data from CRM and Billing systems, T-Mobile USA has been able to “cut customer defections in half in a single quarter”.

McLaren’s Formula One racing team uses real-time car sensor data during car races, identifies issues with its racing cars using predictive analytics and takes corrective actions pro-actively before it’s too late! Data Quality Big Data and Data Mining Big Data Applications Dealing with Big Data Relation between data mining and big data Data quality is becoming an important topic in the big data discussion (Ghandour.M, 2006) DQS Process Need with DQS Introducing Data Quality DQS provides
The following features to resolve data quality issues Data Quality What Is Data?
It is a stored representation of OBJECTS and EVENTS that have meaning and importance in the user’s environment
Data can be Structured or Unstructured

What Is Data Quality ?
Data quality is not defined in absolute terms. It depends upon whether data is appropriate for the purpose for which it is intended. Data Cleansing
Data Cleansing: the modification, removal, or enrichment of data that is incorrect or incomplete, using both computer-assisted and interactive processes.
Data Matching
The identification of semantic duplicates in a rules-based process that enables you to determine what constitutes a match and perform de-duplication.

Data Profiling and Notifications in DQS
Profiling is a powerfusolution.l tool in a DQS data quality Data quality is not defined in absolute terms. It depends upon whether data is appropriate for the purpose for which it is intended.
DQS identifies potentially incorrect data, and provides you with an assessment of the likelihood that the data is in fact incorrect.
DQS provides you with a semantic understanding of the data so you can decide its appropriateness.
DQS enables you to resolve issues involving incompleteness, lack of conformity, inconsistency, inaccuracy, invalidity, and data duplication. Data quality is all about the complete set of interactions of people and information they need to perform their work effectively and efficiently, whether creating data or applying it. Data quality is, “consistently meeting all knowledge workers and end-customers expectations through information and information services, to accomplish knowledge worker objectives and customer objectives”. 

There are two main steps to build reliable business information
Implement a data correction process and technology solution that helps you parse , standardize, verify and correct key pieces of data.
Implement a process improvement and technology solution that transforms the culture and improves business processes to eliminate or significantly reduce defective data from being produced Big data is a popular term used to describe the exponential growth, availability and use of information, both structured and unstructured. Much has been written on the big data trend and how it can serve as the basis for innovation, differentiation and growth.
The hopeful vision for big data is that organizations will be able to harness relevant data and use it to make the best decisions.
Recalculate entire risk portfolios in minutes and understand future possibilities to mitigate risk. Quickly identify customers who matter the most.
Send tailored recommendations to mobile devices at just the right time, while customers are in the right location to take advantage of offers.
Analyse data from social media to detect new market trends and changes in demand.
Use data mining to detect fraudulent behaviour.
Determine root causes of failures, issues and defects by investigating user sessions, network logs and machine sensors. Big Data deals with the flow of information, and having the ability to analyse this data in real time (or near-real-time). Data mining is more of the traditional searching the haystack for needles.

Data Mining is an analytic process designed to explore data (usually large amounts of data - typically business or market related - also known as "big data") in search of consistent patterns and/or systematic relationships between variables, and then to validate the findings by applying the detected patterns to new subsets of data. The ultimate goal of data mining is prediction - and predictive data mining is the most common type of data mining and one that has the most direct business applications. The process of data mining consists of three stages: (1) the initial exploration, (2) model building or pattern identification with validation/verification, and (3) deployment (i.e., the application of the model to new data in order to generate predictions). According to Gartner's Merv Adrian, Big Data: "exceeds the reach of commonly used hardware environment and software tools to capture, manage, and process it within a tolerable elapsed time for its user population" Our group definition:
"Big Data refers to a complex and large sets of data that our today's systems and traditional data processing applications cannot deal with in terms of processing, storing, sharing, transferring and analysing" Definitions How is it important? Like data warehousing, it can help predicting business future and improving decision making:
Healthcare: 20% decrease in patient mortality by analyzing streaming patient data
Telco: 92% decrease in processing time by analyzing networking and call data
Utilities: 99% improved accuracy in placing power generation resources by analyzing 2.8 petabytes of untapped data Big Data and Data Warehousing By Ghaida AlRwdi By Ghaida AlRwdi By Ghaida AlRwdi By Sheraz Ahmed By Ganga Muppidi Big data refers to data sets so large they become awkward to capture, store, search, analyze and visualize using conventional tools. Much of this data is in the unstructured form of documents, videos, or text that is difficult to fit into traditional databases.

It also contains “multiple versions of the truth” in the form of data organized for different purposes at different times, or similar data obtained from different sources.

As a result, existing storage management technologies and processes cannot make all this information available in a neatly organized warehouse accessible when the business needs it.

Technology initiatives include ensuring that the tools needed to navigate big data are usable by the intended audience, and that the architecture and supporting network, technology and software infrastructures are capable of supporting big data.

Organizations should also develop detailed metrics to assess their big data management programs, including the times required to turn data into insight, to integrate new and existing information sources, to manage the data and the value derived achieved from the data. Information directly related to the creation, extraction or capture of value.Supporting information that helps define a strategy to create, extract or capture value.Information required for business operations but not necessarily related to value creation, extraction or capture.

Historical information that once supported value, regulatory or other functions but is now kept only because it might be useful in the future. Policing and Big Data So will Big Data replace data warehouse? Ralph Kimball Problems with Big Data: ShotSpotter Big Data and Your Data Warehouse - TDWI.org Big Data and Data Warehouse Barry Devlin states that “For more than 25 years, data warehousing has been the accepted architecture for providing information to support decision makers” and “The primary driver for data warehousing was to reconcile data from multiple operational systems and to provide a single, easily-understood source of consistent information to decision makers.” According to Philip Russom - Business Intelligence and Data “Just a handful of years ago, Big Data was a problem in terms of scaling up IT systems and discovering the business value. Thanks to advances in vendor platforms and user practices, most enterprises today consider Big Data an opportunity – not a problem – because they can mine it and analyze it for valuable business insight.” Oakland in California. Like many other American cities, today it is covered with hundreds of hidden microphones and sensors, part of a system known as ShotSpotter, which not only alerts the police to the sound of gunshots but also triangulates their location. On verifying that the noises are actual gunshots, a human operator then informs the police.

Instead of detecting gunshots, new and smarter systems can focus on detecting the sounds that have preceded gunshots in the past. This is where the techniques and ideologies of big data make another appearance, promising that a greater, deeper analysis of data about past crimes, combined with sophisticated algorithms, can predict – and prevent – future ones. PRIVACY Any
???? “The enterprise data warehouse must absolutely stay relevant to the business. As the value and the visibility of big data analytics grows, the data warehouse must encompass the new culture, skills, techniques, and systems required for big data analytics.” By Samir Al-Othary By Samir Al-Othary Thanks for lending us your ears, Joining Everything Together By Samir Al-Othary By Samir Al-Othart Will data warehousing survive the advent of big data? - O'Reilly.com How Facebook could get you arrested - The Guardian Microsoft PhotoDNA & Facebook How Facebook could get you arrested - The Guardian In 2011 Facebook began using PhotoDNA, a Microsoft service that allows it to scan every uploaded picture and compare it with child-porn images from the FBI's National Crime Information Centre. Since then it has expanded its analysis beyond pictures as well. In mid-2012 Reuters reported on how Facebook, armed with its predictive algorithms, apprehended a middle-aged man chatting about sex with a 13-year-old girl, arranging to meet her the day after. The police contacted the teen, took over her computer, and caught the man. Source of information
Too overwhelming The Evolving Role of the Enterprise Data
Warehouse in the Era of Big Data Analytics Barry Devlin, "The ultimate reason is that data warehousing seeks to ensure that enterprise-wide decision making is consistent and trusted"

Philip Russom, "So you will need to modify your tried-and-true best practices for EDW data integration, data quality, and data modeling in order to take advantage of Big Data." Adrian, Merv (2011), It’s going mainstream and it’s your next opportunity. [Online]. Terdata Magazine, 1:11. Last accessed 8 April 2013 at http://www.teradatamagazine.com/uploadedFiles/TDMO/v11n01/Articles/PDFs/Big-Data.pdf

McKinsey Global Instititue (2011), Big Data: The Next Frontier for Innovation, Competition, and Productinity. [Online]. Last accessed 8 April 2013 at: http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation

Intel (2013), The Big Facts about Big Data. [Online]. Last accessed 8 April 2013 at: http://www.intel.co.uk/content/www/uk/en/big-data/big-facts-about-big-data.html

IBM (2013), Big Data at the Speed of Business. [Online]. Last accessed 8 April 2013 at: http://www-01.ibm.com/software/data/bigdata/

Spakes, Gary, Four ways big data can benefit your business. [Online]. Last accessed 8 April 2013 at: http://www.sas.com/news/feature/big-data-benefits.html

Dr Kotadia, Harish (2012), 4 Excellent Big Data Case Studies. [Online]. Last accessed 8 April 2013 at: http://hkotadia.com/archives/5021

Bill Franks (2012).Taming The Big Data, New Jersey, USA, Wiley & SAS

Ghandour.M. (2006, 22- Dec). http://www.xing.com/net/tqm/does-tqm-work-6539/data-quality-is-important-to-every-organization-3145676.

MSDN.com. (2013). http://msdn.microsoft.com/en-us/library/ff877917.aspx.
Dr Kotadia, Harish (2012) (msdn.com, 2013) http://www.cognizant.com/insights/perspectives/dealing-with-big-data





Full transcript