Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Transcript of Data Virtualization
AADAMA Salah-Eddine Lecturer :
Mr. SOUSSI Khalid Plan Definition The term data store is used to refer to any source of data.
- When data virtualization is applied, an abstraction layer is provided.
- Because of that abstraction layer, applications don't need to know where all the data is physically stored.
It feels as if one large database is accessed.
Data consumer refers to any application that retrieves, enters, or manipulates data. Data virtualization is the technology that offers data consumers a unified, abstracted, and encapsulated view for querying and manipulating data stored in a heterogeneous set of data stores. - The concepts of data consumer and data store are key to the definition of data virtualization.
- Data virtualization layer presents data store and data consumer, as one integrated set of data.
- And in addition, for the same set of data different languages can be used to access it. DV in BI Systems The average time needed to add a new data source to an existing business intelligence system was 8.4 weeks in 2009, 7.4 weeks in 2010, and 7.8 weeks in 2011.
33% of the organizations needed more than 3 months to add a new data source.
Developing a complex report or dashboard with about 20 dimensions, 12 measures, and 6 user access rules, took on average 6.3 weeks in 2009, 6.6 weeks in 2010, and 7 weeks in 2011. This shows it’s not improving over the years.
30% of the respondents indicated they needed at least 3 months or more for such a development exercise. The architectures of most business intelligence systems are based on a chain of data stores : Disadvantages of the classic systems:
Duplication of data
Non-shared meta data specifications
Decrease of Data Quality The Implementation of Data Virtualization Strategy, Planning & Governance In this stage, we develop the framework for accomplishing our overall data virtualization objectives. Solution Architecture In this stage, we identify our future-state data virtualization architecture and establish our plan to get there. Configuration & setup In the Configuration and Set Up stage, we
establish our overall data virtualization environment. Design & Development Here, we move from organization-wide data virtualization planning and set-up activities to the detailed project-level software engineering. In the Deployment stage, we move new data virtualization solutions into production. Deployment In the Operation stage, attention shifts to how to ensure reliable, high performance operations that achieve SLA objectives. Operation Improvement In the Continuous Improvement stage, our attention shifts to optimization of our data virtualization investments. Advantages Unified data access
Centralized data transformation
Centralized data cleansing
Centralized data integration
Consistent reporting results
Data store independency
Efficient distributed data access
Simplified table structures
Database language and API translation
Minimal data store interference Data Unified Access Data virtualization server can offer one unified API and database language to access all these different storage formats. Centralized Data Integration A data virtualization server centralizes integration code and all data consumers will share that integration code Consistent Reporting Result If all the integration solutions are handled by a data virtualization server, it’s likely that results will be consistent. Efficient Distributed Data Access Developers should not be concerned with such an issue. Therefore, this is a task taken over by a data virtualization server. Centralized Data Cleansing Data virtualization server shows only the correct values to the data consumers.
This solution is better than replicating this data cleansing logic to all the data consumers. Remark : If multiple data consumers use the same data virtualization server, they share the same meta data specifications. Simplified Table Structures Data virtualization could present a simpler and more appropriate table structure, simplifying application development and maintenance.
Every data consumer can benefit from those simplified table structures. Centralized Data Transformation Particular data values in a data store might have formats that aren't suitable for some data consumers.
A data virtualization server could implement this transformation and all the data consumers will use it. Minimal Data Store Interference Most data virtualization products support a caching mechanism.
The data consumer accesses the data in the cache instead of the data in the data store, thus minimizing interference on the source data store. This all means that data virtualization simplifies application development, because it reduces the amount of code required for accessing the necessary data in the right way and form. Integrating data stores without a data virtualization server leads to replication of data integration logic data virtualization server data store. Data Store Independency Data virtualization could hide differences of languages implemented, making it possible to replace the current database server by another.
Data virtualization makes data consumers independent of a particular data store technology. Database Language and API Translation A data virtualization server is able to translate the language supported by the data store to one convenient for the data consumer. Data Virtualization Minimal data store interference Unified data access Centralized data transformation Centralized data cleansing Centralized data integration Consistent reporting results Data store independency Efficient distributed data access Simplified table structures Database language and API translation Uses of Data Virtualization Virtual Data Mart
Self-Service Reporting and Analytics
Operational Reporting and AnalyticsCollaborative Development Use 1 :
Data Mart Virtualization - With a data virtualization server the existence of a data mart can be simulated.
- In a data virtualization server, virtual tables are defined with the same structure. Virtual data marts as an alternative to physical data marts. Use 2 :
Self-Service Reporting and Analytics Business users are given access to any data store possible.
Risk : users are manipulating the data in a correct way ?
A data virtualization server allows the IT department to meet the users halfway.
With data virtualisation the IT department can react more quickly if there are some modification to set. --> Data virtualization turns self-service reporting in managed self-service reporting. Use 3 :
Virtual Sandboxing A virtual sandbox refers to a stand-alone and some what isolated environment setup for analysts and data scientists to study data and find answers to unclear or poorly defined questions.
With a data virtualization server, a virtual sandbox can be set up in a fraction of the time.
In addition, if other types of data are needed, only virtual tables have to be defined.
--> To summarize, a virtual sandbox is more agile than a physical sandbox. Use 4 :
Interactive Prototyping Data virtualization can be used as a prototyping instrument that offers a more interactive and dynamic environment in which users can participate.
Valuable time can be saved.
A prototype can more easily be developed using data virtualization.
If something is not to their liking, the specifications can be changed right away very quickly. Applications
Applications are any information resources that serve the business, such as those created by IT in support of the business. Mashups
Mashups are instances of applications, or composite applications to be exact, where applications, or the mashup, are created through the combination of many different resources, including physical databases, abstracted databases, and APIs (on premise or Webdelivered). Abstracted data provides business intelligence, applications, and mashups access to information enterprises need using the structure they require. Data
Uses Conclusion Data virtualization technology is powerful and mature. But to gain the full benefit, organizations need to implement data virtualization using proven best practices. “The value of data virtualization comes
from successful implementation and
continuing adoption. There are many places to
start, but start nonetheless.”
Bob Eve. Webography & Bibliography Thanks for your attention