Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.



No description

Dominik Schönhofer

on 22 October 2012

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of 2012-10-22_Proseminar-Bioinformatics_Schönhofer-Dominik_Biological-Databases

Dominik Schönhofer
2012.10.22 Biological Databases What are
Databases? 1.1 What is a Database? - organized collection of information.
- information stored in computer readable form
- In bioinformatics mostly types of SQL or XML in use A "database" is build out of two parts: The Database (DB): The Database Management System (DBMS) - function is to save information efficient
in terms of performance, availability and security.
- has to control all reading an writing access
on the database
- database language for defining, manipulating or
query the database 1.4 A simple example for
using a database to handle biological information: 1.3 Databases in biology or "what makes a database to an biological one?" - the contents Slide 1 Genes and Genomes RNA Database Protein sequences Metabolic pathways 5.1 Database access 5.2 The front end - nowadays a mostly graphical interface
- interacts with the DBMS to access the DB

- allows interaction between user and DB
-> the user can insert, query and view data

- different front ends can access the same DB
- how information will be interpreted depends
on the front end - access for users is mostly possible via an
browser application (front end)
- access to most databases in
molecular biology is free of charge
- often users can only read the data
- it needs extended modification rights if
external users should curate data 6 Links "The utility of a database depends on the quality of its links as well as on its contents"
(Lesk, 2008, S. 157) - internal links, to navigate around the DB
-> also allows to analyze selected data

- external links, to connect to other DB's 7.2 Database interoperability ENTREZ - an example for an meta database - french sec. pers. plur. for "Come in!"
- an integrated search an retrieval system
- a single query and user interface
- searches more than 25 DB's, like
PubMed, GenBank, NCBI, OMIM, ...
- several services like email notifications
or illustration of received data 7.1 Database interoperability Some questions need to appeal multiple
DB's at once. How to deal with that? - several databases can be merged into a
single one
- developing methods that allow
dissection and distribution of query's
and recombination of responses http://nar.oxfordjournals.org (lists 1380 DB's in 2012) 8.1 Data mining - knowledge discovery: description or
explanation of regularities in data

- successful forecasting 8.2 Statistical
techniques - Classification algorithms
- Clustering algorithms
-Principal component analysis
-> Hidden Markov Model for
detecting homologous
sequences 8.3 Artificial neural networks e.g. prediction of secondary structure 8.4 Support vector machines - used for classification of
- Large Margin Classifier
- used in supervised machine
learning 2 Databases contents - most DB's created for an
circumscribed subject
- DB's can contain overlapping material
- primary an secondary DB's
-> primary: gathered the data
-> secondary: recombined or
reformatted data from primary DB's 1.2 Organization of DB's two types of DB's in common use: - hierarchical structure (XML):
-> multiple level clustering
-> evolutionary relationships

- relational database (SQL):
-> tables
-> theoretic operators

- flat files 4 Quality control 3 Creation and annotation - data will be supplied by
professionals (e.g. gene sequence)

- also part of an entry can be:
-> Reference information
-> Interpretative information
-> Links errors are difficult to extirpate... - 'get it right the first time'
-> let authors create their entry's!
- removing them when detected
-> let examine entry's by professionals
-> correct errors in the master copy
- if other DB's assimilated errors
-> Knowbots or offer 'health checks' Quality control Creation Contents Access Links Interoperability Thanks for your attention! Slide 2 Slide 3 Slide 5 Slide 6 Slide 7 Slide 8 Slide 9 Slide 10 Slide 11 Slide 12 Slide 13 Slide 14 Slide 15 Slide 16 Slide 17 Slide 18 Slide 19 Slide 16 References - Lesk A (2008) Introduction to Bioinformatics, 3th ed.
Oxford: Oxford University Press
- Zvelebil M et al. (2008) Understanding Bioinformatics.
New York: Garland Science, Taylor & Francis Group, LLC
- Westhead D, Parish J (2002) Bioinformatics.
Guildford: Biddles Ltd
- http://de.wikipedia.org/wiki/Datenbank (18. Oct. 2012)
- http://en.wikipedia.org/wiki/Database (18. Oct. 2012)
- http://de.wikipedia.org/wiki/Relationale_Datenbank (18. Oct. 2012)
- http://de.wikipedia.org/wiki/Front-End (18. Oct. 2012)
- http://en.wikipedia.org/wiki/Entrez (18. Oct. 2012)
- http://de.wikipedia.org/wiki/Data_Mining (18. Oct. 2012)
- http://mips.helmholtz-muenchen.de/cider (18. Oct. 2012)
- http://mips.helmholtz-muenchen.de/HSC (18. Oct. 2012)
- http://dev.mysql.com/doc/refman/5.6/en/ (20. Oct. 2012)
- https://kb.askmonty.org/en/ (20. Oct. 2012) [1] http://en.wikipedia.org/wiki/File:UPlogo1.png (18. Oct. 2012) [2] http://en.wikipedia.org/wiki/File:KEGG_database_logo.gif (18. Oct. 2012)
[3] http://static.ensembl.org/i/e-ensembl.png (18. Oct. 2012) [4] http://www.mirbase.org/images/mirbase-logo-blue-web.png (18. Oct. 2012) mips.helmholtz-muenchen.de/HSC (18. Oct. 2012)
mips.helmholtz-muenchen.de/cider (18. Oct. 2012) mips.helmholtz-muenchen.de/cider (18. Oct. 2012) [1] [2] [3] [4] Slide 4
Full transcript