Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


Workshop Semantic Web

An introduction in ontology, semiology, semantics and linked open data

Marco Brattinga

on 10 December 2012

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Workshop Semantic Web

Semantic web Ontology
Semiotics Semantic web RDF & OWL Applications Ontology Aristotle
(384-322 BC) Metaphysics Taxonomy Partition maps Perspectives Kant
(1724 - 1805) Constructivism Medieval
(1200-1600) Controled
(Latin) Descartes
(1596 - 1650) Epistemology Boole
(1815 - 1864) Gödel
(1906 - 1978) Incompleteness
Theorem Russell
(1872 - 1970) Russell's
paradox Frege
(1848-1925) Predicate logic Syntax Semiotics "Accidents" Compositional
Hierarchy Proposition
logic Semantics Pragmatics Speech Act Social Facts Definitions Pure
reasoning (web 1.0)
World wide web Social networks
(web 2.0) Semantic web
(web 3.0) Open world
assumption Anybody can
say anything
anything Semantic
tagging Won't work Semantic
(data) tagging Open
linked data Instance
before Class Context Semantic
web stack RDF RDFS URI & URL OWL SKOS Reasoning Triple store SPARQL Ontology Big data Domain
ontology Upper
ontology FOAF Gene ontology Dublin Core Ontology
mismatch Linked Data
Overheid Metalex Europeana Kadaster
Glossary Aristotle focused on real things
His ontology is about what
a thing is all the time (in essence, essentially)

(A dog is always an animal)

In his view, accidents are about how a thing is at some time (accidentally).

Short haired, asleep, brown It is not so much
about what "is"

but what "should" be known
and what "can" be known There are no classification hierachies in reality, unless we put them there

We cannot know reality, we can only
know concepts that resemble reality

It is a closed system: we construct
reality from these concepts S is the set of all sets which do not have themselves as a member.

Is S a member of itself? "This statement cannot be proven" Every system that is supposed to know all truths is either:

1. Incomplete: doesn't answer
every question;
2. Incapable of proving its own
consistancy The structural relationship between signs The syntax describes which signs are possible, and how they can be combined.

The syntax formulates how we use language

XML describes the syntax of a message The relationship between signs and the thing to which they refer, their meaning. Semantics describes the meaning of signs
Semantics formulate what we mean when we use language

OWL, SKOS describes
(part of) the semantics
of a message a combination of one or more words
a term with one meaning in a specific context
the explanation of a concept from a specific perspective
an environment that dictates the meaning of terms
a viewpoint from which terms can be understood A definition creates a relationship between the term defined and other terms that are already understood from a specific perspective.

a bicycle is an transportation device with two wheels

is a definition in which the concept bicycle is explained via the relation with the concepts transportation device and wheels When we say something, we:

(1) Utter some words (the data transported);
(2) Mean something with those words (the information shared)
(3) We do something (the act performed)
... and in the act, create something new Peter promises John to write a paper;
Clare orders a book;
Dick warns Hans not to take the train;
The office grants a permit to Igmar. Social facts are facts that are considered true because people act as if they are true. Speech acts create social facts: The Web is a system of interlinked documents accessed
via the internet. Web browsers use HTTP to communicate with web servers You can use hyperlinks to navigate between web documents. People can easily access
any of these web documents. This is the largest
source of public accessible documents ever Connecting people

Social networks, Wikis, Tags, Mobility, Blogs The world wide web, not only for the people,
but also by the people.
The largest communication platform ever. Connecting documents Web pages are written in HTML
HTML describes the structure of the data
HTML describes the syntax not the semantics All web pages uses some language and vocabulary to express certain data

Syntax is how the data is structured
Semantics is the meaning of the data If computers can understand
the meaning behind this data... Incomprehensible data would become understandable information.

So that computers could "learn" what we are interested in and help us to find what we want Today's web is about documents

The semantic web is about things It can recognize people, events, products, permits, organizations, buildings.

And understand the relationships
between them. Closed world: what we don't know, doesn't exists
Open world: what we don't know, might exists Closed world:
if a person is not a man, she's a woman

Open world:
a person could be something else than a man or
woman (I simple don't know - now) Closed world:
For every thing there is one slot
The slot defines what the thing in the slot is. Open world:
A thing is just a thing
The tags define what a thing is. Object orientation:
An instance is the instantiation of a class.
Its properties are inherited from its class.

The class comes first, that the instance. Semantics:
An instance is of a certain class,
if it shares its properties (classification).

The instances comes first, that the class Accessible for everyone
Data is linked with other data
Not documents, but bits of information All things (real or conceptual) are identified by URL's
Browsing a URL gives you information about the thing
The information contains links to other things Big data: a collection of data so large and complex, that normal (data-driven) methods of capture, searching, retrieving an analysis fail. "The idea is to tag HTML pages with semantic annotations, using standard ontologies, so that computers can find and interpret information
on regular web pages" 1. Unreliable
2. Lot of work
3. Mistakes
4. Different opinions
5. Different usage Correct interpretation by a computer relies on the provided tags. But what if somebody doesn't want something to be found? Or just make a mistake?
Different opions exist: what someone might find a
perfect tag, could be completely wrong for another person. This could also be effected by usage: a
hammer could "mean" a part of a music
instrument, or "mean" a
carpenter's tool. 1. Reliable
2. Automated work
3. No mistakes
4. Multiple perspectives
5. Usage defines meaning By tagging structured data, we could process the data by computers using predefined ontologies and rules.
Within these boundaries: no mistakes are made. Multiple perspectives can be applied, each with
their own set of tags. Meaning is defined
by linking usage with a specific
perspective. From the semantic web:
Every "thing" (or "resource") has a unique URL

URI: Uniform Resource Identifier
URL: Uniform Resource Locator An URI identifies a resource, it identifies: A Resource Description Framework
describing data as triples of

< Subject - Predicate - Object > Subject Predicate Object A triple describles something about the subject

A subject is always a resource (identified by an URI)
A predicate is always a resource
An object is a resource or a literal (text, number, date) RDFS is a schema for giving meaning to resources Odie Taxonomy: a hierarchical classification Reality is organised via
universals that are hierarchically related Partition maps: a way to order reality
(also known as Venn Diagrams) Partonomy: a compositional hierarchy Classification: a fish is an animal
Composition: a computer case is part of a personal computer Perspectives: there is no single classification A shark is a carnivore AND a fish
A dolphin is a carnivore AND a mammal
A venus flytrap is a carnivore AND a plant

Common perspective versus
theoretical perspective:

Dolphins live in the see, some people
think they are fish.

A killer whale is not a whale, but a dolphin Cantor
(1845 - 1918) Set theory Semantic web = Semantics + Web

The web is a discovery from the 20th century

To understand Semantics, it takes a journey from Ancient times to the 19th century Incompleteness theorem Constructivism Metaphysics A Universal is what particulars have in common
A particular is a concrete "thing" in reality Dogness is what all dogs have in common
Dogs are Animals, Cats are Animals
Animals are living beings

An example (particular) of a dog is
my neighbour's dog "Max". Syntax <BUSINESS-CARD>
<STREET>Aristotlestreet 1</STREET>
</BUSINESS-CARD> Pragmatics The relationship between signs and the effect they have on the people, organizations and systems that use them DEMO or BPMN describes (part of) the pragmatics of a message The pragmatics describe what the result should or will be when signs are used. The pragmatics formulate what happens when we use language Semantics Triangle of meaning Triangle of meaning A sign (or term) symbolizes the notion a person has of a referent. Its true meaning is only known by that person.

A Term can stand for multiple referents.

To understand each other, speakers have to agree about the notions they both have of the terms they use. Definitions Term:
Perspective: Garfield Simba Dog Cat Berlin
Zoo Animal rdfs:subclassOf rdf:type rdf:type zoo:hosts rdfs:subclassOf rdfs:range rdf:type Animal zoo:
hosts Zoo rdfs:
Class rdf:
Property rdf:type rdf:type rdf:type rdfs:range rdfs:domain Operational data is usually stored in an RDBMS
RDF data is usually stored in a triple store (1) Fixed, predefined schema (closed world assumption)
(2) Referential integrity (mistakes are prevented)
(3) Operational data (optimized for transaction volume)

(1) No predefined schema (open world assumption)
(2) Infer relations (new information is revealed)
(3) Operational knowledge (optimized for data volume) RDBMS:

Triple store: Some triple stores are actually quad stores, to deal with the context of a triple, also known as the graph of a triple. Data in a RDBMS is retrieved using SQL
RDF in a triple store is retrieved using SPARQL Retrieve the names and email addresses of all persons PREFIX foaf: : <http://xmlns.com/foaf/0.1/
SELECT ?name, ?email
?person rdf:type foaf:Person.
?person foaf:name ?name.
?person foaf:mbox ?email.
} Retrieve the names and (if available) email addresses of all persons PREFIX foaf: : <http://xmlns.com/foaf/0.1/
SELECT ?name, ?email
?person rdf:type foaf:Person.
?person foaf:name ?name.
OPTIONAL { ?person foaf:mbox ?email }
} Retrieve the surnames or familiy names of all persons PREFIX foaf: : <http://xmlns.com/foaf/0.1/
SELECT ?lastname
?person rdf:type foaf:Person.
{ ?person foaf:surname ?lastname }
{?person foaf:familiyname ?lastname }
} OWL: The Web Ontology Lanuage OWL is a semantic extension on top of RDFS Three flavours:
OWL lite
OWL Full
For the most simple classification tasks
Maximal expressiveness while retaining decidability
Full expressive power, but undecidable OWL is based on description logics: knowledge representation with a logic background, or put it differently:

OWL is a mixture of RDFS, Aristotle classification, Set theory, Proposition logic, Predicate logic and the world wide web. Description
logics Description logics The Terminology box
statements about the concepts and their relationship. The T-box is about universals, it contains the ontology.

The Assertion box
statements about individuals and their relationship to concepts and other individuals. The A-Box is about particulars, it is the fact-base in terms of the T-Box ontology. T-Box

A-Box Rules are statements that constrain or dictate change in the fact-base. Rules are not part of the fact-base, but form a special "box" within the T-Box. Rules are formulated in terms of the T-Box ontology. Description logics The extension of a class is a set containing all possible individuals of that class

The intension of a class is its ontological definition. Garfield rdf:type Cat
Simba rdf:type Cat Odie rdf:type Dog Cat rdfs:subClassOf Animal
Dog rdfs:subClassOf Animal

Pet-of-John rdfs:subClassOf Cat
Pet-of-John rdfs:subClassOf Dog Odie Garfield Simba Speach act A promise
An order
A warning
A permit Social facts Social fact: "Sesame street is a children's program"
Real fact: "The last episode of sesame street" was as 18:00 GMT" Social facts are the creation of speech acts An agreement
(between John en Peter) "John agrees with Peter to buy his car" social fact created by the act Instance before class Open:
Data: The internet is the largest source of documents

The open linked web will be biggest collection of data Big data Semantics enable computers to learn
the meaning of data, and organise
big data in manageable chunks. Reasoning Formal semantics make it possible to reason with data: .. to handle big data sets (intelligence, profiling)
.. to infer facts (decission support)
.. to validate correctness But understand:
- Making a good data model is hard
- Making an ontology is harder
- Knowing how to do things is hard
- Knowing why to do things is harder Simple Knowledge Organization System If the goal is understanding between people, instead of reasoning, OWL is bit to much. SKOS is designed to created controlled vocabularies, on top of RDFS To understand RDF data, you need an ontology To share RDF data, you need a common ontology Ontology mismatch To understand linked data, requires to link different ontologies together. This is a challenge: 1. Ontologies specific for a domain, no generalisation;
2. Different levels of abstraction
3. Reproduce silo problems
4. No guaranty for quality
5. Ontologies overlap. To link different ontologies, an upper ontology is needed. It defines the most generic concepts that can be reused by different ontologies. These are the axioms for every ontology. Reason: no one "true" ontology By design, an ontology is a view on the world. The type of application drives the way the ontology is defined. If you're interested in the stock of wine bottles in a wine cellar, your ontology would have a notion of individual bottles.

Instance of Class "Wine bottle" = single bottle in stock If you're interested in what kind of wine tastes good with a certain kind of food, your ontology would only have a notion of vintage and vineyard.

Instance of Class "Wine bottle" = single item on a wine list ` Bad taxonomies Excellent example of a bad taxonomy 1. Those that belong to the emperor
2. Embalmed ones
3. Trained animals
4. Suckling pigs
5. Mermaids (or Sirens)
6. Fabulous animals
7. Stray dogs
8. Those included in the present classification
9. Those that tremble as if they were mad
10. Innumerable ones
11. Those painted on walls
12. Others
13. Those that have just broken the flower vase
14. Those that, at a distance, resemble flies Context Context: anything that can be "accidental" to a subject Arristotle defined:
1. Quantity
2. Quality
3. Relations
4. Habit
5. Point in time
6. Place
7. Orientation
8. Action
9. Passion Context might also be:
1. At what time
2. At what place
3. Source of information
4. Modality
5. Intended use
6. Intended audience "Celestial Emporium of Benevolent Knowledge's Taxonomy" - An object, place, situation;
- An idea, concept, notion;
- Statements about such things.
An URI should be unique:
any time, any place, any one RDF graph: a collection of RDF triples A RDF graph is identified by a URL
A RDF graph is also a resource A triple with a RDF graph as it's subject is...
... a statement about a RDF graph! A RDF graph denotes the context of a set of triples.

Statements about graphs give provenance information:
when, how, why, where, who, what CEN Metalex Kadaster glossary Thank you http://www.linkedin.com/in/marcobrattinga
Full transcript