Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


Extending the ETL Process Through a Working Data Interchange Graph and Understanding Different Realities of Data

No description

Brian Ballsun-Stanton

on 30 October 2012

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Extending the ETL Process Through a Working Data Interchange Graph and Understanding Different Realities of Data

A tool to move archaeological data collection and dissemination into this century.

A repository and a mobile tool. The Federated Archaeological
Information Managment System An extension and mirror of tDAR
building on the AHAD database.

A repository for (all) Australian Archaeological Data which needs well-formed data to be useful. As a Repository... In order for the data to be useful,
it must suit your project. Deploying an "mega-ontology" like CIDOC doesn't fit anyone's needs. Useful,
well-formed, data... It's a way of recording useful, well-formed, archaeological observations in the field. As a mobile tool... We propose "a18n": archaeolocialization A
contradiction in terms? The Observation. What is the atomic
unit of Archaeological Data? Forced into tabular form! Archaeological Data
is hobbled by poor data models! Data must be a function of the archeaologist's skilled observation, not the collection methodology. Technology warps data! Humans have different understandings of the "reality" of data! Humans warp data! Data as:
Subjective Observation
Objective Measurement
Human Communication 3 Different
"Realities" of Data: A basis for federation An Archaeological
Data Interchange Graph An indivisible set of attributes which, seperately, cannot exist and which serve to identify an archaeological ... thing. When all you have is excel, everything looks like a spreadsheet! For people who cannot make data models
bow to their whims, their data must conform to the model, not vice versa.

Confusion and poor models abound. Modellers may be providing a model for something entirely different, but called by the same name: "Data" My research has shown 3 philosophies.

Models created by one philosophy may not "make sense" to others, as they seem to record unimportant or actively detrimental things. The format needs to impose granularity restrictions and constrained vocabularies. It needs to provide a common language that does not restrict research. The Observation
is King A method of passing observations is needed. Observations are non-relational structured data.

An observation must be a thing-in-itself in order to provide for trivial observation passing.

Observations should expose no redundancies nor sparsly populated fields. Only things that matter should be passed. ArchaeoML has it almost right A Methodology of Federation Right approach to item entities, wrong approach towards containers. Observations are Messages ProtoBufs
(or equivalent) as Observation Containers Inspired by OCHRE/ArchaeoML

We need to implement it internally to pass observations between devices and between Field Collection and Repository Archiving

The anarchy of the competing thesauri masks a fundamental similarity. From a Recording Point of View: Implementing a18n through iteratively built, but controlled vocabulary and synonyms dramatically reduces the "data mapping problem." Attribute Domains are chosen from a controlled list with synonyms.
The list is trivially extensible, but curated to merge duplicates.
Observations are not grouped within hierarchies, they are able to stand alone.
They carry their associations with them as aspects of the observation, not items within a parent set imposed for technical reasons or through a committee.
The archaeologist can choose which attributes to use and what they are called locally, but will be able to pass them to any participating system.
We are implementing these principles in the FAIMS Archaeological Observation Format in our Mobile Tool and in our Repository.
This format can serve as the basis of tightly-coupled federation in archaeological projects across the world. Archaeological
Message Format Requirements: With Thanks To: Items are Stand-Alone Observations
Attributes have Synonyms
No imposed Technical Hierarchy "a18n" is an extension of i18n: the internationalization libraries programs use to translate between languages: we translate between the specialized jargon different archaeologists use. Most observations contain the same core data
mapped to different names and organized in different ways. Arts eResearch
Full transcript