Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.



No description

Christian Bonanno

on 23 June 2014

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of CODE

Court Open Data

Jurgen Cassar
Daniel Desira
Christian Bonanno
To make closed court cases more accessible and easier to use
Make data available in various formats
Aim of the project
Current System

Information only available through website
Searching for a case requires legal knowledge
Current System

Lack of structure
Not all court cases are searchable
No linked Data
Creating a database from online text
Extract data from PDF documents
Creating datasets out of raw HTML
Applications to be built using our datasets
Our System allows for
Most use the system daily or weekly
Need search engine improvement
There is a need for better output
Thank You
Keyword Module
Relies on the Maltese POS tagger web service
Makes use of SOAPpy
Returns a list of the most frequently used common nouns
Resulting System
Targeting small concepts which then can be expanded to a complete system
Most resources are targeted at clients for law students or law professionals
Integrated with keyword module
Provides for scalability
Designed with node-relationship in mind
Python micro-framework
View functions (more like controllers)
Myriads of extensions available
Transforms HTML to Data, Downloads PDF Documents
Daily updates
Handles exceptions
Maltese POS tagger
Web service transforming a paragraph of text into words followed by their linguistic mnemonic
Currently 90-95% accurate
METANET4U claims they are still improving it
Distribution of work
Everyone is assigned a task, either code or documentation-wise
Components built so as to "talk with ease"
Tools for communication
Facebook group
mainly good old meetings
Store Data Parsed From Online Source, PDF link
Used as temporary storage
Marks errors
Intermediary Database
PDF Parsing
Extraction of information from PDF Documents
Separation of Data into small entities
Link through relations
same_as relationship

Testing and Evaluation
Survey - 50 students
From middle database, results show that apporox. >85% of files were parsed
Unparsed files mainly consist of files
written in English
Some major different structure
Ability to parse also english language
Complex Queries through Api
Data quality
Offers function to join two judges with very similar names
Then searching for one judge will return results of both judges
Law companies
Statistical Data
Local financial companies
Foreign companies
Full transcript