Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

20131106 TG&RVB Presentation

No description
by

Rik Van Bruggen

on 8 November 2013

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of 20131106 TG&RVB Presentation

Case Study Presentation:
Implementing Graph Database within Business Intelligence to map student data

Tereza Gregorova
Business Intelligence Graduate Developer

Student records and degree path
Student
Department
Programme
Modules
Faculty
Subject
Pathway
Institution
Academic
Route
Award
Research Grant
admit_into
access
takes
form
offer
under
part of
differs
teaches
has
has_grant
goes
leads_to
Scottish and English Higher Education Systems differ
Information about students stored in a Relational database
ER diagram
Simple question
Select all modules that a student with Student_ID=11002233 takes
Select STUDENT.STUDENT_ID, FIRST_NAME, SURNAME, MODULE, MODULE_CODE, MOD_NAME
From STUDENT
Join STUDENT_PROGRAME_ROUTE
On STUDENT_PROGRAME_ROUTE.STUDENT_ID=STUDENT.STUDENT_ID
Join STUDENT_MODULE_TAKING
On
STUDENT_MODULE_TAKING.STU_PROG_ROUTE_CODE=
STUDENT_PROGRAME_ROUTE.STU_PROG_ROUTE_CODE
Join MODULE
On MODULE.MODULE_CODE=STUDENT_MODULE_TAKING.MODULE_CODE
Where STUDENT.STUDENT_ID=”110002233”
Graph Database?
A database with an explicit graph structure using nodes, edges and properties to represent and store data

User works with nodes, edges (relationships) and properties in a flexible graph network
Why Neo4j
An open source graph database

Uses the structure of the graph to store, manipulate and represent the complex data

Basic operation is traversal of graph

Uses Cypher to query the graph database
Cypher
Select all modules that a student with Student_ID=11002233 takes
Simple question
Start a=node:Student(Student_ID="11002233") Match a-[r:takes]->b Return a, r, b;
Student
takes
Module
Finding answer using SQL
Finding answer using Cypher
Length of Undergraduate Honours Degree

Flexibility in the first 2 years of study

Following different pathways of study
Loading data from Relational Database to Graph Database
Complex questions
Select the academics who have achieved less than 70 points in Academic Rating and all students that they have taught and whose marks were lower than 13
Loading data from Graph Database to BI tool QlikView
Loading relationships

Connecting QlikView with Neo4j by using JDBC connector
CUSTOM CONNECT TO "Provider=JDBCConnector.dll;jdbc:neo4j://localhost:7474;";
Loading data
SQL START n=node(*)
RETURN ID(n) as NodeID, n.Student_Code? as StudentID, n.Student_Name? as StudentName, n.Module_Name? as ModuleName,
n.Academic_name? as AcademicName, n.Grant_name? as GrantName, n.Department_School_Name? as DepartmentSchoolName,
n.Programme_Name? as ProgrammeName, n.Grant_Funding_Body? as GrantFundingBody, n.Subject_Name? as SubjectName,
n.Grant_Amount? as GrantAmount, n.Pathway_Name? as PathwayName, n.Grant_Amount_Used? as GrantAmountUsed, n.Award_Name? as AwardName,
n.Grant_Used_by? as GrantUsedby, n.Route_Name? as RouteName, n.Grant_date? as GrantDate,n.Institution_Name? as InstitutionName,
n.Faculty_name? as FacultyName;
SQL START r=relationship(*)
RETURN ID(r) as RelID, r.Grade? as RelGrade, r.Semester? as RelSemester, r.Year? as RelYear,r.Property? as RelProperty, type(r) as RelLabel;
Building the graph
SQL START n=node(*)
MATCH n-[r]->m
RETURN ID(r) as RelID, ID(n) as StartNodeID, ID(m) as EndNodeID, type(r) as Label;
QlikView data model
Analysis in QlikView
Show me all students that take module "Academic Project Planning" and show me also academics who teach this module
Analysis in QlikView
Analysis in QlikView
Show me all modules that "Becky Blue" teaches, also show me names of all grants that "Becky Blue" has
Show me all students that access programme "BSc (Science Faculty)"
COURSE_TO_DPT:
SQL
SELECT
R.ROU_NAME,
R.ROU_CODE,
R.ROU_CODE as CAP_ROUC,
RE.ROE_ROUC,
RE.ROE_PWYC,
P.PSD_PWYC,
P.PSD_SUBC,
P.PSD_PERC,
P.PSD_DPTC,
D.DPT_CODE,
D.DPT_SNAM,
D.DPT_NAME,
D.DPT_TYPE,
S.SUB_CODE,
S.SUB_UDF1,
S.SUB_NAME

FROM
INS_ROU R,
INS_ROE RE,
INS_PSD P,
INS_DPT D,
INS_SUB S
WHERE

R.ROU_CODE=RE.ROE_ROUC (+)
AND RE.ROE_PWYC=P.PSD_PWYC (+)
AND P.PSD_DPTC=D.DPT_CODE(+)
AND P.PSD_SUBC = S.SUB_CODE(+);

Loading data from relational database
start n=node:Student("Student_Name:*") match n-[r:takes]->m<-[k:teaches]-o where (o.Academic_Rating<"70" and r.Grade<"13")
return n.Student_Name,r.Grade, m.Module_Name, o.Academic_name,o.Academic_Rating;

Student
Module
Academic
takes
teach
n-[r:takes]->m<-[k:teaches]-o
n.Student_Name, r.Grade, m.Module_Name, o.Academic_name,o.Academic_Rating
Properties
Show the academics who have achieved less than 70 points in Academic Rating and all students that they have taught and whose marks were lower than 13.
start n=node:Student("Student_Name:*") match n-[r:takes]->m<-[k:teaches]-o-[s:has_grant]->p where (o.Academic_Rating<"70" and r.Grade<"13") return n.Student_Name, m.Module_Name, o.Academic_name,o.Academic_Rating,p.Grant_name,p.Grant_Amount_Used,r.Grade;

Student
Department
Programme
Modules
Faculty
Subject
Pathway
Institution
Academic
Route
Award
Research
Grant
admit_into
access
takes
form
offer
under
part of
differs
teaches
has
has_grant
goes
leads_to
Cypher vs SQL
Select all modules that a student with Student_ID=11002233 takes
Cypher
Start a=node:Student(Student_ID="11002233") Match a-[r:takes]->b Return a, r, b;
Student
Module
takes
SQL
Select STUDENT.STUDENT_ID, FIRST_NAME, SURNAME, MODULE, MODULE_CODE, MOD_NAME
From STUDENT
Join STUDENT_PROGRAME_ROUTE
On STUDENT_PROGRAME_ROUTE.STUDENT_ID=STUDENT.STUDENT_ID
Join STUDENT_MODULE_TAKING
On
STUDENT_MODULE_TAKING.STU_PROG_ROUTE_CODE=STUDENT_PROGRAME_ROUTE.STU_PROG_ROUTE_CODE
Join MODULE
On MODULE.MODULE_CODE=STUDENT_MODULE_TAKING.MODULE_CODE
Where STUDENT.STUDENT_ID=”110002233”
Designed for querying complex, interconnected data

Queries for the shortest path

Pattern matching
From Relational Database to Graph Database
Data model
What was proven?
Graph database Neo4j can be implemented with QlikView

Neo4j can be used in educational institution as an extra layer for mapping relationships between data

Great potential for using graph database in BI for analyzing data
Thank you!
Tereza Gregorova
Email: tg27@st-andrews.ac.uk
Select the academics who have achieved less than 70 points in Academic Rating and all students that they have taught and whose marks were lower than 13. Also show whether these academics have any grants.
Student
Module
takes
teach
Academic
Grant
has_grant
n-[r:takes]->m<-[k:teaches]-o-[s:has_grant]->p
n.Student_Name,r.Grade, m.Module_Name, o.Academic_name,o.Academic_Rating,p.Grant_name,p.Grant_Amount_Used;
Properties
Select the academics who have achieved less than 70 points in Academic Rating and all students that they have taught and whose marks were lower than 13. Also show whether these academics have any grants.
Count the number of books that were borrowed by students whose marks were lower than 13 and who were taught by the academics who have achieved less than 70 points in Academic Rating and have some grant.
start n=node:Student("Student_Name:*")
match j<-[t:borrow]-n-[r:takes]->m<-[k:teaches]-o-[s:has_grant]->p
where o.Academic_Rating<"70" and r.Grade<"13"
return
n.Student_Name, m.Module_Name,r.Grade,count(j.Book_Name), o.Academic_name, o.Academic_Rating,p.Grant_name;
j<-[t:borrow]-n-[r:takes]->m<-[k:teaches]-o-[s:has_grant]->p
Book
Student
Module
Academic
Grant
has_grant
teaches
takes
borrow
Properties
n.Student_Name, m.Module_Name,r.Grade,count(j.Book_Name), o.Academic_name, o.Academic_Rating, p.Grant_name;
Count the number of books that were borrowed by students whose marks were lower than 13 and who were taught by the academics who have achieved less than 70 points in Academic Rating and have some grant.
Library
Book
visit
store
borrow
Data in Graph Database Neo4j
Case
Study

Tereza Gregorova
More info:
rik@neotechnology.com
@rvanbruggen


Graph Databases
NOT ONLY SQL

Trying to solve some everlasting challenges in relational database systems
Volume challenges
Complexity challenges
What are
Graphs
good for?

Graph Databases & BI
Background, Concepts & Case Study

Euler conceived Graph Theory in the 18th century

7 Bridges of Konigsberg
Graph Concepts
Abstract representation + Computational algorithms
Counting the Edges!
Deal with
Semi-Structure
Deal with Complexity
Labeled
Property
Graph
Graph Database Concepts
Graph Databases & Business Intelligence?
Help address common challenges in the input & output chains
Modeling, Storing,
Querying, Analysing,
Interacting ...
with a
Graphs can act as
Sources and/or Targets
Graphs can provide a more flexible, up-to-date query infrastructure for
certain use cases
Tereza Gregorova
Email: tg27@st-andrews.ac.uk

Rik Van Bruggen
Email: rik@neotechnology.com
Full transcript