Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Transcript of Untitled Prezi
This presentation aims to give an overview of the
recent changes and improvements that were recently made to Bio4j project coinciding with the change of name for the project.
What to do next?
BioGraphika's internet presence
The following matters should be addressed:
BioGraphika + other projects such as BG7, MG7, etc...
Some changes in the underlying structure
New Blueprints 2.0 API for traversals
Bye bye reference node!
Consequently getting rid of auxiliary relationships replacing them by indices
Around 180 new indices have been added (most of them refer to Uniprot DB external cross references)
Most of this indices have not yet been incorporated to the Neo4j version since it would have been a quite time-consuming implementation process.
Most property names have been updated so that they all can work with Titan indices
Utility classes such as NodeRetriever have been updated and improved so that they cover most of the possible traversal needs.
A new Blueprints 2.2.0 compliant API has been developed!
I had to manually change a few hundred of classes for this...
Both node/relationship types and property names are unique now
In theory any DB implementation of BioGraphika could use this API without posing any problem
The only concern are indices, which have not yet been adapted to this layer
of abstraction, at least at the importing process level
We can use now BioGraphika/Bio4j with a Titan backend!
Well.... the underlying DB technology is actually Berkeley DB.... :)
All programs needed for the importing process have been implemented.
Utility classes such as NodeRetriever or Bio4jmanager have also been implemented for this option.
Multi-valued indexed properties can be dealt with this technology in a much better way than with Neo4j
Full-text search though remains an issue when using Titan as backend
A few bugs were found in the process...
Some examples are:
There where a few extra Feature relationships created for no reason
Duplicated Submissions were created in some scenarios
Protein unpublished obvervation relationships were missing
Enzyme nodes didn't have its respective node type assigned
Dataset nodes were indexed by Keyword names instead....
New website, blog? twitter/linkedIn/user list... ?
Getting new domains and such logistics
BioGraphika + AWS
Releases, where? how? Bio4j overlapping, Titan?
Adapting them to the new version would be easy for main programs. Some utility programs could give some trouble when adapting them to Titan implementation.
Will there be a "default" implementation? Titan? None?
Beside other issues but I think this is enough for now to get us started
Project, repos, packages renaming...
More on projects/repos organization
New github organization? different repos for each implementation? Only one big JAR file with everything? (~75 MB)
Besides many other things I probably didn't think about....