Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Do you really want to delete this prezi?
Neither you, nor the coeditors you shared it with will be able to recover it again.
Make your likes visible on Facebook?
You can change this under Settings & Account at any time.
Transcript of SafeGenome
Top 10 things you probably want to keep private ?
A human being
... well ... requires some privacy
what else ?
bank account number
IQ test result
how you look like just after waking up
Personal tax assessment .... ?
Welcome to the 21st century
deeper inside ....
... and yes, these are your muscle fibres under a microscope...
This is a single cell of your body
These days you have to look
Two nucleotides form a "base-pair"
The whole human genome size is 3000 millions bp
Single base pair ("bp") has one of 4 values:
A (= adenine)
T (= thymine)
G (= guanine)
C (= cytosine)
This equals to 715 megabytes of precious data about you
And please watch this little guy here
This is a mitochondrion. It works as
a battery for the cell. Each cell has
a couple of such batteries.
And it contains its private software
also encoded in DNA. Only 4 kilobytes
but EXTREMELY precious.
715 megabytes = one CD of software that runs your whole body.
Started in middle of 19th century ....
... in '70 and '80 era we were discovering the technology....
The first complete scan of human genome took 13 years and joint effort of 6 countries (20 scientific institutions, total cost $3 billion).
In 2008 a major breakthrough happened:
Next Generation Sequencing
This is example of contemporary sequencing hardware
Illumina HiSeq 4000 ... pretty expensive toy, but ....
"Human Genome Project" was like first landing on the Moon ....
... it is able to do the complete human genome scan in one day
DNA technically contains a complete definition of your body...
... and every day brings some new company offering new useful information to be readable from DNA
your look now
your look 30 year later
the size of your penis (well ...)
your mental potential
all your weaknesses, future diseases
how all this info can be utilized by....
...the company where you have just applied for a job...?
...a lovely person, just met,
that you have some warm
feelings about .. ?
... an insurance company where you are trying to buy a life insurance policy ...?
... the government ... ?
... your political opponents ... ?
OK, let's try to envision the future structure of genomic services market....
DNA sequencing providers
Genome analysis providers
you give some liquid from your body
they do the sequencing
output is 715 MB of precious data
they keep your genomic data in digital form (cloud)
they allow you to be ANONYMOUS while you purchase genome analysis services
they offer analyzing genomes
intense growth of service as genomic science provides better undetrstanding of structure and meaning of genomes
they never know identities of persons these genomes belong to
Current situation on the market ?
Pretty wild and
"ad hoc" so far ...
ANALYSIS SERVICES LAYER
DIGITGAL GENOME BANKS LAYER
SEQUENCING SERVICES LAYER
This was 2015 when whole genome sequencing cost first dropped below $1000.
This sudden drop is what humankind was waiting for decades .... now the whole business around DNA started to develop exponentially !
the technology is still too weak to read all this from your DNA but sooner or later it will be possible and once it happens .....
They all follow similar business model:
Current business model of genomic services available online
Step 1: create an online account on our server and share all personal stuff with us (so we keep your balls forever).
Step 2: send us your saliva sample
Step 3: We will scan your DNA and
present you nice analysis online
Please observe that if privacy of your "classical" sensitive data become compromised .....
.... you always have some "recovery path":
Privacy failure recovery
They found your gmail password ?
Change the password
They found address of you?
They got your phone number ?
Change phone number
Your credit card is stolen ?
Block the old one and get new one from your bank
With your DNA things are more difficult because .... you can never change your DNA !
Just one mistake (like trusting the wrong company)....
All you can read from DNA
Understanding the danger
Sure, but it is 2016 now and things got slightly more complicated ...
DNA market today
Shape of the future
Scientists call reading this information
The final outcome of DNA sequencing can be easily represented as a text file
It looks like this
Right, and there is the "no security recovery" issue....
... and your DNA may be "public" for the rest of your life.
It is worth mentioning that since year 2000 there has been also another DNA sequencing technology in use...
This is how a microarray looks like....
DNA isolated from bio sample (saliva, blood etc) is slashed over the microarray.
Each "dot" of the micro-array performs different bio-chemical reaction
Micro-scanner is then reading the results
So whenever someone starts talking about any DNA scan, you should first ask .....
... are we talking about microarray-based testing ?
...microarray based scanning
... about a full-scan ?
...which means this
If you have a full scan in hand, making any microarray based scan means duplicating your effort, because the information is already there in the (large) text file with all these ATTGACTGGTCCACTGGTA...
If you don't have a full scan in hand and what you really need is some quite "typical" test, examples:
genetic diseases profile
haplotype test (= finding your ethinc profile)
...microarray scan may be the way to go because it is:
What really happens in every "dot" is discovering if specific DNA sub-sequences in the original DNA sample were actually present. The bio-chemical trick behind this is called "DNA hybridization".
Let's shortly review some things we currently do by analyzing DNA...
Genetic ancestry testing (which is finding your ethnicity in a purely quantitative way)
DNA identification (= fingerprinting)
This is especially useful in criminal investigations context ("Who is the killer ?")
Identification of genetical mutations that cause known diseases
(for example finding that you have the mutation causing cystic fibrosis)
Identification of gene variants that may influence your future health condition (like finding that you are breast-cancer prone).
It is only a question of having proper software able to extract this information from the full scan. However this may be actually quite non-trivial.
Java and Scala developer (12 years of professional experience), Akka enthusiast
Scala academic teacher for 5 years
done already one successful startup (remote access platform) - eDesk (a solution very similar to TeamViewer)
worked as a contractor for BioDiscovery - a California based genomic company
passionate blogger: http://scalaakka.blogspot.com/
Wojciech Klaudiusz Zaborowski
Java, Scala and Smalltalk developer (21 years of professional experience), graph databases and cryptography expert
works as Scala contractor in London
OK, so once we understood the shape of the future... why not start the future happening just now ?
Someone has to !
The mission we want to accomplish is quite easy to define...
We just want to create the first commercial digital genomes bank (!)
This mission involves providing (quite technical) answers to the following 4 questions....
Q1: DNA storage
How to organize genomes storage to make it:
Sequencing services layer
Digital genome banks layer
Analysis services layer
Q3: Bank user interface
How to build user interface to:
bring a value, even if the user is not purchasing external DNA analysis ?
make purchasing DNA analysis as easy as buying stuff on Amazon ?
inspire people, so the idea of "DNA hacking" becomes widely recognized as "cool and trendy" ?
Q2: Analysis layer interoperability
Q4: Sequencing layer interoperability
How should integrated sequencing be organized (so the customer is not enforced to purchase independent sequencing service) ?
How should DNA data import be supported (if the customer actually wants to bring to the bank an already existing scanning results) ?
How to set up bank-analyzer open connection protocol so that:
it will become a recognized and accepted de facto standard
it will allow easy and safe "shopping" around DNA (= purchasing DNA analysis) without compromising DNA privacy ?
DNA analysis providers will become interested in supporting the standard
there will be no risk of company lock-in, so the end user will be always able to change genome bank he is using
the architecture will promote a fair competition on analysis and banks markets
Yes, just bring this layer to actual existence !
So it is NOT about creating just another cool product.
It is much more about creating a new product
that others will follow.
that will introduce and enforce an ultimate separation of concerns among DNA services market.
This separation that is key to the market growth.
Phase 1: proof of concept
Phase 2: deployment of the real service
Phase 3: managed growth
Phase 1 tasks list
Formally specify Analysis Layer Interoperation Protocol (ALIP)
anonymous sharing mode
authorized sharing mode
Implement bank backend as Scala+Akka+Cassandra app (cloud, possibly Amazon)
Implement frontend (webapp)
Implement external analysis sample service (free, for ALIP practical tests)
Heavy beta testing
(based on public human genomes available in NCBI and Ensembl)
Proof-of-concept main app features
Safe genome storage (strong encryption)
Multiple genomes stored in single account
Genome storage optimization based on storing genome as set of differences
ALIP operable in anonymous sharing mode
Genome upload and download in most popular formats (FASTA, FASTQ)
Built-in genome browser
Built-in basic genome analysis
based on open-source libraries
selection of algorithms - to be defined (!)
POC as seen by the end user
Can create and delete account
Can do (fake) Bitcoin/Paypal payment
Can upload genomes to his account
Can browse uploaded genomes
Can view details of selected genome (genome browser)
Can run intergated analysis on selected genome (to be defined)
Can simulate purchase of external DNA analysis using ALIP
Phase 2 tasks list
Establish a limited company in Switzerland
Deploy app on selected Swiss based data center
Integrate with real payment services
credit card payment
Establish Terms of service (formal document, must comply existing law)
Establish service pricing policy
Integrate on the organizational level with at least one sequencing provider, so we can offer bank+sequencing as integrated service
Official launch / marketing campaign
Phase 3 - directions
Animate DNA analysis market by establishing business relations with companies that offer DNA analysis services and encourage them to support ALIP standard
Marketing around genome services "shopping" idea using anonymous vs authorized ALIP mode
Offer ancestry discovery (calculation of a haplogroup) as a free feature for members
Offer discovery of typical genetic mutations as a free feature for members