Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

Research Storage at UNSW for Questnet - working

No description
by

Luc Betbeder

on 30 April 2014

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Research Storage at UNSW for Questnet - working

Building a long-term
Research Data Storage
Service for UNSW.

Luc Betbeder
Current State
Strategic Investments
Building the Service
speaking for the whole team...
Large amounts of
different kinds of
research data
are
being stored
current state
The grey person icons represent different kinds of research activity taking place on campus.
Digital Microscopy using Aperio ScanScope.

Creates super high resolution image files.
Slides service.
Average Filesize: 3 Gig
Directory size: 3.5 TB
and many
fit-for purpose
local specialist
systems are
being used for analysis and computation...
Computer cluster: Leonardi - 3000 nodes
eg: Models of diesel spray and combustion in egines
Computer cluster: Leonardi - 3000 nodes

Storage capacity: no-long term storage...

1 x RAID5 (3 x 2TB) + 1 HS (2TB) :
/home (3TB) + /share/apps (750GB)

1 x RAID50 (5 x 5 x 2TB) + 1 HS (2TB) :
/share/scratch (37TB)

UNSW IT Hosted and Fac IT moderated
Disk - NetApp H/W used
Model and Service Work well
Cost is a barrier for smaller groups = $1K/TB

But is being used:
Sci - 380 TB
Med - 116 TB
Eng - 58 TB
Existing Research Storage Service
Forced to delete files and/or store them on external USB HDD.
Strategic Investments
USB Drives in cupboard in Mech.ENG
From IT: Vishal Sehgal, Amany Nuseibeh, Chris Will,
Sergey Sashin, Seri Charoensri, Berhard Semtner,
Dusan Munizaba, Jim Leeper, Denise Black... (and yes comms too... Greg Sawyer and team). And the key business stakeholders: Barbara Chmielewski, Prof Mark Hoffman, Greg Leslie, Grainne Moran, Maude Frances and all the wonderful participants in the Advisory Groups and Pilot sites. Thank you.
2012
Strategic Investment Planning (3 yr plan)

> process:
Align IT investment to UNSW "Business Domains"
(ie: Research, Academic, Other)

> method:
In-domain prioritising > estimating > voting

> outcome:
Multi-year / multi-stream investment for research storage (To support "Research Practice", meet Policy obligations, reduce Risk and "Providing an excellent research environment, with cutting-edge facilities and equipment.")
Long-Term Storage
A long-term "accessible archive" with metadata capability to support research practice at UNSW.

> No direct charge to researchers or project.

> Principles: large, functional, cheap, extensible, supportable, aligned, secure.

NOT a store for computation or analysis.

Interface / Portal
Devices and Bookings
Enhanced Metadata
Active Storage
A portal for accessing and using the UNSW long-term store and other data storage services (e.g. RDSI node(s), vendor/cloud).

Linked to other UNSW systems (library, research projects, data warehouse, authentication etc.)

It is policy-driven. You create a data plan to use the store.


We would like to link our research device booking system to the portal.

And eventually to provide tools to enable direct ingestion of research device data, device metadata and booking system metadata into store.
Over time we want to provide the Store with additional metadata, integration, search and collaboration functionality.

This may include federated search (searching across stores) and improving the ingest metadata capabilities of the store.
A self-service capability for researchers and research projects to access "active storage" (for compute) via the portal.

Active storage costs (unlike long-term storage) would be charged back to the projects.
2013
Starting the work.

> strategy and architecture:
Roadmap / vison document created and aligned with other streams of work taking place at UNSW and elsewhere. (Primarily Library and RDSI/Intersect)

> governance:
Boards and Business advisory Groups established.



Installation and configuration
Testing: July-Aug
Piloting: Sept-Dec

Long-Term Storage
Interface / Portal
Devices (Bookings)
Enhanced Metadata
Active Storage
Analysis
Scope
Architecture and Solution Design
Build: Aug-Dec
Release: 2014

Analysis
Scope
Architecture and Solution Design

Long-Term Storage
Interface / Portal
Devices (Bookings)
Enhanced Metadata
Active Storage
2014-2015
Run the service.
Expand capacity / capability.

> space allocation
Mix of existing targeted high risk and high value projects (100) and all new projects (1000).

> support
Mix of project-supported on-boarding for the messy existing projects and self-service data-plan-driven-through-the-portal.



Automation/Orchestration.
Smarter Data
Smart Devices
Interface
MetaData
Store
> system tests then pilots

> ingest via Web / Script / NFS

> leveraging local IT support

UNSW Storage Service
Local System and Store
Local Support
2013 Pilot Phase
is being built...
NFS (Test)
2012...
> procurement process.
2014 Service Established
UNSW Long Term Research Data Storage Service
LiveArc Web
LiveArc Script
> on-boarding of high-risk and high value projects.

> new projects fill in a data plan and use the interface

> allocations and provisioning behind-the-scenes

> links to RDSI Node (Intersect)

> support functions established

MetaData
Store
UNSW Storage Service
Local System and Store
NFS (Test)
LiveArc Web
LiveArc Script
Local Support
Second
Copy
eResearch
Support
Leverage existing Systems and Services at UNSW
Library
Support
IT Support
Data Warehouse
IT Systems
Authentication, Monitoring, Backup, Security, Networks....
Data Feed from INFOed (Research Management System), HR, Student, Org data...
Interface
Library Systems
ROS
ResData
Create Research Data Plan
Request UNSW and RDSI Storage
Automagic Provisioning
Manual Provisioning
RDSI Node
Vendor Cloud
Allocation Model and Process not visible to end-user.
How big is our data problem?
Evaluation process based on these design principles.
> Very strong vendor responses.
Object-Based Storage Devices (OSD)
Hierarchical Storage Management (HSM)
Expanding current Storage System
Cloud
TAPE:
Oracle SL3000 with 700 Slots / 4 Drives
DISK:
SGI IS5500 storage array.
0.5 PB
1.0 PB
HSM... Disk and Tape
via SGI
SGI LiveArc = Arcitecta MediaFlux
Ingest with MetaData
Search
Version
Script / API
Web Client (Java)
+Growing (to 3PB)
+Protecting (second copy)
+Staging (dev, UAT)
> Grow the store (3PB)

> Second Copy options (3PB)

> Pre-Prod environments

2013 Extend Phase
UNSW Storage Service - Questions for 2013
Primary Store
DEV
UAT
PROD
Environments
Second Copy
On Prem?
Cloudy?
3PB?
Performance?
2 or 3 Env?

VMs goodenough?
or
Reporting for Governance.
Support:
On-boarding.
Improving use of Meta-data tools.
Scripting support.
Using RDSI or other stores
Challenges and Next Steps:

Linking to UNSW Storage at Intersect
Accessing RDSI storage at Intersect

Work on Automation and Connectivity
including Authentication.

Putting public Vendor cloud or UNSW private cloud options into the Interface. "DIY-user-pays-compute"
luc@unsw.edu.au
Four Pilot Groups
Mark Wainwright Analytical Centre (Research Division)

Climate Change Research Centre (SCI)

Australian Wetlands & Rivers Centre (SCI)

Leonardi HPC (Mechanical Engineering)
Pilot Groups
via Interface
Search
Manage Access / Groups
Tools
Training
Is the timing and price right to use a vendor cloud as the "Second-Copy" for the store?


Building the Service
http://www.arcitecta.com/Products/Desktop
experiments...
Full transcript