Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

In Support of Large Scale Scientific Research

Talk on ESnet's support of research within the DOE Lab complex given at the Quilt's Member Meeting
by

Steve Cotter

on 26 January 2014

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of In Support of Large Scale Scientific Research

In Support of Large Scale Scientific Research
Steve Cotter, Dept Head
Lawrence Berkeley National Lab

Quilt Members Meeting
Portland, OR
Sept 21, 2011
Science is changing rapidly
Science is becoming more distributed
Collaborations are global
Scientific instruments are larger & more expensive - fewer being built

Science is becoming more data intensive
Materials science
Biomedicine
Genomics
High Energy Physics

Data transfer and data sharing have become critical to scientific collaborations – in fact, scientific productivity is often determined by the ability to transfer / stream / share data
ESnet developed OSCARS (On-demand, Secure Circuit Advanced Reservation System)
Elephant flows
&
Mouse flows

Optical Transport Network
Provide 10/40/100G point to point wavelength services for production network & non-disruptive testbed purposes
100G Routed/Virtual Circuit Network
Supports 40 DOE labs/sites and with over 140 peerings with other networks (R&E and commercial)
Fully instrumented with perfSONAR network performance and monitoring software
ESnet developed layer 2/3 on demand or schedulable virtual circuit service for guaranteed network performance (latency/thruput) or traffic engineering/failover purposes
Dark Fiber Network
13,000+ miles long haul & metro fibers available for Advanced Networking Initiative (ANI) Testbed & disruptive research
Can be used by commercial entities as well as R&E community
ANI build to four primary sites (three supercomputers and intl exchange point)
Tested complete before end of December 2011 (hopefully by Supercomputing Conference in mid-Nov)

Entire optical transport backbone build complete by July 2012

Transition production network services to entire 100G footprint before November 2012
ESnet5
Serving original intended purpose
Provide the structure for a meeting of the minds between ESnet and science programs
Inform ESnet of science programs’ current and future needs
Several additional benefits
Used as basis for understanding of DOE science program needs (e.g. research proposals, strategic planning)
Platform for building relationships with science constituencies
Adopted by other national science facilities after model shown successful
Network Requirements Workshops
Major changes coming to the field in the next couple of years
Cost of sequencing to plummet
Data volumes to increase by more than 10x
Scientific discovery will be gated on data analysis, not sequencing throughput
Need for data transfer automation, multi-site analysis pipelines, cloud access
Performance is well below what is needed

This is not just the DOE genomics program – NIH, National Cancer Institute, others wrestling with same issues
ESnet invited to sit on workshop planning call with NIH, NCI, others
300GB/genome, 20k genomes per study

We are in the early part of an exponential curve
Experimental Science - Genomics
Major expansion in data output from CCD-based instruments
Data volume increase of 100x in the near future for some collaborations
Automation provides significant sample throughput increase
Significant changes to science process
Scientists have historically carried data home on portable media
Network-based data transfer will be necessary for the first time in many scientists’ careers
Multiple facilities, at least hundreds of scientists
Multi-site workflows needed (e.g. stream data to supercomputer center for near-real-time analysis to vet experiment config, transfer bulk data to supercomputer center for post-experiment analysis)
Experimental Science - Light Sources
Worldwide data distribution framework
Sites in US, UK, Germany, Japan, Australia, others
2PB to 10PB to replicate for IPCC AR5 analysis
Many end sites having problems
Wide area networks are in good shape
Site infrastructure lacking in some places, resulting in low end to end data transfer performance
ESnet playing a significant role in solving problems
Helping to debug multi-continent performance issues
Influencing sites and other networks through leadership in test and measurement, performance architecture best practice
Taking on a proactive role, enabled by workshop case studies and contacts
Simulation Science - Climate
Significant increase in data volume in the coming years
Data intensity increasing across DOE science portfolio
New facilities, facility upgrades (NSLS-II, LSST, LHC upgrades, Exascale, etc)
Higher capacity infrastructure needed
100G infrastructure will allow cost-effective expansion of network capabilities
Rich service set of ESnet4 still needed and should be expanded
One possible future:
Data-intensive science promises rapid scientific progress, increased rate of innovation, shorter time to delivery of solutions to society
This future is not guaranteed – end site and system capabilities must be matched to network capabilities to use the network effectively
Looking at the Future
Commonalities exist between different classes of scientific enterprise
Pain points are similar, scale is different
Data transfer and multi-site workflows are common themes
Data scale ranges from hundreds of gigabytes to tens of petabytes

Data transfer and distribution are common pain points
Many collaborations that cannot effectively transfer data identify this as a problem
Collaborations that can effectively move data consider the capability to be oxygen
However, there is a threshold of sophistication below which data transfer is hard
Outreach effort needed
Looking at the Current State
There is significant potential for improving the productivity of a large number of collaborations by enabling data mobility
Assistance is necessary, as most collaborations are unable to easily bootstrap the necessary skills and tools
Common themes make that assistance less difficult than it might otherwise be
The tools and technologies exist – the problem is one of “how”

Greater community efforts need to get underway to help disseminate best practices for network architecture, performance tuning, etc.
www.fasterdata.es.net

ESnet is taking a holistic view to the end-2-end problem
Cutting-edge network
Science DMZ
Guaranteed performance
Outreach
Full transparency
Going Forward
Dedicated infrastructure for science applications
LAN devices are not part of the troubleshooting mix
Dedicated devices are easier to configure properly and maintain
Data Transfer Node for high-performance data movement
Test and measurement deployed on same network infrastructure as science resources

Note also that networks without a monolithic firewall need not add a monolithic firewall just because it’s in the picture – it is still very beneficial to move large-scale data transfers close to the site perimeter

Note well – this is not a research project! It is based on hard-won experience
Science DMZ
Goal: Enable science to achieve the benefits of network services

Consult on several key elements, based on experience and past successes
Dedicated infrastructure for data transfer
Network architecture that supports high performance data movement
Deployment of interoperable test and measurement hosts (perfSONAR)

Multiple approaches:
Collection and publication of best practice (e.g. http://fasterdata.es.net/)
Site visits
Conference tutorials and demonstrations (e.g. ANI at SC11)
Community building (climate, biology)
Consultation with individual sites and collaborations
ESnet Outreach Efforts
Goals:
Instrument the 100G ANI for real-time power measurement
Power Distribution Units, temperature/humidity sensors
Build tools to collect and visualize live network energy consumption
Flexible meta-data to create customized views.
Power consumed per path, per POP, per layer
Create open datasets for network energy-efficiency research
IEEE’s EEE, IETF’s eMon, GreenTouch etc.
Juniper, Broadcom, Bell Labs, Level 3, BBN and others.
Catalyze adoption of theoretical research/experiments by industry
Energy proportionality will require redesign of network equipment
Establish metrics based on quantified improvements against baseline

Joint-sponsored all-day workshop with GreenTouch at SC11 (Sat 11/12)
Network and data center efficiency
World's First Power Baseline for 100G Network
Site enhancements:
Improved site health functionality
Flows available for smaller organizations
Site attributes
New features:
Suggestion box / 'ideation' tool
Monthly bandwidth stats
Shib / OpenID integration
LookingGlass
Mobile support
Graphing enhancements:
Track across all graphs simultaneously
Zoom into graphs
New visualizations:
Energy consumption
Network topology
ANI and SC demos
First version of OSCARS circuit viz
Network path of interest
Peering
PerfSONAR data
My.ES.net Portal - Roadmap (now-Feb 2012)
Energy monitoring experiments by Alcatel-Lucent;
20 Gbps disk-to-disk experiments by FNAL
40G experiments
Loaner OC-768 SONET cards from Infinera installed
Loaner Bay Microsystems 40G InfiniBand to SONET gateway installed
Long distance 40G RDMA testing has begun
Loaner 40GE cards from Infinera scheduled to arrive this month
16 projects have been given access to the testbed so far:
7 via direct DOE/ASCR funding
9 via testbed proposal review process
5 from Industry; 6 from DOE labs; 5 from Universities
10 DOE-funded, 3 NSF-funded

Wide range of projects:
6: High-speed middleware
2: OpenFlow
2: Other network control plane
1: 100Gbps end host hardware
1: Network flow classification
Current ANI Testbed Research
2: Wide Area RDMA
2: TCP congestion control
1: Security
1: Energy efficiency
Testbed Accomplishments
Advanced Networking Initiative
Full transcript