Eureka! Cracking The ‘Omics Code With StorNext at SIB (Swiss Institute Of Bioinformatics)
The SIB Swiss Institute of Bioinformatics is at the forefront of the next great revolution in life sciences—applying computational methodologies to genomics, proteomics and other bioinformatic sciences. SIB’s work is increasingly focused on applied genomics to improve quality of life.
“SIB recently worked on an algorithm for a prenatal diagnostic test for conditions like Down syndrome,” explains Professor Ioannis Xenarios, Director at the Vital-IT Group. “With a simple blood draw from the mother at 11 weeks, we can sequence the genetic material of the fetus in utero. It’s less invasive—and much less risky than traditional amniocentesis. And it shows how genomics is becoming more relevant in our everyday lives.”
More Than 30TB Per Week Creates Unique Data Management Challenges
SIB operates six sequencing centers and supports about 300 research teams that generate up to 30TB a week.
“Over the last few years, sequencing has become much faster,” explains Roberto Fabbretti, Senior Scientist and IT Manager at Vital-IT. “That means we are doing more projects than ever and our data is growing very rapidly.”
Long-Lived Research Demands Data Stewardship
“For research into areas like cancer and immunotherapy, we capture large amounts of sequenced data for each patient,” says Xenarios. “If that person comes back on a week-to-week or month-to-month basis, all the data from previous tests needs to be made quickly and accurately available to researchers in a short amount of time. To scale our bioinformatics efforts to support tens of thousands of patients, we need to look for cost-effective ways to preserve genomic data for 20, 30, or 40 years of time—effectively creating a view of a patient from before birth to death.”
High-Performance Storage at Petascale
Vital-IT today supports its research infrastructure with StorNext scale-out storage from Quantum. Researchers get high-speed sequencing and analysis through four separate StorNext systems—nearly 1PB of primary storage and 4PB of economic tape archives. StorNext supports high-performance processing using IP over Infiniband, keeps active data on primary storage for analysis, and automatically moves files to an AEL tape archive as they age. Over 600 users access data at one of the data centers or remotely through a CIFS interface.
Self-Service Access Keeps Genomics Data Ready for Research
“The data that our researchers capture and analyze provides important answers today, but it also has the potential to be useful months or years later when new analytic applications can extract information from the same raw sequences,” Fabbretti says. “StorNext allows us to provide cost-effective long-term archiving for all our projects, regardless of how long a project is planned to last.”
Archived files still appear where the researchers expect them to be in the file system, so they can be easily accessed directly, without IT support.
“If you provide researchers with the right set of tools, they push the envelope,” says Xenarios. “StorNext tiered storage helps us take data in fast, quickly move it to archive, and keep it ready so bioinformaticians can continue their work.”
Automated Protection for Some of the Most Valuable Data Sets on Earth
“StorNext not only helps us make sure we capture data fast—it also makes archiving an automated, cost-effective process to help us fulfill our role as a data steward,” says Fabbretti. “We always make two copies of the files on tape, keeping one available in the archive and the other vaulted to provide an additional layer of protection against any kind of hardware failure or damage to a site.”
“We are dealing with some of the most valuable data sets on earth,” Fabbretti explains. “StorNext gives us a multi-petabyte archive capability, long-term data protection, and the ability to easily roll back file versions—it’s a critically important part of that strategy.”
Scalability Keeps SIB Ready for What Comes Next
“StorNext has supported our growth for over six years. We know we can easily add more disk and capacity when we need it. In fact, we’ve gone beyond genomics to store and protect general medical research data sets. It is important to us that StorNext can easily include additional tiers like cloud or object storage in our storage workflow when it is time to expand.”