Monday, May 16, 2011
(Held in conjunction with the International Parallel and Distributed Processing Symposium)
Advance Program for HiCOMB 2011
In this talk I will discuss the current state and future trends in data-intensive genomics driven biological research and how large-scale computing will enable the development of predictive theory from this wealth of data. Sequencing centers will soon be producing data volumes comparable to those in high energy physics and astronomy (multiple petabytes per year), thus requiring new scalable approaches to analysis. I'll discuss three such areas. The first is the development and implementation of the RAST (Rapid Annotation via Subsytem Technology) system for automated genome annotation that has been used by thousands of researchers to annotate and analyze over ten thousand genomes, more than any other system. I'll also address how this system can scale in the future to handle hundreds of thousands of genomes. The second is the ModelSEED, which is the first system to support automated high-throughput construction of metabolic models directly from genomes. The modelSEED is a key capability in our long-term goal to predict phenotypes from genotypes. Our group is currently building computational models for all "complete" bacterial and archaea genomes (over one thousand models). The third is the analysis and assembly of environmental metagenomes. Next generation sequencing technology will make it possible to generate over a billion DNA reads per environmental sample. From these reads we can reconstruct many aspects of the community, including the distribution of organisms, the metabolic potential of the community, and the complete genomes of the most abundant taxa.
Rick Stevens is Associate Laboratory Director responsible for Computing, Environment and Life Sciences research at Argonne National Laboratory and is a professor of computer science at the University of Chicago. He also holds senior fellow appointment Recently Prof. Stevens has been co-leading the DOE laboratory planning effort for exascale computing research aiming to develop computer systems one thousand times faster than current supercomputers and apply these systems to fundamental problems in science including genomic analysis, whole cell modeling, climate models and problems in fundamental physics and energy technology development. He has authored and co-authored more than 120 papers and he is a fellow of the American Association for the Advancement of Science. His research groups have won many national awards over the years, including an R&D 100 award for the Access Grid. He sits on many government, university, and industry advisory boards.
High-performance computing is fast becoming an integral part of research and application in bioinformatics and computational biology. The large size of biological data sets, inherent complexity of biological problems and the ability to deal with error-prone data all result in large run-time and memory requirements. The goal of this workshop is to provide a forum for discussion of latest research in developing high-performance computing solutions to data-intensive and compute-intensive problems arising from molecular biology and related life sciences areas. We are especially interested in parallel algorithms, memory-efficient algorithms, large scale data mining techniques, algorithms on multicores and GPUs, and design of high-performance software for biological applications. The workshop will feature contributed papers as well as invited talks from reputed researchers in the field.
Topics of interest include but are not limited to:
Papers reporting on original research (both theoretical and experimental) in all areas of bioinformatics and computational biology are sought. Surveys of important recent results and directions are also welcome. To submit a paper, upload a PDF copy of the paper here. The paper should not exceed 12 single-spaced pages (US Letter or A4 size) in 11pt font or larger. All papers will be reviewed. IEEE CS Press will publish the IPDPS symposium and workshop abstracts as a printed volume. The complete symposium and workshop proceedings will also be published by IEEE CS Press on CD-ROM and will also be available in the IEEE Digital Library.
|Workshop Paper Due:||December 27, 2010|
|Author Notification:||January 20, 2011|
|Camera-ready Paper Due:||February 21, 2011|
Dept. of Electrical & Computer Engg. and
Lawrence H. Baker Center for Bioinformatics
& Biological Statistics
Iowa State University
3227 Coover Hall
Ames, IA 50011, USA
David A. Bader
College of Computing
Georgia Institute of Technology
Atlanta, GA 30332 USA
School of Electrical Engineering and Computer Science
Washington State University
Pullman, WA 99164
For up-to-date information about this workshop, please visit