20th IEEE International Workshop on High Performance Computational Biology
May 17, 2021
Virtual Workshop

   In conjunction with the IEEE International Parallel and Distributed Processing Symposium

Announcements:


Confirmed Keynote and Invited Speakers

Keynote Speaker:

KathyYelick
Kathy Yelick
Robert S. Pepper Distinguished Professor of EECS
Associate Dean of Research, Division of Computing, Data Science, and Society
University of California Berkeley
Senior Advisor on Computing
Lawrence Berkeley National Laboratory
Berkeley, CA, USA

Title: Genomic Analysis at Scale: Mapping Irregular Computations to Advanced Architectures

Abstract:
    Genomic data sets are growing dramatically as the cost of sequencing continues to decline and community databases are built to store and share this data with the research community. Some of data analysis problems require large scale computational platforms to meet both the memory and computational requirements of these data sets. These applications differ from scientific simulations that dominate the workload on high end parallel systems today and place different requirements on programming support, software libraries, and parallel architectural design. The tools in common use often run only on shared memory machines and on distributed memory they involve irregular communication patterns such as asynchronous updates to shared data structures. The ExaBiome project at Berkeley Lab is developing high performance tools for analyzing microbial data. I will give an overview of several high-performance genomic analysis problems, including alignment, profiling, clustering, and assembly, and describe some of the challenges and opportunities of mapping these to current petascale and future exascale architectures. I will also describe some of the common computational patterns or “motifs” that inform parallelization strategies and can be useful in understanding architectural requirements, algorithmic approaches, and benchmarking of current and future systems.

Biography:
Dr. Kathy Yelick is the Robert S. Pepper Distinguished Professor of Electrical Engineering and Computer Science, and the Associate Dean for Research in the Division of Computing, Data Science and Society at UC Berkeley. She is also the Senior Advisor on Computing at LBNL. Her research is in high performance computing, programming languages, compilers, parallel algorithms, and automatic performance tuning. She currently leads the ExaBiome project on scalable tools for analyzing microbial data and co-leads the Berkeley Benchmarking and Optimization (Bebop) group. Dr. Yelick served as the Associate Lab Director for Computing Sciences at LBNL from 2010 through 2019, and prior to that lead NERSC.  Dr. Yelick is a member of the National Academy of Engineering and the American Academy of Arts and Sciences. She is a Fellow of both the ACM and AAAS.
Invited Speakers:

    Sriram Sankararaman
    Sriram Sankararaman
    Assistant Professor, Departments of Computer Science, Human Genetics, and Computational Medicine
    University of California, Los Angeles, CA

    Title: Probabilistic models for large-scale human genomic data
    Abstract:    The quest to understand the genetic basis of complex traits and diseases has been revolutionized by the collection of phenotypic and genetic data across hundreds of thousands of individuals. However analyses of these large-scale datasets present substantial statistical and computational challenges. I will describe how we bring together statistical and computational insights to design accurate and highly scalable algorithms to uncover the genetic basis of complex traits. By applying these methods to data from about half a million individuals from the UK Biobank, we obtain novel insights into how heritable traits are, how genetic effects are distributed across the genome, and the relative contributions of additive, dominance and gene-environment interaction effects to trait variation.



    Fabio Vandin
    Fabio Vandin
    Professor, Department of Information Engineering
    University of Padoa
    Padova, Italy

    Title:
Fast Approximations of Frequent k-Mers and Applications
    Abstract:
    The extraction of k-mers is a fundamental step in many complex analyses of large sequencing datasets, including reads classification in genomics and the characterization of RNA-seq datasets. The extraction of all k-mers and their frequencies is extremely demanding in terms of running time and memory, owing to the size of the data and to the exponential number of k-mers to be considered. However, in several applications, only frequent k-mers, which are k-mers appearing in a relatively high proportion of the data, are required by the analysis. In this talk I am going to present two efficient algorithms, SAKEIMA and SPRISS, to approximate frequent k-mers and their frequencies in next-generation sequencing data. Both algorithms employ simple yet powerful reads sampling schemes based on tools from statistical learning theory. Our extensive experimental evaluation demonstrates the efficiency and accuracy of our algorithms in approximating frequent k-mers, and shows that they can be used in various scenarios, such as the comparison of metagenomic datasets and the identification of discriminative k-mers, to extract insights in a fraction of the time required by the analysis of the whole dataset.


    Anil Vullikanti

    Anil Vullikanti
    Professor of Computer Science
    Biocomplexity Institute
    University of Virginia
    Charlottesville, VA, USA

    Title: Designing interventions in networked models of epidemic spread
    Abstract:
    The spread of epidemics is a very complex process, and stochastic diffusion models on networks have been found useful, especially when modeling their spread in large and heterogeneous populations, where individual and community level behaviors need to be represented. A common problem in such models is to understand how to control the spread of an epidemic by interventions such as vaccination (which can be modeled as node removal) and social distancing (which can be modeled as edge removal). The decision space of interventions is very complex, in general, as they need to be made over time, taking into account resource constraints and behavioral changes. A broad range of interventions have been proposed, and we will summarize these briefly from a network science perspective, along with the underlying assumptions, their performance, and the computational challenges, which arise even in the simplest of settings. We will also discuss some of the current techniques for these problems, which provide the first steps towards solutions with approximation guarantees, including reducing the spectral radius, using submodularity, and stochastic optimization.
HiCOMB 2021 Call For Papers

The size and complexity of genomic and biomedical big data continue to grow at a furious pace, and the analysis of these complex, noisy, data sets demands efficient algorithms and high performance computing architectures. Hence, high-performance computing (HPC) has become an integral part of research and development in bioinformatics, computational biology, and medical and health informatics. The goal of the HiCOMB workshop is to showcase novel HPC research and technologies to solve data- and compute-intensive problems arising from all areas of computational life sciences. The workshop will feature contributed papers as well as invited talks from reputed researchers in the field.

For peer-reviewed papers, we invite authors to submit original and previously unpublished work that are at the intersection of the "pillars" of modern day computational life sciences and HPC.  More specifically, we encourage submissions from all areas of biology that can benefit from HPC, and from all areas of HPC that need new development to address the class of computational problems that originate from biology.

Areas of interest within computational life sciences include (but not limited to):

Areas of interest within HPC include (but are not limited to):

Submission guidelines

To submit a paper, please upload a PDF file through the Linklings HiCOMB 2021 submission link:
https://ssl.linklings.net/conferences/ipdps/?page=Submit&id=HiCOMBWorkshopFullSubmission&site=ipdps2021

IPDPS workshops can have submission in three categories: regular papers (up to 10 pages), short papers (up to 4 pages), and extended abstracts (1 page). Submitted manuscripts may not exceed ten (10) single-spaced double-column pages using a 10-point size font on 8.5x11 inch pages (IEEE conference style), including figures, tables, and references (see IPDPS Call for Papers for more details). All papers will be reviewed by three or more referees. This year, the authors of the accepted papers will be given a choice on whether to have the paper appear in the IPDPSW Proceedings (which will be digitally indexed and archived as part of the IEEE Xplore Digital Library). If the authors choose not to make it part of the proceedings, then the paper will not be considered archival. In either case, all accepted papers will be posted online on the workshop website, and all accepted papers (archived or not) will need to have an oral presentation at the workshop by one of the authors of the paper.


Important Dates

Workshop submission deadline
(for all categories):

January 29, 2021
February 5, 2021 by 11:59pm AoE (extended deadline)
Author notification: February 28, 2021
Final camera-ready papers deadline: March 12, 2021
Workshop: May 17, 2021

Program Committee

Mohammed Alser, ETH Zurich
Ariful Azad, Indiana University
Mukul Bansal, University of Connecticut
Aydin Buluc, Lawrence Berkeley National Laboratory; University of California, Berkeley
Somali Chaterji, Purdue University
Rayan Chikhi, Institut Pasteur
Ercument Cicek, Bilkent University
Saliya Ekanayake, Virginia Tech
Attila Gursoy, Koc University
Daisuke Kihara, Purdue University
Penporn Koanantakool, Google
Georgios Kollias, IBM T.J. Watson Research Center
Johannes Langguth, Simula Research Laboratory
Erin Molloy, University of California, Los Angeles
Gaurav Pandey, Icahn School of Medicine at Mount Sinai
Enzo Rucci, National University of La Plata
Matt Ruffalo, Carnegie Mellon University
Bertil Schmidt, Johannes Gutenberg University Mainz
Michela Taufer, University of Tennessee Knoxville
Sharma Thankachan, University of Central Florida
Fabio Vandin, University of Padova
Jaroslaw Zola, University at Buffalo

Program Chair

General Chairs

Steering Committee Members


HiCOMB Archive

19th International Workshop on High Performance Computational Biology - HiCOMB 2020
18th International Workshop on High Performance Computational Biology - HiCOMB 2019
17th International Workshop on High Performance Computational Biology - HiCOMB 2018
16th International Workshop on High Performance Computational Biology - HiCOMB 2017
15th International Workshop on High Performance Computational Biology - HiCOMB 2016
14th International Workshop on High Performance Computational Biology - HiCOMB 2015
13th International Workshop on High Performance Computational Biology - HiCOMB 2014
12th International Workshop on High Performance Computational Biology - HiCOMB 2013
11th International Workshop on High Performance Computational Biology - HiCOMB 2012
10th International Workshop on High Performance Computational Biology - HiCOMB 2011
9th International Workshop on High Performance Computational Biology - HiCOMB 2010
8th International Workshop on High Performance Computational Biology - HiCOMB 2009
7th International Workshop on High Performance Computational Biology - HiCOMB 2008
6th International Workshop on High Performance Computational Biology - HiCOMB 2007
5th International Workshop on High Performance Computational Biology - HiCOMB 2006
4th International Workshop on High Performance Computational Biology - HiCOMB 2005
3rd International Workshop on High Performance Computational Biology - HiCOMB 2004
2nd International Workshop on High Performance Computational Biology - HiCOMB 2003
1st International Workshop on High Performance Computational Biology - HiCOMB 2002