Computer Science Research Seminar

Date: 3-4PM Thursday, September 25th, 2008
Location: Adams 018

Title: Exploiting Heterogeneous Data Sets for Functional Annotation

Dr. Darrin Lewis (Hofstra Alumn)


Machine learning offers powerful tools to experts practicing in varied domains of study. Amongst practitioners, there is a vital need to learn from heterogeneous data sets. This need is fueled by the increasing amount of data being generated by different processes, that potentially inform different aspects of a learning problem. As an example, we consider the computational biology problem of protein functional annotation. Numerous wet lab and computational experiments have provided a wide variety of data pertaining to the same set of proteins. Each type of measurement, e.g., DNA sequence, three dimensional structure, subcellular location, protein domain content, interaction networks, etc., offers a different set of discriminative features for classification. The practitioner should have the freedom to use all of these data to inform a classifier and should expect the learning algorithm to exploit all the data for maximum benefit.

Bio: Dr. Lewis conducts research in the areas of machine learning and computational biology. Computational biology differs from the traditional practice of biology by the exploitation of high throughput experiments for data gathering and the algorithmic manipulation and analysis of the results. Dr. Lewis has published in some of the most respected journals and conferences in his field, having developed and applied several novel techniques.

Dr. Lewis earned his Ph.D at Columbia University under Dr. William Stafford Noble and Dr. Tony Jebara. Prior to that, he earned a M.S. in Computer Science at Hofstra University under Dr. Robert Bumcrot and Dr. Jerome Epstein. Dr. Lewis has held research positions at Bell Laboratories and Siemens Research.