Skip to main content
David Koslicki in front of brick wall

Metagenomics with Compressive Sensing

This is an article in our series on Faculty Researchers. This series of articles will highlight mathematics faculty research contributions within the various curricular areas in the mathematics department.

You may have heard buzz words like big data, artificial intelligence, and machine learning used a lot recently. Have you ever wondered how scientists, researchers, and companies actually use big data or machine learning in order to understand their area of interest? In fact, mathematics and mathematical algorithms underlie many, if not all, of the computational approaches that are used to understand so-called big data. Assistant Professor David Koslicki is among those researchers developing the techniques necessary to understand the world around us through the use of data. In particular, Koslicki develops computational techniques used by biologists to help them better understand sequencing data: DNA, proteins, genes, genomes, and the like.

One project, recently funded by the National Science Foundation (NSF), has the goal of determining what bacteria and microbes are present in an environmental sample by looking at the genomes of the microbes. It turns out, most microbes can’t be cultured: we don’t know the exact conditions required to grow enough of them so we could look at them in a microscope. Instead, we can extract DNA from the microbes and use this to figure out which microbes are present and what they are doing. This sequencing of DNA results in gigabytes to terabytes of information. So how is a biologist supposed to understand such a large set of data? Koslicki uses an approach based on ideas from a mathematical field called compressed sensing. In particular, Koslicki defines a model, based on linear algebra, such that the question “what microbes are present” is reduced to solving a particular system of equations. Of course, such an approach is only useful if the results can be returned in a timely fashion. This is especially important in the medical field: if you’re sick with a bacterial infection, the doctor wants to know as quickly as possible what bacteria are causing the infection. Hence, Koslicki has developed a set of algorithms that can quickly solve the relevant system of equations without using too many computational resources.

“I've enjoyed working on metagenomic problems, and computational biology in general, as this is a great area in which newly developed mathematics can make significant contributions to pressing real-world issues.”

This and related work of Koslicki has been featured in venues ranging from mathematical journals, to widely-read biology journals such as Nature, and even popular press and newspaper articles.