New data science tool greatly speeds up molecular analysis of our environment
- Date:
- September 20, 2024
- Source:
- University of California - Riverside
- Summary:
- A research team has developed a computational workflow for analyzing large data sets in the field of metabolomics, the study of small molecules found within cells, biofluids, tissues, and entire ecosystems.
- Share:
A research team led by scientists at the University of California, Riverside, has developed a computational workflow for analyzing large data sets in the field of metabolomics, the study of small molecules found within cells, biofluids, tissues, and entire ecosystems.
Most recently, the team applied this new computational tool to analyze pollutants in seawater in Southern California. The team swiftly captured the chemical profiles of coastal environments and highlighted potential sources of pollution.
"We are interested in understanding how such pollutants get introduced in the ecosystem," said Daniel Petras, an assistant professor of biochemistry at UC Riverside, who led the research team. "Figuring out which molecules in the ocean are important for environmental health is not straightforward because of the ocean's sheer chemical diversity. The protocol we developed greatly speeds up this process. More efficient sorting of the data means we can understand problems related to ocean pollution faster."
Petras and his colleagues report in the journal Nature Protocols that their protocol is designed not only for experienced researchers but also for educational purposes, making it an ideal resource for students and early-career scientists. This computational workflow is accompanied by an accessible web application with a graphical user interface that makes metabolomics data analysis accessible for non-experts and enables them to gain statistical insights into their data within minutes.
"This tool is accessible to a broad range of researchers, from absolute beginners to experts, and is tailored for use in conjunction with the molecular networking software my group is developing," said coauthor Mingxun Wang, an assistant professor of computer science and engineering at UCR. "For beginners, the guidelines and code we provide make it easier to understand common data processing and analysis steps. For experts, it accelerates reproducible data analysis, enabling them to share their statistical data analysis workflows and results."
Petras explained the research paper is unique, serving as a large educational resource organized through a virtual research group called Virtual Multiomics Lab, or VMOL. With more than 50 scientists participating from around the world, VMOL is a community-driven, open-access community. It aims to simplify and democratize the chemical analysis process, making it accessible to researchers worldwide, regardless of their background or resources.
"I'm incredibly proud to see how this project evolved into something impactful, involving experts and students from across the globe," said Abzer Pakkir Shah, a doctoral student in Petras' group and the first author of the paper. "By removing physical and economic barriers, VMOL provides training in computational mass spectrometry and data science and aims to launch virtual research projects as a new form of collaborative science."
All software the team developed is free and publicly available. The software development was initiated during a summer school for non-targeted metabolomics in 2022 at the University of Tübingen, where the team also launched VMOL.
Petras expects the protocol will be especially useful to environmental researchers as well as scientists working in the biomedical field and researchers doing clinical studies in microbiome science.
"The versatility of our protocol extends to a wide range of fields and sample types, including combinatorial chemistry, doping analysis, and trace contamination of food, pharmaceuticals, and other industrial products," he said.
Petras received his master's degree in biotechnology from the University of Applied Science Darmstadt and his doctoral degree in biochemistry from the Technical University Berlin. He did postdoctoral research at UC San Diego, where he focused on the development of large-scale environmental metabolomics methods. In 2021, he launched the Functional Metabolomics Lab at the University of Tübingen. In January 2024 he joined UCR, where his lab focuses on the development and application of mass spectrometry-based methods to visualize and assess chemical exchange within microbial communities.
Story Source:
Materials provided by University of California - Riverside. Original written by Iqbal Pittalwala. Note: Content may be edited for style and length.
Journal Reference:
- Abzer K. Pakkir Shah, Axel Walter, Filip Ottosson, Francesco Russo, Marcelo Navarro-Diaz, Judith Boldt, Jarmo-Charles J. Kalinski, Eftychia Eva Kontou, James Elofson, Alexandros Polyzois, Carolina González-Marín, Shane Farrell, Marie R. Aggerbeck, Thapanee Pruksatrakul, Nathan Chan, Yunshu Wang, Magdalena Pöchhacker, Corinna Brungs, Beatriz Cámara, Andrés Mauricio Caraballo-Rodríguez, Andres Cumsille, Fernanda de Oliveira, Kai Dührkop, Yasin El Abiead, Christian Geibel, Lana G. Graves, Martin Hansen, Steffen Heuckeroth, Simon Knoblauch, Anastasiia Kostenko, Mirte C. M. Kuijpers, Kevin Mildau, Stilianos Papadopoulos Lambidis, Paulo Wender Portal Gomes, Tilman Schramm, Karoline Steuer-Lodd, Paolo Stincone, Sibgha Tayyab, Giovanni Andrea Vitale, Berenike C. Wagner, Shipei Xing, Marquis T. Yazzie, Simone Zuffa, Martinus de Kruijff, Christine Beemelmanns, Hannes Link, Christoph Mayer, Justin J. J. van der Hooft, Tito Damiani, Tomáš Pluskal, Pieter Dorrestein, Jan Stanstrup, Robin Schmid, Mingxun Wang, Allegra Aron, Madeleine Ernst, Daniel Petras. Statistical analysis of feature-based molecular networking results from non-targeted metabolomics data. Nature Protocols, 2024; DOI: 10.1038/s41596-024-01046-3
Cite This Page: