New! Sign up for our free email newsletter.
Science News
from research organizations

How artificial intelligence could automate genomics research

Date:
December 2, 2024
Source:
University of California - San Diego
Summary:
New research suggests that large language models like GPT-4 could streamline the process of gene set enrichment, an approach what genes do and how they interact. Results bring science one step closer to automating one of the most widely used methods in genomics research.
Share:
FULL STORY

Researchers at University of California San Diego School of Medicine have demonstrated that large language models (LLMs), such as GPT-4, could help automate functional genomics research, which seeks to determine what genes do and how they interact. The most frequently-used approach in functional genomics, called gene set enrichment, aims to determine the function of experimentally-identified gene sets by comparing them to existing genomics databases. However, more interesting and novel biology is often beyond the scope of established databases. Using artificial intelligence (AI) to analyze gene sets could save scientists many hours of intensive labor and bring science one step closer to automating one of the most widely used methods for understanding how genes work together to influence biology.

Testing five different LLMs, the researchers found that GPT-4 was the most successful, achieving a 73% accuracy rate in identifying common functions of curated gene sets from a commonly used genomics database. When asked to analyze random gene sets, GPT-4 refused to provide a name in 87% of cases, demonstrating the potential of GPT-4 to analyze gene sets with minimal hallucination. GPT-4 was also capable of providing detailed narratives to support its naming process.

While further research is needed to fully explore the potential of LLMs in automating functional genomics, the study highlights the need for continued investment in the development of LLMs and their applications in genomics and precision medicine. To support this, the researchers created a web portal to help other researchers incorporate LLMs into their functional genomics workflows. More broadly, the findings also demonstrate the power of AI to revolutionize the scientific process by synthesizing complex information to generate new, testable hypotheses in a fraction of the time.

The study, published in Nature Methods, was led by Trey Ideker, Ph.D., a professor at UC San Diego School of Medicine and UC San Diego Jacobs School of Engineering, Dexter Pratt, Ph.D., a software architect in Ideker's group, and Clara Hu, a biomedical sciences doctoral candidate in Ideker's group. The study was funded, in part, by the National Institutes of Health.


Story Source:

Materials provided by University of California - San Diego. Original written by Miles Martin. Note: Content may be edited for style and length.


Journal Reference:

  1. Mengzhou Hu, Sahar Alkhairy, Ingoo Lee, Rudolf T. Pillich, Dylan Fong, Kevin Smith, Robin Bachelder, Trey Ideker, Dexter Pratt. Evaluation of large language models for discovery of gene set function. Nature Methods, 2024; DOI: 10.1038/s41592-024-02525-x

Cite This Page:

University of California - San Diego. "How artificial intelligence could automate genomics research." ScienceDaily. ScienceDaily, 2 December 2024. <www.sciencedaily.com/releases/2024/12/241202123918.htm>.
University of California - San Diego. (2024, December 2). How artificial intelligence could automate genomics research. ScienceDaily. Retrieved December 2, 2024 from www.sciencedaily.com/releases/2024/12/241202123918.htm
University of California - San Diego. "How artificial intelligence could automate genomics research." ScienceDaily. www.sciencedaily.com/releases/2024/12/241202123918.htm (accessed December 2, 2024).

Explore More

from ScienceDaily

RELATED STORIES