A Novel AI Predict Inner Cell Functions

More potent AI models and the amassing of vast amounts of cell data in recent years are beginning to turn biology into a more predictive science.

Researchers at Columbia University Vagelos College of Physicians and Surgeons have developed a new artificial intelligence technique that allows them to precisely anticipate gene activity in any human cell, effectively exposing the inner workings of the cell. The technology, which was detailed in the most recent issue of Nature, has the potential to revolutionize how scientists investigate everything from genetic disorders to cancer.

Predictive generalizable computational models allow us to uncover biological processes in a fast and accurate way. These methods can effectively conduct large-scale computational experiments, boosting and guiding traditional experimental approaches.

Raul Rabadan

Conventional biology research techniques are effective in illuminating how cells function or respond to perturbations. However, they are unable to forecast how cells will function or respond to changes, such as a mutation that causes cancer.

Having the ability to accurately predict a cell’s activities would transform our understanding of fundamental biological processes,

It would turn biology from a science that describes seemingly random processes into one that can predict the underlying systems that govern cell behavior.

Raul Rabadan

More potent AI models and the amassing of vast amounts of cell data in recent years are beginning to turn biology into a more predictive science. Researchers’ innovative work in applying AI to predict protein structures earned them the 2024 Nobel Prize in Chemistry. However, it has proven more challenging to employ AI techniques to forecast the actions of genes and proteins within cells.

Rabadan and his colleagues attempted to anticipate which genes are active within particular cells using artificial intelligence in the current study. Researchers can determine the identity of the cell and how it functions by using this gene expression data.

Previous models have been trained on data in particular cell types, usually cancer cell lines or something else that has little resemblance to normal cells.

Raul Rabadan

The general methodology is similar to that of ChatGPT and other well-known “foundation” models. These systems find fundamental rules, or the grammar of language, using a set of training data. They then apply the principles they have inferred to novel contexts.

Here it’s exactly the same thing: we learn the grammar in many different cellular states, and then we go into a particular condition it can be a diseased or normal cell type and we can try to see how well we predict patterns from this information

Raul Rabadan

To train and test the new model, Fu and Rabadan quickly assembled a group of partners, including Shentong Mo of Carnegie Mellon and co-first authors Alejandro Buendia, who is currently a PhD candidate at Stanford and was once in the Rabadan lab.

The algorithm was able to predict gene expression in cell types it had never seen before after being trained on data from over 1.3 million human cells, producing results that closely matched experimental data.

When the researchers requested their AI system to reveal the biology of diseased cells in this case, an inherited form of pediatric leukemia they demonstrated the system’s strength.

These kids inherit a gene that is mutated, and it was unclear exactly what it is these mutations are doing.

Raul Rabadan

The researchers hypothesized that the mutations might affect AI by interfering with the interplay of two distinct transcription factors that control leukemic cell fate. AI’s forecast was validated by lab tests. Knowing the impact of these mutations reveals particular pathways that cause this illness.

Researchers might be able to begin investigating the function of the genome’s “dark matter,” a term taken from cosmology that describes the great bulk of the genome that lacks known protein-encoding genes, thanks to the new computational techniques.

Rabadan is already collaborating with scientists from Columbia and other universities to study various malignancies, such as blood and brain cancers, as well as the grammar of control in healthy cells and how cells alter along the course of cancer formation.

Rabadan is already working with researchers from Columbia and other universities to investigate the grammar of control in healthy cells, how cells change during the development of cancer, and a variety of malignancies, including brain and blood cancers.

Also Read: An important development in “smart cell” design

Rabadan views the work as part of a significant trend, following other recent developments in artificial intelligence for biology.

It’s really a new era in biology that is extremely exciting; transforming biology into a predictive science.

Raul Rabadan

Source: Columbia University – News

Journal Reference: Fu, Xi, et al. “A Foundation Model of Transcription across Human Cell Types.” Nature, 2025, pp. 1-9, DOI: https://doi.org/10.1038/s41586-024-08391-z.


Last Modified:

Editor's Desk

Next Post

New Skeletal Tissue Advances Regenerative Medicine Potential

Fri Jan 10 , 2025
The work explains how lipocartilage cells form and preserve their own lipid reservoirs while maintaining a consistent size.
skeletal tissue regenerative medicine representation

Related Articles

Skip to content