基于 AlphaFold 数据库分析蛋白质进化中的统计规律
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

R857.3; O414.2

基金项目:

江苏省高等学校自然科学研究项目(22KJD14005);香港研究资助局杰出青年学者计划(22302723)


Uncovering the Statistical Trends of Protein Evolution with AlphaFold Database
Author:
Affiliation:

Fund Project:

This work is supported by Natural Science Foundation of the Jiangsu Higher Education Institutions of China (22KJD14005) and Early Career Scheme (22302723) from Research Grants Council of Hong Kong

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    由 DeepMind 开发的 AlphaFold 在蛋白质结构预测领域取得了前所未有的巨大突破,对生命科学的研究产生了革命性的影响。基于大规模的结构预测,AlphaFold 结构预测数据库得以建立,它包含 2 亿多种蛋白,并覆盖了数十种物种的完整蛋白质组。该综述介绍了在“后 AlphaFold 时代”利用统计物理方法研究蛋白质进化问题的一些最新进展。传统的蛋白质进化研究往往关注同一个家族的蛋白质序列或者结构(微观视角),而随着 AlphaFold 预测的海量蛋白质结构的出现,研究者可以把视角扩展到大量蛋白质的集合,甚至是直接对比不同物种体内的全部蛋白质,从中挖掘统计趋势(宏观视角)。基于 AlphaFold 数据库,通过对比 40 多种模式生物体内相似链长的蛋白质,研究者发现了蛋白质分子进化中的统计规律。随着物种复杂性的提高,蛋白质结构将趋向于更高的柔性和模块化程度,蛋白质序列将趋向于出现更显著的亲疏水片段分隔,蛋白质的功能专一性也不断提高。这些基于AlphaFold 的统计研究在分子进化和物种进化之间建立了联系,有助于理解生物复杂性的演化。

    Abstract:

    AlphaFold, which is developed by DeepMind, has made amazing advances in predicting protein structures for life sciences research. Using the vast structural predictions made possible by AlphaFold, a database of over 200 million proteins has been established. Such a database covers the complete proteomes of many organisms. This review outlines the most recent progresses in exploring protein evolution using statistical physical methods based on the AlphaFold database. Traditional protein evolution research often concentrates on the sequences or structures of proteins within the same family, using a narrow microscopic approach. With the new emergence of extensive protein structure predictions by AlphaFold, whereas scientists can expand their horizons to include vast assortments of proteins to make parallels with all proteins in different species and extract statistical trends through macroscopic observation. By comparing the proteins with similar chain lengths in over 40 model organisms, the statistical trends in protein evolution are discovered. For organisms with higher complexity, their constituent proteins present larger radii of gyration, higher flexibility, and higher segregation of hydrophobic and hydrophilic residues in both spatial and sequence. It is also validated by statistical physics analysis that higher organismal complexity correlates with higher functional specialization of constituent proteins. The findings in these studies connect molecular evolution to organism evolution, contributing to the understanding of the origin and evolution of lives.

    参考文献
    相似文献
    引证文献
引用本文

引文格式
夏辰亮,唐乾元.基于 AlphaFold 数据库分析蛋白质进化中的统计规律 [J].集成技术,2024,13(2):74-88

Citing format
XIA Chenliang, TANG Qianyuan. Uncovering the Statistical Trends of Protein Evolution with AlphaFold Database[J]. Journal of Integration Technology,2024,13(2):74-88

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-09-12
  • 最后修改日期:2023-09-12
  • 录用日期:2023-11-23
  • 在线发布日期: 2023-11-23
  • 出版日期:
文章二维码