+Search query
-Structure paper
| Title | Discovery of CRISPR-Cas12a clades using a large language model. |
|---|---|
| Journal, issue, pages | Nat Commun, Vol. 16, Issue 1, Page 7877, Year 2025 |
| Publish date | Aug 23, 2025 |
Authors | Yuanyuan Feng / Junchao Shi / Zhanwei Li / Yongqian Li / Jiaxi Yang / Shisheng Huang / Jinfang Zheng / Wei Han / Yunbo Qiao / Jun Zhang / Qi Liu / Yao Yang / Chunyi Hu / Lina Wu / Xiaokang Zhang / Jin Tang / Xingxu Huang / Peixiang Ma / ![]() |
| PubMed Abstract | CRISPR-Cas systems revolutionize life science. Metagenomes contain millions of unknown Cas proteins. Traditional mining relies on protein sequence alignments. In this work, we employ an evolutionary ...CRISPR-Cas systems revolutionize life science. Metagenomes contain millions of unknown Cas proteins. Traditional mining relies on protein sequence alignments. In this work, we employ an evolutionary scale language model (ESM) to learn the information beyond sequences. Trained with CRISPR-Cas data, ESM accurately identifies Cas proteins without alignment. Limited experimental data restricts feature prediction, but integrating with machine learning enables trans-cleavage activity prediction of uncharacterized Cas12a. We discover 7 undocumented Cas12a subtypes with unique CRISPR loci. Structural analyses reveal 8 subtypes of Cas1, Cas2, and Cas4. Cas12a subtypes display distinct 3D-folds. CryoEM analyses unveil unique RNA interactions with the uncharacterized Cas12a. These proteins show distinct double-strand and single-strand DNA cleavage preferences and broad PAM recognition. Finally, we establish a specific detection strategy for the oncogene SNP without traditional Cas12a PAM. This study highlights the potential of language models in exploring undocumented Cas protein function via gene cluster classification. |
External links | Nat Commun / PubMed:40849498 / PubMed Central |
| Methods | EM (single particle) |
| Resolution | 2.9 Å |
| Structure data | EMDB-37219, PDB-8kgf: |
| Chemicals | ![]() ChemComp-MG: |
| Source |
|
Keywords | DNA BINDING PROTEIN/RNA / CRISPR/Cas12a / DNA binding protein / DNA BINDING PROTEIN-RNA complex |
Movie
Controller
Structure viewers
About Yorodumi Papers



Authors

External links


anaeroglobus (bacteria)
Keywords