+
Open data
-
Basic information
| Entry | ![]() | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Title | Structure of AmCas12a with crRNA | |||||||||
Map data | ||||||||||
Sample |
| |||||||||
Keywords | CRISPR/Cas12a / DNA binding protein / DNA BINDING PROTEIN-RNA complex | |||||||||
| Biological species | Anaeroglobus (bacteria) | |||||||||
| Method | single particle reconstruction / cryo EM / Resolution: 2.9 Å | |||||||||
Authors | Feng Y / Zhang X / Shi J / Ma P / Tang J / Huang X | |||||||||
| Funding support | China, 2 items
| |||||||||
Citation | Journal: Nat Commun / Year: 2025Title: Discovery of CRISPR-Cas12a clades using a large language model. Authors: Yuanyuan Feng / Junchao Shi / Zhanwei Li / Yongqian Li / Jiaxi Yang / Shisheng Huang / Jinfang Zheng / Wei Han / Yunbo Qiao / Jun Zhang / Qi Liu / Yao Yang / Chunyi Hu / Lina Wu / Xiaokang ...Authors: Yuanyuan Feng / Junchao Shi / Zhanwei Li / Yongqian Li / Jiaxi Yang / Shisheng Huang / Jinfang Zheng / Wei Han / Yunbo Qiao / Jun Zhang / Qi Liu / Yao Yang / Chunyi Hu / Lina Wu / Xiaokang Zhang / Jin Tang / Xingxu Huang / Peixiang Ma / ![]() Abstract: CRISPR-Cas systems revolutionize life science. Metagenomes contain millions of unknown Cas proteins. Traditional mining relies on protein sequence alignments. In this work, we employ an evolutionary ...CRISPR-Cas systems revolutionize life science. Metagenomes contain millions of unknown Cas proteins. Traditional mining relies on protein sequence alignments. In this work, we employ an evolutionary scale language model (ESM) to learn the information beyond sequences. Trained with CRISPR-Cas data, ESM accurately identifies Cas proteins without alignment. Limited experimental data restricts feature prediction, but integrating with machine learning enables trans-cleavage activity prediction of uncharacterized Cas12a. We discover 7 undocumented Cas12a subtypes with unique CRISPR loci. Structural analyses reveal 8 subtypes of Cas1, Cas2, and Cas4. Cas12a subtypes display distinct 3D-folds. CryoEM analyses unveil unique RNA interactions with the uncharacterized Cas12a. These proteins show distinct double-strand and single-strand DNA cleavage preferences and broad PAM recognition. Finally, we establish a specific detection strategy for the oncogene SNP without traditional Cas12a PAM. This study highlights the potential of language models in exploring undocumented Cas protein function via gene cluster classification. | |||||||||
| History |
|
-
Structure visualization
| Supplemental images |
|---|
-
Downloads & links
-EMDB archive
| Map data | emd_37219.map.gz | 117 MB | EMDB map data format | |
|---|---|---|---|---|
| Header (meta data) | emd-37219-v30.xml emd-37219.xml | 21.8 KB 21.8 KB | Display Display | EMDB header |
| FSC (resolution estimation) | emd_37219_fsc.xml | 10.7 KB | Display | FSC data file |
| Images | emd_37219.png | 73.1 KB | ||
| Masks | emd_37219_msk_1.map | 125 MB | Mask map | |
| Filedesc metadata | emd-37219.cif.gz | 7.5 KB | ||
| Others | emd_37219_half_map_1.map.gz emd_37219_half_map_2.map.gz | 5.3 MB 5.3 MB | ||
| Archive directory | http://ftp.pdbj.org/pub/emdb/structures/EMD-37219 ftp://ftp.pdbj.org/pub/emdb/structures/EMD-37219 | HTTPS FTP |
-Related structure data
-
Links
| EMDB pages | EMDB (EBI/PDBe) / EMDataResource |
|---|
-
Map
| File | Download / File: emd_37219.map.gz / Format: CCP4 / Size: 125 MB / Type: IMAGE STORED AS FLOATING POINT NUMBER (4 BYTES) | ||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Projections & slices | Image control
Images are generated by Spider. | ||||||||||||||||||||||||||||||||||||
| Voxel size | X=Y=Z: 0.74 Å | ||||||||||||||||||||||||||||||||||||
| Density |
| ||||||||||||||||||||||||||||||||||||
| Symmetry | Space group: 1 | ||||||||||||||||||||||||||||||||||||
| Details | EMDB XML:
|
-Supplemental data
-Mask #1
| File | emd_37219_msk_1.map | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Projections & Slices |
| ||||||||||||
| Density Histograms |
-
Sample components
-Entire : Structure of AmCas12a with crRNA
| Entire | Name: Structure of AmCas12a with crRNA |
|---|---|
| Components |
|
-Supramolecule #1: Structure of AmCas12a with crRNA
| Supramolecule | Name: Structure of AmCas12a with crRNA / type: complex / ID: 1 / Parent: 0 / Macromolecule list: #1-#2 |
|---|---|
| Source (natural) | Organism: Anaeroglobus (bacteria) |
-Macromolecule #1: CRISPR-associated endonuclease Cas12a
| Macromolecule | Name: CRISPR-associated endonuclease Cas12a / type: protein_or_peptide / ID: 1 / Number of copies: 1 / Enantiomer: LEVO |
|---|---|
| Source (natural) | Organism: Anaeroglobus (bacteria) |
| Molecular weight | Theoretical: 157.312344 KDa |
| Recombinant expression | Organism: ![]() |
| Sequence | String: MGPKKKRKVA AADYKDDDDK SRLEPGEKPY KCPECGKSFS QSGALTRHQR THTRMRTMVT FENFTKQYQV SKTLRFELIP QGKTLENMK RDGIISVDRQ RNDDYQKAKG ILDKLYKYIL DSTMETAVID WEELAIAIEE FRKSKDKKTY EKVQSKVRTA L LEHVKKQK ...String: MGPKKKRKVA AADYKDDDDK SRLEPGEKPY KCPECGKSFS QSGALTRHQR THTRMRTMVT FENFTKQYQV SKTLRFELIP QGKTLENMK RDGIISVDRQ RNDDYQKAKG ILDKLYKYIL DSTMETAVID WEELAIAIEE FRKSKDKKTY EKVQSKVRTA L LEHVKKQK VGTEDLFKGM FSSKIITGEV LAAFPEIRLS DEENLILEKF KDFTTYFTGF FENRKNVFTD EALSTSFTYR LV NDNFIKF FDNCTVMKNV VNISPDMAKS LETCVSDLGI FPGVSLEEVF SVSFYNRLLT QTGIDQFNQL LGGISGKEGE YKK QGLNEI INLAMQQSPE VKEVLKNKAH RFTPLFKQIL SDRSTMSFIP DAFADDDEVL SAVDAYRKYL LEKNIGDRAF QLIS DIEEY SPELMRIGGK YVSVLSQLLF NSWSEIRDGV KAYKESLITG KKTKKELENI DKGIKYGVTL QEIKEALPKK DIYEE VKKY AMSVVKDYHA GLAEPLPEKI ETDDERASIK HIMDSMLGLY RFLEYFSHDS IEDTDPVFGE CLDTILDDMN ETVPLY NKV RNFSTRKVYS TEKFKLNFNN SSLANGWDKN KEQANGAVLL KKAGEYFLGI FNSKNKPKLV SDGGGGTGYE KMIYKQF PD FKKMLPKCTI SRKETKAHFQ KSDEDFTLLT DKFEKSLVIT KKIYDLGTQT VNGKKKFQVD YPRLTGDMEG YRAALKEW I DFGKKFIQAY ASTAIYDTSL FRNSSDYPDL PSFYKDVDNI CYKLTFECIP DAVINDCIDD GSLYLFKLHN KDFSAGSIG KPNLHTLYWK AIFEEENLSD VVVKLNGQAE LFYRPKSLTR PVVHEAGEVI INKTTSTGLP VPDDVYVELS KFVRNGKKGN LTDKAKNWL DKVTVRKTPH AIIKDRRFTV DKFFFHVPIT LNYKADSSPY RFNDFVRQYV KDCSDVNIIG IDRGERNLIY A VVIDGKGN IIEQRSFNTV GTYNYQEKLE QKEKERQTAR QDWATVTKIK DLKKGYLSAV VHELSKMIVK YKAIVVLENL NV GFKRMRG GIAERSVYQQ FEKALIDKLN YLVFKDEEQS GYGGVLNAYQ LTDKFESFSK MGQQTGFLFY VPAAYTSKID PLT GFINPF SWKHVKNRED RRNFLNLFSK LYYDVNTHDF VLAYHHSNKD SKYTIKGNWE IADWDILIQE NKEVFGKTGT PYCV GKRIV YMDDSTTGHN RMCAYYPHTE LKKLLSEYGI EYTSGQDLLK IIQEFDDDKL VKGLFYIIKA ALQMRNSNSE TGEDY ISSP IEGRPGICFD SRAEADTLPH NADANGAFHI AMKGLLLTER IRNDDKLAIS NEEWLNYIQE MRGASLVPRG SHHHHH HHH HH |
-Macromolecule #2: RNA (44-MER)
| Macromolecule | Name: RNA (44-MER) / type: rna / ID: 2 / Number of copies: 1 |
|---|---|
| Source (natural) | Organism: Anaeroglobus (bacteria) |
| Molecular weight | Theoretical: 14.125415 KDa |
| Sequence | String: UAAUUUCUAC UAAGUGUAGA UGACAUAUGG AAAACGAACU AUGU |
-Macromolecule #3: MAGNESIUM ION
| Macromolecule | Name: MAGNESIUM ION / type: ligand / ID: 3 / Number of copies: 2 / Formula: MG |
|---|---|
| Molecular weight | Theoretical: 24.305 Da |
-Experimental details
-Structure determination
| Method | cryo EM |
|---|---|
Processing | single particle reconstruction |
| Aggregation state | particle |
-
Sample preparation
| Buffer | pH: 7.5 |
|---|---|
| Grid | Material: NICKEL/TITANIUM / Mesh: 300 / Pretreatment - Type: GLOW DISCHARGE / Pretreatment - Time: 30 sec. |
| Vitrification | Cryogen name: ETHANE / Chamber humidity: 100 % |
-
Electron microscopy
| Microscope | FEI TITAN KRIOS |
|---|---|
| Image recording | Film or detector model: FEI FALCON IV (4k x 4k) / Average electron dose: 46.63 e/Å2 |
| Electron beam | Acceleration voltage: 300 kV / Electron source: FIELD EMISSION GUN |
| Electron optics | Illumination mode: SPOT SCAN / Imaging mode: BRIGHT FIELD / Nominal defocus max: 1.8 µm / Nominal defocus min: 0.6 µm |
| Experimental equipment | ![]() Model: Titan Krios / Image courtesy: FEI Company |
+
Image processing
-Atomic model buiding 1
| Initial model | Chain - Source name: AlphaFold / Chain - Initial model type: in silico model |
|---|---|
| Output model | ![]() PDB-8kgf: |
Movie
Controller
About Yorodumi




Keywords
Anaeroglobus (bacteria)
Authors
China, 2 items
Citation

Z (Sec.)
Y (Row.)
X (Col.)




























FIELD EMISSION GUN

