Journal: J Struct Biol X / Year: 2025 Title: Protein identification using Cryo-EM and artificial intelligence guides improved sample purification. Authors: Kenneth D Carr / Dane Evan D Zambrano / Connor Weidle / Alex Goodson / Helen E Eisenach / Harley Pyles / Alexis Courbet / Neil P King / Andrew J Borst / Abstract: Protein purification is essential in protein biochemistry, structural biology, and protein design, enabling the determination of protein structures, the study of biological mechanisms, and the ...Protein purification is essential in protein biochemistry, structural biology, and protein design, enabling the determination of protein structures, the study of biological mechanisms, and the characterization of both natural and de novo designed proteins. However, standard purification strategies often encounter challenges, such as unintended co-purification of contaminants alongside the target protein. This issue is particularly problematic for self-assembling protein nanomaterials, where unexpected geometries may reflect novel assembly states, cross-contamination, or native proteins originating from the expression host. Here, we used an automated structure-to-sequence pipeline to first identify an unknown co-purifying protein found in several purified designed protein samples. By integrating cryo-electron microscopy (Cryo-EM), ModelAngelo's sequence-agnostic model-building, and Protein BLAST, we identified the contaminant as dihydrolipoamide succinyltransferase (DLST). This identification was validated through comparisons with DLST structures in the Protein Data Bank, AlphaFold 3 predictions based on the DLST sequence from our E. coli expression vector, and traditional biochemical methods. The identification informed subsequent modifications to our purification protocol, which successfully excluded DLST from future preparations. To explore the potential broader utility of this approach, we benchmarked four computational methods for DLST identification across varying resolution ranges. This study demonstrates the successful application of a structure-to-sequence protein identification workflow, integrating Cryo-EM, ModelAngelo, Protein BLAST, and AlphaFold 3 predictions, to identify and ultimately help guide the removal of DLST from sample purification efforts. It highlights the potential of combining Cryo-EM with AI-driven tools for accurate protein identification and addressing purification challenges across diverse contexts in protein science.
Name: Dihydrolipoamide Succinyltransferase / type: complex / ID: 1 / Parent: 0 / Macromolecule list: #1 Details: This protein was observed as contaminant in a sample of a two component nanoparticle assembly.
Source (natural)
Organism: Escherichia coli (E. coli) / Strain: BL21(DE3)
Molecular weight
Theoretical: 1.056 MDa
-
Macromolecule #1: Dihydrolipoyllysine-residue succinyltransferase component of 2-ox...
Macromolecule
Name: Dihydrolipoyllysine-residue succinyltransferase component of 2-oxoglutarate dehydrogenase complex type: protein_or_peptide / ID: 1 / Number of copies: 24 / Enantiomer: LEVO / EC number: dihydrolipoyllysine-residue succinyltransferase
Model: Quantifoil R2/2 / Material: COPPER / Mesh: 300 / Support film - Material: CARBON / Support film - topology: HOLEY / Support film - Film thickness: 40 / Pretreatment - Type: GLOW DISCHARGE
Vitrification
Cryogen name: ETHANE / Chamber humidity: 100 % / Chamber temperature: 295.15 K / Instrument: FEI VITROBOT MARK IV Details: Wait time: 7.5 seconds Blot time: 0.5 seconds Blot force: 0 seconds.
Details
This sample was heterogeneous and contamined both DLST and the designed nanoparticle assembly.
-
Electron microscopy
Microscope
TFS KRIOS
Image recording
Film or detector model: GATAN K3 BIOQUANTUM (6k x 4k) / Digitization - Dimensions - Width: 5760 pixel / Digitization - Dimensions - Height: 4092 pixel / Number grids imaged: 1 / Number real images: 4264 / Average exposure time: 5.0 sec. / Average electron dose: 47.0 e/Å2 Details: 6211 movies were collected, the best 4264 were used for particle picking and further processing.
Electron beam
Acceleration voltage: 300 kV / Electron source: FIELD EMISSION GUN
Chain - Source name: Other / Chain - Initial model type: in silico model / Details: ModelAngelo
Details
The final model was built to density using the UniProt sequence in ModelAngelo. Further refinement of the model to the density was performed using ISOLDE in ChimeraX, Coot, Phenix. Waters were built to one chain and then that chain and water network were rebuilt in ChimeraX using symmetry.
Refinement
Space: REAL / Protocol: OTHER / Overall B value: 69.9 / Target criteria: Cross-correlation coefficient
Output model
PDB-9dz8: Catalytic domain of Dihydrolipoamide Succinytransferase
+
About Yorodumi
-
News
-
Feb 9, 2022. New format data for meta-information of EMDB entries
New format data for meta-information of EMDB entries
Version 3 of the EMDB header file is now the official format.
The previous official version 1.9 will be removed from the archive.
In the structure databanks used in Yorodumi, some data are registered as the other names, "COVID-19 virus" and "2019-nCoV". Here are the details of the virus and the list of structure data.
Jan 31, 2019. EMDB accession codes are about to change! (news from PDBe EMDB page)
EMDB accession codes are about to change! (news from PDBe EMDB page)
The allocation of 4 digits for EMDB accession codes will soon come to an end. Whilst these codes will remain in use, new EMDB accession codes will include an additional digit and will expand incrementally as the available range of codes is exhausted. The current 4-digit format prefixed with “EMD-” (i.e. EMD-XXXX) will advance to a 5-digit format (i.e. EMD-XXXXX), and so on. It is currently estimated that the 4-digit codes will be depleted around Spring 2019, at which point the 5-digit format will come into force.
The EM Navigator/Yorodumi systems omit the EMD- prefix.
Related info.:Q: What is EMD? / ID/Accession-code notation in Yorodumi/EM Navigator
Yorodumi is a browser for structure data from EMDB, PDB, SASBDB, etc.
This page is also the successor to EM Navigator detail page, and also detail information page/front-end page for Omokage search.
The word "yorodu" (or yorozu) is an old Japanese word meaning "ten thousand". "mi" (miru) is to see.
Related info.:EMDB / PDB / SASBDB / Comparison of 3 databanks / Yorodumi Search / Aug 31, 2016. New EM Navigator & Yorodumi / Yorodumi Papers / Jmol/JSmol / Function and homology information / Changes in new EM Navigator and Yorodumi