This page is also available in: 日本語

You can download the PDB data from PDBj FTP site. The available protocols are ftp and rsync. The contents of the site also includes PDBMLadd files, which are generated from original PDBML files by adding some information by PDBj, and RDF files. See also PDBj download page.

Abstract of the directory structure

The following table shows the root path, file type and URL.

Contents FTP URL
Document root ftp://ftp.pdbj.org/
PDB Archive root (common in each PDB site) ftp://ftp.pdbj.org/pub/pdb/data
Derived data (common in each PDB site) ftp://ftp.pdbj.org/pub/pdb/derived_data
Documents such as newsletter (common in each PDB site) ftp://ftp.pdbj.org/pub/pdb/doc
EMDB Archive root (common in each PDB site) ftp://ftp.pdbj.org/pub/emdb
validation reports (common in each PDB site) ftp://ftp.pdbj.org/pub/pdb/validation_reports

Details

[RDF]

pdb

This directory includes RDF format PDB data derived from PDBx/mmCIF format file. See also wwPDB/RDF page
e.g. PDBID 100d
ftp://ftp.pdbj.org/RDF/pdb/100d.rdf.gz

chem_comp

This directory includes RDF format file of each chemical group derived from its PDBx/mmCIF format file. See also wwPDB/RDF page.
e.g. ATP
ftp://ftp.pdbj.org/RDF/chem_comp/ATP.rdf.gz

sifts

This directory includes RDF format file of SIFTS data.

  • pdb_chain_cath_uniprot.ttl.gz
  • pdb_chain_enzyme.ttl.gz
  • pdb_chain_go.ttl.gz
  • pdb_chain_interpro.ttl.gz
  • pdb_chain_pfam.ttl.gz
  • pdb_chain_scop_uniprot.ttl.gz
  • pdb_chain_taxonomy.ttl.gz
  • pdb_chain_uniprot.ttl.gz
  • pdb_pubmed.ttl.gz
  • uniprot_pdb.ttl.gz
  •  

[XML]

This directory includes original PDBML files and PDBMLadd files, which are generated from original PDBML files by adding some information by PDBj.

all

This directory includes PDBML format files which have all data. The contents in here is equal to those in pub/pdb/data/structures/all/XML
e.g. PDBID 100d
ftp://ftp.pdbj.org/XML/all/100d.xml.gz

all-extatom

This directory includes PDBML format files only with atom coordinate date. The contents in here is equal to those in pub/pdb/data/structures/all/XML-extatom
e.g. PDBID 100d
ftp://ftp.pdbj.org/XML/all-extatom/100d-extatom.xml.gz

all-noatom

This directory includes PDBML format files with whole date except for atom coordinates. The contents in here is equal to those in pub/pdb/data/structures/all/XML-noatom
e.g. PDBID 100d
ftp://ftp.pdbj.org/XML/all-noatom/100d-noatom.xml.gz

pdbmlplus

This directory includes PDBMLadd format files, which are generated from original PDBML files by adding some information by PDBj. It also includes the PDBMLadd schema file ( ftp://ftp.pdbj.org/XML/pdbmlplus/PDBMLadd_v40.xsd ).
e.g. PDBID 100d
ftp://ftp.pdbj.org/XML/pdbmlplus/pdbml_add/100d-add.xml.gz(compressed)

[chem_comp]

mmcif

mmcif/[0-9a-z]{1,3}.cif.gz

This directory includes mmCIF format file of each chemical component entry.

e.g. ATP
ftp://ftp.pdbj.org/chem_comp/mmcif/ATP.cif.gz

PDBML

PDBML/[0-9A-Z]{1,3}.xml.gz

This directory includes PDBML(XML) format file of each chemical component entry derived from its PDBx/mmCIF format file.

e.g. ATP
ftp://ftp.pdbj.org/chem_comp/PDBML/ATP.xml.gz

RDF

RDF/[0-9A-Z]{1,3}.rdf.gz

This directory includes RDF format file of each chemical component entry derived from its PDBx/mmCIF format file.

e.g. ATP
ftp://ftp.pdbj.org/chem_comp/RDF/ATP.rdf.gz

[doc]

newsletters/en

The pdf files of News Letter English version, which are published from PDBj. You can get it via http protocol from News Letter web page.

newsletters/jp

The pdf files of News Letter Japanese version, which are published from PDBj. You can get it via http protocol from News Letter web page.

[edmap]

This directory includes edmap list and ccp4 format files in EDMap service.

  • edmap.list: all entries with Electron density map
  • ccp4/([0-9a-z]{2})/[1-9]{1}[0-9a-z]{3}.ccp4.gz: electron density map data of "ccp4" format

[emnavi]

Date involved in EM Navigator.

[family]

[mine]

This directory was obsoleted. Please use the following "mine2" directory.

[mine2]

Date related to PDBj Mine such as database dump file.

  • mine2.dump: The dump file which includes whole database of PDBj Mine.
  • split/mine2.{aa,ab...}: Dump files of PDBj Mine database divided into every 100MB.
  • weekly/mine2_update_yyyymmdd.sql.gz: Increment of the PDBj Mine database at yyyy/mm/dd.
  • sifts: This directory contains SQL and shell scripts for integrating PDBe's SIFTS into the PDBj Mine2 database.

[mmcif]

This directory includes PDBx/mmCIF format files. The contents in here is equal to those in pub/pdb/data/structures/all/mmCIF
e.g. PDBID 100d
ftp://ftp.pdbj.org/mmcif/100d.cif.gz

[model]

In current PDB, structures determined experimentally are only included in the main contents, and the theoretical model structures are stored in this directory.
e.g. PDBID 163d
ftp://ftp.pdbj.org/model/pdb163d.ent.gz

[nmr]

NMR restraint files, equal to the contents in pub/pdb/data/structures/all/nmr_restraints/ . e.g. PDBID 108d
ftp://ftp.pdbj.org/nmr/108d.mr.gz

[nmr_chemical_shifts]

Chemical shift information files of structures solved by NMR method, equal to the contents in pub/pdb/data/structures/all/nmr_chemical_shifts/ .
e.g. PDBID 2l7e
ftp://ftp.pdbj.org/nmr_chemical_shifts/2l7e_cs.str.gz

[nmr_v2]

NMR restraint version 2 files, equal to the contents in pub/pdb/data/structures/all/nmr_restraints_v2/ .
e.g. PDBID 108d
ftp://ftp.pdbj.org/nmr_v2/108d_mr.str.gz

[pdb]

This directory includes gzip compressed PDB format files, equal to the contents in pub/pdb/data/structures/all/pdb .
e.g. PDBID 100d
ftp://ftp.pdbj.org/pdb/pdb100d.ent.gz

[pdb_h]

This directory includes gzip compressed PDB format files without atom coordinate information.
e.g. PDBID 100d
ftp://ftp.pdbj.org/pdb_h/pdb100d.ent.gz

[pdb_nc]

This directory includes not compressed PDB format files.
e.g. PDBID 100d
ftp://ftp.pdbj.org/pdb_nc/pdb100d.ent

[prd]

[prdcc]

[structure_factors]

Structure factor files, equal to the contents in pub/pdb/data/structures/all/structure_factors/ .
e.g. PDBID 100d
ftp://ftp.pdbj.org/structure_factors/r100dsf.ent.gz

[pub/emdb]

This directory is the root of EMDB Archive and common contents in every PDB site (wwPDB, RCSB PDB, PDBe and PDBj).

[pub/pdb]

This directory is the root of PDB Archive and common contents in every PDB site (wwPDB, RCSB PDB, PDBe and PDBj).

data

biounit

monomers

status

structures

Structural data such as atom coordinates.

structures/all

This directory includes all types of PDB data. Each format file is stored in one directory. Each file is gzip compressed.

structures/divided

This directory includes all types of PDB data, equal to all directory. Each format file is stored in sub directory named from middle two strings of PDBID. Each file is gzip compressed.

structures/models

structures/obsolete

Obsolete entry data.

compatible/pdb_bundle

As large structures (containing >62 chains and/or 99999 ATOM lines) cannot be represented in the legacy PDB file format, data are available in the PDB archive as single PDBx/mmCIF files representing the entire structure.
This directory includes TAR files(pdb-bundle files) containing a collection of best effort/minimal files, related files and chain ID mappings file.
Files are organized in directories following the 2-character hash code format.
For example, one TAR file for entry 1ABC will be found in directory /pdb/compatible/pdb_bundle/ab/1abc/.
The TAR file is compressed using gzip compression.

e.g. 1abc-pdb-bundle.tar.gz

derived_data

doc

software

validation_reports

This folder includes validation reports for all X-ray crystal structures. Files are organized in directories following the 2-character hash code format.
For example, files for entry 1ABC will be found in directory /pdb/validation_reports/ab/1abc/.
All files are compressed using gzip compression.

Five files will be included in the directory for each entry:

  • Standard wwPDB validation report as a PDF file (e.g., 1abc_validation.pdf.gz)
  • Validation report containing all outliers as a PDF file (e.g., 1abc_full_validation.pdf.gz)
  • Summary quality graphic as a PNG image (e.g., 1abc_multipercentile_validation.png.gz)
  • Summary quality graphic as a SVG image (e.g., 1abc_multipercentile_validation.svg.gz)
  • Data file containing all diagnostics in machine-readable XML format (e.g., 1abc_validation.xml.gz)

status

This folder includes information of PDBIDs for added/updated/obsoleted entries in plain text files.

obsolete.dat

This file includes PDB IDs for the all obsoleted entries in the fixed length format like the PDB format.

 LIST OF OBSOLETE COORDINATE ENTRIES AND SUCCESSORS
OBSLTE    31-JUL-94 116L     216L
OBSLTE    15-APR-98 125D     1AW6
OBSLTE    20-SEP-99 14PS     1QJB
OBSLTE    30-OCT-78 151C     251C
OBSLTE    15-JAN-91 156B     256B
OBSLTE    08-JUL-08 179L     
OBSLTE    07-DEC-04 1A0V     1Y46
OBSLTE    07-DEC-04 1A0W     1Y4F
OBSLTE    07-DEC-04 1A0X     1Y4G
...

After the second line, the data is described as following:

Columns Commens Example
1-6 Data type identifier. Always "OBSLTE" OBSLTE
11-19 Updated date.[Day in 2 digits]-[Month name in three letters]-[Lower 2 digits of the year] 31-JUL-94
21-24 PDB IDs of replaced/obsoleted entries 116L
30-33 PDB IDs of the superseded entries. In case the entry was obsoleted, this item was set to null. 216L
[yyyymmdd] directory

This folder includes text files with PDB IDs that added/updated/obsoleted at yyyy/mm/dd (yyyy is 4 digits of the year, mm is 2 digits of the month and dd is 2 digits of the day).

File name Comments
added.pdb PDB IDs of newly added entries
added.sf PDB IDs of newly added the structure factor files
added.nmr PDB IDs of newly added the NMR distance restraint files
added.cs PDB IDs of newly added the chemical shift files
added.models PDB IDs of newly added the theoretical models *
modified.pdb PDB IDs of updated entries
modified.sf PDB IDs of update the structure factor files
modified.nmr PDB IDs of updated the NMR distance restraint files
modified.cs PDB IDs of updated the chemical shift files
modified.models PDB IDs of updated the theoretical models *
obsolete.pdb PDB IDs of obsoleted/superseded entries
obsolete.sf PDB IDs of obsoleted/superseded the structure factor files
obsolete.nmr PDB IDs of obsoleted/superseded the NMR distance restraint files
obsolete.cs PDB IDs of obsoleted/superseded the chemical shift files
obsolete.models PDB IDs of obsoleted/superseded the theoretical models *

* Currently, we don't accept theoretical models, and all the deposited models were already moved from the main PDB archive. All of them are listed in here.

latest
The link to the directory of the latest update.
Created: 2013-03-18 (last edited: more than 1 year ago)2018-11-01