How to download from the PDB/snapshot archive
PDB Archive
The PDB data sites are updated every Wednesday at 00:00 UTC / 09:00 JST.
( The PDB NextGen Repositories are updated every month on the 1st Wednesday at 00:00 UTC / 09:00 JST.)
The latest PDB raw data is available from the PDB Archive site.
- The available protocols are https, ftp, and rsync.
- There are two types of our archive site: the data archive (data.pdbjlc1.pdbj.org) that includes the latest PDBj data and the wwPDB archives (split into files.pdbj.org , files-versioned.pdbj.org and files-nextgen.pdbj.org).
- The PDB archive provided by PDBj is a superset of the wwPDB data, including the original wwPDB data; core , versioned , nextgen, and our own generated data and archives.
Archives | Protocols | ||
---|---|---|---|
https | ftp | rsync | |
wwPDB core Archives@PDBj | https://files.pdbj.org/ | ftp://ftp.pdbj.org/ | rsync -az rsync.pdbj.org:: |
wwPDB Versioned Archive@PDBj | https://files-versioned.pdbj.org/ | ftp://ftp-versioned.pdbj.org/ | rsync -az rsync-versioned.pdbj.org:: |
wwPDB next generation Archive@PDBj | https://files-nextgen.pdbj.org/ | ftp://ftp-nextgen.pdbj.org/ | rsync -az rsync-nextgen.pdbj.org:: |
PDBj data Archive (including wwPDB Archives) |
https://data.pdbjlc1.pdbj.org/ | ftp://data.pdbjlc1.pdbj.org/ | rsync -az data.pdbjlc1.pdbj.org:: |
See also "About the contents of the PDBj data site" page.
Using the FTP protocol
Connect to the PDBj FTP site (ftp://data.pdbjlc1.pdbj.org/) using your prefered FTP client in PASSIVE mode.
Using the rsync protocol
This is useful for full/partial mirroring of the PDB archive. Several entry points are provided, which can be viewed using the following command:
$ rsync data.pdbjlc1.pdbj.org:: Welcome to PDBj FTP site. ftp Top level of pdb data ftp tree aproximately 1.3T ( /pub/pdb ) ftp_data Data directory within ftp archive aproximately 652G ( /pub/pdb/data ) ftp_derived Derived data directory within ftp archive aproximately 467M ( /pub/pdb/derived_data ) ftp_doc Doc directory within ftp archive aproximately 391M ( /pub/pdb/doc ) emdb Top level of emdb data ftp tree aproximately 13T ( /pub/emdb ) ftp_versioned Top level of pdb_versioned data ftp tree aproximately 237G ( /pdb_versioned ) pdb_nextgen Top level of NextGen PDB archive tree approximately 85G ( /pdb_nextgen ) rsync Top level of PDBj FTP tree aproximately 21T ( / )
To download the entry files in PDB format the following rsync command can be used:
rsync -avz --delete data.pdbjlc1.pdbj.org::ftp_data/structures/divided/pdb/ ./pdb
To download the entry files in PDB exchange format (mmCIF) the following rsync command can be used:
rsync -avz --delete data.pdbjlc1.pdbj.org::ftp_data/structures/divided/mmCIF/ ./mmCIF
To download the entry files in PDBML format the following rsync command can be used:
rsync -avz --delete data.pdbjlc1.pdbj.org::ftp_data/structures/divided/XML/ ./XML
To download the entry files in EMDB format the following rsync command can be used:
rsync -avz --delete data.pdbjlc1.pdbj.org::emdb/ ./emdb
Snapshot Archive
The wwPDB provides a snapshot archive that contains the snapshots of the FTP archive just before the first release every year or any remediations at the wwPDB Snapshots Archive (https://s3snapshots.rcsb.org/). We provide a mirror at ftp://snapshots.pdbj.org.
Using the FTP protocol
Connect to ftp://snapshots.pdbj.org/ using your prefered FTP client in PASSIVE mode.
Using the rsync protocol
This is useful for full/partial mirroring of the shapshot archive. Several entry points are provided, which can be viewed using the following command:
$ rsync snapshots.pdbj.org::
all Entire Snapshots archive
20050106 January 6, 2005 snapshot
20060103 January 3, 2006 snapshot
20070102 January 2, 2007 snapshot
20070731 July 31, 2007 snapshot
20080107 January 7, 2008 snapshot
20090105 January 5, 2009 snapshot
20090316 March 16, 2009 snapshot
20100104 January 4, 2010 snapshot
20110103 January 3, 2011 snapshot
20110707 July 7, 2011 snapshot
20120102 January 2, 2012 snapshot
20130101 January 1, 2013 snapshot
20140102 January 2, 2014 snapshot
20141203 December 3, 2014 snapshot
20150102 January 2, 2015 snapshot
20160101 January 1, 2016 snapshot
20160302 March 2, 2016 snapshot (Validation Reports Only)
20170101 January 1, 2017 snapshot
20170308 March 8, 2017 snapshot (Validation Reports Only)
20170710 July 10, 2017 snapshot
20180101 January 1, 2018 snapshot
20180321 March 21, 2018 snapshot (Validation Reports Only)
20190101 January 1, 2019 snapshot
20200101 January 1, 2020 snapshot
20200610 June 10, 2020 snapshot (Validation Reports Only)
20200722 July 22, 2020 snapshot
20200916 September 16, 2020 snapshot (Validation Reports Only)
20210105 January 5, 2021 snapshot
20220103 January 3, 2022 snapshot
20230102 January 2, 2023 snapshot
20240101 January 1, 2024 snapshot
20250101 January 1, 2025 snapshot
To download the shapshot from 2005-01-06 and put them in the 20050106_snapshot directory, the following command can be used:
rsync -a snapshots.pdbj.org::20050106 20050106_snapshot