SCOPe 2.08: Structural Classification of Proteins

Welcome to the SCOPe website!

SCOPe is a database developed at the Berkeley Lab and UC Berkeley that extends SCOP (version 1). SCOPe classifies many structures released since SCOP 1.75 through a combination of automation and manual curation, and corrects some errors, aiming to have the same accuracy as the fully hand-curated SCOP releases. SCOPe also incorporates and updates the Astral database.

In addition to new SCOPe releases, the SCOPe website provides integrated access to data found in all releases of the SCOP and Astral databases that feature stable identifiers (i.e., those since release 1.55). A history of all changes between consecutive releases of SCOP and SCOPe is available under the Stats & History menu.

In order to facilitate use of SCOPe data by SCOP and Astral users, we provide SCOPe data in parseable files in the same formats as the SCOP and Astral databases. SCOPe uses the same stable identifiers (e.g., sunid, sid, sccs) as were used for prior releases of SCOP and Astral.

Authors

The SCOPe authors are Naomi K. Fox, Steven E. Brenner, and John-Marc Chandonia.
The SCOP authors are Alexey G. Murzin, John-Marc Chandonia, Antonina Andreeva, Dave Howorth, Loredana Lo Conte, Bartlett G. Ailey, Steven E. Brenner, Tim J. P. Hubbard, and Cyrus Chothia.
The ASTRAL authors are John-Marc Chandonia, Naomi K. Fox, Degui Zhi, Gary Hon, Loredana Lo Conte, Nigel Walker, Patrice Koehl, Michael Levitt, and Steven E. Brenner.
References to cite are on this page.

Mission

Nearly all proteins have structural similarities with other proteins, and in some of these cases, share a common evolutionary origin. The Structural Classification of Proteins — extended (SCOPe) knowledgebase aims to provide an accurate, detailed, and comprehensive description of the structural and evolutionary relationships amongst all proteins of known structure, along with resources for analyzing the protein structures and their sequences. By providing a broad survey of protein folds and authoritative information about relatives of proteins, particularly those too ancient to be readily recognized from sequence, SCOPe is a framework for future research and classification. SCOPe undertakes to provide interfaces and data to support all users of protein structure and evolutionary relationships, for research, education, and policy, at scales ranging from interactive exploration of relationships of proteins of interest, including nuances of their individual structures and variations, to comprehensive studies and methods that draw on the entirety of the protein universe. The SCOPe resource aims to be Findable/Accessible/Interoperable/Reusable (FAIR) and also equitable for all.

News

In SCOPe 2.08, we have continued to perform manual curation of new Folds, Superfamilies, and Families. We classified members of 74 Pfam families having the most structures (all those with at least 25 PDB entries) that had not previously had a classified representative in SCOP or SCOPe. Among these families, 22 (30%) had at least one domain classified into a new SCOPe fold, 10 (14%) into a new superfamily in an existing fold, 33 (45%) into a new family within an existing superfamily, and 9 (12%) as a new protein within an existing family.

Variant searches. To assist in the analysis of genetic variants and to enable easier access to structural classification data, we built a search tool to map human genetic variants to protein structure and associated SCOPe data. Users can search for structures relevant to a genetic variant of interest by providing HGVS expressions or genome coordinates using hg19 and GRCh38. Examples are on the advanced variant search page.

Annotation of structural heterogeneity. We have improved consistency in how structures in the same family are divided into domains, so that automated methods (e.g., deep learning classifiers) that rely on multiple alignments of homologous SCOPe domains will be less likely to produce incorrect results due to variable domain lengths within the same family. See our help page for further details.

Annotated repeat units. Some protein domains in SCOPe consist of a number of smaller tandem repeating units. The number of repeats may or may not be the same between the domains in the same family. To facilitate automated algorithms developed or trained on the SCOPe knowledgebase, we provide machine-parseable annotations of the extent of a single repeat unit for all families of repeats in classes a to g.

We have also improved our detection of cloning artifacts (e.g., expression tags). These tags are classified in a special class (l: Artifacts) in order to separate them from the homology-based curations in the rest of the SCOPe hierarchy. Including such artifacts can result in similarity between non-homologous sequences.

License

All data in SCOPe (including the data from older releases of SCOP and Astral) are freely available to all users.

Pronunciation

There are several alternative pronunciations of the vowels in the word SCOPe. All are considered correct.

Funding

This work was supported by the National Institutes of Health (R01-GM073109) through the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.