The modelcif.reference
Python module¶
Classes for linking back to a sequence or structure database.
- class modelcif.reference.TargetReference(code, accession, align_begin=None, align_end=None, isoform=None, ncbi_taxonomy_id=None, organism_scientific=None, sequence_version_date=None, sequence_crc64=None, sequence=None, details=None)[source]¶
Point to the sequence of a target
modelcif.Entity
in a sequence database. Typically a subclass such asUniProt
is used, although to use a custom database, make a new subclass and provide a docstring to describe the database, e.g.:class CustomRef(TargetReference): "my custom database"
Compare with
modelcif.ReferenceDatabase
, which describes multiple sequences used in template searches or alignment construction; this class relates to just the modeled sequence itself.See also
alignments
to describe the correspondence between the database and entity sequences.- Parameters:
code (str) – The name of the sequence in the database.
accession (str) – The database accession.
align_begin (int) – Beginning index of the sequence in the database. Deprecated; use
alignments
instead.align_end (int) – Ending index of the sequence in the database. Deprecated; use
alignments
instead.isoform (str) – Sequence isoform, if applicable.
ncbi_taxonomy_id (str) – Taxonomy identifier provided by NCBI.
organism_scientific (str) – Scientific name of the organism.
sequence_version_date (
datetime.date
ordatetime.datetime
) – Versioning date, e.g. for UniProtKB sequences this is usually the date of last modification from the DT line of an entry.sequence_crc64 (str) – The CRC64 sum of the original database sequence.
sequence (str) – The complete database sequence, as a string of one-letter codes. If omitted, will default to the canonical sequence of the associated
Entity
.details (str) – Longer text describing the sequence.
- alignments¶
All alignments between the reference and entity sequences, as
Alignment
objects. If none are provided, a simple 1:1 alignment is assumed.
- property other_details¶
More information about a custom reference type. By default it is the first line of the docstring.
- class modelcif.reference.UniProt(code, accession, align_begin=None, align_end=None, isoform=None, ncbi_taxonomy_id=None, organism_scientific=None, sequence_version_date=None, sequence_crc64=None, sequence=None, details=None)[source]¶
Point to the sequence of an
modelcif.Entity
in UniProt.These objects are typically passed to the
modelcif.Entity
constructor for target sequences (for templates, seeTemplateReference
).See
TargetReference
for a description of the parameters.
- class modelcif.reference.Alignment(db_begin=1, db_end=None, entity_begin=1, entity_end=None, seq_dif=[])[source]¶
A sequence range that aligns between the database and the entity. This describes part of the sequence in the sequence database (
Sequence
) and in theihm.Entity
. The two ranges must be the same length and have the same primary sequence (any differences must be described withSeqDif
objects).- Parameters:
db_begin (int) – The first residue in the database sequence that is used (defaults to the entire sequence).
db_end (int) – The last residue in the database sequence that is used (or None, the default, to use the entire sequence).
entity_begin (int) – The first residue in the
Entity
sequence that is taken from the reference (defaults to the entire entity sequence).entity_end (int) – The last residue in the
Entity
sequence that is taken from the reference (or None, the default, to use the entire sequence).seq_dif (Sequence of
SeqDif
objects.) – Single-point mutations made to the sequence.
- class modelcif.reference.SeqDif(seq_id, db_monomer, monomer, details=None)[source]¶
Annotate a sequence difference between a reference and entity sequence. See
Alignment
.- Parameters:
seq_id (int) – The residue index in the entity sequence.
db_monomer (
ihm.ChemComp
) – The monomer type (as aChemComp
object) in the reference sequence.monomer (
ihm.ChemComp
) – The monomer type (as aChemComp
object) in the entity sequence.details (str) – Descriptive text for the sequence difference.
- class modelcif.reference.TemplateReference(accession, db_version_date=None)[source]¶
Point to the structure of a
modelcif.Template
in a structure database.These objects are typically passed to the
modelcif.Template
constructor for template sequences (for target sequences, seeTargetReference
).Typically a subclass such as
PDB
is used, although to use a custom database, make a new subclass and provide a docstring to describe the database, e.g.:class CustomRef(TemplateReference): "my custom database"
- Parameters:
accession (str) – The database accession.
db_version_date (
datetime.date
ordatetime.datetime
) – Versioning date, e.g. for PDB entries this is usually the value of_pdbx_audit_revision_history.revision_date
.
- property other_details¶
More information about a custom reference type. By default it is the first line of the docstring.
- class modelcif.reference.PDB(accession, db_version_date=None)[source]¶
Point to the structure of a
modelcif.Template
in PDB.These objects are typically passed to the
modelcif.Template
constructor.See
TemplateReference
for a description of the parameters.
- class modelcif.reference.AlphaFoldDB(accession, db_version_date=None)[source]¶
Point to the structure of a
modelcif.Template
in AlphaFold DB.These objects are typically passed to the
modelcif.Template
constructor.See
TemplateReference
for a description of the parameters.
- class modelcif.reference.PubChem(accession, db_version_date=None)[source]¶
Point to the structure of a
modelcif.Template
in PubChem.These objects are typically passed to the
modelcif.Template
constructor.See
TemplateReference
for a description of the parameters. Use the PubChem CID as the accession code.