The modelcif
Python module¶
- class modelcif.System(title=None, id='model', database=None, model_details=None)[source]¶
Top-level class representing a complete modeled system.
- Parameters:
title (str) – Longer text description of the system.
id (str) – Unique identifier for this system in the mmCIF file.
database (
Database
) – If this system is part of an official database (e.g. SwissModel, ModBase), details of the database identifiers.model_details (str) – Detailed description of the system, like an abstract.
The system contains a number of simple flat lists of various objects, for example
alignments
. After constructing objects they should usually be added to these lists so that a hierarchy of classes is formed and is ultimately written out to mmCIF/BinaryCIF. After reading a file the resultingSystem
object will also populate these lists.Most objects do not need to be explicitly added to the system since they are referenced by other objects. For example
Template
objects are not usually added to the system because they are added to alignments which in turn are added to the system. If however an “orphan” Template is desired (not part of an alignment) the system does maintain an appropriate list (System.templates
in this case) to which it can be added.- alignments¶
All modeling alignments. See
modelcif.alignment
.
- authors¶
List of all authors of this system, as a list of strings (last name followed by initials, e.g. “Smith AJ”). When writing out a file, if this list is empty, all authors from the first citation (see
citations
andihm.Citation
) are used instead.
- citations¶
List of all citations. By convention the first citation describes the system itself. See
ihm.Citation
.
- comments¶
List of plain text comments. These will be added to the top of the mmCIF file.
- model_groups¶
All groups of models. See
ModelGroup
.
- repositories¶
Any additional files with extra data about this system. See
modelcif.associated.Repository
.
- class modelcif.Database(id, code)[source]¶
Information about a System that is part of an official database.
If a
System
is part of an official database (e.g. SwissModel, ModBase), this class contains details of the database identifiers. It should be passed to theSystem
constructor.
- class modelcif.Software(name, classification, description, location, type='program', version=None, citation=None)[source]¶
Software used as part of the modeling protocol.
- Parameters:
name (str) – The name of the software.
classification (str) – The major function of the software, for example ‘model building’, ‘sample preparation’, ‘data collection’.
description (str) – A longer text description of the software.
location (str) – Place where the software can be found (e.g. URL).
type (str) – Type of software (program/package/library/other).
version (str) – The version used.
citation (
ihm.Citation
) – Publication describing the software.
Generally these objects are added to groups (see
SoftwareGroup
) which can then be used to describe the software used in various parts of the modeling (Software
objects can also be used any placeSoftwareGroup
are accepted, in which case they will act as if a group containing only a single member was used).See also
System.software
.
- class modelcif.SoftwareGroup(elements=(), parameters=None)[source]¶
A number of
Software
and/orSoftwareWithParameters
objects that are grouped together.This class can be used to group together multiple
Software
objects if multiple pieces of software were used together to generate a single alignment (seemodelcif.alignment.AlignmentMode
), to run a modeling step (seemodelcif.protocol.Step
), or to calculate a model quality score (seemodelcif.qa_metric
). It behaves like a regular Python list.SoftwareWithParameters
allows including both a piece of software, and the parameters with which it was used, in the group.- Parameters:
elements (sequence) – Initial set of
Software
and/orSoftwareWithParameters
objects.
- class modelcif.SoftwareWithParameters(software, parameters=None)[source]¶
A piece of software and the parameters with which it was used.
See
SoftwareGroup
.- Parameters:
software (
modelcif.Software
) – The software that was used.parameters (sequence) – sequence of parameters for the software, as
SoftwareParameter
objects.
- class modelcif.SoftwareParameter(name, value, description=None)[source]¶
A single parameter given to software used in modeling.
- class modelcif.Entity(sequence, alphabet=<class 'ihm.LPeptideAlphabet'>, description=None, details=None, source=None, references=[])[source]¶
Represent a unique molecular sequence.
This can be used both for template sequences (in which case the Entity is then used in a
Template
object) or for target (model) sequences (where it is used in aAsymUnit
object).(Note that template sequence Entity objects are not written out to the entity, entity_poly etc. tables in the mmCIF/BinaryCIF file by default. Instead, sequence information is captured in template-specific categories.)
- Parameters:
sequence (sequence) – The primary sequence, as a sequence of
ihm.ChemComp
objects, and/or codes looked up in alphabet. See ihm.Entity for examples.alphabet (
ihm.Alphabet
) – The mapping from code to chemical components to use (it is not necessary to instantiate this class).description (str) – A short text name for the sequence.
details (str) – Longer text describing the sequence.
source (
ihm.source.Source
) – The method by which the sample for this entity was produced.references (sequence of
reference.TargetReference
objects) – For a target (model) sequence, information about this entity stored in external databases (for example the sequence in UniProt). For references to structure databases for templates, seeTemplate
instead.
See ihm.Entity for more information.
- branch_descriptors¶
String descriptors of branched chemical structure. These generally only make sense for oligosaccharide entities, and should be a list of
BranchDescriptor
objects.
- branch_links¶
Any links between components in a branched entity. This is a list of
BranchLink
objects.
- property formula_weight¶
Formula weight (dalton). This is calculated automatically from that of the chemical components.
- is_polymeric()[source]¶
Return True iff this entity represents a polymer, such as an amino acid sequence or DNA/RNA chain (and not a ligand or water)
- property seq_id_range¶
Sequence range
- class modelcif.AsymUnit(entity, details=None, auth_seq_id_map=0, id=None, strand_id=None, orig_auth_seq_id_map=None)[source]¶
An asymmetric unit, i.e. a unique instance of an Entity that was modeled.
Note that this class should not be used to describe crystal waters; for that, see
WaterAsymUnit
.- Parameters:
entity (
Entity
) – The unique sequence of this asymmetric unit.details (str) – Longer text description of this unit.
auth_seq_id_map – Mapping from internal 1-based consecutive residue numbering (seq_id) to PDB “author-provided” numbering (auth_seq_id plus an optional ins_code). This can be either be an int offset, in which case
auth_seq_id = seq_id + auth_seq_id_map
with no insertion codes, or a mapping type (dict, list, tuple) in which caseauth_seq_id = auth_seq_id_map[seq_id]
with no insertion codes, orauth_seq_id, ins_code = auth_seq_id_map[seq_id]
- i.e. the output of the mapping is either the author-provided number, or a 2-element tuple containing that number and an insertion code. (Note that if a list or tuple is used for the mapping, the first element in the list or tuple does not correspond to the first residue and will never be used - since seq_id can never be zero.) The default if not specified, or not in the mapping, is forauth_seq_id == seq_id
and for no insertion codes to be used.id (str) – User-specified ID (usually a string of one or more upper-case letters, e.g. A, B, C, AA). If not specified, IDs are automatically assigned alphabetically.
strand_id (str) – PDB or “author-provided” strand/chain ID. If not specified, it will be the same as the regular ID.
orig_auth_seq_id_map – Mapping from internal 1-based consecutive residue numbering (seq_id) to original “author-provided” numbering. This differs from auth_seq_id_map as the original numbering need not follow any defined scheme, while auth_seq_id_map must follow certain PDB-defined rules. This can be any mapping type (dict, list, tuple) in which case
orig_auth_seq_id = orig_auth_seq_id_map[seq_id]
. If the mapping is None (the default), or a given seq_id cannot be found in the mapping,orig_auth_seq_id = auth_seq_id
. This mapping is only used in the various scheme tables, such aspdbx_poly_seq_scheme
.
See
System.asym_units
.- num_map¶
For branched entities read from files, mapping from provisional to final internal numbering (seq_id), or None if no mapping is necessary. See
ihm.model.Model.add_atom()
.
- segment(gapped_sequence, seq_id_begin, seq_id_end)[source]¶
Get an object representing the alignment of part of this sequence.
- property seq_id_range¶
Sequence range
- property sequence¶
Primary sequence
- property strand_id¶
PDB or author-provided strand/chain ID
- class modelcif.NonPolymerFromTemplate(template, explicit, details=None, auth_seq_id_map=0, id=None, strand_id=None)[source]¶
A non-polymer (e.g. ligand) in the model that is modeled from a non-polymer template.
These objects act just like
AsymUnit
and should be added toAssembly
.To represent a non-polymer that is modeled without a template, just use a regular
AsymUnit
.- Parameters:
For the other parameters, see
AsymUnit
.
- class modelcif.Residue(seq_id, entity=None, asym=None)[source]¶
A single residue in an entity or asymmetric unit. Usually these objects are created by calling
Entity.residue()
orAsymUnit.residue()
.- property auth_seq_id¶
Author-provided seq_id; only makes sense for asymmetric units
- property comp¶
Chemical component (residue type)
- property ins_code¶
Insertion code; only makes sense for asymmetric units
- class modelcif.Assembly(elements=(), name=None, description=None)[source]¶
A collection of parts of the system that were modeled together.
- Parameters:
This is implemented as a simple list of asymmetric units (or parts of them), i.e. a list of
AsymUnit
and/orAsymUnitRange
objects. An Assembly is typically passed to themodelcif.model.Model
constructor.Note that the ModelCIF dictionary has deprecated the corresponding
ma_struct_assembly
category, so any name or description of the assembly will not be written to the mmCIF file. The ModelCIF dictionary requires that all models have the same composition.
- class modelcif.AsymUnitRange(asym, seq_id_begin, seq_id_end)[source]¶
Part of an asymmetric unit. Usually these objects are created from an
AsymUnit
, e.g. to get a range covering residues 4 through 7 in asym use:asym = ihm.AsymUnit(entity) rng = asym(4,7)
- class modelcif.Transformation(rot_matrix, tr_vector)[source]¶
Rotation and translation applied to an object.
These objects are generally used to record the transformation that was applied to a
Template
to generate the starting structure used in modeling.- Parameters:
rot_matrix – Rotation matrix (as a 3x3 array of floats) that places the object in its final position.
tr_vector – Translation vector (as a 3-element float list) that places the object in its final position.
- class modelcif.TemplateSegment(template, gapped_sequence, seq_id_begin, seq_id_end)[source]¶
An aligned part of a template (see
modelcif.alignment.Pair
).Usually these objects are created from a
Template
usingTemplate.segment()
, e.g. to get a segment covering residues 1 through 3 in tmpl use:tmpl = modelcif.Template(entity, ...) seg = tmpl.segment('--ACG', 1, 3)
- class modelcif.Template(entity, asym_id, model_num, transformation, name=None, references=[], strand_id=None, entity_id=None)[source]¶
A single chain that was used as a template structure for modeling.
After creating a polymer template, use
segment()
to denote the part of its sequence used in any modeling alignments (seemodelcif.alignment.Pair
).Non-polymer templates do not have alignments, and should instead be passed to one or more
NonPolymerFromTemplate
objects.Template objects can also be used as inputs or outputs in modeling protocol steps; see
modelcif.protocol.Step
.- Parameters:
entity (
Entity
) – The sequence of the chain.asym_id (str) – The asym or chain ID in the template structure.
model_num (int) – The model number of the template structure.
transformation (
Transformation
) – Rotation and translation applied to the original template structure to get the starting model used in modeling.name (str) – A short name for this template.
references (list of
modelcif.reference.TemplateReference
objects) – A list of pointers to reference databases (such as PDB) from which the template structure was taken.strand_id (str) – PDB or “author-provided” strand/chain ID. If not specified, it will be the same as the regular asym_id.
entity_id (str) – If known, the ID of the entity for this template in its own mmCIF file.
- segment(gapped_sequence, seq_id_begin, seq_id_end)[source]¶
Get an object representing the alignment of part of this sequence.
- property seq_id_range¶
Sequence range
- property strand_id¶
PDB or author-provided strand/chain ID
- class modelcif.ReferenceDatabase(name, url, version=None, release_date=None)[source]¶
A reference database used in the modeling. This is typically a sequence database used for template search, alignments, etc. These objects are passed as input or output to
modelcif.protocol.Step
. See alsomodelcif.data.Data
for more details.Compare with
modelcif.reference.TargetReference
, which pertains to just the modeled sequence itself; this class describes multiple sequences.- Parameters:
name (str) – Name of the database.
url (str) – Location of the database.
version (str) – Version of the database.
release_date (
datetime.date
) – Release date of the specified version.