The modelcif Python module¶
- class modelcif.System(title=None, id='model', database=None, model_details=None)[source]¶
Top-level class representing a complete modeled system.
- Parameters:
title (str) – Longer text description of the system.
id (str) – Unique identifier for this system in the mmCIF file.
database (
Database) – If this system is part of an official database (e.g. SwissModel, ModBase), details of the database identifiers.model_details (str) – Detailed description of the system, like an abstract.
The system contains a number of simple flat lists of various objects, for example
alignments. After constructing objects they should usually be added to these lists so that a hierarchy of classes is formed and is ultimately written out to mmCIF/BinaryCIF. After reading a file the resultingSystemobject will also populate these lists.Most objects do not need to be explicitly added to the system since they are referenced by other objects. For example
Templateobjects are not usually added to the system because they are added to alignments which in turn are added to the system. If however an “orphan” Template is desired (not part of an alignment) the system does maintain an appropriate list (System.templatesin this case) to which it can be added.- alignments¶
All modeling alignments. See
modelcif.alignment.
- authors¶
List of all authors of this system, as a list of strings (last name followed by initials, e.g. “Smith AJ”). When writing out a file, if this list is empty, all authors from the first citation (see
citationsandihm.Citation) are used instead.
- citations¶
List of all citations. By convention the first citation describes the system itself. See
ihm.Citation.
- comments¶
List of plain text comments. These will be added to the top of the mmCIF file.
- data_usage¶
Information on usage of the data. See
ihm.DataUsage.
- model_groups¶
All groups of models. See
ModelGroup.
- repositories¶
Any additional files with extra data about this system. See
modelcif.associated.Repository.
- revisions¶
Revision/update history. See
ihm.Revision.
- class modelcif.Database(id, code)[source]¶
Information about a System that is part of an official database.
If a
Systemis part of an official database (e.g. SwissModel, ModBase), this class contains details of the database identifiers. It should be passed to theSystemconstructor.
- class modelcif.Software(name, classification, description, location, type='program', version=None, citation=None)[source]¶
Software used as part of the modeling protocol.
- Parameters:
name (str) – The name of the software.
classification (str) – The major function of the software, for example ‘model building’, ‘sample preparation’, ‘data collection’.
description (str) – A longer text description of the software.
location (str) – Place where the software can be found (e.g. URL).
type (str) – Type of software (program/package/library/other).
version (str) – The version used.
citation (
ihm.Citation) – Publication describing the software.
Generally these objects are added to groups (see
SoftwareGroup) which can then be used to describe the software used in various parts of the modeling (Softwareobjects can also be used any placeSoftwareGroupare accepted, in which case they will act as if a group containing only a single member was used).See also
System.software.
- class modelcif.SoftwareGroup(elements=(), parameters=None)[source]¶
A number of
Softwareand/orSoftwareWithParametersobjects that are grouped together.This class can be used to group together multiple
Softwareobjects if multiple pieces of software were used together to generate a single alignment (seemodelcif.alignment.AlignmentMode), to run a modeling step (seemodelcif.protocol.Step), or to calculate a model quality score (seemodelcif.qa_metric). It behaves like a regular Python list.SoftwareWithParametersallows including both a piece of software, and the parameters with which it was used, in the group.- Parameters:
elements (sequence) – Initial set of
Softwareand/orSoftwareWithParametersobjects.
- class modelcif.SoftwareWithParameters(software, parameters=None)[source]¶
A piece of software and the parameters with which it was used.
See
SoftwareGroup.- Parameters:
software (
modelcif.Software) – The software that was used.parameters (sequence) – sequence of parameters for the software, as
SoftwareParameterobjects.
- class modelcif.SoftwareParameter(name, value, description=None)[source]¶
A single parameter given to software used in modeling.
- class modelcif.Entity(sequence, alphabet=<class 'ihm.LPeptideAlphabet'>, description=None, details=None, source=None, references=[])[source]¶
Represent a unique molecular sequence.
This can be used both for template sequences (in which case the Entity is then used in a
Templateobject) or for target (model) sequences (where it is used in aAsymUnitobject).(Note that template sequence Entity objects are not written out to the entity, entity_poly etc. tables in the mmCIF/BinaryCIF file by default. Instead, sequence information is captured in template-specific categories.)
- Parameters:
sequence (sequence) – The primary sequence, as a sequence of
ihm.ChemCompobjects, and/or codes looked up in alphabet. See ihm.Entity for examples.alphabet (
ihm.Alphabet) – The mapping from code to chemical components to use (it is not necessary to instantiate this class).description (str) – A short text name for the sequence.
details (str) – Longer text describing the sequence.
source (
ihm.source.Source) – The method by which the sample for this entity was produced.references (sequence of
reference.TargetReferenceobjects) – For a target (model) sequence, information about this entity stored in external databases (for example the sequence in UniProt). For references to structure databases for templates, seeTemplateinstead.
See ihm.Entity for more information.
- branch_descriptors¶
String descriptors of branched chemical structure. These generally only make sense for oligosaccharide entities, and should be a list of
BranchDescriptorobjects.
- branch_links¶
Any links between components in a branched entity. This is a list of
BranchLinkobjects.
- property formula_weight¶
Formula weight (dalton). This is calculated automatically from that of the chemical components.
- is_polymeric()[source]¶
Return True iff this entity represents a polymer, such as an amino acid sequence or DNA/RNA chain (and not a ligand or water)
- property seq_id_range¶
Sequence range
- class modelcif.AsymUnit(entity, details=None, auth_seq_id_map=0, id=None, strand_id=None, orig_auth_seq_id_map=None)[source]¶
An asymmetric unit, i.e. a unique instance of an Entity that was modeled.
Note that this class should not be used to describe crystal waters; for that, see
WaterAsymUnit.- Parameters:
entity (
Entity) – The unique sequence of this asymmetric unit.details (str) – Longer text description of this unit.
auth_seq_id_map – Mapping from internal 1-based consecutive residue numbering (seq_id) to PDB “author-provided” numbering (auth_seq_id plus an optional ins_code). This can be either be an int offset, in which case
auth_seq_id = seq_id + auth_seq_id_mapwith no insertion codes, or a mapping type (dict, list, tuple) in which caseauth_seq_id = auth_seq_id_map[seq_id]with no insertion codes, orauth_seq_id, ins_code = auth_seq_id_map[seq_id]- i.e. the output of the mapping is either the author-provided number, or a 2-element tuple containing that number and an insertion code. (Note that if a list or tuple is used for the mapping, the first element in the list or tuple does not correspond to the first residue and will never be used - since seq_id can never be zero.) The default if not specified, or not in the mapping, is forauth_seq_id == seq_idand for no insertion codes to be used.id (str) – User-specified ID (usually a string of one or more upper-case letters, e.g. A, B, C, AA). If not specified, IDs are automatically assigned alphabetically.
strand_id (str) – PDB or “author-provided” strand/chain ID. If not specified, it will be the same as the regular ID.
orig_auth_seq_id_map – Mapping from internal 1-based consecutive residue numbering (seq_id) to original “author-provided” numbering. This differs from auth_seq_id_map as the original numbering need not follow any defined scheme, while auth_seq_id_map must follow certain PDB-defined rules. This can be any mapping type (dict, list, tuple) in which case
orig_auth_seq_id = orig_auth_seq_id_map[seq_id]. If the mapping is None (the default), or a given seq_id cannot be found in the mapping,orig_auth_seq_id = auth_seq_id. This mapping is only used in the various scheme tables, such aspdbx_poly_seq_scheme.
See
System.asym_units.- num_map¶
For branched entities read from files, mapping from provisional to final internal numbering (seq_id), or None if no mapping is necessary. See
ihm.model.Model.add_atom().
- segment(gapped_sequence, seq_id_begin, seq_id_end)[source]¶
Get an object representing the alignment of part of this sequence.
- property seq_id_range¶
Sequence range
- property sequence¶
Primary sequence
- property strand_id¶
PDB or author-provided strand/chain ID
- class modelcif.WaterAsymUnit(entity, number, details=None, auth_seq_id_map=0, id=None, strand_id=None, orig_auth_seq_id_map=None)[source]¶
A collection of crystal waters, all with the same “chain” ID.
- Parameters:
number (int) – The number of water molecules in this unit.
For more information on this class and the rest of the parameters, see
AsymUnit.- property number_of_molecules¶
Number of molecules
- property seq_id_range¶
Sequence range
- property sequence¶
Primary sequence
- class modelcif.NonPolymerFromTemplate(template, explicit, details=None, auth_seq_id_map=0, id=None, strand_id=None)[source]¶
A non-polymer (e.g. ligand) in the model that is modeled from a non-polymer template.
These objects act just like
AsymUnitand should be added toAssembly.To represent a non-polymer that is modeled without a template, just use a regular
AsymUnit.- Parameters:
For the other parameters, see
AsymUnit.
- class modelcif.Residue(seq_id, entity=None, asym=None)[source]¶
A single residue in an entity or asymmetric unit. Usually these objects are created by calling
Entity.residue()orAsymUnit.residue().- property auth_seq_id¶
Author-provided seq_id; only makes sense for asymmetric units
- property comp¶
Chemical component (residue type)
- property ins_code¶
Insertion code; only makes sense for asymmetric units
- class modelcif.Assembly(elements=(), name=None, description=None)[source]¶
A collection of parts of the system that were modeled together.
- Parameters:
This is implemented as a simple list of asymmetric units (or parts of them), i.e. a list of
AsymUnitand/orAsymUnitRangeobjects. An Assembly is typically passed to themodelcif.model.Modelconstructor.Note that the ModelCIF dictionary has deprecated the corresponding
ma_struct_assemblycategory, so any name or description of the assembly will not be written to the mmCIF file. The ModelCIF dictionary requires that all models have the same composition.
- class modelcif.AsymUnitRange(asym, seq_id_begin, seq_id_end)[source]¶
Part of an asymmetric unit. Usually these objects are created from an
AsymUnit, e.g. to get a range covering residues 4 through 7 in asym use:asym = ihm.AsymUnit(entity) rng = asym(4,7)
- class modelcif.Transformation(rot_matrix, tr_vector)[source]¶
Rotation and translation applied to an object.
These objects are generally used to record the transformation that was applied to a
Templateto generate the starting structure used in modeling.- Parameters:
rot_matrix – Rotation matrix (as a 3x3 array of floats) that places the object in its final position.
tr_vector – Translation vector (as a 3-element float list) that places the object in its final position.
- class modelcif.TemplateSegment(template, gapped_sequence, seq_id_begin, seq_id_end)[source]¶
An aligned part of a template (see
modelcif.alignment.Pair).Usually these objects are created from a
TemplateusingTemplate.segment(), e.g. to get a segment covering residues 1 through 3 in tmpl use:tmpl = modelcif.Template(entity, ...) seg = tmpl.segment('--ACG', 1, 3)
- class modelcif.Template(entity, asym_id, model_num, transformation, name=None, references=[], strand_id=None, entity_id=None)[source]¶
A single database chain that was used as a template structure for modeling.
After creating a polymer template, use
segment()to denote the part of its sequence used in any modeling alignments (seemodelcif.alignment.Pair).Non-polymer templates do not have alignments, and should instead be passed to one or more
NonPolymerFromTemplateobjects.Template objects can also be used as inputs or outputs in modeling protocol steps; see
modelcif.protocol.Step.This class is intended for templates that were taken from reference databases such as PDB. For a non-deposited “custom” template, use the
CustomTemplateclass instead.- Parameters:
entity (
Entity) – The sequence of the chain.asym_id (str) – The asym or chain ID in the template structure.
model_num (int) – The model number of the template structure.
transformation (
Transformation) – Rotation and translation applied to the original template structure to get the starting model used in modeling.name (str) – A short name for this template.
references (list of
modelcif.reference.TemplateReferenceobjects) – A list of pointers to reference databases (such as PDB) from which the template structure was taken.strand_id (str) – PDB or “author-provided” strand/chain ID. If not specified, it will be the same as the regular asym_id.
entity_id (str) – If known, the ID of the entity for this template in its own mmCIF file.
- class modelcif.CustomTemplate(entity, asym_id, model_num, transformation, name=None, strand_id=None, entity_id=None, details=None)[source]¶
A chain that was used as a template structure for modeling.
This class is intended for templates that have not been deposited in a database such as PDB (for deposited templates, use the
Templateclass instead). The coordinates of the atoms in these “custom” templates will be included in the mmCIF file; see theatomsmember.- Parameters:
details (str) – Information on how the template was created.
See
Templatefor a description of the other parameters.- atoms¶
Coordinates of all atoms as
TemplateAtomobjects
- class modelcif.TemplateAtom(seq_id, atom_id, type_symbol, x, y, z, het=False, biso=None, occupancy=None, charge=None, auth_seq_id=None, auth_atom_id=None, auth_comp_id=None)[source]¶
Coordinates of a single atom in a custom template.
This provides the coordinates for a template that has not been deposited in a database. See
CustomTemplatefor more information. These objects are added to theCustomTemplate.atomslist.- Parameters:
seq_id (int) – The sequence ID of the residue represented by this atom. This should generally be a number starting at 1 for any polymer chain, water, or oligosaccharide. For ligands, a seq_id is not needed (as a given asym can only contain a single ligand), so either 1 or None can be used.
atom_id (str) – The name of the atom in the residue
type_symbol (str) – Element name
x (float) – x coordinate of the atom
y (float) – y coordinate of the atom
z (float) – z coordinate of the atom
het (bool) – True for HETATM sites, False (default) for ATOM
biso (float) – Temperature factor or equivalent (if applicable)
occupancy (float) – Fraction of the atom type present (if applicable)
charge (float) – Formal charge (if applicable)
auth_seq_id (int) – Author-provided sequence ID (if applicable; this is optional for polymers but required for ligands).
auth_atom_id (str) – Author-provided atom name (if needed)
auth_comp_id (str) – Author-provided residue name (if needed)
- class modelcif.ReferenceDatabase(name, url, version=None, release_date=None)[source]¶
A reference database used in the modeling. This is typically a sequence database used for template search, alignments, etc. These objects are passed as input or output to
modelcif.protocol.Step. See alsomodelcif.data.Datafor more details.Compare with
modelcif.reference.TargetReference, which pertains to just the modeled sequence itself; this class describes multiple sequences.- Parameters:
name (str) – Name of the database.
url (str) – Location of the database.
version (str) – Version of the database.
release_date (
datetime.date) – Release date of the specified version.
- class modelcif.Feature[source]¶
Base class for selecting parts of the system. This class should not be used itself; instead, see
AtomFeature,PolyResidueFeature, andEntityInstanceFeature.Generally it is expected that the entities selected by a given feature are all of the same type. For example, a feature should not select both a ligand and a polymer.
Features are typically used in QA metrics, passed to
modelcif.qa_metric.Featureormodelcif.qa_metric.FeaturePairwiseobjects.
- class modelcif.AtomFeature(atoms, details=None)[source]¶
Selection of one or more atoms from the system. See
Featurefor more information.Note that currently support for atom features in python-modelcif is rather rudimentary. They must be selected by their “id”, not by the Atom Python object.