Entries and Relations

This page describes the entries and relations present in the ChemRecon ontology.

../_images/schema.svg

Overview

Entries

Entries in the ChemRecon database represent various types of biochemical objects.

Entries may be source entries. These entries represent those which contain only unmodified information sourced directly from the source databases (See the main page for details).

Each source entry is identified (indexed) by its source_id, the unique identifier, and its id_type, the type of this identifier. In addition, each entry has an internal unique identifier, recon_id. This has no link to the source databases, and it is NOT guaranteed to be stable across ChemRecon versions, so permanence of this identifer should not be relied upon.

Other entries are structural entries, which are standardized versions of molecular structures and atom-to-atom maps. These are generated by ChemRecon based on the source entries, but are canonicalised and standardised to enable direct comparison.

class chemrecon.Entry(recon_id: int | None = None)

Bases: DatabaseObject, ABC

Generic base class for entries.

get_columns_with_values(include_recon_id=True) dict[Column, Any]

Get the columns of this entry with values.

get_index_columns_with_values() dict[Column, Any]

Get the index (primary key) columns of this entry with values.

recon_id: int | None

Internal identifier in the ChemRecon database. Normally, entries have a nonnegative recon_id, unique to the table. A negative recon_id indicates that the object is ‘virtual’, that is, it was created by a procedural relation, and does not exist in the database. A recon_id of None indicates that the entry is not stored in the database.

Entry types:

Relations

Relations represent various connections between entries.

The source and target entry types define which types of entries the relation refers to. Each relation is identified by its source and target (given by recon_id_1 and recon_id_2), and its attributes, if any.

Symmetric relations are those which have no directonality, and the source/target ordering does not matter. In these cases, the source and target entry types are the same, and the ‘source’ will always be the entry with the lowest recon_id.

Some types of relations come in pairs, of which each is the inverse relation of the other, indicated by a slash (/).

Relation types can be procedural, in which case not all possible relations of this type exists in the database. These relations can be generated on-the-fly when querying relations or when generating entry graphs; this is done automatically and deterministically, so this process is transparent (but may increase runtime). Note that procedural relations cannot be queried or traversed in the backwards direction.

class chemrecon.Relation(recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: DatabaseObject, ABC, Generic

Generic base class for relations between entries.

get_columns_with_values(include_recon_ids: bool = True) dict[Column, Any]

Get the columns of this entry with values.

get_index_columns_with_values() dict[Column, Any]

Get the index (primary key) columns of this entry with values. This always includes recon_id_1 and recon_id_2, and may include attribute columns.

recon_id_1: int | None

Recon ID of source

recon_id_2: int | None

Recon ID of target

Relation types:
Entry graph relation types:

Entries

Source Entries

Compound

class chemrecon.Compound(source_id: str, id_type: IdTypeCompoundEnum, name: str | None = None, quality: Quality = None, recon_id: int | None = None, properties: list[str] | None = None)

Bases: SourceEntry

Represents a compound entry.

source_id: str

Identifier of the database entry. (index)

id_type: IdTypeCompoundEnum

Source of the database entry. (index)

name: str | None

Name of the database entry. This May be the common name, IUPAC systematic name, or others.

quality: Quality | None

Indicates the quality of the entry, if this is specified by the source database.

properties: list[str]

Additional properties

Reaction

class chemrecon.Reaction(source_id: str, id_type: IdTypeReactionEnum, name: str | None = None, is_transport: bool | None = None, is_reversible: bool | None = None, quality: Quality | None = None, recon_id: int | None = None, properties: list[str] | None = None)

Bases: SourceEntry

Represents a compound entry.

source_id: str

Identifier of the database entry. (index)

id_type: IdTypeReactionEnum

Source of the database entry. (index)

is_transport: bool | None

Whether the reaction is explicitly marked as a transport reaction.

is_reversible: bool | None

Whether the reaction is explicitly marked as reversible.

quality: Quality | None

Indicates the quality of the entry, if this is specified by the source database.

properties: list[str]

Additional properties

Enzyme

class chemrecon.Enzyme(source_id: str, id_type: IdTypeEnzymeEnum, name: str | None = None, recon_id: int | None = None, quality: Quality | None = None, properties: list[str] | None = None)

Bases: SourceEntry

Represents an enzyme entry.

source_id: str

Identifier of the database entry. (index)

id_type: IdTypeEnzymeEnum

Source of the database entry. Always EC as of the current version. (index)

name: str | None

Name of the enzyme.

quality: Quality | None

Indicates the quality of the entry, if this is specified by the source database.

properties: list[str]

Additional properties

MolStructureRepr

class chemrecon.MolStructureRepr(source_id: str, implicit: bool, id_type: IdTypeStructureRepresentationEnum, recon_id: int | None = None)

Bases: Entry

Represents a representation of a structure, such as an S_SMILES string, S_MOLFILE or similar.

source_id: str

Identifier string. (index)

id_type: IdTypeStructureRepresentationEnum

The type of the identifier string (e.g. SMILES, InChI). (index)

implicit: bool

If marked as implicit, this molecular structure only exists implicitly in an atom-to-atom map, and is not directly referenced by a known compound.

AAMRepr

class chemrecon.AAMRepr(source_id: str, id_type: IdTypeAAMEnum, recon_id: int | None = None)

Bases: Entry

Represents an enzyme entry.

source_id: str

Identifier string. (index)

id_type: IdTypeAAMEnum

The type of the identifier string (e.g. Reaction SMILES). (index)

Structural Entries

MolStructure

class chemrecon.MolStructure(smiles: str, std_feats: list[FeatureEnum] | None = None, molformula: str | None = None, recon_id: int | None = None)

Bases: Entry

Abstract representation of a structure in SMILES format.

smiles: str

The canonical SMILES identifier.

molformula: str | None

The sum formula.

std_feats: list[FeatureEnum] | None

The features w.r.t which this structure is standardized (see TODO).

AAM

class chemrecon.AAM(reaction_smiles: str, recon_id: int | None = None)

Bases: Entry

Represents an atom-to-atom map.

reaction_smiles: str

The canonical reaction SMILES string.

Relations

Compound Relations

CompoundReference

source/target: Compound (symmetric)

class chemrecon.CompoundReference(src: SourceDatabase = unknown, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: Relation[Compound, Compound]

Inter- or intra-database reference between compounds.

src: SourceDatabase

The source of the relation

CompoundIsA / CompoundHasInstance

source/target: Compound

class chemrecon.CompoundIsA(src: SourceDatabase, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: Relation[Compound, Compound]

Hierarchical relation which indicates that a compound is a member of a class of compounds.

src: SourceDatabase

The source of the relation

class chemrecon.CompoundHasInstance(src: SourceDatabase, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: InverseRelation[Compound, Compound]

Hierarchical relation which indicates that a compound is a member of a class of compounds.

src: SourceDatabase

The source of the relation

CompoundHasOldID / CompoundHasNewID

source/target: Compound

class chemrecon.CompoundHasOldID(src: SourceDatabase, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: InverseRelation[Compound, Compound]

Indicates correspondence between identifiers in different database versions. Can be used to resolve deprecated identifiers.

src: SourceDatabase

The source of the relation

class chemrecon.CompoundHasNewID(src: SourceDatabase, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: Relation[Compound, Compound]

Indicates correspondence between identifiers in different database versions. Can be used to resolve deprecated identifiers.

src: SourceDatabase

The source of the relation

CompoundHasPart / CompoundIsPartOf

source/target: Compound

class chemrecon.CompoundHasPart(src: SourceDatabase, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: Relation[Compound, Compound]

Represents a relationship where one compound is a part of another.

src: SourceDatabase

The source of the relation

class chemrecon.CompoundIsPartOf(src: SourceDatabase, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: InverseRelation[Compound, Compound]

Represents a relationship where one compound is a part of another.

src: SourceDatabase

The source of the relation

CompoundHasConjugateAcid / CompoundHasConjugateBase

source/target: Compound

class chemrecon.CompoundHasConjugateAcid(src: SourceDatabase, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: Relation[Compound, Compound]

Represents a relationship indicating that a compound has a conjugate acid.

src: SourceDatabase

The source of the relation

class chemrecon.CompoundHasConjugateBase(src: SourceDatabase, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: InverseRelation[Compound, Compound]

Represents a relationship indicating that a compound has a conjugate base.

src: SourceDatabase

The source of the relation

CompoundHasTautomer

source/target: Compound (symmetric)

class chemrecon.CompoundHasTautomer(src: SourceDatabase, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: Relation[Compound, Compound]

Represents that two compounds are considered tautomers of each other.

src: SourceDatabase

The source of the relation

CompoundHasStereoIsomer

source/target: Compound (symmetric)

class chemrecon.CompoundHasStereoIsomer(src: SourceDatabase, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: Relation[Compound, Compound]

Represents that two compounds are considered stereoisomers of each other.

src: SourceDatabase

The source of the relation

CompoundHasIsotopologue

source/target: Compound (symmetric)

class chemrecon.CompoundHasIsotopologue(src: SourceDatabase, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: Relation[Compound, Compound]

Represents that two compounds are considered isotopologues of each other. (identical in elements, but different isotopic composition).

src: SourceDatabase

The source of the relation

Reaction Relations

ReactionInvolvesCompound / CompoundParticipatesInReaction

source: Reaction, target: Compound

class chemrecon.ReactionInvolvesCompound(n: int, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: Relation[Reaction, Compound]

Each reaction is connected by this relation to the compounds which take part in the reaction, annotated with the stoichiometric coefficient.

n: int

The stoichiometric coefficient of the compound in the reaction

class chemrecon.CompoundParticipatesInReaction(n: int, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: InverseRelation[Compound, Reaction]

Each reaction is connected by this relation to the compounds which take part in the reaction, annotated with the stoichiometric coefficient.

n: int

The stoichiometric coefficient of the compound in the reaction

ReactionReference

source/target: Reaction (symmetric)

class chemrecon.ReactionReference(src: SourceDatabase = unknown, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: Relation[Reaction, Reaction]

Inter- or intra-database reference between reactions.

src: SourceDatabase

The source of the reference.

ReactionIsA / ReactionHasInstance

source/target: Reaction

class chemrecon.ReactionIsA(src: SourceDatabase, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: Relation[Reaction, Reaction]

Hierarchical relation which indicates that a reaction is a member of a class of reactions.

src: SourceDatabase

The source of the relation

class chemrecon.ReactionHasInstance(src: SourceDatabase, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: InverseRelation[Reaction, Reaction]

Hierarchical relation which indicates that a reaction is a member of a class of reactions.

src: SourceDatabase

The source of the relation

ReactionHasNewID / ReactionHasOldID

source/target: Reaction

class chemrecon.ReactionHasNewID(src: SourceDatabase, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: Relation[Reaction, Reaction]

Indicates correspondence between identifiers in different database versions. Can be used to resolve deprecated identifiers.

src: SourceDatabase

The source of the relation

class chemrecon.ReactionHasOldID(src: SourceDatabase, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: InverseRelation[Reaction, Reaction]

Indicates correspondence between identifiers in different database versions. Can be used to resolve deprecated identifiers.

src: SourceDatabase

The source of the relation

Enzyme Relations

EnzymeCatalyzesReaction / ReactionHasEnzyme

source: Enzyme, target: Reaction

class chemrecon.EnzymeCatalyzesReaction(src: SourceDatabase = unknown, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: InverseRelation[Enzyme, Reaction]

Relates an enzyme entry to a reaction which it catalyses.

src: SourceDatabase

The source of the relation

class chemrecon.ReactionHasEnzyme(src: SourceDatabase = unknown, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: Relation[Reaction, Enzyme]

Relates a reaction entry to the given enzyme which catalyses the reaction.

src: SourceDatabase

The source of the relation

EnzymeIsA / EnzymeHasInstance

source/target: Enzyme

class chemrecon.EnzymeIsA(src: SourceDatabase, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: Relation[Enzyme, Enzyme]

Hierarchical relation which indicates that an enzyme is a member of a class of enzymes. The EC number means that this can be automatically resolved, e.g. 1.2.3.4 is_a 1.2.3

src: SourceDatabase

The source of the relation

class chemrecon.EnzymeHasInstance(src: SourceDatabase, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: InverseRelation[Enzyme, Enzyme]

Hierarchical relation which indicates that an enzyme is a member of a class of enzymes. The EC number means that this can be automatically resolved, e.g. 1.2.3.4 is_a 1.2.3

src: SourceDatabase

The source of the relation

Molecular Structure Representation Relations

CompoundHasStructureRepresentation

class chemrecon.CompoundHasStructureRepresentation(src: SourceDatabase = unknown, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: Relation[Compound, MolStructureRepr]

Relates a compound entry to the structure representation (e.g. SMILES, InChI) given by that database.

src: SourceDatabase

The source database containing this representation.

Atom-to-Atom Map Representation Relations

ReactionHasAAMRepr

class chemrecon.ReactionHasAAMRepr(src: SourceDatabase = unknown, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: Relation[Reaction, AAMRepr]

Relates a reaction entry to the given AAM representation (e.g. RXN file).

src: SourceDatabase

The source database containing this AAM.

AAMReprInvolvesMolStructureRepr / MolStructureReprParticipatesInAAMRepr

class chemrecon.AAMReprInvolvesMolStructureRepr(index: int, stoich: int, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: Relation[AAMRepr, MolStructureRepr]

Relates the molecular structure representation involved in an atom-to-atom-mapped reaction. These molecular structures are implicitly a part of the AAM structure.

index: int

Index in the mapping string

stoich: int

The stoichiometric coefficient of the MolStructureRepr

class chemrecon.MolStructureReprParticipatesInAAMRepr(index: int, stoich: int, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: InverseRelation[MolStructureRepr, AAMRepr]

Relates the molecular structure representation involved in an atom-to-atom-mapped reaction. These molecular structures are implicitly a part of the AAM structure.

index: int

Index in the mapping string

stoich: int

The stoichiometric coefficient of the MolStructureRepr

Molecular Structure Relations

CompoundHasMolStructure

class chemrecon.CompoundHasMolStructure(rel_1: CompoundHasStructureRepresentation, rel_2: MolStructureConvert, intermediate: MolStructureRepr, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: ComposedRelation[Compound, MolStructure, MolStructureRepr]

This relation gives the associated molecular structures of a compound in a standardized format. This relation is composed of the CompoundHasMolStructureRepr relation and the MolStructureConvert relations. So if a compound entry includes an InChI, Molfile representation of a structure, this relation gives the same structure in the standardized, SMILES-based format.

src: SourceDatabase

The source database containing this structure.

MolStructureStandardization

class chemrecon.MolStructureStandardization(feat: FeatureEnum, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: ProceduralRelation[MolStructure, MolStructure]

Standardization of molecular structures according to a particular ‘feature’ (Fragment, Isotope, Charge, Tautomerism, Stereochemical).

feat: FeatureEnum

The feature w.r.t. which the structure is standardized.

classmethod generate(take_entry: MolStructure) list[tuple[ProceduralRelation[MolStructure, MolStructure], MolStructure]]

Given a T1, generate relations from that.

MolStructureConvert

class chemrecon.MolStructureConvert(recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: ProceduralRelation[MolStructureRepr, MolStructure]

Standardized and canonical conversion of molecular structure representations (e.g. SMILES, InChI, …) into a consistent format, stored as a canonical SMILES string.

classmethod generate(take_entry: MolStructureRepr) list[tuple[ProceduralRelation[MolStructureRepr, MolStructure], MolStructure]]

Given a T1, generate relations from that.

Atom-to-Atom Map Relations

ReactionHasAAM

class chemrecon.ReactionHasAAM(rel_1: ReactionHasAAMRepr, rel_2: AAMConvert, intermediate: AAMRepr, recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: ComposedRelation[Reaction, AAM, AAMRepr]

This relation gives the associated AAM of a reaction in a standardized format. This relation is composed of the ReactionHasAAMRepr relation and the AAMConvert relations. So if a reaction entry includes an RXN or GML representation of a map, this relation gives the same map in the standardized, ReactionSMILES-based format.

src: SourceDatabase

The source database specifying this AAM.

AAMConvert

class chemrecon.AAMConvert(recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: ProceduralRelation[AAMRepr, AAM]

Standardized and canonical conversion of AAM representations (e.g. reaction SMILES, MolFile, …) into a consistent format, stored as a canonical reaction SMILES string.

classmethod generate(take_entry: AAMRepr) list[tuple[ProceduralRelation[AAMRepr, AAM], AAM]]

Given a T1, generate relations from that.

AAMInvolvesMolStructure / MolStructureParticipatesInAAM

class chemrecon.AAMInvolvesMolStructure(side: int, structure_atom_index_in_aam: list[int], recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: Relation[AAM, MolStructure]

Relates the molecular structures involved in an atom-to-atom-mapped reaction. These molecular structures are implicitly a part of the AAM structure.

side: int

The side of the reaction which contains this structure (-1 or 1)

structure_atom_index_in_aam: list[int]

Atom mapping numbers of the atoms in the structure w.r.t. the AAM.

class chemrecon.MolStructureParticipatesInAAM(side: int, structure_atom_index_in_aam: list[int], recon_id_1: int | None = None, recon_id_2: int | None = None)

Bases: InverseRelation[MolStructure, AAM]

Relates the molecular structures involved in an atom-to-atom-mapped reaction. These molecular structures are implicitly a part of the AAM structure.

side: int

The side of the reaction which contains this structure (-1 or 1)

structure_atom_index_in_aam: list[int]

Atom mapping numbers of the atoms in the structure w.r.t. the AAM.

Entry Graph Relations

These relations are based on the results of the entry graph scoring algorithm.

CompoundSelectStructure

class chemrecon.CompoundSelectStructure(recon_id_1: int | None = None, recon_id_2: int | None = None, score: float = 0.0)

Bases: ProceduralRelationEG[Compound, MolStructure]

Procedural relation which computes the most related molecular structures for a given compound using entry graph scoring.

score: float

The score of the target entry in the entry graph.

recon_id_1: int | None

Recon ID of source

recon_id_2: int | None

Recon ID of target

ReactionSelectAAM

class chemrecon.ReactionSelectAAM(recon_id_1: int | None = None, recon_id_2: int | None = None, score: float = 0.0)

Bases: ProceduralRelationEG[Reaction, AAM]

Procedural relation which computes the most related AAMs for a given reaction using entry graph scoring.

score: float

The score of the target entry in the entry graph.

recon_id_1: int | None

Recon ID of source

recon_id_2: int | None

Recon ID of target