GEMtractor¶
-
class
modules.gemtractor.gemtractor.
GEMtractor
(sbml_file)[source]¶ Bases:
object
Main class for the GEMtractor
It reads an SBML file, trims entities, and extracts the encoded network.
Parameters: sbml_file (str) – path to the SBML file -
_GEMtractor__find_genes
(reaction)¶ Look for genes associated to a reaction.
Will check if there are annotations using the FBC package, otherwise it will evaluate the reactions’ notes. If there are still no valid gene associations, the function assumes there is an undocumented catalyst and falls back to a gene with the reaction’s identifier (prefixed with ‘reaction_’).
Parameters: reaction (libsbml:Reaction) – the SBML S-Base reaction Returns: the gene associations Return type: str
-
_GEMtractor__get_expression_parser
()¶ get a parser to parse gene-association strings of reactions
Returns: the expression parser Return type: pyparsing:ParserElement
-
_extract_genes_from_sbml_notes
(annotation, default)[source]¶ extract the genes from a free-text sbml note
expects the gene associations as
<p>GENE_ASSOCIATION: a and (b or c)</p>
Parameters: - annotation (str) – the annotation string
- default (str) – the default to return if no gene-associations were found
Returns: the gene-associations
Return type: string
-
_get_genes
(reaction)[source]¶ get the genes associated to a reaction
Will cache the gene associations. If there is nothing in cache, it will run
_GEMtractor__find_genes()
,_parse_expression()
, and_unfold_complex_expression()
to find them.Parameters: reaction – the SBML S-Base reaction Returns: list of GeneComplexes catalyzing the reaction Return type: list of network.genecomplex.GeneComplex
-
_implode_genes
(genes)[source]¶ implode a list of genes into a proper logical expression
basically joins the list with or, making sure every item is enclosed in brackets
Parameters: genes (list of network.genecomplex.GeneComplex
) – the list of optional genesReturns: the logical expression (genes joined using ‘or’) Return type: str
-
_overwrite_genes_in_sbml_notes
(new_genes, reaction)[source]¶ set the gene-associations for a note in an sbml reaction
overwrites it, if it was set already
Parameters: - new_genes (str) – the new gene-associations
- reaction – the sbml reaction
-
_parse_expression
(expression)[source]¶ parse a gene-association expression
uses the expression parser from
_GEMtractor__get_expression_parser()
Parameters: expression (str) – the gene-association expression Returns: the parse result Return type: pyparsing:ParseResults
-
_set_genes_in_sbml
(genes, reaction)[source]¶ set the genes of a reaction in an sbml model
tries to set the genes using the FBC package, if supported, otherwise sets the genes in the reaction’s notes. if genes already exist the will be overwritten…
Parameters: - genes (list of
network.genecomplex.GeneComplex
) – the new list of gene associations - reaction – the reaction to annotate
- genes (list of
-
_unfold_complex_expression
(parseresult)[source]¶ unfold a gene-association parse result
takes a parse result and unfolds it to a list of alternative gene complexes, which can catalyze a certain reaction
Parameters: parseresult – the result of the expression parser Returns: the list of gene-complexes catalyzing Return type: list of network.genecomplex.GeneComplex
-
extract_network_from_sbml
()[source]¶ Extract the Network from the SBML model
Will go through the SBML file and convert the (remaining) entities into our own network structure.
Returns: the network (after optional trimming) Return type: network.network.Network
-
get_gene_product_annotations
(gene)[source]¶ get the annotations of a gene product
if the document is annotated using the FBC package there is good chance we’ll find more information about a gene in the gene-product’s annotations
Parameters: gene (str) – the label gene product of interest Returns: the annotations of the gene-product labeled gene or None if there are no such annotations Return type: xml str
-
get_reaction_annotations
(reactionid)[source]¶ get the annotations of an sbml reaction
Parameters: reactionid (str) – the identifier of the reaction in the sbml document Returns: the annotations of that reaction Return type: xml str
-
get_sbml
(filter_species=[], filter_reactions=[], filter_genes=[], filter_gene_complexes=[], remove_reaction_enzymes_removed=True, remove_ghost_species=False, discard_fake_enzymes=False, remove_reaction_missing_species=False, removing_enzyme_removes_complex=True)[source]¶ Get a filtered SBML document from the model file
this parses the SBML file, applies the trimming according to the arguments, and returns the trimmed model
Parameters: - filter_species (list of str) – species identifiers to get rid of
- filter_reactions (list of str) – reaction identifiers to get rid of
- filter_genes (list of str) – enzyme identifiers to get rid of
- filter_gene_complexes (list of str) – enzyme-complex identifiers to get rid of, every list-item should be of format: ‘A + B + gene42’
- remove_reaction_enzymes_removed (bool) – should we remove a reaction if all it’s genes were removed?
- remove_ghost_species (bool) – should species be removed, that do not participate in any reaction anymore - even though they might be required in other entities?
- discard_fake_enzymes (bool) – should fake enzymes (implicitly assumes enzymes, if no enzymes are annotated to a reaction) be removed?
- remove_reaction_missing_species (bool) – remove a reaction if one of the participating genes was removed?
- removing_enzyme_removes_complex (bool) – if an enzyme is removed, should also all enzyme complexes be removed in which it participates?
Returns: the SBML document
Return type:
-