merrin
is a Python3 tool to compute metabolic regulatory rules from time series observations.
This implementation rely on merrinasp
, extension of the Answer Set Programming (ASP) solver clingo
with quantified linear constraints.
To install the merrin
package from the GitHub repository, run the pip command:
python3.X -m pip install git+/~https://github.com/bioasp/merrin
merrin
can be used in the terminal as follows:
merrin [-h] -sbml SBML -pkn PKN -obj OBJ -obs OBS [-n NBSOL] [--lpsolver {glpk,gurobi}] [--timelimit TIMELIMIT]
[--optimization {all,subsetmin}] [--projection {network,node,trace}]
Inferred regulatory rules are displayed in the terminal in CSV format, see examples below.
Mandatory arguments:
-sbml SBML, --SBML SBML
Metabolic network in SBML file format.
-pkn PKN, --PKN PKN Prior Knowledge Network.
-obj OBJ, --objective-reaction OBJ
Objective reaction.
-obs OBS, --observations OBS
JSON file describing the input timeseries.
Optional arguments:
-n NBSOL
Number of solution to enumerate (default: 0 for all)
--lpsolver {glpk,gurobi}
Linear solver to use (default: glpk)
--timelimit TIMELIMIT
Timelimit for each resolution, -1 if none (default: -1)
--optimization {all,subsetmin}
Select optimization mode: all networks or subset minimal ones only (default: subsetmin)
--projection {network,node,trace}
Select project mode (default: network):
- network: enumerate all the rules of each network;
- node: enumerate the candidate rules for each node;
- trace: enumerate classes of network of equivalent rFBA traces
Metabolic network should be in SBML
(Systems Biology Markup Language) version 3 format.
Prior Knowledge Network (PKN) is a text file where each line is such that:
node_1 sign node_2
with:
node_1
andnode_2
are two components of the regulatory or metabolic systems.sign
in (0
,-1
,1
) such that:-1
is an inhibition effect ofnode_1
onnode_2
;1
is an activation effect ofnode_1
onnode_2
;0
is an unknown effect (either activation or inhibition effect) ofnode_1
onnode_2
;
Examples
Carbon1 0 RPcl
RPcl 1 Tc2
Tc2 -1 RPcl
In this example, RPcl
regulatory rule depends on an unknown interaction with Carbon1
and an inhibition effect of Tc2
.
merrin
is compatible with any combination of the following datatypes: kinetics, fluxomics and transcriptomics.
The observations can be noisy. Note that it is preferable not to enter observations that are not certain.
Observations are described in a json
file. Each time series observation is defined as follows:
{
"file": "path/to/the/csv/file",
"type": ["Kinetics","Fluxomics","Transcriptomics"], <- any non-empty subset
"constraints": {
"mutations": {
"node_u": true, <- forced activation
"node_v": false, <- forced inhibition
},
"bounds": {
"reaction": [lower_bound, upper_bound]
}
}
}
The csv
file describing the observation needs to have a Time
column with an integer timestamp for each observed time step.
For kinetics and fluxomics data types:
- Metabolites: real-values, modeling the metabolite concentration in the substrate.
- Need to contain a
biomass
column with the measured value of the biomass.
For fluxomics data types:
- Reaction: real-values, modeling the reaction activity rates in the metabolic network.
For transcriptomics data types:
- All values are binary (
0
or1
), modeling the activity (1
) or inactivity (0
) of a component (metabolite, reaction, regulatory nodes).
merrin
generates a CSV
file describing the inferred regulatory networks.
A rule set to 1
represents a constant value (i.e. always activated) for which no regulatory rules are necessary to explain the component dynamics.
Remarks 1: If no regulatory networks are returned, then the instance is unsatisfiable.
Try to change the max_gap
and max_error
variables before launching merrin
again.
Remarks 2: For unsatisfiable instances with kinetics and/or fluxomics data, launching merrin
with the observation declared as transcriptomics data only can sometimes allow inferring some regulatory networks.
Regulatory rules are returned in disjunctive normal form (DNF) with the following syntax:
R := 1 || C || (C_1 | ... | C_n)
C := L || (L_1 & ... & L_m)
L := N || !N
N := regulatory component name
with !
denoting the negation, &
the logical and, and |
the logical or.
Examples are provided in ./examples
.
- The instance
./examples/instances/toy
has been generated from the regulatory metabolic network and the experiments described in (Thuillier et al., 2021). - The instance
./examples/instances/core-regulated
has been generated from the regulatory metabolic network and the experiments described in (Covert et al., 2001). - The instance
./examples/instances/large-scale
has been generated from the regulatory metabolic network and the experiments described in (Covert et al., 2002).
To solve the core-regulated
instance using the console command, see the bash file: ./examples/run-merrin.sh
.
It can be executed with:
sh ./examples/run-merrin.sh
To solve the core-regulated
instance using a Python script using merrin
, check the jupyter notebook: ./examples/notebook-merrin.ipynb
.
Network projection: Infer regulatory networks.
Each row of the displayed CSV
is a regulatory network and each column is the rules for a given regulatory component.
Example 1: Network projection + All optimization
R2a,R2b,R5a,R5b,R7,R8a,RPO2,RPb,RPcl,RPh,Rres,Tc2
!RPb,1,1,!RPO2,1,!RPh,!Oxygen,R2b,Carbon1,Hext,1,!RPcl
!RPb,1,1,!RPO2,!RPb,!RPh,!Oxygen,R2b,Carbon1,Hext,1,!RPcl
!RPb,1,!RPO2,!RPO2,!RPb,!RPh,!Oxygen,R2b,Carbon1,Hext,1,!RPcl
!RPb,1,!RPO2,!RPO2,!RPb,!RPh,!Oxygen,R2b,Carbon1,Hext,!RPO2,!RPcl
...
Only the first 4 inferred regulatory networks are shown.
The node R2b
is always set to 1
, it does not have any regulatory rules, and so, is always activated.
Example 2: Network projection + Subset minimal optimization
R2a,R2b,R5a,R5b,R7,R8a,RPO2,RPb,RPcl,RPh,Rres,Tc2
!RPb,1,1,1,1,!RPh,!Oxygen,R2b,Carbon1,Hext,1,!RPcl
Node projection: Infer possible regulatory rules for each regulatory component. It will only output 1 row. Each cell contains a set of compatible regulatory rules separated by ';'.
Example 3: Node projection + All optimization
R2a,R2b,R5a,R5b,R7,R8a,RPO2,RPb,RPcl,RPh,Rres,Tc2
!RPb,1,!RPO2;1,!RPO2;1;RPO2,!RPb;1,!RPh,!Oxygen,R2b,Carbon1,Hext,!RPO2;1,!RPcl
The node R5a
has 2 possible regulatory rules: !RPO2
or 1
(unregulated).
Example 4: Node projection + Subset minimal optimization
R2a,R2b,R5a,R5b,R7,R8a,RPO2,RPb,RPcl,RPh,Rres,Tc2
!RPb,1,1,1,1,!RPh,!Oxygen,R2b,Carbon1,Hext,1,!RPcl
Trace projection: Infer possible regulatory rules for each rFBA trace compatible with the observations.
Each row of the displayed CSV
is a class of regulatory networks compatible with an rFBA trace compatible with the observations.
Each cell contains a set of compatible regulatory rules separated by ';' for a dedicated node.
Remarks: For the core-regulated
instance, it yields the same CSV than the network projection
To cite this tool:
Kerian Thuillier, Caroline Baroukh, Alexander Bockmayr, Ludovic Cottret, Loïc Paulevé, Anne Siegel, MERRIN: MEtabolic regulation rule INference from time series data, Bioinformatics, Volume 38, Issue Supplement_2, September 2022, Pages ii127–ii133, https://doi.org/10.1093/bioinformatics/btac479 [pdf]