MEIGOR: a software suite based on metaheuristics for global optimization in systems biology and bioinformatics (original) (raw)

Contents

Introduction

MEIGO is an optimization suite programmed in R which implements metaheuristics for solving different nonlinear optimization problems in both continuous and integer domains arising in systems biology, bioinformatics and other areas. It consists of two main metaheuristics: the enhanced scatter search method, eSSR (Egea et al. 2009; Egea, Martí, and Banga 2010) for continuous and mixed-integer problems, and the variable neighbourhood search metaheuristic (Hansen, Mladenović, and Moreno-Pérez 2010), for integer problems.

Metaheuristics are useful for solving problems that due to its complexity cannot be solved by deterministic global optimization solvers or where the usage of a local solver systematically converges to a local solutions. Metaheuristics do not guarantee optimality but are usually efficient in locating the vicinity of the global solution in modest computational time.

Both eSS and VNS are hybrid solvers. This means that the stochastic optimization methods are combined with local solvers to improve the efficiency. They have been implemented in R and can be invoked within the MEIGO framework. This manual describes both methods, their corresponding parallelizable versions, and guides the users to implement their own optimization problems and to choose the best set of options for a particular instance.

Installing MEIGOR

MEIGOR can be installed from Bioconductor by typing:

if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
     BiocManager::install("CellNOptR")

Continuous and mixed-integer problems: Enhanced Scatter Search (eSSR)

eSSR is an R implementation of the enhanced scatter search method (Egea et al. 2009; Egea, Martí, and Banga 2010) which is part of the MEIGO toolbox for global optimization in bioinformatics. It is a metaheuristic which seeks the global minimum of mixed-integer nonlinear programming (MINLP) problems specified by

\[\begin{equation} \min\limits_{x}f(x,p_1,p_2,...,p_n) \end{equation}\]

subject to

\begin{aligned} c_{eq}=0 \ c_Lc(x)c_U\ x_Lxx_U \end{aligned}

where \(x\) is the vector of decision variables, and \(x_L\) and \(x_U\) its respective bounds. \(p_1,\ldots,p_n\) are optional extra input parameters to be passed to the objective function (see example in Section 3.3.3). \(c_{eq}\) is a set of equality constraints. \(c(x)\) is a set of inequality constraints with lower and upper bounds, \(c_L\) and \(c_U\). Finally, \(f(x,p_1,p_2,...,p_n)\) is the objective function to be minimized.

Quick start: How to carry out an optimization with SSR

library(MEIGOR)
problem<-list(f=objective, x_L=rep(-1,2), x_U=rep(1,2))
opts<-list(maxeval=500, local_solver='DHC')

Type Results<-MEIGO(problem,opts,algorithm="ESS" If your problem has additional constant parameters to be passed to the objective function, they are declared as input parameters after “opts”(e.g., type Results<-MEIGO(problem,opts,algorithm="ESS",p1,p2) if your model has two extra input parameters, \(p_1\) and \(p_2\)).

Regarding the objective function, the input parameter is the decision vector (with extra parameters \(p_1, p_2,\ldots, p_n\) if they were defined before calling the solver). The objective function must provide a scalar output parameter (the objective function value) and, for constrained problems, a second output parameter, which is a vector containing the values of the constraints. For problems containing equality constraints (\(=0\)), they must be defined before the inequality constraints. For a quick reference, consider the following example which will be later extended in section 3.3.3.

\begin{aligned} _{x}f(x)=-x_4 \end{aligned}

subject to

\begin{aligned} x_4 - x_3 + x_2 - x_1 + k_4x_4x_6 = 0\ x_1 - 1 + k_1x_1x_5 = 0\ x_2 - x_1 + k_2x_2x_6 = 0\ x_3 + x_1 - 1 + k_3x_3x_5 = 0\ x_5^{0.5} + x_6^{0.5} 4\ 0 x_1, x_2, x_3, x_4 1\ 0 x_5, x_6 16 \end{aligned}

with \(k_1\), \(k_2\), \(k_3\), \(k_4\) being extra parameters defined before calling the solver. The objective function for this problem would be:

ex3<-function(x,k1,k2,k3,k4){
f=-x[4]
# Equality constraints (declared before the inequality ones)
g<-rep(0,5);
g[1]=x[4]-x[3]+x[2]-x[1]+k4*x[4]*x[6]
g[2]=x[1]-1+k1*x[1]*x[5]
g[3]=x[2]-x[1]+k2*x[2]*x[6]
g[4]=x[3]+x[1]-1+k3*x[3]*x[5]

# Inequality constraint
g[5]=x[5]^0.5+x[6]^0.5
return(list(f=f,g=g))
}

This objective function can be invoked like the code below. Let us assume that \(x\) has dimension 6 and we want the value 1 for each dimension. We also define the 4 extra input parameter \(k_1\) to \(k_4\) with values 0.2, 0.5, -0.1, 0.9, respectively.

ex3(c(1,1,1,1,1,1),0.2,0.5,-0.1,0.9)

eSSR usage

Problem definition

To solve an optimization problem with eSSR, a list (named problem here) containing the following fields must be defined:

Besides, there are two optional fields

If the problem contains additional constraints and/or integer or binary variables, the following fields should also be defined:

User options

The user may define a set of different options related to the optimization problem. They are defined in another list (named opts here) which has the following fields:

Global options

A set of options related to the global search phase of the algorithm may also be defined also within the list opts:

Local options

eSSR is a global optimization method which performs local searches from selected initial points to accelerate the convergence to optimal solutions. Some options regarding the local search can be defined in the opts list, as follows:

Note that, for some problems, the local search may be inefficient, spending a high computation time to provide low quality solutions. This is the case of many noisy or ill-posed problems. In these instances, the local search may be deactivated by user by defining the value of the field solver as zero (0).

Output

_eSSR_’s output is a list (called Results here) containing the following fields:

Guidelines for using eSSR

Although eSSR default options have been chosen to be robust for a high number of problems, the tuning of some parameters may help increase the efficiency for a particular problem. Here is presented a list of suggestions for parameter choice depending on the type of problem the user has to face.

Examples

In this section we will illustrate the usage of eSSR within MEIGO for solving different instances.

Unconstrained problem

\begin{aligned} _{x}f(x)=4x_1^2 - 2.1x_1^4 + x_1^6 + x_1x_2 - 4x_2^2 + 4x_2^4 \end{aligned}

subject to

\begin{aligned} -1x_1, x_21 \end{aligned}

The objective function is defined in ex1.R. Note that being an unconstrained problem, there is only one output argument, \(f\).

ex1 <- function(x){
y<-4*x[1]*x[1]-2.1*x[1]^4+1/3*x[1]^6+x[1]*x[2]-4*x[2]*x[2]+4*x[2]^4;
return(y)
}

The solver is called in main_ex1.R. This problem has two known global optima in \(x^*=(0.0898, -0.7127)\) and \(x^*=(-0.0898, 0.7127)\) with \(f(x^*)= -1.03163\).

Options set:

#=========================
#PROBLEM SPECIFICATIONS
# ========================
problem<-list(f="ex1",x_L=rep(-1,2),x_U=rep(1,2))
opts<-list(maxeval=500, ndiverse=40, local_solver='DHC',
local_finish='LBFGSB', local_iterprint=1)
#===============================
# END OF PROBLEM SPECIFICATIONS
#===============================
Results<-MEIGO(problem,opts,algorithm="ESS")

Constrained problem

\begin{aligned} _{x}f(x)=-x_1-x_2 \end{aligned}

subject to

\begin{aligned} x_2 2x_1^4 - 8x_1^3 + 8x_1^2 + 2\ x_2 4x_1^4 - 32x_1^3 + 88x_1^2 - 96x_1 + 36\ 0x_13\ 0x_24\ \end{aligned}

The objective function is defined in ex2.R. Note that being a constrained problem, there are two output arguments, \(f\) and \(g\).

ex2<-function(x){
F=-x[1]-x[2]
g<-rep(0,2)
g[1]<-x[2]-2*x[1]^4+8*x[1]^3-8*x[1]^2
g[2]<-x[2]-4*x[1]^4+32*x[1]^3-88*x[1]^2+96*x[1]
return(list(F=F,g=g))
}

The solver is called in main_ex2.R. The global optimum for this problem is located in \(x^*=[2.32952, 3.17849]\) with \(f(x^*)=-5.50801\).

Options set:

#=========================
#PROBLEM SPECIFICATIONS
#=========================
problem<-list(f="ex2",x_L=rep(0,2),x_U=c(3,4),
c_L=rep(-Inf,2), c_U=c(2,36))
opts<-list(maxeval=750, local_solver="DHC", local_n1=2, local_n2=3)
#=============================
#END OF PROBLEM SPECIFICATIONS
# ============================
Results<-MEIGO(problem,opts,algorithm="ESS")

Constrained problem with equality constraints

\begin{aligned} _{x}f(x)=-x_4 \end{aligned}

subject to

\begin{aligned} x_4 - x_3 + x_2 - x_1 + k_4x_4x_6 = 0\ x_1 - 1 + k_1x_1x_5 = 0\ x_2 - x_1 + k_2x_2x_6 = 0\ x_3 + x_1 - 1 + k_3x_3x_5 = 0\ x_5^{0.5} + x_6^{0.5} 4\ 0 x_1, x_2, x_3, x_4 1\ 0 x_5, x_6 16 \end{aligned}

with \(k_1 = 0.09755988\), \(k_3 = 0.0391908\) \(k_2 = 0.99k_1\) and \(k_4 = 0.9k_3\). The objective function is defined in ex3.R. Note that equality constraints must be declared before inequality constraints. Parameters \(k_1,\ldots, k_4\) are passed to the objective function through the main script, therefore they do not have to be calculated in every function evaluation. See the input arguments below.

ex3<-function(x,k1,k2,k3,k4){
f=-x[4]
#Equality constraints
    g<-rep(0,5)
    g[1]=x[4]-x[3]+x[2]-x[1]+k4*x[4]*x[6]
    g[2]=x[1]-1+k1*x[1]*x[5]
    g[3]=x[2]-x[1]+k2*x[2]*x[6]
    g[4]=x[3]+x[1]-1+k3*x[3]*x[5]

#Inequality constraint
    g[5]=x[5]^0.5+x[6]^0.5
    return(list(f=f,g=g))
}

The solver is called in main_ex3.R. The global optimum for this problem is located in \(x^*=[0.77152, 0.516994, 0.204189, 0.388811, 3.0355, 5.0973]\) with \(f(x^*)= -0.388811\).

Options set:

#=========================
#PROBLEM SPECIFICATIONS
#=========================
problem<-list(f="ex3",x_L=rep(0,6),x_U=c(rep(1,4),16,16),
neq=4, c_L=-Inf, c_U=4)
opts<-list(maxtime=5, local_solver='solnp')
#=============================
#END OF PROBLEM SPECIFICATIONS
#=============================
k1=0.09755988
k3=0.0391908
k2=0.99*k1
k4=0.9*k3
Results<-MEIGO(problem,opts,algorithm="ESS",k1,k2,k3,k4)

Mixed integer problem

\begin{aligned} _{x}f(x)=x_2^2 + x_3^2 + 2x_1^2 + x_4^2 - 5x_2 - 5x_3 - 21x_1 + 7x_4 \end{aligned}

subject to

\begin{aligned} x_2^2 + x_3^2 + x_1^2 + x_4^2 + x_2 - x_3 + x_1 - x_4 8\ x_2^2 + 2x_3^2 + x_1^2 + 2x_4^2 - x_2 - x_4 10\ 2x_2^2 + x_3^2 + x_1^2 + 2x_2 - x_3 - x_4 5\ 0 x_i 10 i \end{aligned}

Integer variables: \(x_2\), \(x_3\) and \(x_4\). In the function declaration ex4.R they must have the last indexes.

    ex4<-function(x){
        F = x[2]^2 + x[3]^2 + 2.0*x[1]^2 + x[4]^2 - 5.0*x[2] - 5.0*x[3] - 21.0*x[1] + 7.0*x[4]
        g<-rep(0,3)
        g[1] = x[2]^2 + x[3]^2 + x[1]^2 + x[4]^2 + x[2] - x[3] + x[1] - x[4]
        g[2] = x[2]^2 + 2.0*x[3]^2 + x[1]^2 + 2.0*x[4]^2 - x[2] - x[4]
        g[3] = 2.0*x[2]^2 + x[3]^2 + x[1]^2 + 2.0*x[2] - x[3] - x[4]
        return(list(F=F, g=g))
    }

The solver is called in main_ex4.R. The global optimum for this problem is located in \(x^*=[2.23607, 0, 1, 0]\) with \(f(x^*)=-40.9575\).

Options set:

#========================= PROBLEM SPECIFICATIONS ===========================
problem<-list(f="ex4", x_L=rep(0,4), x_U=rep(10,4), x_0=c(3,4,5,1),
int_var=3, c_L=rep(-Inf,3), c_U=c(8,10,5))
opts<-list(maxtime=2)
#========================= END OF PROBLEM SPECIFICATIONS =====================
Results<-MEIGO(problem,opts,algorithm="ESS")

essR_multistart application

An application of essR_multistart within MEIGO on the problem ex3 using solnp as local solver is presented in the script main_multistart_ex3.R. The number of initial points chosen is 10.

#=========================
#PROBLEM SPECIFICATIONS
#=========================
problem<-list(f="ex3",x_L=rep(0,6),x_U=c(rep(1,4),16,16),
neq=4, c_L=-Inf, c_U=4)
opts<-list(ndiverse=10, local_solver='SOLNP', local_tol=3)
#=============================
#END OF PROBLEM SPECIFICATIONS
 #============================
k1=0.09755988
k3=0.0391908
k2=0.99*k1
k4=0.9*k3
Results<-MEIGO(problem,opts,algorithm="multistart",k1,k2,k3,k4)

Integer optimization: Variable Neighbourhood Search (VNSR)

VNSR is an R implementation of the Variable Neighbourhood Search (VNS) metaheuristic which is part of the MEIGO toolbox for global optimization in bioinformatics. VNS was first proposed by (Mladenović and Hansen 1997) for solving combinatorial and/or global optimization problems. The method guides a trial solution to search for an optimum in a certain area. After this optimum is located, the trial solution is perturbed to start searching in a new area (or neighbourhood). New neighbourhoods are defined following a distance criterion in order to achieve a good diversity in the search. Different variants of the method have been published in recent years in order to adapt it to different types of problems (Hansen, Mladenović, and Moreno-Pérez 2010). VNSR implements some of this variants by means of different tunning parameters. VNSR seeks the global minimum of integer programming (IP) problems specified by

\begin{aligned} _{x}f(x,p_1,p_2,…,p_n) \end{aligned}

subject to

\begin{aligned} x_Lxx_U \ \end{aligned}

where \(x\) is the vector of (integer) decision variables, and \(x_L\) and \(x_U\) its respective bounds. \(p_1,\ldots,p_n\) are optional extra input parameters to be passed to the objective function, \(f\), to be minimized. The current VNSR version does not handle constraints apart from bound constrained, so that the user should formulate his/her own method (i.e., a penalty function) to solve constrained problems.

Quick start: How to carry out an optimization with VNSR

Regarding the objective function, the input parameter is the decision vector (with extra parameters \(p_1, p_2,\ldots, p_n\) if they were defined before calling the solver). The objective function must provide a scalar output parameter (the objective function value).

VNSR usage

Problem definition

A list named problem containing the following fields must be defined for running VNSR:

VNSR options

The user may define a set of different options related to the integer optimization problem. They are defined in another list (named opts here) which has the following fields.

The following options only apply when the local search is activated (i.e., opts$use_local=1)

Output

VNSR output is a list (called Results here) containing the following fields:

The list Results as well as problem and opts are stored and saved in a file called VNSR_report.RData.

Guidelines for using VNSR

Parameter tuning in VNSR may help increase the efficiency for a particular problem. Here is presented an explanation of the influence of each parameter and suggestions for tuning.

Application example

To illustrate the use of VNSR we choose the 10 dimensional Rosenbrock function with integer decision variables as an example. The code of the Rosenbrock function is available inside the installation directory of the MEIGO package, under the benchmarks folder. Source the objective function and assign it to the problem list.

rosen10<-function(x){
        f<-0
        n=length(x)
        for (i in 1:(n-1)){
            f <- f + 100*(x[i]^2 - x[i+1])^2 + (x[i]-1)^2
        }
        return(f)
    }

nvar<-10

Define the problem settings:

problem<-list(f="rosen10", x_L=rep(-5,nvar), x_U=rep(1,nvar))

Define the algorithm to be used:

algorithm="VNS"

Define the options for VNSR:

opts<-list(maxeval=2000, maxtime=3600*69, use_local=1,
     aggr=0, local_search_type=1, decomp=1, maxdist=0.5)

Call MEIGO:

Results<-MEIGO(problem,opts,algorithm)

Parallel computation in MEIGO

MEIGO allows the user to exploit multi core computation with either of the implemented methods by means of their corresponding parallelizable versions: CeSSR and CeVNS. The following sections include and introduction to the parallel computing packages needed to exploit this features as well as application examples to implement the methods.

Quick notes about the parallel computing packages

In order to launch multiple threads of eSSR and/or VNSR we use the snow package, available in the CRAN repository. Installing and using this package in a cluster might not be entirely trivial and therefore we will start by adding a few notes about this.

Snowfall

The snowfall package is based in the snow package to ease cluster programming. Detailed information can be found at https://cran.r-project.org/web/packages/snowfall/index.html. In MEIGO we allow the use of two mechanisms:

Usage

Options

The corresponding cooperative method for either of the algorithms included in MEIGO (i.e., CeSSR and CeVNSR), is invoked using the same problem declaration and options (see sections 3.2 and 4.2). To select the cooperative version of each method, the field algorithm must contain either CeSSR or CeVNSR. The list opts must now be a vector for each field instead of a scalar. For example, if we want to use CeSSR with 3 different threads, we would define the options of maximum number of evaluations as

opts=list();
opts[[1]]$maxeval<-value1
opts[[2]]$maxeval<-value2
opts[[3]]$maxeval<-value3

The first n fields should contain the options for the n threads that are going to be used. The field n+1 should be named hosts. opts$hosts should be a vector containing as many elements as threads you want to launch in parallel. In case you are using the ‘SOCKS’ mechanism from snowfall each element of this vector should be a string with the address of the machines where you want to launch the optimization. For instance, opts$hosts('node1','node1','node2','node2','node5','localhost','127.0.0.1') will use 7 threads for each iteration; two at ‘node1’, two at ‘node2’, one in ‘node5’ and two in your local machine (‘localhost’ and ‘127.0.0.1’).

Additionally, a set of extra options must be defined in the list opts. These extra options are the following:

      <!-- \item \textbf{global\_save\_list}: This field should contain the names of the global variables to be exported. -->  

Output

The CeSSR / CeVNSR output is a list containing the following fields:

CeSSR application example

CeSSR is an R implementation of the Cooperative enhanced Scatter Search method which is part of the MEIGO toolbox for global optimization in bioinformatics. The CeSS strategy (Villaverde, Egea, and Banga 2012) is a general purpose technique for global optimization, suited for a computing environment with several processors. Its key feature is the cooperation between the different programs (“threads”) that run in parallel in different processors. Each thread implements the enhanced Scatter Search metaheuristic (eSS) (Egea et al. 2009; Egea, Martí, and Banga 2010), which is also available in MEIGO.

To illustrate the use of CeSSR we choose the D/2m-group Shifted and m-rotated Rastrigin’s function (f10) from the LSGO benchmarks (Tang, Yang, and Weise 2012) as an example. The code of f10 is available inside the installation directory of the MEIGO package, under the benchmarks folder. Source the objective function and assign it to the problem list, defining the bounds for the decision variables:

rosen10<-function(x){
    f<-0
    n=length(x)
    for (i in 1:(n-1)){
         f <- f + 100*(x[i]^2 - x[i+1])^2 + (x[i]-1)^2
    }
    return(f)
}

nvar=20
problem<-list(f=rosen10, x_L=rep(-1000,nvar), x_U=rep(1000,nvar))

#Set 1 nodes and 2 cpu's per node
n_nodes=1
n_cpus_per_node=3

#Set different values for dim_refset, bal and n2
#for each of the 10 cpu's to be used
dim1 = 23;     bal1 = 0;     n2_1 = 0;
dim2 = 33;     bal2 = 0;     n2_2 = 0;
dim3 = 46;     bal3 = 0;     n2_3 = 2;
dim4 = 56;     bal4 = 0;     n2_4 = 4;
dim5 = 72;     bal5 = 0.25;  n2_5 = 7;
dim6 = 72;     bal6 = 0.25;  n2_6 = 10;
dim7 = 88;     bal7 = 0.25;  n2_7 = 15;
dim8 = 101;    bal8 = 0.5;   n2_8 = 20;
dim9 = 111;    bal9 = 0.25;  n2_9 = 50;
dim10 = 123;   bal10 = 0.25; n2_10 = 100;

opts_dim=c(dim1,dim2,dim3,dim4,dim5,dim6,dim7,dim8,dim9,dim10)
opts_bal=c(bal1,bal2,bal3,bal4,bal5,bal6,bal7,bal8,bal9,bal10)
opts_n2=c(n2_1,n2_2,n2_3,n2_4,n2_5,n2_6,n2_7,n2_8,n2_9,n2_10)
D=10

#Initialize counter and options
counter=0
opts=list()
hosts=c()

for(i in 1:n_nodes){
  for(j in 1:n_cpus_per_node){

    counter=counter+1

    #Set the name of every thread
    if(i<10)hosts=c(hosts,paste('node0',i,sep=""))
    if(i>=10 && i<100)hosts=c(hosts,paste('node',i,sep=""))

    opts[[counter]]=list()

    #Set specific options for each thread
    opts[[counter]]$local_balance   =   opts_bal[counter]
    opts[[counter]]$dim_refset      =   opts_dim[counter]
    opts[[counter]]$local_n2        =   opts_n2[counter]

    #Set common options for each thread

    opts[[counter]]$maxeval         =   10000
    opts[[counter]]$local_solver    =   "dhc"

    #Options not set will take default values for every thread

  }
}

#Set the address of each machine, defined inside the 'for' loop
opts$hosts=c('localhost','localhost','localhost')

#Do not define the additional options for cooperative methods (e.g., ce_maxtime, ce_isparallel, etc..)
#They will take their default values
opts$ce_niter=5
opts$ce_type="SOCKS"
opts$ce_isparallel=TRUE

#Call the solver
Results<-MEIGO(problem,opts,algorithm="CeSSR")

CeVNSR application example

CeVNSR is an extension of VNSR which makes use of parallel computation packages available in R to reduce the time needed to solve a given integer programming problem (IP). This implementation builds on the ideas explored by (Villaverde, Egea, and Banga 2012) which showed (in a nonlinear programming context) that, at least for a set of benchmarks, cooperative instances of an optimization algorithm exchanging information from time to time produced better results than running same instances in an independent fashion with an equivalent computational cost.

Next we show an example function that illustrates how to configure the list of options used by CeVNSR. This example calls two cpu’s of our local machine to solve the integer Rosenbrock problem. It can be found in the script main_CeVNSR_example.R in the benchmarks folder.

rosen10<-function(x){
        f<-0
        n=length(x)
        for (i in 1:(n-1)){
            f <- f + 100*(x[i]^2 - x[i+1])^2 + (x[i]-1)^2
        }
        return(f)
    }

nvar=20

problem<-list(f=rosen10, x_L=rep(-1000,nvar), x_U=rep(1000,nvar))

opts=list()
opts[[1]]=list(use_local=1,aggr=1,local_search=1,decomp=1,maxdist=0.8,maxeval=2000)
opts[[2]]=list(use_local=1,aggr=0,local_search=2,decomp=0,maxdist=0.5,maxeval=2000)
opts[[3]]=list(use_local=1,aggr=0,local_search=2,decomp=0,maxdist=0.5,maxeval=2000)
opts[[4]]=list(use_local=1,aggr=0,local_search=2,decomp=0,maxdist=0.5,maxeval=2000)

opts$hosts=c('localhost','localhost','localhost','localhost')

opts$ce_niter=10
opts$ce_type="SOCKS"
opts$ce_isparallel= TRUE

Results=MEIGO(problem,opts, algorithm="CeVNSR")

Applications from Systems Biology

In this section we provide examples for two systems biology problems related with the use and calibration of logic models for modeling cell signaling pathways.

Using eSS to fit a dynamic model

In this section we use eSS to calibrate a model of ordinary differential equations. Here the dynamics are simulated with the CNORode package CNORode (Henriques and Cokelaer 2012) which is part of the CellNOptRpackage (T.Cokelaer et al. 2012; Terfve et al. 2012).

The model shown here is toy pathway for illustrative purposes. A detailed description on the biological problem and published case studies can be found in http://www.cellnopt.org/. Briefly, CNORode uses a technique called multivariate polynomial interpolation (Krumsiek et al. 2010) in order to transform a logic model into a system of ordinary differential equations. The final goal of the package is to calibrate such model (its parameters) by means of optimization strategies where the objective is to minimize the squared difference between the model predictions and the experimental data.

In this example, we start by loading the package and the necessary model and experimental data:

library(MEIGOR)
library(CNORode)
data("CNORodeExample",package="MEIGOR")
plotCNOlist(cnolist)

Figure 1: Plot of the experimental data for the CNORode example

This data was generated in silico and added \(5\%\) of Gaussian noise to simulate the experimental error. Figure 1 shows the plot of the experimental data.

In order to visualize the logic model we can use the CellNOptR package (see Figure 2)

Figure 2: The model structure for the CNORode example

The models generated by CNORode typically produce a large number of continuous parameters. To automatically generate a list containing the set of continuous parameters and additional information (e.g. the parameters names) for this model we can use the createLBodeContPars function. Here we set the option random to TRUE in order to generate a random initial solution for our problem and plot the simulation results against the experimental data (see Figure 3).

Figure 3: Fit for the initial (random) solution provided to the optimization solver
The simulation is ploted against the experimental data.

The CNORode packages allows the generation of an objective function that can be used by optimization solvers. Basically this function computes the squared difference between the experimental data and the simulation results:

f_hepato<-getLBodeContObjFunction(cnolist, model, initial_pars, indices=NULL,
 time = 1, verbose = 0, transfer_function = 3, reltol = 1e-04, atol = 1e-03,
maxStepSize =Inf, maxNumSteps = 1e4, maxErrTestsFails = 50, nan_fac = 5)

In eSSR an optimization problem is defined by a list that should contain at least the objective function and the upper and lower bound of the variables under study:

n_pars=length(initial_pars$LB)
problem<-list(f=f_hepato, x_L=initial_pars$LB[initial_pars$index_opt_pars],
    x_U=initial_pars$UB[initial_pars$index_opt_pars],x_0=initial_pars$LB[initial_pars$index_opt_pars])

Finally, the options for the solver must be defined. Please note that the settings used here do not provide the necessary computational effort to solve this problem and are chosen such that the examples can be executed quickly. To improve the performance increase maxeval, ndiverse and dim_refsetand use a local solver:

opts<-list(maxeval=100, local_solver=0,ndiverse=10,dim_refset=6)

To start the optimization procedure call MEIGO:

Results<-MEIGO(problem,opts,algorithm="ESS")

After the optimization we can use the obtained results to plot and plot the model fitness with CNORode (see Figure 4):

Figure 4: Fitness of the solution obtained by the eSS in the CNORode parameter estimation problem

Using VNS to optimize logic models

In (Saez-Rodriguez et al. 2009) it is shown how to use a genetic algorithm to optimize a logic model against experimental data. From the optimization point of view, this corresponds to a binary decision problem that can be tackled by VNS.

The basic idea is that each binary decision variable corresponds to a logic disjuntion (an AND gate) from the Boolean point of view or to an hyperedge (an edge with multiple inputs) from the graph perspective. The CellNOptR software (Terfve et al. 2012) provides the necessary framework to simulate such models and obtain an objective function that can be used by VNS.

The example provided here is a synthetic pathway, meaning that, both the data and the network structure shown here were engineered for illustrative purposes. Note that a detailed description of the biological problem and published case studies can be found in http://www.cellnopt.org/.

The first step in this example is to load the CellNOptR package and a list (the cnolist) containing the data for several experiments under different pertubation conditions of upstream network stimuli (typically ligands) and downstream inhibitors (typically small molecule inhibitiors). Also, CellNOptR is used to visualize such data (see Figure 5):

Figure 5: Pseudo-experimental data for the CellNOptR (Boolean logic) example
Each row corresponds to a different experiment upon pertubation with different stimuli and inhibitors.

The original prior-knowledge network (a signed directed graph) can be expanded into an hypergraph containing all possible Boolean gates. CellNOptR can then be used to visualize the obtained network (see Figure 6).

Figure 6: The expanded model for the CellNOptR (Boolean logic) example
This representation contains all possible logic gates where each binary decision variable corrersponds to an hyperedge/logical disjunction.

After loading the experimental data and the prior-knowledge network, we need to define an objective function that can be used by VNS. In order to compute the fitness of the model we will need to perform a simulation and compute the model fitness. Again, CellNOptR automates most of this process and thus our objective function can be defined as follows:

get_fobj<- function(cnolist,model){

    f<-function(x,model1=model,cnolist1=cnolist){
        simlist=prep4sim(model1)
        score=computeScoreT1(cnolist1, model1, x)
        return(score)
    }
}
fobj=get_fobj(cnolist,model)

After defining the objective function we need to define the optimization problem and the options for the solver:

nvar=16

problem<-list(f=fobj, x_L=rep(0,nvar), x_U=rep(1,nvar))
opts<-list(maxeval=2000, maxtime=30, use_local=1,
    aggr=0, local_search_type=1, decomp=1, maxdist=0.5)

Finally we call MEIGO using the VNS algorithm in order to solve the problem:

Results<-MEIGO(problem,opts,"VNS")

Once the optimization procedure is finished we plot the obtained solution (see Figure 7):

Figure 7: The Boolean logic model obtained after optimization with VNS

References

Belisle, C. J. P. 1992. “Convergence Theorems for a Class of Simulated Annealing Algorithms.” J. Applied Probability 29: 885–95.

Broyden, C. G. 1970. “The Convergence of a Class of Double-Rank Minimization Algorithms.” Journal of the Institute of Mathematics and Its Applications 6: 76–90.

Byrd, R. H., P. Lu, J. Nocedal, and C. Zhu. 1995. “A Limited Memory Algorithm for Bound Constrained Optimization.” SIAM J. Scientific Computing 16: 1190–1208.

de la Maza, M., and D. Yuret. 1994. “Dynamic Hill Climbing.” AI Expert 9 (3): 26–31.

Egea, J. A., E. Balsa-Canto, M.-S.G. García, and J. R. Banga. 2009. “Dynamic Optimization of Nonlinear Processes with an Enhanced Scatter Search Method.” Industrial & Engineering Chemistry Research 48 (9): 4388–4401.

Egea, J. A., R. Martí, and J. R. Banga. 2010. “An Evolutionary Algorithm for Complex Process Optimization.” Computers & Operations Research 37 (2): 315–24.

Fletcher, R., and C. M. Reeves. 1964. “Function Minimization by Conjugate Gradients.” Computer Journal 7: 148–54.

Ghalanos, Alexios, and Stefan Theussl. 2012. Rsolnp: General Non-Linear Optimization Using Augmented Lagrange Multiplier Method. R Package Version 1.12.

Hansen, P., N. Mladenović, and J. A. Moreno-Pérez. 2010. “Variable Neighbourhood Search: Methods and Applications.” Annals of Operations Research 175 (1): 367–407.

Hansen, P., N. Mladenović, and D Pérez-Brito. 2001. “Variable Neighborhood Decomposition Search.” Journal of Heuristics 7 (4): 335–50.

Henriques, David, and Thomas Cokelaer. 2012. CNORode: ODE Add-on to Cellnoptr.

Krumsiek, Jan, Sebastian Pölsterl, Dominik Wittmann, and Fabian Theis. 2010. “Odefy-from Discrete to Continuous Models.” BMC Bioinformatics 11 (1): 233.

Mladenović, N., and P. Hansen. 1997. “Variable Neighborhood Search.” Computers and Operations Research 24: 1097–1100.

Nelder, J. A., and R. Mead. 1965. “A Simplex Algorithm for Function Minimization.” Computer Journal 7: 308–13.

Saez-Rodriguez, Julio, Leonidas G Alexopoulos, Jonathan Epperlein, Regina Samaga, Douglas A Lauffenburger, Steffen Klamt, and Peter K Sorger. 2009. “Discrete Logic Modelling as a Means to Link Protein Signalling Networks with Functional Analysis of Mammalian Signal Transduction.” Molecular Systems Biology 5 (1).

T.Cokelaer, F.Eduati, A.MacNamara, S.Schrier, and C.Terfve. 2012. CellNOptR: Training of Boolean Logic Models of Signalling Networks Using Prior Knowledge Networks and Perturbation Data.

Terfve, Camille, Thomas Cokelaer, David Henriques, Aidan MacNamara, Emanuel Goncalves, Melody K Morris, Martijn van Iersel, Douglas A Lauffenburger, and J Saez-Rodrigues. 2012. “CellNOptR: A Flexible Toolkit to Train Protein Signaling Networks to Data Using Multiple Logic Formalisms.” BMC Syst Biol 6 (1): 133.

Villaverde, A. F., J. A. Egea, and J. R. Banga. 2012. “A Cooperative Strategy for Parameter Estimation in Large Scale Systems Biology Models.” BMC Systems Biology 6: 75.

Ye, Y. 1987. “Interior Algorithms for Linear, Quadratic and Linearly Constrained Non-Linear Programming.” PhD thesis, Stanford University.