5. Dynamic optimization of DAEs using direct collocation with CasADi

5.1. Algorithm overview

The direct collocation method can be used to solve dynamic optimization problems, including optimal control problems and parameter optimization problems. In the collocation method, the dynamic model variable profiles are approximated by piecewise polynomials. This method of approximating a differential equation corresponds to a fixed step implicit Runge-Kutta scheme, where the mesh defines the length of each step. Also, the number of collocation points in each element, or step, needs to be provided. This number corresponds to the stage order of the Runge-Kutta scheme. The selection of mesh is analogous to the choice of step length in a one-step algorithm for solving differential equations. Accordingly, the mesh needs to be fine-grained enough to ensure sufficiently accurate approximation of the differential constraint. For an overview of simultaneous optimization algorithms, see [2]. The nonlinear programming solvers IPOPT and WORHP can be used to solve the nonlinear program resulting from collocation. The needed first- and second-order derivatives are obtained using CasADi by algorithmic differentiation.

The collocation method requires that the model equations are twice continuously differentiable with respect to all of the variables. This for example means that the model can not contain integer variables or if clauses depending on the states.

Optimization models are represented using the class OptimizationProblem, which can be instantiated using the transfer_optimization_problem method. An object containing all the options for the optimization algorithm can be retrieved from the object:

from pyjmi import transfer_optimization_problem
op = transfer_optimization_problem(class_name, optimica_file_path)
opts = op.optimize_options()
opts? # View the help text

After options have been set, the options object can be propagated to the optimize method, which solves the optimization problem:

res = op.optimize(options=opts)

The standard options for the algorithm are shown in Table 6.1, “Standard options for the CasADi- and collocation-based optimization algorithm”. Additional documentation is available in the Python class documentation. The algorithm also has a lot of experimental options, which are not as well tested and some are intended for debugging purposes. These are shown in Table 6.2, “Experimental and debugging options for the CasADi- and collocation-based optimization algorithm”, and caution is advised when changing their default values.

Table 6.1. Standard options for the CasADi- and collocation-based optimization algorithm

n_e50Number of finite elements.
hsNoneElement lengths. Possible values: None, iterable of floats and "free" None: The element lengths are uniformly distributed. iterable of floats: Component i of the iterable specifies the length of element i. The lengths must be normalized in the sense that the sum of all lengths must be equal to 1. "free": The element lengths become optimization variables and are optimized according to the algorithm option free_element_lengths_data. WARNING: The "free" option is very experimental and will not always give desirable results.
n_cp3Number of collocation points in each element.
expand_to_sx"NLP"Whether to expand the CasADi MX graphs to SX graphs. Possible values: "NLP", "DAE", "no". "NLP": The entire NLP graph is expanded into SX. This will lead to high evaluation speed and high memory consumption. "DAE": The DAE, objective and constraint graphs for the dynamic optimization problem expressions are expanded into SX, but the full NLP graph is an MX graph. This will lead to moderate evaluation speed and moderate memory consumption. "no": All constructed graphs are MX graphs. This will lead to low evaluation speed and low memory consumption.
init_trajNoneVariable trajectory data used for initialization of the NLP variables.
nominal_trajNoneVariable trajectory data used for scaling of the NLP variables. This option is only applicable if variable scaling is enabled.
blocking_factorsNone (not used)Blocking factors are used to enforce piecewise constant inputs. The inputs may only change values at some of the element boundaries. The option is either None (disabled), given as an instance of pyjmi.optimization.casadi_collocation.BlockingFactors or as a list of blocking factors. If the options is a list of blocking factors, then each element in the list specifies the number of collocation elements for which all of the inputs must be constant. For example, if blocking_factors == [2, 2, 1], then the inputs will attain 3 different values (number of elements in the list), and it will change values between collocation element number 2 and 3 as well as number 4 and 5. The sum of all elements in the list must be the same as the number of collocation elements and the length of the list determines the number of separate values that the inputs may attain. See the documentation of the BlockingFactors class for how to use it. If blocking_factors is None, then the usual collocation polynomials are instead used to represent the controls.
external_dataNoneData used to penalize, constrain or eliminate certain variables.
delayed_feedbackNoneIf not None, should be a dict with mappings 'delayed_var': ('undelayed_var', delay_ne). For each key-value pair, adds the the constraint that the variable 'delayed_var' equals the value of the variable 'undelayed_var' delayed by delay_ne elements. The initial part of the trajectory for 'delayed_var' is fixed to its initial guess given by the init_traj option or the initialGuess attribute. 'delayed_var' will typically be an input. This is an experimental feature and is subject to change.
solver'IPOPT'Specifies the nonlinear programming solver to be used. Possible choices are 'IPOPT' and 'WORHP'.
IPOPT_optionsIPOPT defaultsIPOPT options for solution of NLP. See IPOPT's documentation for available options.
WORHP_optionsWORHP defaultsWORHP options for solution of NLP. See WORHP's documentation for available options.

Table 6.2. Experimental and debugging options for the CasADi- and collocation-based optimization algorithm

free_element_lengths_dataNoneData used for optimizing the element lengths if they are free. Should be None when hs != "free".
discr'LGR'Determines the collocation scheme used to discretize the problem. Possible values: "LG" and "LGR". "LG": Gauss collocation (Legendre-Gauss) "LGR": Radau collocation (Legendre-Gauss-Radau).
named_varsFalseIf enabled, the solver will create a duplicated set of NLP variables which have names corresponding to the Modelica/Optimica variable names. Symbolic expressions of the NLP consisting of the named variables can then be obtained using the get_named_var_expr method of the collocator class. This option is only intended for investigative purposes.
init_dualNoneDictionary containing vectors of initial guess for NLP dual variables. Intended to be obtained as the solution of an optimization problem which has an identical structure, which is stored in the dual_opt attribute of the result object. The dictionary has two keys, 'g' and 'x', containing vectors of the corresponding dual variable intial guesses. Note that when using IPOPT, the option warm_start_init_point has to be activated for this option to have an effect.
variable_scalingTrueWhether to scale the variables according to their nominal values or the trajectories provided with the nominal_traj option.
equation_scalingFalseWhether to scale the equations in collocated NLP. Many NLP solvers default to scaling the equations, but if it is done through this option the resulting scaling can be inspected.
nominal_traj_mode{"_default_mode": "linear"}Mode for computing scaling factors based on nominal trajectories. Four possible modes: "attribute": Time-invariant, linear scaling based on Nominal attribute "linear": Time-invariant, linear scaling "affine": Time-invariant, affine scaling "time-variant": Time-variant, linear scaling Option is a dictionary with variable names as keys and corresponding scaling modes as values. For all variables not occuring in the keys of the dictionary, the mode specified by the "_default_mode" entry will be used, which by default is "linear".
result_file_name""Specifies the name of the file where the result is written. Setting this option to an empty string results in a default file name that is based on the name of the model class.
write_scaled_resultFalseReturn the scaled optimization result if set to True, otherwise return the unscaled optimization result. This option is only applicable when variable_scaling is enabled and is only intended for debugging.
print_condition_numbersFalsePrints the condition numbers of the Jacobian of the constraints and of the simplified KKT matrix at the initial and optimal points. Note that this is only feasible for very small problems.
result_mode'collocation_points'Specifies the output format of the optimization result. Possible values: "collocation_points", "element_interpolation" and "mesh_points" "collocation_points": The optimization result is given at the collocation points as well as the start and final time point. "element_interpolation": The values of the variable trajectories are calculated by evaluating the collocation polynomials. The algorithm option n_eval_points is used to specify the evaluation points within each finite element. "mesh_points": The optimization result is given at the mesh points.
n_eval_points20The number of evaluation points used in each element when the algorithm option result_mode is set to "element_interpolation". One evaluation point is placed at each element end-point (hence the option value must be at least 2) and the rest are distributed uniformly.
checkpointFalseIf checkpoint is set to True, transcribed NLP is built with packed MX functions. Instead of calling the DAE residual function, the collocation equation function, and the lagrange term function n_e * n_cp times, the check point scheme builds an MXFunction evaluating n_cp collocation points at the same time, so that the packed MXFunction is called only n_e times. This approach improves the code generation and it is expected to reduce the memory usage for constructing and solving the NLP.
quadrature_constraintTrueWhether to use quadrature continuity constraints. This option is only applicable when using Gauss collocation. It is incompatible with eliminate_der_var set to True. True: Quadrature is used to get the values of the states at the mesh points. False: The Lagrange basis polynomials for the state collocation polynomials are evaluated to get the values of the states at the mesh points.
mutable_external_dataTrueIf true and the external_data option is used, the external data can be changed after discretization, e.g. during warm starting.

The last standard options, IPOPT_options and WORHP_options, serve as interfaces for setting options in IPOPT and WORHP. To exemplify the usage of these algorithm options, the maximum number of iterations in IPOPT can be set using the following syntax:

opts = model.optimize_options()
opts["IPOPT_options"]["max_iter"] = 10000

JModelica.org's CasADi framework does not support simulation and initialization of models. It is recommended to use PyFMI for these purposes instead.

Some statistics from the NLP solver can be obtained by issuing the command


The return argument of this function can be found by using the interactive help:

Get nonlinear programming solver statistics.
    return_status -- 
        Return status from nonlinear programming solver.
    nbr_iter -- 
        Number of iterations.
    objective -- 
       Final value of objective function.
    total_exec_time -- 
        Execution time.

5.1.1. Reusing the same discretization for several optimization solutions

When collocation is used to solve a dynamic optimization problem, the solution procedure is carried out in several steps:

  • Discretize the dynamic optimization problem, which is formulated in continuous time. The result is a large and sparse nonlinear program (NLP). The discretization step depends on the options as provided to the optimize method.

  • Solve the NLP.

  • Postprocess the NLP solution to extract an approximate solution to the original dynamic optimization problem.

Depending on the problem, discretization may account for a substantial amount of the total solution time, or even dominate it.

The same discretization can be reused for several solutions with different parameter values, but the same options. Discretization will be carried out each time the optimize method is called on the model. Instead of calling model.optimize(options=opts), a problem can be discretized using the prepare_optimization method:

solver = model.prepare_optimization(options=opts)

Alternatively, the solver can be retrieved from an existing optimization result, as solver = res.get_solver(). Manipulating the solver (e.g. setting parameters) may affect the original optimization problem object and vice versa.

The obtained solver object represents the discretized problem, and can be used to solve it using its own optimize method:

res = solver.optimize()

While options cannot be changed in general, parameter values, initial trajectories, external data, and NLP solver options can be changed on the solver object. Parameter values can be updated with

solver.set(parameter_name, value)

and current values retrieved with


New initial trajectories can be set with


where init_traj has the same format as used with the init_traj option.

External data can be updated with

solver.set_external_variable_data(variable_name, data)

(unless the mutable_external_data option is turned off). variable_name should correspond to one of the variables used in the external_data option passed to prepare_optimization. data should be the new data, in the same format as variable data used in the external_data option. The kind of external data used for the variable (eliminated/constrained/quadratic penalty) is not changed.

Settings to the nonlinear solver can be changed with

solver.set_solver_option(solver_name, name, value)

where solver_name is e g 'IPOPT' or 'WORHP'.

5.1.2. Warm starting

The solver object obtained from prepare_optimization can also be used for warm starting, where an obtained optimization solution (including primal and dual variables) is used as the initial guess for a new optimization with new parameter values.

To reuse the solver's last obtained solution as initial guess for the next optimization, warm starting can be enabled with


before calling solver.optimize(). This will reuse the last solution for the primal variables (unless solver.set_init_traj was called since the last solver.optimize) as well as the last solution for the dual variables.

When using the IPOPT solver with warm starting, several solver options typically also need to be set to see the benefits, e g:

def set_warm_start_options(solver, push=1e-4, mu_init=1e-1):    
    solver.set_solver_option('IPOPT', 'warm_start_init_point', 'yes')
    solver.set_solver_option('IPOPT', 'mu_init', mu_init)

    solver.set_solver_option('IPOPT', 'warm_start_bound_push', push)
    solver.set_solver_option('IPOPT', 'warm_start_mult_bound_push', push)
    solver.set_solver_option('IPOPT', 'warm_start_bound_frac', push)
    solver.set_solver_option('IPOPT', 'warm_start_slack_bound_frac', push)
    solver.set_solver_option('IPOPT', 'warm_start_slack_bound_push', push)


Smaller values of the push and mu arguments will make the solver place more trust in that the sought solution is close to the initial guess, i e, the last solution.

5.2. Examples

5.2.1. Optimal control

This tutorial is based on the Hicks-Ray Continuously Stirred Tank Reactors (CSTR) system. The model was originally presented in [1]. The system has two states, the concentration, c, and the temperature, T. The control input to the system is the temperature, Tc, of the cooling flow in the reactor jacket. The chemical reaction in the reactor is exothermic, and also temperature dependent; high temperature results in high reaction rate. The CSTR dynamics are given by:

This tutorial will cover the following topics:

  • How to solve a DAE initialization problem. The initialization model has equations specifying that all derivatives should be identically zero, which implies that a stationary solution is obtained. Two stationary points, corresponding to different inputs, are computed. We call the stationary points A and B respectively. Point A corresponds to operating conditions where the reactor is cold and the reaction rate is low, whereas point B corresponds to a higher temperature where the reaction rate is high.

  • An optimal control problem is solved where the objective is to transfer the state of the system from stationary point A to point B. The challenge is to ignite the reactor while avoiding uncontrolled temperature increases. It is also demonstrated how to set parameter and variable values in a model. More information about the simultaneous optimization algorithm can be found at JModelica.org API documentation.

  • The optimization result is saved to file and then the important variables are plotted.

The Python commands in this tutorial may be copied and pasted directly into a Python shell, in some cases with minor modifications. Alternatively, you may copy the commands into a text file, e.g., cstr_casadi.py.

Start the tutorial by creating a working directory and copy the file $JMODELICA_HOME/Python/pyjmi/examples/files/CSTR.mop to your working directory. An online version of CSTR.mop is also available (depending on which browser you use, you may have to accept the site certificate by clicking through a few steps). If you choose to create a Python script file, save it to the working directory. Compile and instantiate a model object

The functions and classes used in the tutorial script need to be imported into the Python script. This is done by the following Python commands. Copy them and paste them either directly into your Python shell or, preferably, into your Python script file.

import numpy as N
import matplotlib.pyplot as plt

from pymodelica import compile_fmu
from pyfmi import load_fmu
from pyjmi import transfer_optimization_problem

To solve the initialization problem and simulate the model, we will first compile it as an FMU and load it in Python. These steps are described in more detail in Section 4.

# Compile the stationary initialization model into an FMU
init_fmu = compile_fmu("CSTR.CSTR_Init", "CSTR.mop")
# Load the FMU
init_model = load_fmu(init_fmu)

At this point, you may open the file CSTR.mop, containing the CSTR model and the static initialization model used in this section. Study the classes CSTR.CSTR and CSTR.CSTR_Init and make sure you understand the models. Before proceeding, have a look at the interactive help for one of the functions you used:

help(compile_fmu) Solve the DAE initialization problem

In the next step, we would like to specify the first operating point, A, by means of a constant input cooling temperature, and then solve the initialization problem assuming that all derivatives are zero.

# Set input for Stationary point A
Tc_0_A = 250
init_model.set('Tc', Tc_0_A)

# Solve the initialization problem using FMI

# Store stationary point A
[c_0_A, T_0_A] = init_model.get(['c', 'T'])

# Print some data for stationary point A
print(' *** Stationary point A ***')
print('Tc = %f' % Tc_0_A)
print('c = %f' % c_0_A)
print('T = %f' % T_0_A)

Notice how the method set is used to set the value of the control input. The initialization algorithm is invoked by calling the method initialize, which returns a result object from which the initialization result can be accessed. The values of the states corresponding to point A can then be extracted from the result object. Look carefully at the printouts in the Python shell to see a printout of the stationary values. Display the help text for the initialize method and take a moment to look it through. The procedure is now repeated for operating point B:

# Set inputs for Stationary point B
init_model.reset() # reset the FMU so that we can initialize it again
Tc_0_B = 280
init_model.set('Tc', Tc_0_B)

# Solve the initialization problem using FMI

# Store stationary point B
[c_0_B, T_0_B] = init_model.get(['c', 'T'])

# Print some data for stationary point B
print(' *** Stationary point B ***')
print('Tc = %f' % Tc_0_B)
print('c = %f' % c_0_B)
print('T = %f' % T_0_B)

We have now computed two stationary points for the system based on constant control inputs. In the next section, these will be used to set up an optimal control problem. Solving an optimal control problem

The optimal control problem we are about to solve is given by:

and is expressed in Optimica format in the class CSTR.CSTR_Opt2 in the CSTR.mop file above. Have a look at this class and make sure that you understand how the optimization problem is formulated and what the objective is.

Direct collocation methods often require good initial guesses in order to ensure robust convergence. Also, if the problem is non-convex, initialization is even more critical. Since initial guesses are needed for all discretized variables along the optimization interval, simulation provides a convenient means to generate state and derivative profiles given an initial guess for the control input(s). It is then convenient to set up a dedicated model for computation of initial trajectories. In the model CSTR.CSTR_Init_Optimization in the CSTR.mop file, a step input is applied to the system in order obtain an initial guess. Notice that the variable names in the initialization model must match those in the optimal control model.

First, compile the model and set model parameters:

# Compile the optimization initialization model
init_sim_fmu = compile_fmu("CSTR.CSTR_Init_Optimization", "CSTR.mop")

# Load the model
init_sim_model = load_fmu(init_sim_fmu)

# Set initial and reference values
init_sim_model.set('cstr.c_init', c_0_A)
init_sim_model.set('cstr.T_init', T_0_A)
init_sim_model.set('c_ref', c_0_B)
init_sim_model.set('T_ref', T_0_B)
init_sim_model.set('Tc_ref', Tc_0_B)

Having initialized the model parameters, we can simulate the model using the simulate function.

# Simulate with constant input Tc
init_res = init_sim_model.simulate(start_time=0., final_time=150.)

The method simulate first computes consistent initial conditions and then simulates the model in the interval 0 to 150 seconds. Take a moment to read the interactive help for the simulate method.

The simulation result object is returned and to retrieve the simulation data use Python dictionary access to retrieve the variable trajectories.

# Extract variable profiles
t_init_sim = init_res['time']
c_init_sim = init_res['cstr.c']
T_init_sim = init_res['cstr.T']
Tc_init_sim = init_res['cstr.Tc']

# Plot the initial guess trajectories
plt.subplot(3, 1, 1)
plt.plot(t_init_sim, c_init_sim)
plt.title('Initial guess obtained by simulation')

plt.subplot(3, 1, 2)
plt.plot(t_init_sim, T_init_sim)

plt.subplot(3, 1, 3)
plt.plot(t_init_sim, Tc_init_sim)
plt.ylabel('Cooling temperature')

Look at the plots and try to relate the trajectories to the optimal control problem. Why is this a good initial guess?

Once the initial guess is generated, we compile the optimal control problem:

# Compile and load optimization problem
op = transfer_optimization_problem("CSTR.CSTR_Opt2", "CSTR.mop")

We will now initialize the parameters of the model so that their values correspond to the optimization objective of transferring the system state from operating point A to operating point B. Accordingly, we set the parameters representing the initial values of the states to point A and the reference values in the cost function to point B:

# Set reference values
op.set('Tc_ref', Tc_0_B)
op.set('c_ref', float(c_0_B))
op.set('T_ref', float(T_0_B))

# Set initial values
op.set('cstr.c_init', float(c_0_A))
op.set('cstr.T_init', float(T_0_A))

We will also set some optimization options. In this case, we decrease the number of finite elements in the mesh from 50 to 19, to be able to illustrate that simulation and optimization might not give the exact same result. This is done by setting the corresponding option and providing it as an argument to the optimize method. We also lower the tolerance of IPOPT to get a more accurate result. We are now ready to solve the actual optimization problem. This is done by invoking the method optimize:

# Set options
opt_opts = op.optimize_options()
opt_opts['n_e'] = 19 # Number of elements
opt_opts['init_traj'] = init_res
opt_opts['nominal_traj'] = init_res
opt_opts['IPOPT_options']['tol'] = 1e-10

# Solve the optimal control problem
res = op.optimize(options=opt_opts)

You should see the output of IPOPT in the Python shell as the algorithm iterates to find the optimal solution. IPOPT should terminate with a message like 'Optimal solution found' or 'Solved to acceptable level' in order for an optimum to have been found. The optimization result object is returned and the optimization data are stored in res.

We can now retrieve the trajectories of the variables that we intend to plot:

# Extract variable profiles
c_res = res['cstr.c']
T_res = res['cstr.T']
Tc_res = res['cstr.Tc']
time_res = res['time']
c_ref = res['c_ref']
T_ref = res['T_ref']
Tc_ref = res['Tc_ref']

Finally, we plot the result using the functions available in matplotlib:

# Plot the results
plt.subplot(3, 1, 1)
plt.plot(time_res, c_res)
plt.plot(time_res, c_ref, '--')
plt.title('Optimized trajectories')

plt.subplot(3, 1, 2)
plt.plot(time_res, T_res)
plt.plot(time_res, T_ref, '--')

plt.subplot(3, 1, 3)
plt.plot(time_res, Tc_res)
plt.plot(time_res, Tc_ref, '--')
plt.ylabel('Cooling temperature')

You should now see the plot shown in Figure 6.2, “Optimal profiles for the CSTR problem.”.

Figure 6.2. Optimal profiles for the CSTR problem.

Optimal profiles for the CSTR problem.

Take a minute to analyze the optimal profiles and to answer the following questions:

  1. Why is the concentration high in the beginning of the interval?

  2. Why is the input cooling temperature high in the beginning of the interval? Verify optimal control solution

Solving optimal control problems by means of direct collocation implies that the differential equation is approximated by a time-discrete counterpart. The accuracy of the solution is dependent on the method of collocation and the number of elements. In order to assess the accuracy of the discretization, we may simulate the system using the optimal control profile as input. With this approach, the state profiles are computed with high accuracy and the result may then be compared with the profiles resulting from optimization. Notice that this procedure does not verify the optimality of the resulting optimal control profiles, but only the accuracy of the discretization of the dynamics.

We start by compiling and loading the model used for simulation:

# Compile model
sim_fmu = compile_fmu("CSTR.CSTR", "CSTR.mop")

# Load model
sim_model = load_fmu(sim_fmu)

The solution obtained from the optimization are values at a finite number of time points, in this case the collocation points. With the JMI framework, these values are normally interpolated linearly when used for simulation purposes. The CasADi framework also supports obtaining all the collocation polynomials for all the input variables in the form of a function instead, which can be used during simulation for greater accuracy. We obtain it from the result object in the following manner.

# Get optimized input
(_, opt_input) = res.get_opt_input()

We specify the initial values and simulate using the optimal trajectory:

# Set initial values
sim_model.set('c_init', c_0_A)
sim_model.set('T_init', T_0_A)

# Simulate using optimized input
sim_opts = sim_model.simulate_options()
sim_opts['CVode_options']['rtol'] = 1e-6
sim_opts['CVode_options']['atol'] = 1e-8
res = sim_model.simulate(start_time=0., final_time=150.,
                         input=('Tc', opt_input), options=sim_opts)

Finally, we load the simulated data and plot it to compare with the optimized trajectories:

# Extract variable profiles
time_sim = res['time']

# Plot the results


plt.ylabel('Cooling temperature')

You should now see the plot shown in Figure 6.3, “Optimal control profiles and simulated trajectories corresponding to the optimal control input.”.

Figure 6.3. Optimal control profiles and simulated trajectories corresponding to the optimal control input.

Optimal control profiles and simulated trajectories corresponding to the optimal control input.

Discuss why the simulated trajectories differ from their optimized counterparts. Exercises

After completing the tutorial you may continue to modify the optimization problem and study the results.

  1. Remove the constraint on cstr.T. What is then the maximum temperature?

  2. Play around with weights in the cost function. What happens if you penalize the control variable with a larger weight? Do a parameter sweep for the control variable weight and plot the optimal profiles in the same figure.

  3. Add terminal constraints ('cstr.T(finalTime)=someParameter') for the states so that they are equal to point B at the end of the optimization interval. Now reduce the length of the optimization interval. How short can you make the interval?

  4. Try varying the number of elements in the mesh and the number of collocation points in each interval. References

[1] G.A. Hicks and W.H. Ray. Approximation Methods for Optimal Control Synthesis. Can. J. Chem. Eng., 40:522–529, 1971.

[2] Bieger, L., A. Cervantes, and A. Wächter (2002): "Advances in simultaneous strategies for dynamic optimization." Chemical Engineering Science, 57, pp. 575-593.

5.2.2. Minimum time problems

Minimum time problems are dynamic optimization problems where not only the control inputs are optimized, but also the final time. Typically, elements of such problems include initial and terminal state constraints and an objective function where the transition time is minimized. The following example will be used to illustrate how minimum time problems are formulated in Optimica. We consider the optimization problem:

subject to the Van der Pol dynamics:

and the constraints:

This problem is encoded in the following Optimica specification:

optimization VDP_Opt_Min_Time (objective = finalTime,
                               startTime = 0,

  // The states
  Real x1(start = 0,fixed=true);
  Real x2(start = 1,fixed=true);

  // The control signal
  input Real u(free=true,min=-1,max=1);

  // Dynamic equations
  der(x1) = (1 - x2^2) * x1 - x2 + u;
  der(x2) = x1;

  // terminal constraints
end VDP_Opt_Min_Time;

Notice how the class attribute finalTime is set to be free in the optimization. The problem is solved by the following Python script:

# Import numerical libraries
import numpy as N
import matplotlib.pyplot as plt

# Import the JModelica.org Python packages
from pymodelica import compile_fmu
from pyfmi import load_fmu
from pyjmi import transfer_optimization_problem

vdp = transfer_optimization_problem("VDP_Opt_Min_Time", "VDP_Opt_Min_Time.mop")
res = vdp.optimize()

# Extract variable profiles

# Plot



The resulting control and state profiles are shown in Figure 6.4, “Minimum time profiles for the Van der Pol Oscillator.”. Notice the difference as compared to Figure Figure 6.1, “Optimal profiles for the VDP oscillator”, where the Van der Pol oscillator system is optimized using a quadratic objective function.

Figure 6.4. Minimum time profiles for the Van der Pol Oscillator.

Minimum time profiles for the Van der Pol Oscillator.

5.2.3. Optimization under delay constraints

In some applications, it can be useful to solve dynamic optimization problems that include time delays in the model. Collocation based optimization schemes are well suited to handle this kind of models, since the whole state trajectory is available at the same time. The direct collocation method using CasADi contains an experimental implementation of such delays, which we will describe with an example. Please note that the implementation of this feature is experimental and subject to change.

We consider the optimization problem

subject to the dynamics

and the boundary conditions

The effect of positive is initially to increase , but after a time delay of time , it comes back with twice the effect in the negative direction through .

We model everything except the delay constraint in the Optimica specification

optimization DelayTest(startTime = 0, finalTime = 1,
                       objectiveIntegrand = 4*x^2 + u1^2 + u2^2)
    input Real u1, u2;
    Real x(start = 1, fixed=true);
    der(x) = u1 - 2*u2;
    x(finalTime) = 0;
end DelayTest;

The problem is then solved in the following python script. Notice how delay constraint is added using the delayed_feedback option, and the initial part of is set using the initialGuess attribute:

# Import numerical libraries
import numpy as np
import matplotlib.pyplot as plt

# Import JModelica.org Python packages
from pyjmi import transfer_optimization_problem

n_e = 20
delay_n_e = 5
horizon = 1.0
delay = horizon*delay_n_e/n_e

# Compile and load optimization problem
opt = transfer_optimization_problem("DelayTest", "DelayedFeedbackOpt.mop")

# Set value for u2(t) when t < delay
opt.getVariable('u2').setAttribute('initialGuess', 0.25)

# Set algorithm options
opts = opt.optimize_options()
opts['n_e'] = n_e
# Set delayed feedback from u1 to u2
opts['delayed_feedback'] = {'u2': ('u1', delay_n_e)}

# Optimize
res = opt.optimize(options=opts)

# Extract variable profiles
x_res = res['x']
u1_res = res['u1']
u2_res = res['u2']
time_res = res['time']

# Plot results
plt.plot(time_res, x_res, time_res, u1_res, time_res, u2_res)
plt.plot(time_res+delay, u1_res, '--')
plt.legend(('x', 'u1', 'u2', 'delay(u1)'))

The resulting control and state profiles are shown in Figure 6.5, “Optimization result for delayed feedback example.”. Notice that grows initially since is set positive to exploit the greater control gain that appears delayed through . At time , the delayed value of ceases to influence within the horizon, and immediately switches sign to drive down to its final value .

Figure 6.5. Optimization result for delayed feedback example.

Optimization result for delayed feedback example.

5.2.4. Parameter estimation

In this tutorial it will be demonstrated how to solve parameter estimation problems. We consider a quadruple tank system depicted in Figure 6.6, “A schematic picture of the quadruple tank process.”.

Figure 6.6. A schematic picture of the quadruple tank process.

A schematic picture of the quadruple tank process.

The dynamics of the system are given by the differential equations:

Where the nominal parameter values are given in Table 6.3, “Parameters for the quadruple tank process.”.

Table 6.3. Parameters for the quadruple tank process.

Parameter nameValueUnit

The states of the model are the tank water levels x1, x2, x3, and x4. The control inputs, u1 and u2, are the flows generated by the two pumps.

The Modelica model for the system is located in QuadTankPack.mop. Download the file to your working directory and open it in a text editor. Locate the class QuadTankPack.QuadTank and make sure you understand the model. In particular, notice that all model variables and parameters are expressed in SI units.

Measurement data, available in qt_par_est_data.mat, has been logged in an identification experiment. Download also this file to your working directory.

Open a text file and name it qt_par_est_casadi.py. Then enter the imports:

import os
from collections import OrderedDict

from scipy.io.matlab.mio import loadmat
import matplotlib.pyplot as plt
import numpy as N

from pymodelica import compile_fmu
from pyfmi import load_fmu
from pyjmi import transfer_optimization_problem
from pyjmi.optimization.casadi_collocation import ExternalData

into the file. Next, we compile the model, which is used for simulation, and the optimization problem, which is used for estimating parameter values. We will take a closer look at the optimization formulation later, so do not worry about that one for the moment. The initial states for the experiment are stored in the optimization problem, which we propagate to the model for simulation.

# Compile and load FMU, which is used for simulation
model = load_fmu(compile_fmu('QuadTankPack.QuadTank', "QuadTankPack.mop"))

# Transfer problem to CasADi Interface, which is used for estimation
op = transfer_optimization_problem("QuadTankPack.QuadTank_ParEstCasADi",

# Set initial states in model, which are stored in the optimization problem
x_0_names = ['x1_0', 'x2_0', 'x3_0', 'x4_0']
x_0_values = op.get(x_0_names)
model.set(x_0_names, x_0_values)

Next, we enter code to open the data file, extract the measurement time series and plot the measurements:

# Load measurement data from file
data = loadmat("qt_par_est_data.mat", appendmat=False)

# Extract data series
t_meas = data['t'][6000::100, 0] - 60
y1_meas = data['y1_f'][6000::100, 0] / 100
y2_meas = data['y2_f'][6000::100, 0] / 100
y3_meas = data['y3_d'][6000::100, 0] / 100
y4_meas = data['y4_d'][6000::100, 0] / 100
u1 = data['u1_d'][6000::100, 0]
u2 = data['u2_d'][6000::100, 0]

# Plot measurements and inputs
plt.subplot(2, 2, 1)
plt.plot(t_meas, y3_meas)
plt.subplot(2, 2, 2)
plt.plot(t_meas, y4_meas)
plt.subplot(2, 2, 3)
plt.plot(t_meas, y1_meas)
plt.subplot(2, 2, 4)
plt.plot(t_meas, y2_meas)

plt.subplot(2, 1, 1)
plt.plot(t_meas, u1)
plt.subplot(2, 1, 2)
plt.plot(t_meas, u2)

You should now see two plots showing the measurement state profiles and the control input profiles similar to Figure 6.7, “Measured state profiles.” and Figure 6.8, “Control inputs used in the identification experiment.”.

Figure 6.7. Measured state profiles.

Measured state profiles.

Figure 6.8. Control inputs used in the identification experiment.

Control inputs used in the identification experiment.

In order to evaluate the accuracy of nominal model parameter values, we simulate the model using the same initial state and inputs values as in the performed experiment used to obtain the measurement data. First, a matrix containing the input trajectories is created:

# Build input trajectory matrix for use in simulation
u = N.transpose(N.vstack([t_meas, u1, u2]))

Now, the model can be simulated:

# Simulate model response with nominal parameter values
res_sim = model.simulate(input=(['u1', 'u2'], u),
                         start_time=0., final_time=60.)

The simulation result can now be extracted:

# Load simulation result
x1_sim = res_sim['x1']
x2_sim = res_sim['x2']
x3_sim = res_sim['x3']
x4_sim = res_sim['x4']
t_sim  = res_sim['time']
u1_sim = res_sim['u1']
u2_sim = res_sim['u2']

and then plotted:

# Plot simulation result
plt.subplot(2, 2, 1)
plt.plot(t_sim, x3_sim)
plt.subplot(2, 2, 2)
plt.plot(t_sim, x4_sim)
plt.subplot(2, 2, 3)
plt.plot(t_sim, x1_sim)
plt.subplot(2, 2, 4)
plt.plot(t_sim, x2_sim)

plt.subplot(2, 1, 1)
plt.plot(t_sim, u1_sim, 'r')
plt.subplot(2, 1, 2)
plt.plot(t_sim, u2_sim, 'r')

Figure 6.9, “Simulation result for the nominal model.” shows the result of the simulation.

Figure 6.9. Simulation result for the nominal model.

Simulation result for the nominal model.

Here, the simulated profiles are given by the green curves. Clearly, there is a mismatch in the response, especially for the two lower tanks. Think about why the model does not match the data, i.e., which parameters may have wrong values.

The next step towards solving a parameter estimation problem is to identify which parameters to tune. Typically, parameters which are not known precisely are selected. Also, the selected parameters need of course affect the mismatch between model response and data, when tuned. In a first attempt, we aim at decreasing the mismatch for the two lower tanks, and therefore we select the lower tank outflow areas, a1 and a2, as parameters to optimize. The Optimica specification for the estimation problem is contained in the class QuadTankPack.QuadTank_ParEstCasADi:

optimization QuadTank_ParEstCasADi(startTime=0, finalTime=60)

    extends QuadTank(x1(fixed=true), x1_0=0.06255,
                     x2(fixed=true), x2_0=0.06045,
                     x3(fixed=true), x3_0=0.02395,
                     x4(fixed=true), x4_0=0.02325,
                     a1(free=true, min=0, max=0.1e-4),
                     a2(free=true, min=0, max=0.1e-4));
end QuadTank_ParEstCasADi;

We have specified the time horizon to be one minute, which matches the length of the experiment, and that we want to estimate a1 and a2 by setting free=true for them. Unlike optimal control, the cost function is not specified using Optimica. This is instead specified from Python, using the ExternalData class and the code below.

# Create external data object for optimization
Q = N.diag([1., 1., 10., 10.])
data_x1 = N.vstack([t_meas, y1_meas])
data_x2 = N.vstack([t_meas, y2_meas])
data_u1 = N.vstack([t_meas, u1])
data_u2 = N.vstack([t_meas, u2])
quad_pen = OrderedDict()
quad_pen['x1'] = data_x1
quad_pen['x2'] = data_x2
quad_pen['u1'] = data_u1
quad_pen['u2'] = data_u2
external_data = ExternalData(Q=Q, quad_pen=quad_pen)

This will create an objective which is integral of the squared difference between the measured profiles for x1 and x2 and the corresponding model profiles. We will also introduce corresponding penalties for the two input variables, which are left as optimization variables. It would also have been possible to eliminate the input variables from the estimation problem by using the eliminated parameter of ExternalData. See the documentation of ExternalData for how to do this. Finally, we use a square matrix Q to weight the different components of the objective. We choose larger weights for the inputs, as we have larger faith in those values.

We are now ready to solve the optimization problem. We first set some options, where we specify the number of elements (time-discretization grid), the external data, and also provide the simulation with the nominal parameter values as an initial guess for the solution, which is also used to scale the variables instead of the variables' nominal attributes (if they have any):

# Set optimization options and optimize
opts = op.optimize_options()
opts['n_e'] = 60 # Number of collocation elements
opts['external_data'] = external_data
opts['init_traj'] = res_sim
opts['nominal_traj'] = res_sim
res = op.optimize(options=opts) # Solve estimation problem

Now, let's extract the optimal values of the parameters a1 and a2 and print them to the console:

# Extract estimated values of parameters
a1_opt = res.initial("a1")
a2_opt = res.initial("a2")

# Print estimated parameter values
print('a1: ' + str(a1_opt*1e4) + 'cm^2')
print('a2: ' + str(a2_opt*1e4) + 'cm^2')

You should get an output similar to:

a1: 0.0266cm^2
a2: 0.0271cm^2

The estimated values are slightly smaller than the nominal values - think about why this may be the case. Also note that the estimated values do not necessarily correspond to the physically true values. Rather, the parameter values are adjusted to compensate for all kinds of modeling errors in order to minimize the mismatch between model response and measurement data.

Next we plot the optimized profiles:

# Load state profiles
x1_opt = res["x1"]
x2_opt = res["x2"]
x3_opt = res["x3"]
x4_opt = res["x4"]
u1_opt = res["u1"]
u2_opt = res["u2"]
t_opt  = res["time"]

# Plot estimated trajectories
plt.subplot(2, 2, 1)
plt.plot(t_opt, x3_opt, 'k')
plt.subplot(2, 2, 2)
plt.plot(t_opt, x4_opt, 'k')
plt.subplot(2, 2, 3)
plt.plot(t_opt, x1_opt, 'k')
plt.subplot(2, 2, 4)
plt.plot(t_opt, x2_opt, 'k')

plt.subplot(2, 1, 1)
plt.plot(t_opt, u1_opt, 'k')
plt.subplot(2, 1, 2)
plt.plot(t_opt, u2_opt, 'k')

You will see the plot shown in Figure 6.10, “State profiles corresponding to estimated values of a1 and a2.”.

Figure 6.10. State profiles corresponding to estimated values of a1 and a2.

State profiles corresponding to estimated values of a1 and a2.

The profiles corresponding to the estimated values of a1 and a2 are shown in black curves. As can be seen, the match between the model response and the measurement data has been significantly increased. Is the behavior of the model consistent with the estimated parameter values?

Nevertheless, there is still a mismatch for the upper tanks, especially for tank 4. In order to improve the match, a second estimation problem may be formulated, where the parameters a1, a2, a3, a4 are free optimization variables, and where the squared errors of all four tank levels are penalized. Do this as an exercise!

5.3. Investigating optimization progress

This section describes some tools that can be used to investigate the progress of the nonlinear programming solver on an optimization problem. This information can be useful when debugging convergence problems; some of it (e.g. dual variables) may also be useful to gain a better understanding of the properties of an optimization problem. To make sense of the information that can be retrieved, we first give an overview of the collocation procedure that transcribes the optimization problem into a Nonlinear Program (NLP).

Methods for inspecting progress are divided into low level and high level methods, where the low level methods provide details of the underlying NLP while the high level methods are oriented towards the optimization problem as seen in the model formulation.

All functionality related to inspection of solver progress is exposed through the solver object as returned through the prepare_optimization method. If the optimization has been done through the optimize method instead, the solver can be obtained as in

res = model.optimize(options=opts)
solver = res.get_solver()

5.3.1. Collocation

To be able to solve a dynamic optimization problem, it is first discretized through collocation. Time is divided into elements (time intervals), and time varying variables are approximated by a low order polynomial over each element. Each polynomial piece is described by sample values at a number of collocation points (default 3) within the element. The result is that each time varying variable in the model is instantiated into one NLP variable for each collocation point within each element. Some variables may also need to be instantiated at additional points, such as the initial point which is typically not a collocation point.

The equations in a model are divided into initial equations, dae equations, path constraints and point constraints. These equations are also instantiated at different time points to become constraints in the NLP. Initial equations and point constraints are instantiated only once. Dae equations and path constraints are instantiated at collocation point of each element and possibly some additional points.

When using the methods described below, each model equation is referred to as a pair (eqtype, eqind). The string eqtype may be either 'initial', 'dae', 'path_eq', 'path_ineq', 'point_eq', or 'point_ineq'. The equation index eqind gives the index within the given equation type, and is a nonnegative integer less than the number of equations within the type. The symbolic model equations corresponding to given pairs (eqtype, eqind) can be retrieved through the get_equations method:

eq      = solver.get_equations(eqtype, 0)     # first equation of type eqtype
eqs     = solver.get_equations(eqtype, [1,3]) # second and fourth equation
all_eqs = solver.get_equations(eqtype)        # all equations of the given type

Apart from the model equations, collocation may also instantiate additional kinds of constraints, such as continuity constraints to enforce continuity of states between elements and collocation constraints to prescribe the coupling between states and their derivatives. These constraints have their own eqtype strings. A list of all equation types that are used in a given model can be retrieved using

eqtypes = solver.get_constraint_types()

5.3.2. Inspecting residuals

Given a potential solution to the NLP, the residual of a constraint is a number that specifies how close it is to being satisfied. For equalities, the residual must be (close to) zero for the solution to be feasible. For inequalities, the residual must be in a specified range, typically nonpositive. The constraint violation is zero if the residual is within bounds, and gives the signed distance to the closest bound otherwise; for equality constraints, this is the same as the residual. Methods for returning residuals actually return the violation by default, but have an option to get the raw residual.

For a feasible solution, all violations are (almost) zero. If an optimization converges to an infeasible point or does not have time to converge to a feasible one then the residuals show which constraints the NLP solver was unable to satisfy. If one problematic constraint comes into conflict with a number of constraints, all of them will likely have nonzero violations.

Residual values for a given equation type can be retrieved as a function of time through

r = solver.get_residuals(eqtype)

where r is an array of residuals of shape (n_timepoints, n_equations). There are also optional arguments: inds gives a subset of equation indices (e.g. inds=[0, 1]), point specifies whether to evaluate residuals at the optimization solution (point='opt', default) or the initial point (point='init'), and raw specifies whether to return constraint violations (raw=False, default) or raw residuals (raw=True).

The corresponding time points can be retrieved with

t, i, k = solver.get_constraint_points(eqtype)

where t, i, and k are vectors that give the time, element index, and collocation point index for each instantiation.

To get an overview of which residuals are the largest,


returns a list of equation types sorted by descending residual norm, and


returns a list of equation indices of the given type sorted by residual norm.

By default, the methods above work with the unscaled residuals that result directly from collocation. If the equation_scaling option is turned on, the constraints will be rescaled before they are sent to the NLP solver. It might be of more interest to look at the size of the scaled residuals, since these are what the NLP solver will try to make small. The above methods can then be made to work with the scaled residuals instead of the unscaled by use of the scaled=True keyword argument. The residual scale factors can also be retrieved in analogy to solver.get_residuals through

scales = solver.get_residual_scales(eqtype)

and an overview of the residual scale factors (or inverse scale factors with inv=True) can be gained from


5.3.3. Inspecting the constraint Jacobian

When solving the collocated NLP, the NLP solver typically has to evaluate the Jacobian of the constraint residual functions. Convergence problems can sometimes be related to numerical problems with the constraint Jacobian. In particular, Ipopt will never consider potential solution if there are nonfinite (infinity or not-a-number) entries in the Jacobian. If the Jacobian has such entries at the initial guess, the optimizer will give up completely.

The constraint Jacobian comes from the NLP. As seen from the original model, it contains the derivatives of the model equations (and also e.g. the collocation equations) with respect to the model variables at different time points. If one or several problematic entries are found in the Jacobian, it is often helpful to know the model equation and variable that they correspond to.

The set of (model equation, model variable) pairs that correspond to nonfinite entries in the constraint Jacobian can be printed with


or returned with

entries = solver.find_nonfinite_jacobian_entries()

There are also methods to allow to make more custom analyses of this kind. To instead list all Jacobian entries with an absolute value greater than 10, one can use

J = solver.get_nlp_jacobian() # Get the raw NLP constraint Jacobian as a (sparse) scipy.csc_matrix

# Find the indices of all entries with absolute value > 10
J.data = abs(J.data) > 10
c_inds, xx_inds = N.nonzero(J)

entries = solver.get_model_jacobian_entries(c_inds, xx_inds) # Map the indices to equations and variables in the model
solver.print_jacobian_entries(entries) # Print them

To get the Jacobian with residual scaling applied, use the scaled_residuals=True option.

5.3.4. Inspecting dual variables

Many NLP solvers (including Ipopt) produce a solution that consists of not only the primal variables (the actual NLP variables), but also one dual variable for each constraint in the NLP. Upon convergence, the value of each dual variable gives the change in the optimal objective per unit change in the residual. Thus, the dual variables can give an idea of which constraints are most hindering when it comes to achieving a lower objective value, however, they must be interpreted in relation to how much it might be possible to change any given constraint.

Dual variable values for a given equation type can be retrieved as a function of time through

d = solver.get_constraint_duals(eqtype)

in analogy to solver.get_residuals. To get constraint duals for the equation scaled problem, use the scaled=True keyword argument. Just as with get_residuals, the corresponding time points can be retrieved with

t, i, k = solver.get_constraint_points(eqtype)

Besides regular constraints, the NLP can also contain upper and lower bounds on variables. These will correspond to the Modelica min and max attributes for instantiated model variables. The dual variables for the bounds on a given model variable var can be retrieved as a function of time through

d = solver.get_bound_duals(var)

The corresponding time points can be retrieved with

t, i, k = solver.get_variable_points(var)

5.3.5. Inspecting low level information about NLP solver progress

The methods described above generally hide the actual collocated NLP and only require to work with model variables and equations, instantiated at different points. There also exist lower level methods that expose the NLP level information and its mapping to the original model more directly, and may be useful for more custom applications. These include

  • get_nlp_variables, get_nlp_residuals, get_nlp_bound_duals, and get_nlp_constraint_duals to get raw vectors from the NLP solution.

  • get_nlp_variable_bounds and get_nlp_residual_bounds to get the corresponding bounds used in the NLP.

  • get_nlp_residual_scales to get the raw residual scale factors.

  • get_nlp_variable_indices and get_nlp_constraint_indices to get mappings from model variables and equations to their NLP counterparts.

  • get_point_time to get the times of collocation points (i, k).

  • get_model_variables and get_model_constraints to map from NLP variables and constraints to the corresponding model variables and equations.

The low level constraint Jacobian methods get_nlp_jacobian, get_model_jacobian_entries, and the print_jacobian_entries method have already been covered in the section about jacobians above.