## 6. Dynamic optimization of DAEs using direct collocation with CasADi

### 6.1. Algorithm overview

The default algorithm described in Section 5 is well tested, robust and has been used in several large-scale industrial projects. Due to its implementation in C, however, its flexibility is limited and extension requires significant efforts. Therefore, a new algorithm for optimal control and parameter estimation of DAEs based on the same collocation scheme that has proven successful in the default algorithm has been initiated. The new algorithm is based on the CasADi package for evaluation of first and second order derivatives, and for integration with the IPOPT algorithm. The two main benefits of using CasADi, through its Python interface, is increased flexibility and extensibility of the implementation improved performance. The goal is for the CasADi-based algorithm to replace the C-based algorithm, once it has sufficiently matured.

The direct collocation method can be used to solve dynamic optimization problems, including optimal control problems and parameter optimization problems. In the collocation method, the dynamic model variable profiles are approximated by piecewise polynomials. This method of approximating a differential equation corresponds to a fixed step implicit Runge-Kutta scheme, where the mesh defines the length of each step. Also, the number of collocation points in each element, or step, needs to be provided. This number corresponds to the stage order of the Runge-Kutta scheme. The selection of mesh is analogous to the choice of step length in a one-step algorithm for solving differential equations. Accordingly, the mesh needs to be fine-grained enough to ensure sufficiently accurate approximation of the differential constraint. For an overview of simultaneous optimization algorithms, see [2]. The algorithm IPOPT is used to solve the non-linear program resulting from collocation.

The collocation method requires that the model to be optimized does not contain discontinuities such as if equations, when clauses or integer variables.

The mathematical formulation of the algorithm can be found in the JMI API documentation.

Optimization models are represented using the class OptimizationProblem. An object containing all the options for the optimization algorithm can be retrieved from the model object:

opts = model.optimize_options()

opts? # View the help text

After options have been set, the options object can be propagated to the optimize method:

res = model.optimize(options=opts)

The options for the algorithm are shown in Table 7.3. Additional documentation is available in the Python class documentation.

Table 7.3. Options for the CasADi- and collocation-based optimization algorithm

Option Default Description
n_e 50 Number of finite elements.
hs None Element lengths. Possible values: None, iterable of floats and "free" None: The element lengths are uniformly distributed. iterable of floats: Component i of the iterable specifies the length of element i. The lengths must be normalized in the sense that the sum of all lengths must be equal to 1. "free": The element lengths become optimization variables and are optimized according to the algorithm option free_element_lengths_data. WARNING: The "free" option is very experimental and will not always give desirable results.
free_element_lengths_data None Data used for optimizing the element lengths if they are free. Should be None when hs != "free".
n_cp 3 Number of collocation points in each element.
discr 'LGR' Determines the collocation scheme used to discretize the problem. Possible values: "LG" and "LGR". "LG": Gauss collocation (Legendre-Gauss) "LGR": Radau collocation (Legendre-Gauss-Radau).
expand_to_SX True Whether to expand the CasADi MX graphs to SX graphs.
named_vars False Whether to name the NLP variables according to their corresponding Modelica/Optimica names. This disables the solution of the NLP and should only be done for investigative purposes..
init_traj None Variable trajectory data used for initialization of the NLP variables.
variable_scaling True Whether to scale the variables according to their nominal values or the trajectories provided with the nominal_traj option.
nominal_traj None Variable trajectory data used for scaling of the NLP variables. This option is only applicable if variable scaling is enabled.
nominal_traj_mode {"_default_mode": "linear"} Mode for computing scaling factors based on nominal trajectories. Four possible modes: "attribute": Time-invariant, linear scaling based on Nominal attribute "linear": Time-invariant, linear scaling "affine": Time-invariant, affine scaling "time-variant": Time-variant, linear scaling Option is a dictionary with variable names as keys and corresponding scaling modes as values. For all variables not occuring in the keys of the dictionary, the mode specified by the "_default_mode" entry will be used, which by default is "linear".
result_file_name "" Specifies the name of the file where the result is written. Setting this option to an empty string results in a default file name that is based on the name of the model class.
write_scaled_result False Return the scaled optimization result if set to True, otherwise return the unscaled optimization result. This option is only applicable when variable_scaling is enabled and is only intended for debugging.
print_condition_numbers False Prints the condition numbers of the Jacobian of the constraints and of the simplified KKT matrix at the initial and optimal points. Note that this is only feasible for very small problems.
result_mode 'collocation_points' Specifies the output format of the optimization result. Possible values: "collocation_points", "element_interpolation" and "mesh_points" "collocation_points": The optimization result is given at the collocation points as well as the start and final time point. "element_interpolation": The values of the variable trajectories are calculated by evaluating the collocation polynomials. The algorithm option n_eval_points is used to specify the evaluation points within each finite element. "mesh_points": The optimization result is given at the mesh points.
n_eval_points 20 The number of evaluation points used in each element when the algorithm option result_mode is set to "element_interpolation". One evaluation point is placed at each element end-point (hence the option value must be at least 2) and the rest are distributed uniformly.
blocking_factors None (not used) The iterable of blocking factors, where each element corresponds to the number of elements for which all the control profiles should be constant. For example, if blocking_factors == [2, 1, 5], then u_0 = u_1 and u_3 = u_4 = u_5 = u_6 = u_7. The sum of all elements in the iterable must be the same as the number of elements. If blocking_factors is None, then the usual collocation polynomials are instead used to represent the controls.
quadrature_constraint True Whether to use quadrature continuity constraints. This option is only applicable when using Gauss collocation. It is incompatible with eliminate_der_var set to True. True: Quadrature is used to get the values of the states at the mesh points. False: The Lagrange basis polynomials for the state collocation polynomials are evaluated to get the values of the states at the mesh points.
eliminate_der_var False True: The variables representing the derivatives are eliminated via the collocation equations and are thus not a part of the NLP, with the exception of \dot{x}_{1, 0}, which is not eliminated since the collocation equations are not enforced at t_0. False: The variables representing the derivatives are kept as NLP variables and the collocation equations enter as constraints.
eliminate_cont_var False True: Let the same variables represent both the values of the states at the start of each element and the end of the previous element. False: For Radau collocation, the extra variables x_{i, 0}, representing the states at the start of each element, are created and then constrained to be equal to the corresponding variable at the end of the previous element for continuity. For Gauss collocation, the extra variables x_{i, n_cp + 1}, representing the states at the end of each element, are created and then constrained to be equal to the corresponding variable at the start of the succeeding element for continuity.
measurement_data None Data used to penalize, constraint or eliminate certain variables.
IPOPT_options IPOPT defaults IPOPT options for solution of NLP. See documentation of CasADi's IPOPT interface for available options.

A few of the options are experimental, so changing default values should be done with some care.

The last option, IPOPT_options, serves as an interface for setting options in IPOPT. Note that this option utilizes CasADi's IPOPT interface, which includes more options than just the original IPOPT options. Thus, please see the documentation for CasADi's IpoptSolver class for a complete list of the available IPOPT options, which is available at http://casadi.sourceforge.net/api/html/dd/df1/classCasADi_1_1IpoptSolver.html. To exemplify the usage of this algorithm option, the maximum number of iterations in IPOPT can be set using the following syntax:

opts = model.optimize_options()
opts["IPOPT_options"]["max_iter"] = 10000

The algorithm is in many ways similar to the JMI algorithm, but there are a few key differences. As mentioned before, it offers increased performance, which is mainly due to two things. The first is that this algorithm constructs automatic differentiation graphs for the entire NLP, whereas the JMI algorithm only constructs graphs for evaluation of residuals, leading to very fast function evaluations. The second is that second-order derivatives are computed using automatic differentiation, whereas the JMI algorithm instead uses a quasi-Newton method, which affects convergence speed and robustness. This feature can be disabled by setting the IPOPT option generate_hessian to False, in which case it will use the same quasi-Newton approach as the JMI algorithm.

JModelica.org's CasADi framework does not support simulation and initialization of models. It is recommended to use PyFMI for these purposes instead.

Some statistics from IPOPT can be obtained by issuing the command

res_opt.solver.get_ipopt_statistics()


The return argument of this function can be found by using the interactive help:

help(res.solver.opt_get_ipopt_statistics)
Get statistics from the last optimization run.

Returns::

return_status --
Return status from IPOPT.

nbr_iter --
Number of iterations.

objective --
Final value of objective function.

total_exec_time --
Execution time.

### 6.2. Examples

#### 6.2.1. Optimal control

This tutorial is based on the Hicks-Ray Continuously Stirred Tank Reactors (CSTR) system. The model was originally presented in [1]. The system has two states, the concentration, c, and the temperature, T. The control input to the system is the temperature, Tc, of the cooling flow in the reactor jacket. The chemical reaction in the reactor is exothermic, and also temperature dependent; high temperature results in high reaction rate. The CSTR dynamics are given by:

This tutorial will cover the following topics:

• How to solve a DAE initialization problem. The initialization model has equations specifying that all derivatives should be identically zero, which implies that a stationary solution is obtained. Two stationary points, corresponding to different inputs, are computed. We call the stationary points A and B respectively. Point A corresponds to operating conditions where the reactor is cold and the reaction rate is low, whereas point B corresponds to a higher temperature where the reaction rate is high.

• An optimal control problem is solved where the objective is to transfer the state of the system from stationary point A to point B. The challenge is to ignite the reactor while avoiding uncontrolled temperature increases. It is also demonstrated how to set parameter and variable values in a model. More information about the simultaneous optimization algorithm can be found at JModelica.org API documentation.

• The optimization result is saved to file and then the important variables are plotted.

The Python commands in this tutorial may be copied and pasted directly into a Python shell, in some cases with minor modifications. Alternatively, you may copy the commands into a text file, e.g., cstr_casadi.py.

Start the tutorial by creating a working directory and copy the file \$JMODELICA_HOME/Python/pyjmi/examples/files/CSTR.mop to your working directory. An online version of CSTR.mop is also available (depending on which browser you use, you may have to accept the site certificate by clicking through a few steps). If you choose to create a Python script file, save it to the working directory.

##### Compile and instantiate a model object

The functions and classes used in the tutorial script need to be imported into the Python script. This is done by the following Python commands. Copy them and paste them either directly into your Python shell or, preferably, into your Python script file.

import numpy as N
import matplotlib.pyplot as plt

from pymodelica import compile_fmu
from pyjmi import transfer_optimization_problem


To solve the initialization problem and simulate the model, we will first compile it as an FMU and load it in Python. These steps are described in more detail in Section 4.

# Compile the stationary initialization model into an FMU
init_fmu = compile_fmu("CSTR.CSTR_Init", "CSTR.mop")



At this point, you may open the file CSTR.mop, containing the CSTR model and the static initialization model used in this section. Study the classes CSTR.CSTR and CSTR.CSTR_Init and make sure you understand the models. Before proceeding, have a look at the interactive help for one of the functions you used:

help(compile_fmu)

##### Solve the DAE initialization problem

In the next step, we would like to specify the first operating point, A, by means of a constant input cooling temperature, and then solve the initialization problem assuming that all derivatives are zero.

# Set input for Stationary point A
Tc_0_A = 250
init_model.set('Tc', Tc_0_A)

# Solve the initialization problem using FMI
init_model.initialize()

# Store stationary point A
[c_0_A, T_0_A] = init_model.get(['c', 'T'])

# Print some data for stationary point A
print(' *** Stationary point A ***')
print('Tc = %f' % Tc_0_A)
print('c = %f' % c_0_A)
print('T = %f' % T_0_A)


Notice how the method set is used to set the value of the control input. The initialization algorithm is invoked by calling the method initialize, which returns a result object from which the initialization result can be accessed. The values of the states corresponding to point A can then be extracted from the result object. Look carefully at the printouts in the Python shell to see a printout of the stationary values. Display the help text for the initialize method and take a moment to look it through. The procedure is now repeated for operating point B:

# Set inputs for Stationary point B
init_model.reset() # reset the FMU so that we can initialize it again
Tc_0_B = 280
init_model.set('Tc', Tc_0_B)

# Solve the initialization problem using FMI
init_model.initialize()

# Store stationary point B
[c_0_B, T_0_B] = init_model.get(['c', 'T'])

# Print some data for stationary point B
print(' *** Stationary point B ***')
print('Tc = %f' % Tc_0_B)
print('c = %f' % c_0_B)
print('T = %f' % T_0_B)


We have now computed two stationary points for the system based on constant control inputs. In the next section, these will be used to set up an optimal control problem.

###### Solving an optimal control problem

The optimal control problem we are about to solve is given by:

and is expressed in Optimica format in the class CSTR.CSTR_Opt2 in the CSTR.mop file above. Have a look at this class and make sure that you understand how the optimization problem is formulated and what the objective is.

Direct collocation methods often require good initial guesses in order to ensure robust convergence. Since initial guesses are needed for all discretized variables along the optimization interval, simulation provides a convenient means to generate state and derivative profiles given an initial guess for the control input(s). It is then convenient to set up a dedicated model for computation of initial trajectories. In the model CSTR.CSTR_Init_Optimization in the CSTR.mop file, a step input is applied to the system in order obtain an initial guess. Notice that the variable names in the initialization model must match those in the optimal control model.

First, compile the model and set model parameters:

# Compile the optimization initialization model
init_sim_fmu = compile_fmu("CSTR.CSTR_Init_Optimization", "CSTR.mop")

# Set initial and reference values
init_sim_model.set('cstr.c_init', c_0_A)
init_sim_model.set('cstr.T_init', T_0_A)
init_sim_model.set('c_ref', c_0_B)
init_sim_model.set('T_ref', T_0_B)
init_sim_model.set('Tc_ref', Tc_0_B)


Having initialized the model parameters, we can simulate the model using the simulate function.

init_res = init_sim_model.simulate(start_time=0., final_time=150.)


The method simulate first computes consistent initial conditions and then simulates the model in the interval 0 to 150 seconds. Take a moment to read the interactive help for the simulate method.

The simulation result object is returned and to retrieve the simulation data use Python dictionary access to retrieve the variable trajectories.

# Extract variable profiles
t_init_sim = init_res['time']
c_init_sim = init_res['cstr.c']
T_init_sim = init_res['cstr.T']
Tc_init_sim = init_res['cstr.Tc']

# Plot the initial guess trajectories
plt.close(1)
plt.figure(1)
plt.hold(True)
plt.subplot(3, 1, 1)
plt.plot(t_init_sim, c_init_sim)
plt.grid()
plt.ylabel('Concentration')
plt.title('Initial guess obtained by simulation')

plt.subplot(3, 1, 2)
plt.plot(t_init_sim, T_init_sim)
plt.grid()
plt.ylabel('Temperature')

plt.subplot(3, 1, 3)
plt.plot(t_init_sim, Tc_init_sim)
plt.grid()
plt.ylabel('Cooling temperature')
plt.xlabel('time')
plt.show()


Look at the plots and try to relate the trajectories to the optimal control problem. Why is this a good initial guess?

Once the initial guess is generated, we compile the optimal control problem:

# Compile and load optimization problem
op = transfer_optimization_problem("CSTR.CSTR_Opt2", "CSTR.mop")


We will now initialize the parameters of the model so that their values correspond to the optimization objective of transferring the system state from operating point A to operating point B. Accordingly, we set the parameters representing the initial values of the states to point A and the reference values in the cost function to point B:

# Set reference values
op.set('Tc_ref', Tc_0_B)
op.set('c_ref', float(c_0_B))
op.set('T_ref', float(T_0_B))

# Set initial values
op.set('cstr.c_init', float(c_0_A))
op.set('cstr.T_init', float(T_0_A))


Collocation-based optimization algorithms often require a good initial guess in order to achieve fast convergence. Also, if the problem is non-convex, initialization is even more critical. Initial guesses can be provided in Optimica by the initialGuess attribute, see the CSTR.mop file for an example for this. Notice that initialization in the case of collocation-based optimization methods means initialization of all the control and state profiles as a function of time. In some cases, it is sufficient to use constant profiles. For this purpose, the initialGuess attribute works well. In more difficult cases, however, it may be necessary to initialize the profiles using simulation data, where an initial guess for the input(s) has been used to generate the profiles for the dependent variables. This approach for initializing the optimization problem is used in this tutorial.

We are now ready to solve the actual optimization problem. This is done by invoking the method optimize:

# Set options
opt_opts = op.optimize_options()
opt_opts['n_e'] = 19 # Number of elements
opt_opts['init_traj'] = init_res.result_data
opt_opts['IPOPT_options']['tol'] = 1e-10

# Solve the optimal control problem
res = op.optimize(options=opt_opts)


In this case, we would like to increase the number of finite elements in the mesh from 50 to 100. This is done by setting the corresponding option and provide it as an argument to the optimize method. We also lower the tolerance of Ipopt to get a more accurate result. You should see the output of Ipopt in the Python shell as the algorithm iterates to find the optimal solution. Ipopt should terminate with a message like 'Optimal solution found' or 'Solved to an acceptable level' in order for an optimum to be found. The optimization result object is returned and the optimization data are stored in res.

We can now retrieve the trajectories of the variables that we intend to plot:

# Extract variable profiles
c_res = res['cstr.c']
T_res = res['cstr.T']
Tc_res = res['cstr.Tc']
time_res = res['time']

c_ref = res.final('c_ref') # extract constant value as last element
T_ref = res.final('T_ref')
Tc_ref = res.final('Tc_ref')


Finally, we plot the result using the functions available in matplotlib:

# Plot the results
plt.close(2)
plt.figure(2)
plt.hold(True)
plt.subplot(3, 1, 1)
plt.plot(time_res, c_res)
plt.plot([time_res[0], time_res[-1]], [c_ref, c_ref], '--')
plt.grid()
plt.ylabel('Concentration')
plt.title('Optimized trajectories')

plt.subplot(312)
plt.plot(time_res,T_res)
plt.plot([time_res[0],time_res[-1]],[T_ref,T_ref],'--')
plt.grid()
plt.ylabel('Temperature')

plt.subplot(313)
plt.plot(time_res,Tc_res)
plt.plot([time_res[0],time_res[-1]],[Tc_ref,Tc_ref],'--')
plt.grid()
plt.ylabel('Cooling temperature')
plt.xlabel('time')
plt.show()


Notice that parameters are returned as scalar values whereas variables are returned as vectors and that this must be taken into account when plotting. You should now see the plot shown in Figure 7.11.

Take a minute to analyze the optimal profiles and to answer the following questions:

1. Why is the concentration high in the beginning of the interval?

2. Why is the input cooling temperature high in the beginning of the interval?

##### Verify optimal control solution

Solving optimal control problems by means of direct collocation implies that the differential equation is approximated by a time-discrete counterpart. The accuracy of the solution is dependent on the method of collocation and the number of elements. In order to assess the accuracy of the discretization, we may simulate the system using the optimal control profile as input. With this approach, the state profiles are computed with high accuracy and the result may then be compared with the profiles resulting from optimization. Notice that this procedure does not verify the optimality of the resulting optimal control profiles, but only the accuracy of the discretization of the dynamics.

# Compile model
sim_fmu = compile_fmu("CSTR.CSTR", file_path)



The solution obtained from the optimization are values at a finite number of time points, in this case the collocation points. With the JMI framework, these values are normally interpolated linearly when used for simulation purposes. The CasADi framework also supports obtaining all the collocation polynomials for all the input variables in the form of a function instead, which can be used during simulation for greater accuracy. We obtain it from the result object in the following manner.

# Get optimized input
(_, opt_input) = res.solver.get_opt_input()


We specify the initial values and simulate using the optimal trajectory:

# Set initial values
sim_model.set('c_init', c_0_A)
sim_model.set('T_init', T_0_A)

# Simulate using optimized input
sim_opts = sim_model.simulate_options()
sim_opts['CVode_options']['rtol'] = 1e-6
sim_opts['CVode_options']['atol'] = 1e-8
res = sim_model.simulate(start_time=0., final_time=150.,
input=('Tc', opt_input), options=sim_opts)


Finally, we load the simulated data and plot it to compare with the optimized trajectories:

# Extract variable profiles
c_sim=res['c']
T_sim=res['T']
Tc_sim=res['Tc']
time_sim = res['time']

# Plot the results
plt.figure(3)
plt.clf()
plt.hold(True)
plt.subplot(311)
plt.plot(time_res,c_res,'--')
plt.plot(time_sim,c_sim)
plt.legend(('optimized','simulated'))
plt.grid()
plt.ylabel('Concentration')

plt.subplot(312)
plt.plot(time_res,T_res,'--')
plt.plot(time_sim,T_sim)
plt.legend(('optimized','simulated'))
plt.grid()
plt.ylabel('Temperature')

plt.subplot(313)
plt.plot(time_res,Tc_res,'--')
plt.plot(time_sim,Tc_sim)
plt.legend(('optimized','simulated'))
plt.grid()
plt.ylabel('Cooling temperature')
plt.xlabel('time')
plt.show()


You should now see the plot shown in Figure 7.12.

Discuss why the simulated trajectories differ from their optimized counterparts.

##### Exercises

After completing the tutorial you may continue to modify the optimization problem and study the results.

1. Remove the constraint on cstr.T. What is then the maximum temperature?

2. Play around with weights in the cost function. What happens if you penalize the control variable with a larger weight? Do a parameter sweep for the control variable weight and plot the optimal profiles in the same figure.

3. Add terminal constraints ('cstr.T(finalTime)=someParameter') for the states so that they are equal to point B at the end of the optimization interval. Now reduce the length of the optimization interval. How short can you make the interval?

4. Try varying the number of elements in the mesh and the number of collocation points in each interval.

##### References

[1] G.A. Hicks and W.H. Ray. Approximation Methods for Optimal Control Synthesis. Can. J. Chem. Eng., 40:522–529, 1971.

[2] Bieger, L., A. Cervantes, and A. Wächter (2002): "Advances in simultaneous strategies for dynamic optimization." Chemical Engineering Science, 57, pp. 575-593.

###### Minimum time problems

Minimum time problems are dynamic optimization problems where not only the control inputs are optimized, but also the final time. Typically, elements of such problems include initial and terminal state constraints and an objective function where the transition time is minimized. The following example will be used to illustrate how minimum time problems are formulated in Optimica. We consider the optimization problem:

subject to the Van der Pol dynamics:

and the constraints:

This problem is encoded in the following Optimica specification:

optimization VDP_Opt_Min_Time (objective = finalTime,
startTime = 0,
finalTime(free=true,min=0.2,initialGuess=1))

// The states
Real x1(start = 0,fixed=true);
Real x2(start = 1,fixed=true);

// The control signal
input Real u(free=true,min=-1,max=1);

equation
// Dynamic equations
der(x1) = (1 - x2^2) * x1 - x2 + u;
der(x2) = x1;

constraint
// terminal constraints
x1(finalTime)=0;
x2(finalTime)=0;
end VDP_Opt_Min_Time;


Notice how the class attribute finalTime is set to be free in the optimization. The problem is solved by the following Python script:

# Import numerical libraries
import numpy as N
import matplotlib.pyplot as plt

# Import the JModelica.org Python packages
from pymodelica import compile_fmu
from pyjmi import transfer_optimization_problem

vdp = transfer_optimization_problem("VDP_Opt_Min_Time", "VDP_Opt_Min_Time.mop")
res = vdp.optimize()

# Extract variable profiles
x1=res['x1']
x2=res['x2']
u=res['u']
t=res['time']

# Plot
plt.figure(1)
plt.clf()
plt.subplot(311)
plt.plot(t,x1)
plt.grid()
plt.ylabel('x1')

plt.subplot(312)
plt.plot(t,x2)
plt.grid()
plt.ylabel('x2')

plt.subplot(313)
plt.plot(t,u,'x-')
plt.grid()
plt.ylabel('u')
plt.xlabel('time')
plt.show()


The resulting control and state profiles are shown in Figure 7.13. Notice the difference as compared to Figure Figure 7.1, where the Van der Pol oscillator system is optimized using a quadratic objective function.