The direct collocation method can be used to solve dynamic optimization problems, including optimal control problems and parameter optimization problems. In the collocation method, the dynamic model variable profiles are approximated by piecewise polynomials. This method of approximating a differential equation corresponds to a fixed step implicit Runge-Kutta scheme, where the mesh defines the length of each step. Also, the number of collocation points in each element, or step, needs to be provided. This number corresponds to the stage order of the Runge-Kutta scheme. The selection of mesh is analogous to the choice of step length in a one-step algorithm for solving differential equations. Accordingly, the mesh needs to be fine-grained enough to ensure sufficiently accurate approximation of the differential constraint. For an overview of simultaneous optimization algorithms, see [2]. The nonlinear programming solvers IPOPT and WORHP can be used to solve the nonlinear program resulting from collocation. The needed first- and second-order derivatives are obtained using CasADi by algorithmic differentiation.

The collocation method requires that the model equations are twice continuously differentiable with respect to all of the variables. This for example means that the model can not contain integer variables or if clauses depending on the states.

Optimization models are represented using the class `OptimizationProblem`

, which can be instantiated using the `transfer_optimization_problem`

method. An object containing all the options for the optimization algorithm can be retrieved from the object:

from pyjmi import transfer_optimization_problem op = transfer_optimization_problem(class_name, optimica_file_path) opts = op.optimize_options() opts? # View the help text

After options have been set, the options object can be propagated to the `optimize`

method, which solves the optimization problem:

res = op.optimize(options=opts)

The standard options for the algorithm are shown in Table 6.1, “Standard options for the CasADi- and collocation-based optimization algorithm”. Additional documentation is available in the Python class documentation. The algorithm also has a lot of experimental options, which are not as well tested and some are intended for debugging purposes. These are shown in Table 6.2, “Experimental and debugging options for the CasADi- and collocation-based optimization algorithm”, and caution is advised when changing their default values.

**Table 6.1. Standard options for the CasADi- and collocation-based optimization algorithm**

Option | Default | Description |
---|---|---|

`n_e` | 50 | Number of finite elements. |

`hs` | None | Element lengths. Possible values: None, iterable of floats and "free" None: The element lengths are uniformly distributed. iterable of floats: Component i of the iterable specifies the length of element i. The lengths must be normalized in the sense that the sum of all lengths must be equal to 1. "free": The element lengths become optimization variables and are optimized according to the algorithm option free_element_lengths_data. WARNING: The "free" option is very experimental and will not always give desirable results. |

`n_cp` | 3 | Number of collocation points in each element. |

`expand_to_sx` | "NLP" | Whether to expand the CasADi MX graphs to SX graphs. Possible values: "NLP", "DAE", "no". "NLP": The entire NLP graph is expanded into SX. This will lead to high evaluation speed and high memory consumption. "DAE": The DAE, objective and constraint graphs for the dynamic optimization problem expressions are expanded into SX, but the full NLP graph is an MX graph. This will lead to moderate evaluation speed and moderate memory consumption. "no": All constructed graphs are MX graphs. This will lead to low evaluation speed and low memory consumption. |

`init_traj` | None | Variable trajectory data used for initialization of the NLP variables. |

`nominal_traj` | None | Variable trajectory data used for scaling of the NLP variables. This option is only applicable if variable scaling is enabled. |

`blocking_factors` | None (not used) | Blocking factors are used to enforce piecewise constant inputs. The inputs may only change values at some of the element boundaries. The option is either None (disabled), given as an instance of pyjmi.optimization.casadi_collocation.BlockingFactors or as a list of blocking factors. If the options is a list of blocking factors, then each element in the list specifies the number of collocation elements for which all of the inputs must be constant. For example, if blocking_factors == [2, 2, 1], then the inputs will attain 3 different values (number of elements in the list), and it will change values between collocation element number 2 and 3 as well as number 4 and 5. The sum of all elements in the list must be the same as the number of collocation elements and the length of the list determines the number of separate values that the inputs may attain. See the documentation of the BlockingFactors class for how to use it. If blocking_factors is None, then the usual collocation polynomials are instead used to represent the controls. |

`external_data` | None | Data used to penalize, constrain or eliminate certain variables. |

`delayed_feedback` | None | If not `None` , should be a `dict` with mappings `'delayed_var': ('undelayed_var', delay_ne)` . For each key-value pair, adds the the constraint that the variable `'delayed_var'` equals the value of the variable `'undelayed_var'` delayed by `delay_ne` elements. The initial part of the trajectory for `'delayed_var'` is fixed to its initial guess given by the `init_traj` option or the `initialGuess` attribute. `'delayed_var'` will typically be an input. This is an experimental feature and is subject to change. |

`solver` | 'IPOPT' | Specifies the nonlinear programming solver to be used. Possible choices are 'IPOPT' and 'WORHP'. |

`IPOPT_options` | IPOPT defaults | IPOPT options for solution of NLP. See IPOPT's documentation for available options. |

`WORHP_options` | WORHP defaults | WORHP options for solution of NLP. See WORHP's documentation for available options. |

**Table 6.2. Experimental and debugging options for the CasADi- and collocation-based optimization algorithm**

Option | Default | Description |
---|---|---|

`free_element_lengths_data` | None | Data used for optimizing the element lengths if they are free. Should be None when hs != "free". |

`discr` | 'LGR' | Determines the collocation scheme used to discretize the problem. Possible values: "LG" and "LGR". "LG": Gauss collocation (Legendre-Gauss) "LGR": Radau collocation (Legendre-Gauss-Radau). |

`named_vars` | False | If enabled, the solver will create a duplicated set of NLP variables which have names corresponding to the Modelica/Optimica variable names. Symbolic expressions of the NLP consisting of the named variables can then be obtained using the get_named_var_expr method of the collocator class. This option is only intended for investigative purposes. |

`init_dual` | None | Dictionary containing vectors of initial guess for NLP dual variables. Intended to be obtained as the solution of an optimization problem which has an identical structure, which is stored in the dual_opt attribute of the result object. The dictionary has two keys, 'g' and 'x', containing vectors of the corresponding dual variable intial guesses. Note that when using IPOPT, the option warm_start_init_point has to be activated for this option to have an effect. |

`variable_scaling` | True | Whether to scale the variables according to their nominal values or the trajectories provided with the nominal_traj option. |

`equation_scaling` | False | Whether to scale the equations in collocated NLP. Many NLP solvers default to scaling the equations, but if it is done through this option the resulting scaling can be inspected. |

`nominal_traj_mode` | {"_default_mode": "linear"} | Mode for computing scaling factors based on nominal trajectories. Four possible modes: "attribute": Time-invariant, linear scaling based on Nominal attribute "linear": Time-invariant, linear scaling "affine": Time-invariant, affine scaling "time-variant": Time-variant, linear scaling Option is a dictionary with variable names as keys and corresponding scaling modes as values. For all variables not occuring in the keys of the dictionary, the mode specified by the "_default_mode" entry will be used, which by default is "linear". |

`result_file_name` | "" | Specifies the name of the file where the result is written. Setting this option to an empty string results in a default file name that is based on the name of the model class. |

`write_scaled_result` | False | Return the scaled optimization result if set to True, otherwise return the unscaled optimization result. This option is only applicable when variable_scaling is enabled and is only intended for debugging. |

`print_condition_numbers` | False | Prints the condition numbers of the Jacobian of the constraints and of the simplified KKT matrix at the initial and optimal points. Note that this is only feasible for very small problems. |

`result_mode` | 'collocation_points' | Specifies the output format of the optimization result. Possible values: "collocation_points", "element_interpolation" and "mesh_points" "collocation_points": The optimization result is given at the collocation points as well as the start and final time point. "element_interpolation": The values of the variable trajectories are calculated by evaluating the collocation polynomials. The algorithm option n_eval_points is used to specify the evaluation points within each finite element. "mesh_points": The optimization result is given at the mesh points. |

`n_eval_points` | 20 | The number of evaluation points used in each element when the algorithm option result_mode is set to "element_interpolation". One evaluation point is placed at each element end-point (hence the option value must be at least 2) and the rest are distributed uniformly. |

`checkpoint` | False | If `checkpoint` is set to `True,` transcribed NLP is built with packed MX functions. Instead of calling the DAE residual function, the collocation equation function, and the lagrange term function `n_e * n_cp` times, the check point scheme builds an `MXFunction` evaluating `n_cp` collocation points at the same time, so that the packed `MXFunction` is called only `n_e` times. This approach improves the code generation and it is expected to reduce the memory usage for constructing and solving the NLP. |

`quadrature_constraint` | True | Whether to use quadrature continuity constraints. This option is only applicable when using Gauss collocation. It is incompatible with eliminate_der_var set to True. True: Quadrature is used to get the values of the states at the mesh points. False: The Lagrange basis polynomials for the state collocation polynomials are evaluated to get the values of the states at the mesh points. |

`mutable_external_data` | True | If true and the `external_data` option is used, the external data can be changed after discretization, e.g. during warm starting. |

The last standard options, `IPOPT_options`

and `WORHP_options`

, serve as interfaces for setting options in IPOPT and WORHP. To exemplify the usage of these algorithm options, the maximum number of iterations in IPOPT can be set using the following syntax:

opts = model.optimize_options() opts["IPOPT_options"]["max_iter"] = 10000

JModelica.org's CasADi framework does not support simulation and initialization of models. It is recommended to use PyFMI for these purposes instead.

Some statistics from the NLP solver can be obtained by issuing the command

res_opt.get_solver_statistics()

The return argument of this function can be found by using the interactive help:

help(res_opt.get_solver_statistics) Get nonlinear programming solver statistics. Returns:: return_status -- Return status from nonlinear programming solver. nbr_iter -- Number of iterations. objective -- Final value of objective function. total_exec_time -- Execution time.

When collocation is used to solve a dynamic optimization problem, the solution procedure is carried out in several steps:

Discretize the dynamic optimization problem, which is formulated in continuous time. The result is a large and sparse nonlinear program (NLP). The discretization step depends on the options as provided to the

`optimize`

method.Solve the NLP.

Postprocess the NLP solution to extract an approximate solution to the original dynamic optimization problem.

Depending on the problem, discretization may account for a substantial amount of the total solution time, or even dominate it.

The same discretization can be reused for several solutions with different parameter values, but the same options. Discretization will be carried out each time the `optimize`

method is called on the model. Instead of calling `model.optimize(options=opts)`

, a problem can be discretized using the `prepare_optimization`

method:

solver = model.prepare_optimization(options=opts)

Alternatively, the solver can be retrieved from an existing optimization result, as `solver = res.get_solver()`

. Manipulating the solver (e.g. setting parameters) may affect the original optimization problem object and vice versa.

The obtained solver object represents the discretized problem, and can be used to solve it using its own `optimize`

method:

res = solver.optimize()

While options cannot be changed in general, parameter values, initial trajectories, external data, and NLP solver options can be changed on the solver object. Parameter values can be updated with

solver.set(parameter_name, value)

and current values retrieved with

solver.get(parameter_name)

New initial trajectories can be set with

solver.set_init_traj(init_traj)

where `init_traj`

has the same format as used with the `init_traj`

option.

External data can be updated with

solver.set_external_variable_data(variable_name, data)

(unless the `mutable_external_data`

option is turned off). `variable_name`

should correspond to one of the variables used in the `external_data`

option passed to `prepare_optimization`

. `data`

should be the new data, in the same format as variable data used in the `external_data`

option. The kind of external data used for the variable (eliminated/constrained/quadratic penalty) is not changed.

Settings to the nonlinear solver can be changed with

solver.set_solver_option(solver_name, name, value)

where `solver_name`

is e g `'IPOPT'`

or `'WORHP'`

.

The solver object obtained from `prepare_optimization`

can also be used for *warm starting*, where an obtained optimization solution (including primal and dual variables) is used as the initial guess for a new optimization with new parameter values.

To reuse the solver's last obtained solution as initial guess for the next optimization, warm starting can be enabled with

solver.set_warm_start(True)

before calling `solver.optimize()`

. This will reuse the last solution for the primal variables (unless `solver.set_init_traj`

was called since the last `solver.optimize`

) as well as the last solution for the dual variables.

When using the IPOPT solver with warm starting, several solver options typically also need to be set to see the benefits, e g:

def set_warm_start_options(solver, push=1e-4, mu_init=1e-1): solver.set_solver_option('IPOPT', 'warm_start_init_point', 'yes') solver.set_solver_option('IPOPT', 'mu_init', mu_init) solver.set_solver_option('IPOPT', 'warm_start_bound_push', push) solver.set_solver_option('IPOPT', 'warm_start_mult_bound_push', push) solver.set_solver_option('IPOPT', 'warm_start_bound_frac', push) solver.set_solver_option('IPOPT', 'warm_start_slack_bound_frac', push) solver.set_solver_option('IPOPT', 'warm_start_slack_bound_push', push) set_warm_start_options(solver)

Smaller values of the `push`

and `mu`

arguments will make the solver place more trust in that the sought solution is close to the initial guess, i e, the last solution.

This tutorial is based on the Hicks-Ray Continuously Stirred Tank Reactors (CSTR) system. The model was originally presented in [1]. The system has two states, the concentration, c, and the temperature, T. The control input to the system is the temperature, Tc, of the cooling flow in the reactor jacket. The chemical reaction in the reactor is exothermic, and also temperature dependent; high temperature results in high reaction rate. The CSTR dynamics are given by:

This tutorial will cover the following topics:

How to solve a DAE initialization problem. The initialization model has equations specifying that all derivatives should be identically zero, which implies that a stationary solution is obtained. Two stationary points, corresponding to different inputs, are computed. We call the stationary points A and B respectively. Point A corresponds to operating conditions where the reactor is cold and the reaction rate is low, whereas point B corresponds to a higher temperature where the reaction rate is high.

An optimal control problem is solved where the objective is to transfer the state of the system from stationary point A to point B. The challenge is to ignite the reactor while avoiding uncontrolled temperature increases. It is also demonstrated how to set parameter and variable values in a model. More information about the simultaneous optimization algorithm can be found at JModelica.org API documentation.

The optimization result is saved to file and then the important variables are plotted.

The Python commands in this tutorial may be copied and pasted directly into a Python shell, in some cases with minor modifications. Alternatively, you may copy the commands into a text file, e.g., `cstr_casadi.py`

.

Start the tutorial by creating a working directory and copy the file `$JMODELICA_HOME/Python/pyjmi/examples/files/CSTR.mop`

to your working directory. An online version of `CSTR.mop`

is also available (depending on which browser you use, you may have to accept the site certificate by clicking through a few steps). If you choose to create a Python script file, save it to the working directory.

The functions and classes used in the tutorial script need to be imported into the Python script. This is done by the following Python commands. Copy them and paste them either directly into your Python shell or, preferably, into your Python script file.

import numpy as N import matplotlib.pyplot as plt from pymodelica import compile_fmu from pyfmi import load_fmu from pyjmi import transfer_optimization_problem

To solve the initialization problem and simulate the model, we will first compile it as an FMU and load it in Python. These steps are described in more detail in Section 4.

# Compile the stationary initialization model into an FMU init_fmu = compile_fmu("CSTR.CSTR_Init", "CSTR.mop") # Load the FMU init_model = load_fmu(init_fmu)

At this point, you may open the file `CSTR.mop`

, containing the CSTR model and the static initialization model used in this section. Study the classes `CSTR.CSTR`

and `CSTR.CSTR_Init`

and make sure you understand the models. Before proceeding, have a look at the interactive help for one of the functions you used:

help(compile_fmu)

In the next step, we would like to specify the first operating point, A, by means of a constant input cooling temperature, and then solve the initialization problem assuming that all derivatives are zero.

# Set input for Stationary point A Tc_0_A = 250 init_model.set('Tc', Tc_0_A) # Solve the initialization problem using FMI init_model.initialize() # Store stationary point A [c_0_A, T_0_A] = init_model.get(['c', 'T']) # Print some data for stationary point A print(' *** Stationary point A ***') print('Tc = %f' % Tc_0_A) print('c = %f' % c_0_A) print('T = %f' % T_0_A)

Notice how the method `set`

is used to set the value of the control input. The initialization algorithm is invoked by calling the method `initialize`

, which returns a result object from which the initialization result can be accessed. The values of the states corresponding to point A can then be extracted from the result object. Look carefully at the printouts in the Python shell to see a printout of the stationary values. Display the help text for the `initialize`

method and take a moment to look it through. The procedure is now repeated for operating point B:

# Set inputs for Stationary point B init_model.reset() # reset the FMU so that we can initialize it again Tc_0_B = 280 init_model.set('Tc', Tc_0_B) # Solve the initialization problem using FMI init_model.initialize() # Store stationary point B [c_0_B, T_0_B] = init_model.get(['c', 'T']) # Print some data for stationary point B print(' *** Stationary point B ***') print('Tc = %f' % Tc_0_B) print('c = %f' % c_0_B) print('T = %f' % T_0_B)

We have now computed two stationary points for the system based on constant control inputs. In the next section, these will be used to set up an optimal control problem.

The optimal control problem we are about to solve is given by:

and is expressed in Optimica format in the class `CSTR.CSTR_Opt2`

in the `CSTR.mop`

file above. Have a look at this class and make sure that you understand how the optimization problem is formulated and what the objective is.

Direct collocation methods often require good initial guesses in order to ensure robust convergence. Also, if the problem is non-convex, initialization is even more critical. Since initial guesses are needed for all discretized variables along the optimization interval, simulation provides a convenient means to generate state and derivative profiles given an initial guess for the control input(s). It is then convenient to set up a dedicated model for computation of initial trajectories. In the model `CSTR.CSTR_Init_Optimization`

in the `CSTR.mop`

file, a step input is applied to the system in order obtain an initial guess. Notice that the variable names in the initialization model must match those in the optimal control model.

First, compile the model and set model parameters:

# Compile the optimization initialization model init_sim_fmu = compile_fmu("CSTR.CSTR_Init_Optimization", "CSTR.mop") # Load the model init_sim_model = load_fmu(init_sim_fmu) # Set initial and reference values init_sim_model.set('cstr.c_init', c_0_A) init_sim_model.set('cstr.T_init', T_0_A) init_sim_model.set('c_ref', c_0_B) init_sim_model.set('T_ref', T_0_B) init_sim_model.set('Tc_ref', Tc_0_B)

Having initialized the model parameters, we can simulate the model using the `simulate`

function.

# Simulate with constant input Tc init_res = init_sim_model.simulate(start_time=0., final_time=150.)

The method `simulate`

first computes consistent initial conditions and then simulates the model in the interval 0 to 150 seconds. Take a moment to read the interactive help for the `simulate`

method.

The simulation result object is returned and to retrieve the simulation data use Python dictionary access to retrieve the variable trajectories.

# Extract variable profiles t_init_sim = init_res['time'] c_init_sim = init_res['cstr.c'] T_init_sim = init_res['cstr.T'] Tc_init_sim = init_res['cstr.Tc'] # Plot the initial guess trajectories plt.close(1) plt.figure(1) plt.hold(True) plt.subplot(3, 1, 1) plt.plot(t_init_sim, c_init_sim) plt.grid() plt.ylabel('Concentration') plt.title('Initial guess obtained by simulation') plt.subplot(3, 1, 2) plt.plot(t_init_sim, T_init_sim) plt.grid() plt.ylabel('Temperature') plt.subplot(3, 1, 3) plt.plot(t_init_sim, Tc_init_sim) plt.grid() plt.ylabel('Cooling temperature') plt.xlabel('time') plt.show()

Look at the plots and try to relate the trajectories to the optimal control problem. Why is this a good initial guess?

Once the initial guess is generated, we compile the optimal control problem:

# Compile and load optimization problem op = transfer_optimization_problem("CSTR.CSTR_Opt2", "CSTR.mop")

We will now initialize the parameters of the model so that their values correspond to the optimization objective of transferring the system state from operating point A to operating point B. Accordingly, we set the parameters representing the initial values of the states to point A and the reference values in the cost function to point B:

# Set reference values op.set('Tc_ref', Tc_0_B) op.set('c_ref', float(c_0_B)) op.set('T_ref', float(T_0_B)) # Set initial values op.set('cstr.c_init', float(c_0_A)) op.set('cstr.T_init', float(T_0_A))

We will also set some optimization options. In this case, we decrease the number of finite elements in the mesh from 50 to 19, to be able to illustrate that simulation and optimization might not give the exact same result. This is done by setting the corresponding option and providing it as an argument to the `optimize`

method. We also lower the tolerance of IPOPT to get a more accurate result. We are now ready to solve the actual optimization problem. This is done by invoking the method `optimize`

:

# Set options opt_opts = op.optimize_options() opt_opts['n_e'] = 19 # Number of elements opt_opts['init_traj'] = init_res opt_opts['nominal_traj'] = init_res opt_opts['IPOPT_options']['tol'] = 1e-10 # Solve the optimal control problem res = op.optimize(options=opt_opts)

You should see the output of IPOPT in the Python shell as the algorithm iterates to find the optimal solution. IPOPT should terminate with a message like 'Optimal solution found' or 'Solved to acceptable level' in order for an optimum to have been found. The optimization result object is returned and the optimization data are stored in `res`

.

We can now retrieve the trajectories of the variables that we intend to plot:

# Extract variable profiles c_res = res['cstr.c'] T_res = res['cstr.T'] Tc_res = res['cstr.Tc'] time_res = res['time'] c_ref = res['c_ref'] T_ref = res['T_ref'] Tc_ref = res['Tc_ref']

Finally, we plot the result using the functions available in matplotlib:

# Plot the results plt.close(2) plt.figure(2) plt.hold(True) plt.subplot(3, 1, 1) plt.plot(time_res, c_res) plt.plot(time_res, c_ref, '--') plt.grid() plt.ylabel('Concentration') plt.title('Optimized trajectories') plt.subplot(3, 1, 2) plt.plot(time_res, T_res) plt.plot(time_res, T_ref, '--') plt.grid() plt.ylabel('Temperature') plt.subplot(3, 1, 3) plt.plot(time_res, Tc_res) plt.plot(time_res, Tc_ref, '--') plt.grid() plt.ylabel('Cooling temperature') plt.xlabel('time') plt.show()

You should now see the plot shown in Figure 6.2, “Optimal profiles for the CSTR problem.”.

Take a minute to analyze the optimal profiles and to answer the following questions:

Why is the concentration high in the beginning of the interval?

Why is the input cooling temperature high in the beginning of the interval?

Solving optimal control problems by means of direct collocation implies that the differential equation is approximated by a time-discrete counterpart. The accuracy of the solution is dependent on the method of collocation and the number of elements. In order to assess the accuracy of the discretization, we may simulate the system using the optimal control profile as input. With this approach, the state profiles are computed with high accuracy and the result may then be compared with the profiles resulting from optimization. Notice that this procedure does not verify the optimality of the resulting optimal control profiles, but only the accuracy of the discretization of the dynamics.

We start by compiling and loading the model used for simulation:

# Compile model sim_fmu = compile_fmu("CSTR.CSTR", "CSTR.mop") # Load model sim_model = load_fmu(sim_fmu)

The solution obtained from the optimization are values at a finite number of time points, in this case the collocation points. With the JMI framework, these values are normally interpolated linearly when used for simulation purposes. The CasADi framework also supports obtaining all the collocation polynomials for all the input variables in the form of a function instead, which can be used during simulation for greater accuracy. We obtain it from the result object in the following manner.

# Get optimized input (_, opt_input) = res.get_opt_input()

We specify the initial values and simulate using the optimal trajectory:

# Set initial values sim_model.set('c_init', c_0_A) sim_model.set('T_init', T_0_A) # Simulate using optimized input sim_opts = sim_model.simulate_options() sim_opts['CVode_options']['rtol'] = 1e-6 sim_opts['CVode_options']['atol'] = 1e-8 res = sim_model.simulate(start_time=0., final_time=150., input=('Tc', opt_input), options=sim_opts)

Finally, we load the simulated data and plot it to compare with the optimized trajectories:

# Extract variable profiles c_sim=res['c'] T_sim=res['T'] Tc_sim=res['Tc'] time_sim = res['time'] # Plot the results plt.figure(3) plt.clf() plt.hold(True) plt.subplot(311) plt.plot(time_res,c_res,'--') plt.plot(time_sim,c_sim) plt.legend(('optimized','simulated')) plt.grid() plt.ylabel('Concentration') plt.subplot(312) plt.plot(time_res,T_res,'--') plt.plot(time_sim,T_sim) plt.legend(('optimized','simulated')) plt.grid() plt.ylabel('Temperature') plt.subplot(313) plt.plot(time_res,Tc_res,'--') plt.plot(time_sim,Tc_sim) plt.legend(('optimized','simulated')) plt.grid() plt.ylabel('Cooling temperature') plt.xlabel('time') plt.show()

You should now see the plot shown in Figure 6.3, “Optimal control profiles and simulated trajectories corresponding to the optimal control input.”.

**Figure 6.3. Optimal control profiles and simulated trajectories corresponding to the optimal control input.**

Discuss why the simulated trajectories differ from their optimized counterparts.

After completing the tutorial you may continue to modify the optimization problem and study the results.

Remove the constraint on cstr.T. What is then the maximum temperature?

Play around with weights in the cost function. What happens if you penalize the control variable with a larger weight? Do a parameter sweep for the control variable weight and plot the optimal profiles in the same figure.

Add terminal constraints ('cstr.T(finalTime)=someParameter') for the states so that they are equal to point B at the end of the optimization interval. Now reduce the length of the optimization interval. How short can you make the interval?

Try varying the number of elements in the mesh and the number of collocation points in each interval.

[1] G.A. Hicks and W.H. Ray. Approximation Methods for Optimal Control Synthesis. *Can. J. Chem. Eng*., 40:522–529, 1971.

[2] Bieger, L., A. Cervantes, and A. Wächter (2002): "Advances in simultaneous strategies for dynamic optimization." *Chemical Engineering Science*, **57**, pp. 575-593.

Minimum time problems are dynamic optimization problems where not only the control inputs are optimized, but also the final time. Typically, elements of such problems include initial and terminal state constraints and an objective function where the transition time is minimized. The following example will be used to illustrate how minimum time problems are formulated in Optimica. We consider the optimization problem:

subject to the Van der Pol dynamics:

and the constraints:

This problem is encoded in the following Optimica specification:

optimization VDP_Opt_Min_Time (objective = finalTime, startTime = 0, finalTime(free=true,min=0.2,initialGuess=1)) // The states Real x1(start = 0,fixed=true); Real x2(start = 1,fixed=true); // The control signal input Real u(free=true,min=-1,max=1); equation // Dynamic equations der(x1) = (1 - x2^2) * x1 - x2 + u; der(x2) = x1; constraint // terminal constraints x1(finalTime)=0; x2(finalTime)=0; end VDP_Opt_Min_Time;

Notice how the class attribute `finalTime`

is set to be free in the optimization. The problem is solved by the following Python script:

# Import numerical libraries import numpy as N import matplotlib.pyplot as plt # Import the JModelica.org Python packages from pymodelica import compile_fmu from pyfmi import load_fmu from pyjmi import transfer_optimization_problem vdp = transfer_optimization_problem("VDP_Opt_Min_Time", "VDP_Opt_Min_Time.mop") res = vdp.optimize() # Extract variable profiles x1=res['x1'] x2=res['x2'] u=res['u'] t=res['time'] # Plot plt.figure(1) plt.clf() plt.subplot(311) plt.plot(t,x1) plt.grid() plt.ylabel('x1') plt.subplot(312) plt.plot(t,x2) plt.grid() plt.ylabel('x2') plt.subplot(313) plt.plot(t,u,'x-') plt.grid() plt.ylabel('u') plt.xlabel('time') plt.show()

The resulting control and state profiles are shown in Figure 6.4, “Minimum time profiles for the Van der Pol Oscillator.”. Notice the difference as compared to Figure Figure 6.1, “Optimal profiles for the VDP oscillator”, where the Van der Pol oscillator system is optimized using a quadratic objective function.

In some applications, it can be useful to solve dynamic optimization problems that include time delays in the model. Collocation based optimization schemes are well suited to handle this kind of models, since the whole state trajectory is available at the same time. The direct collocation method using CasADi contains an experimental implementation of such delays, which we will describe with an example. Please note that the implementation of this feature is experimental and subject to change.

We consider the optimization problem

subject to the dynamics

and the boundary conditions

The effect of positive is initially to increase , but after a time delay of time , it comes back with twice the effect in the negative direction through .

We model everything except the delay constraint in the Optimica specification

optimization DelayTest(startTime = 0, finalTime = 1, objectiveIntegrand = 4*x^2 + u1^2 + u2^2) input Real u1, u2; Real x(start = 1, fixed=true); equation der(x) = u1 - 2*u2; constraint x(finalTime) = 0; end DelayTest;

The problem is then solved in the following python script. Notice how delay constraint is added using the `delayed_feedback`

option, and the initial part of is set using the `initialGuess`

attribute:

# Import numerical libraries import numpy as np import matplotlib.pyplot as plt # Import JModelica.org Python packages from pyjmi import transfer_optimization_problem n_e = 20 delay_n_e = 5 horizon = 1.0 delay = horizon*delay_n_e/n_e # Compile and load optimization problem opt = transfer_optimization_problem("DelayTest", "DelayedFeedbackOpt.mop") # Set value for u2(t) when t < delay opt.getVariable('u2').setAttribute('initialGuess', 0.25) # Set algorithm options opts = opt.optimize_options() opts['n_e'] = n_e # Set delayed feedback from u1 to u2 opts['delayed_feedback'] = {'u2': ('u1', delay_n_e)} # Optimize res = opt.optimize(options=opts) # Extract variable profiles x_res = res['x'] u1_res = res['u1'] u2_res = res['u2'] time_res = res['time'] # Plot results plt.plot(time_res, x_res, time_res, u1_res, time_res, u2_res) plt.hold(True) plt.plot(time_res+delay, u1_res, '--') plt.hold(False) plt.legend(('x', 'u1', 'u2', 'delay(u1)')) plt.show()

The resulting control and state profiles are shown in Figure 6.5, “Optimization result for delayed feedback example.”. Notice that grows initially since is set positive to exploit the greater control gain that appears delayed through . At time , the delayed value of ceases to influence within the horizon, and immediately switches sign to drive down to its final value .

In this tutorial it will be demonstrated how to solve parameter estimation problems. We consider a quadruple tank system depicted in Figure 6.6, “A schematic picture of the quadruple tank process.”.

The dynamics of the system are given by the differential equations:

Where the nominal parameter values are given in Table 6.3, “Parameters for the quadruple tank process.”.

**Table 6.3. Parameters for the quadruple tank process.**

Parameter name | Value | Unit |
---|---|---|

A_{i} | 4.9 | cm^{2} |

`a` | 0.03 | cm^{2} |

`k` | 0.56 | cm^{2}V^{-1}s^{-1} |

`γ` | 0.3 | Vcm^{-1} |

The states of the model are the tank water levels x1, x2, x3, and x4. The control inputs, u1 and u2, are the flows generated by the two pumps.

The Modelica model for the system is located in QuadTankPack.mop. Download the file to your working directory and open it in a text editor. Locate the class `QuadTankPack.QuadTank`

and make sure you understand the model. In particular, notice that all model variables and parameters are expressed in SI units.

Measurement data, available in qt_par_est_data.mat, has been logged in an identification experiment. Download also this file to your working directory.

Open a text file and name it `qt_par_est_casadi.py`

. Then enter the imports:

import os from collections import OrderedDict from scipy.io.matlab.mio import loadmat import matplotlib.pyplot as plt import numpy as N from pymodelica import compile_fmu from pyfmi import load_fmu from pyjmi import transfer_optimization_problem from pyjmi.optimization.casadi_collocation import ExternalData

into the file. Next, we compile the model, which is used for simulation, and the optimization problem, which is used for estimating parameter values. We will take a closer look at the optimization formulation later, so do not worry about that one for the moment. The initial states for the experiment are stored in the optimization problem, which we propagate to the model for simulation.

# Compile and load FMU, which is used for simulation model = load_fmu(compile_fmu('QuadTankPack.QuadTank', "QuadTankPack.mop")) # Transfer problem to CasADi Interface, which is used for estimation op = transfer_optimization_problem("QuadTankPack.QuadTank_ParEstCasADi", "QuadTankPack.mop") # Set initial states in model, which are stored in the optimization problem x_0_names = ['x1_0', 'x2_0', 'x3_0', 'x4_0'] x_0_values = op.get(x_0_names) model.set(x_0_names, x_0_values)

Next, we enter code to open the data file, extract the measurement time series and plot the measurements:

# Load measurement data from file data = loadmat("qt_par_est_data.mat", appendmat=False) # Extract data series t_meas = data['t'][6000::100, 0] - 60 y1_meas = data['y1_f'][6000::100, 0] / 100 y2_meas = data['y2_f'][6000::100, 0] / 100 y3_meas = data['y3_d'][6000::100, 0] / 100 y4_meas = data['y4_d'][6000::100, 0] / 100 u1 = data['u1_d'][6000::100, 0] u2 = data['u2_d'][6000::100, 0] # Plot measurements and inputs plt.close(1) plt.figure(1) plt.subplot(2, 2, 1) plt.plot(t_meas, y3_meas) plt.title('x3') plt.grid() plt.subplot(2, 2, 2) plt.plot(t_meas, y4_meas) plt.title('x4') plt.grid() plt.subplot(2, 2, 3) plt.plot(t_meas, y1_meas) plt.title('x1') plt.xlabel('t[s]') plt.grid() plt.subplot(2, 2, 4) plt.plot(t_meas, y2_meas) plt.title('x2') plt.xlabel('t[s]') plt.grid() plt.close(2) plt.figure(2) plt.subplot(2, 1, 1) plt.plot(t_meas, u1) plt.hold(True) plt.title('u1') plt.grid() plt.subplot(2, 1, 2) plt.plot(t_meas, u2) plt.title('u2') plt.xlabel('t[s]') plt.hold(True) plt.grid() plt.show()

You should now see two plots showing the measurement state profiles and the control input profiles similar to Figure 6.7, “Measured state profiles.” and Figure 6.8, “Control inputs used in the identification experiment.”.

In order to evaluate the accuracy of nominal model parameter values, we simulate the model using the same initial state and inputs values as in the performed experiment used to obtain the measurement data. First, a matrix containing the input trajectories is created:

# Build input trajectory matrix for use in simulation u = N.transpose(N.vstack([t_meas, u1, u2]))

Now, the model can be simulated:

# Simulate model response with nominal parameter values res_sim = model.simulate(input=(['u1', 'u2'], u), start_time=0., final_time=60.)

The simulation result can now be extracted:

# Load simulation result x1_sim = res_sim['x1'] x2_sim = res_sim['x2'] x3_sim = res_sim['x3'] x4_sim = res_sim['x4'] t_sim = res_sim['time'] u1_sim = res_sim['u1'] u2_sim = res_sim['u2']

and then plotted:

# Plot simulation result plt.figure(1) plt.subplot(2, 2, 1) plt.plot(t_sim, x3_sim) plt.subplot(2, 2, 2) plt.plot(t_sim, x4_sim) plt.subplot(2, 2, 3) plt.plot(t_sim, x1_sim) plt.subplot(2, 2, 4) plt.plot(t_sim, x2_sim) plt.figure(2) plt.subplot(2, 1, 1) plt.plot(t_sim, u1_sim, 'r') plt.subplot(2, 1, 2) plt.plot(t_sim, u2_sim, 'r') plt.show()

Figure 6.9, “Simulation result for the nominal model.” shows the result of the simulation.

Here, the simulated profiles are given by the green curves. Clearly, there is a mismatch in the response, especially for the two lower tanks. Think about why the model does not match the data, i.e., which parameters may have wrong values.

The next step towards solving a parameter estimation problem is to identify which parameters to tune. Typically, parameters which are not known precisely are selected. Also, the selected parameters need of course affect the mismatch between model response and data, when tuned. In a first attempt, we aim at decreasing the mismatch for the two lower tanks, and therefore we select the lower tank outflow areas, a1 and a2, as parameters to optimize. The Optimica specification for the estimation problem is contained in the class `QuadTankPack.QuadTank_ParEstCasADi`

:

optimization QuadTank_ParEstCasADi(startTime=0, finalTime=60) extends QuadTank(x1(fixed=true), x1_0=0.06255, x2(fixed=true), x2_0=0.06045, x3(fixed=true), x3_0=0.02395, x4(fixed=true), x4_0=0.02325, a1(free=true, min=0, max=0.1e-4), a2(free=true, min=0, max=0.1e-4)); end QuadTank_ParEstCasADi;

We have specified the time horizon to be one minute, which matches the length of the experiment, and that we want to estimate a1 and a2 by setting `free=true`

for them. Unlike optimal control, the cost function is not specified using Optimica. This is instead specified from Python, using the `ExternalData`

class and the code below.

# Create external data object for optimization Q = N.diag([1., 1., 10., 10.]) data_x1 = N.vstack([t_meas, y1_meas]) data_x2 = N.vstack([t_meas, y2_meas]) data_u1 = N.vstack([t_meas, u1]) data_u2 = N.vstack([t_meas, u2]) quad_pen = OrderedDict() quad_pen['x1'] = data_x1 quad_pen['x2'] = data_x2 quad_pen['u1'] = data_u1 quad_pen['u2'] = data_u2 external_data = ExternalData(Q=Q, quad_pen=quad_pen)

This will create an objective which is integral of the squared difference between the measured profiles for x1 and x2 and the corresponding model profiles. We will also introduce corresponding penalties for the two input variables, which are left as optimization variables. It would also have been possible to eliminate the input variables from the estimation problem by using the `eliminated`

parameter of `ExternalData`

. See the documentation of `ExternalData`

for how to do this. Finally, we use a square matrix Q to weight the different components of the objective. We choose larger weights for the inputs, as we have larger faith in those values.

We are now ready to solve the optimization problem. We first set some options, where we specify the number of elements (time-discretization grid), the external data, and also provide the simulation with the nominal parameter values as an initial guess for the solution, which is also used to scale the variables instead of the variables' nominal attributes (if they have any):

# Set optimization options and optimize opts = op.optimize_options() opts['n_e'] = 60 # Number of collocation elements opts['external_data'] = external_data opts['init_traj'] = res_sim opts['nominal_traj'] = res_sim res = op.optimize(options=opts) # Solve estimation problem

Now, let's extract the optimal values of the parameters a1 and a2 and print them to the console:

# Extract estimated values of parameters a1_opt = res.initial("a1") a2_opt = res.initial("a2") # Print estimated parameter values print('a1: ' + str(a1_opt*1e4) + 'cm^2') print('a2: ' + str(a2_opt*1e4) + 'cm^2')

You should get an output similar to:

a1: 0.0266cm^2 a2: 0.0271cm^2

The estimated values are slightly smaller than the nominal values - think about why this may be the case. Also note that the estimated values do not necessarily correspond to the physically true values. Rather, the parameter values are adjusted to compensate for all kinds of modeling errors in order to minimize the mismatch between model response and measurement data.

Next we plot the optimized profiles:

# Load state profiles x1_opt = res["x1"] x2_opt = res["x2"] x3_opt = res["x3"] x4_opt = res["x4"] u1_opt = res["u1"] u2_opt = res["u2"] t_opt = res["time"] # Plot estimated trajectories plt.figure(1) plt.subplot(2, 2, 1) plt.plot(t_opt, x3_opt, 'k') plt.subplot(2, 2, 2) plt.plot(t_opt, x4_opt, 'k') plt.subplot(2, 2, 3) plt.plot(t_opt, x1_opt, 'k') plt.subplot(2, 2, 4) plt.plot(t_opt, x2_opt, 'k') plt.figure(2) plt.subplot(2, 1, 1) plt.plot(t_opt, u1_opt, 'k') plt.subplot(2, 1, 2) plt.plot(t_opt, u2_opt, 'k') plt.show()

You will see the plot shown in Figure 6.10, “State profiles corresponding to estimated values of a1 and a2.”.

The profiles corresponding to the estimated values of a1 and a2 are shown in black curves. As can be seen, the match between the model response and the measurement data has been significantly increased. Is the behavior of the model consistent with the estimated parameter values?

Nevertheless, there is still a mismatch for the upper tanks, especially for tank 4. In order to improve the match, a second estimation problem may be formulated, where the parameters a1, a2, a3, a4 are free optimization variables, and where the squared errors of all four tank levels are penalized. Do this as an exercise!

This section describes some tools that can be used to investigate the progress of the nonlinear programming solver on an optimization problem. This information can be useful when debugging convergence problems; some of it (e.g. dual variables) may also be useful to gain a better understanding of the properties of an optimization problem. To make sense of the information that can be retrieved, we first give an overview of the collocation procedure that transcribes the optimization problem into a Nonlinear Program (NLP).

Methods for inspecting progress are divided into low level and high level methods, where the low level methods provide details of the underlying NLP while the high level methods are oriented towards the optimization problem as seen in the model formulation.

All functionality related to inspection of solver progress is exposed through the solver object as returned through the `prepare_optimization`

method. If the optimization has been done through the `optimize`

method instead, the solver can be obtained as in

res = model.optimize(options=opts) solver = res.get_solver()

To be able to solve a dynamic optimization problem, it is first discretized through collocation. Time is divided into *elements* (time intervals), and time varying variables are approximated by a low order polynomial over each element. Each polynomial piece is described by sample values at a number of *collocation points* (default 3) within the element. The result is that each time varying variable in the model is *instantiated* into one NLP variable for each collocation point within each element. Some variables may also need to be instantiated at additional points, such as the initial point which is typically not a collocation point.

The equations in a model are divided into initial equations, dae equations, path constraints and point constraints. These equations are also instantiated at different time points to become constraints in the NLP. Initial equations and point constraints are instantiated only once. Dae equations and path constraints are instantiated at collocation point of each element and possibly some additional points.

When using the methods described below, each model equation is referred to as a pair `(eqtype, eqind)`

. The string `eqtype`

may be either `'initial'`

, `'dae'`

, `'path_eq'`

, `'path_ineq'`

, `'point_eq'`

, or `'point_ineq'`

. The equation index `eqind`

gives the index within the given equation type, and is a nonnegative integer less than the number of equations within the type. The symbolic model equations corresponding to given pairs `(eqtype, eqind)`

can be retrieved through the `get_equations`

method:

eq = solver.get_equations(eqtype, 0) # first equation of type eqtype eqs = solver.get_equations(eqtype, [1,3]) # second and fourth equation all_eqs = solver.get_equations(eqtype) # all equations of the given type

Apart from the model equations, collocation may also instantiate additional kinds of constraints, such as *continuity constraints* to enforce continuity of states between elements and *collocation constraints* to prescribe the coupling between states and their derivatives. These constraints have their own `eqtype`

strings. A list of all equation types that are used in a given model can be retrieved using

eqtypes = solver.get_constraint_types()

Given a potential solution to the NLP, the *residual* of a constraint is a number that specifies how close it is to being satisfied. For equalities, the residual must be (close to) zero for the solution to be feasible. For inequalities, the residual must be in a specified range, typically nonpositive. The constraint *violation* is zero if the residual is within bounds, and gives the signed distance to the closest bound otherwise; for equality constraints, this is the same as the residual. Methods for returning residuals actually return the violation by default, but have an option to get the raw residual.

For a feasible solution, all violations are (almost) zero. If an optimization converges to an infeasible point or does not have time to converge to a feasible one then the residuals show which constraints the NLP solver was unable to satisfy. If one problematic constraint comes into conflict with a number of constraints, all of them will likely have nonzero violations.

Residual values for a given equation type can be retrieved as a function of time through

r = solver.get_residuals(eqtype)

where `r`

is an array of residuals of shape `(n_timepoints, n_equations)`

. There are also optional arguments: `inds`

gives a subset of equation indices (e.g. `inds=[0, 1]`

), `point`

specifies whether to evaluate residuals at the optimization solution (`point='opt'`

, default) or the initial point (`point='init'`

), and `raw`

specifies whether to return constraint violations (`raw=False`

, default) or raw residuals (`raw=True`

).

The corresponding time points can be retrieved with

t, i, k = solver.get_constraint_points(eqtype)

where `t`

, `i`

, and `k`

are vectors that give the time, element index, and collocation point index for each instantiation.

To get an overview of which residuals are the largest,

solver.get_residual_norms()

returns a list of equation types sorted by descending residual norm, and

solver.get_residual_norms(eqtype)

returns a list of equation indices of the given type sorted by residual norm.

By default, the methods above work with the unscaled residuals that result directly from collocation. If the `equation_scaling`

option is turned on, the constraints will be rescaled before they are sent to the NLP solver. It might be of more interest to look at the size of the scaled residuals, since these are what the NLP solver will try to make small. The above methods can then be made to work with the scaled residuals instead of the unscaled by use of the `scaled=True`

keyword argument. The residual scale factors can also be retrieved in analogy to `solver.get_residuals`

through

scales = solver.get_residual_scales(eqtype)

and an overview of the residual scale factors (or inverse scale factors with `inv=True`

) can be gained from

solver.get_residual_scale_norms()

When solving the collocated NLP, the NLP solver typically has to evaluate the Jacobian of the constraint residual functions. Convergence problems can sometimes be related to numerical problems with the constraint Jacobian. In particular, Ipopt will never consider potential solution if there are nonfinite (infinity or not-a-number) entries in the Jacobian. If the Jacobian has such entries at the initial guess, the optimizer will give up completely.

The constraint Jacobian comes from the NLP. As seen from the original model, it contains the derivatives of the model equations (and also e.g. the collocation equations) with respect to the model variables at different time points. If one or several problematic entries are found in the Jacobian, it is often helpful to know the model equation and variable that they correspond to.

The set of (model equation, model variable) pairs that correspond to nonfinite entries in the constraint Jacobian can be printed with

solver.print_nonfinite_jacobian_entries()

or returned with

entries = solver.find_nonfinite_jacobian_entries()

There are also methods to allow to make more custom analyses of this kind. To instead list all Jacobian entries with an absolute value greater than 10, one can use

J = solver.get_nlp_jacobian() # Get the raw NLP constraint Jacobian as a (sparse) scipy.csc_matrix # Find the indices of all entries with absolute value > 10 J.data = abs(J.data) > 10 c_inds, xx_inds = N.nonzero(J) entries = solver.get_model_jacobian_entries(c_inds, xx_inds) # Map the indices to equations and variables in the model solver.print_jacobian_entries(entries) # Print them

To get the Jacobian with residual scaling applied, use the `scaled_residuals=True`

option.

Many NLP solvers (including Ipopt) produce a solution that consists of not only the primal variables (the actual NLP variables), but also one *dual variable* for each constraint in the NLP. Upon convergence, the value of each dual variable gives the change in the optimal objective per unit change in the residual. Thus, the dual variables can give an idea of which constraints are most hindering when it comes to achieving a lower objective value, however, they must be interpreted in relation to how much it might be possible to change any given constraint.

Dual variable values for a given equation type can be retrieved as a function of time through

d = solver.get_constraint_duals(eqtype)

in analogy to `solver.get_residuals`

. To get constraint duals for the equation scaled problem, use the `scaled=True`

keyword argument. Just as with `get_residuals`

, the corresponding time points can be retrieved with

t, i, k = solver.get_constraint_points(eqtype)

Besides regular constraints, the NLP can also contain upper and lower bounds on variables. These will correspond to the Modelica `min`

and `max`

attributes for instantiated model variables. The dual variables for the bounds on a given model variable `var`

can be retrieved as a function of time through

d = solver.get_bound_duals(var)

The corresponding time points can be retrieved with

t, i, k = solver.get_variable_points(var)

The methods described above generally hide the actual collocated NLP and only require to work with model variables and equations, instantiated at different points. There also exist lower level methods that expose the NLP level information and its mapping to the original model more directly, and may be useful for more custom applications. These include

`get_nlp_variables`

,`get_nlp_residuals`

,`get_nlp_bound_duals`

, and`get_nlp_constraint_duals`

to get raw vectors from the NLP solution.`get_nlp_variable_bounds`

and`get_nlp_residual_bounds`

to get the corresponding bounds used in the NLP.`get_nlp_residual_scales`

to get the raw residual scale factors.`get_nlp_variable_indices`

and`get_nlp_constraint_indices`

to get mappings from model variables and equations to their NLP counterparts.`get_point_time`

to get the times of collocation points`(i, k)`

.`get_model_variables`

and`get_model_constraints`

to map from NLP variables and constraints to the corresponding model variables and equations.

The low level constraint Jacobian methods `get_nlp_jacobian`

, `get_model_jacobian_entries`

, and the `print_jacobian_entries`

method have already been covered in the section about jacobians above.