API Reference
Public API
DecisionRules.AbstractIntegerStrategy — Type
AbstractIntegerStrategyExtension point for preparing models with discrete variables before reading duals or solver sensitivities.
DecisionRules.ContinuousRelaxationIntegerStrategy — Type
ContinuousRelaxationIntegerStrategy()Relax all binary/integer constraints to continuous bounds (binary → [0,1]), solve the resulting LP, and read duals in that relaxed state.
Compared to FixedDiscreteIntegerStrategy:
- Faster: one LP solve instead of MIP + LP.
- Smoother gradients: no integer fixing means no zero-gradient dead zones.
- Less accurate: the LP solution may have fractional integer variables, so the gradient does not correspond to any feasible integer assignment.
A practical pattern is to train with ContinuousRelaxationIntegerStrategy during warmup (smooth landscape for initial learning) and switch to FixedDiscreteIntegerStrategy later (integer-accurate gradients for fine-tuning).
DecisionRules.FixedDiscreteIntegerStrategy — Type
FixedDiscreteIntegerStrategy()Solve the original model, fix binary/integer variables to their incumbent values, relax integrality, re-solve the fixed continuous model, and read duals or sensitivities in that fixed-incumbent continuous state.
The returned derivative-like information is local to the incumbent integer assignment and should be interpreted as a postprocessing surrogate, not as a full differentiable MIP method.
DecisionRules.NoIntegerStrategy — Type
NoIntegerStrategy()Default integer strategy. Solves the model exactly as-is and preserves the historical continuous-model behavior.
DecisionRules.RolloutEvaluation — Type
RolloutEvaluation(subproblems, state_params_in, state_params_out, initial_state,
scenarios; stride=1, policy_state=:realized)Evaluation helper that assesses the policy with a stage-wise rollout (the deployment semantics of a target-trajectory policy) on a fixed held-out scenario set. Deterministic-equivalent evaluation re-optimizes all stages jointly and can absorb stage-wise-unfollowable targets through the slack penalty, silently overstating policy quality; the rollout metric is the guard that detects this.
policy_state controls what state is passed back into the policy:
:realizedpipes the previous realized state into the policy. This is the deployment/closed-loop rollout semantics.:targetpipes the previous target state into the policy, matching the deterministic-equivalent target-generation semantics fromsimulate_stateswhile still solving the stage subproblems sequentially.
scenarios must be a vector of materialized scenarios, sampled once before training (e.g. [DecisionRules.sample(uncertainty_samples) for _ in 1:n]), so every evaluation uses the same fixed set. subproblems may be the training subproblems (all stage parameters are rewritten on every solve) or a separately built copy; when training on a deterministic equivalent, pass the stage-wise subproblems here.
Call evaluation(iter, model), e.g. from within a record callback. Every stride calls it rolls the policy out over the fixed set and reports:
metrics/rollout_objective_no_deficit: the rollout objective excluding the target-slack penalty term (the operational cost), andmetrics/rollout_target_violation_share: the realized slack penalty divided by the full rollout objective (NaNwhen undefined).
Policy comparisons should only be trusted when the violation share is small (≤ ~0.05); a larger share means the policy's targets are not followable stage by stage and the reported cost is not what deployment would realize. The latest values are kept in last_objective_no_deficit / last_violation_share for custom logging. Calls on batches that are not a multiple of stride are a no-op and leave the cached values unchanged.
DecisionRules.SampleLog — Type
SampleLog(; on_sample=(s, models, sample_log) -> nothing,
objective_no_deficit_fn=get_objective_no_target_deficit)Per-sample logger with local cache state for the training loops, patterned after SaveBest. During each batch the training loop calls sample_log(s, det_equivalent_or_subproblems) right after sample s has been simulated (successful solves only; a failed solve throws exactly as before). The default behavior caches, per sample, the full objective (objective_value) and the objective excluding the target-slack penalty term (objective_no_deficit_fn). The cache is cleared at the start of every batch and handed to the per-batch record(sample_log, iter, model) callback.
on_sample is an optional hook called as on_sample(s, models, sample_log) after the default caching. It receives the live JuMP model(s), so it can inspect termination statuses or dump the details of a suspicious sample for debugging — without paying any per-sample logging cost in the default configuration.
DecisionRules.StateConditionedPolicy — Type
StateConditionedPolicyA policy architecture that separates temporal encoding from state conditioning:
encoder: a recurrent cell (LSTMCell/GRUCell/RNNCell, or aChainof them) that encodes only the uncertainty sequence (temporal dependencies)combiner: aDenselayer that combines the encoder output with the previous state to produce the next state
Flux's recurrent cells are stateless (Flux >= 0.16): each call returns (output, new_state) instead of mutating an internal Recur. StateConditionedPolicy therefore carries the encoder's recurrent state itself in state, threading it through one call per stage. Call Flux.reset! to clear it (back to Flux.initialstates) at the start of a rollout.
Input format: [uncertainty..., previous_state...]
DecisionRules.compute_parameter_dual — Method
compute_parameter_dual(model::JuMP.Model, param::JuMP.VariableRef)Compute the dual value (sensitivity) of a parameter in a solved JuMP model.
The parameter dual represents ∂(objective)/∂(parameter_value) and is computed by:
- Finding all constraints where the parameter appears
- For each constraint, computing: -coefficient * constraint_dual
- For the objective, adding the coefficient (or negative for maximization)
- Summing all contributions
This works for any solved model, not just convex ones, as long as dual values are available.
Arguments
model: A solved JuMP modelparam: A parameter variable (created with@variable(model, p in MOI.Parameter(value)))
Returns
- The dual value (sensitivity) of the parameter
Example
model = Model(HiGHS.Optimizer)
@variable(model, x >= 0)
@variable(model, p in MOI.Parameter(1.0))
@constraint(model, con, x >= 2 * p)
@objective(model, Min, 3 * x + p)
optimize!(model)
dual_p = compute_parameter_dual(model, p) # Should be -2 * dual(con) + 1DecisionRules.create_deficit! — Method
create_deficit!(model::JuMP.Model, len::Int; penalty_l1=nothing, penalty_l2=nothing, penalty=nothing)Create deficit variables to penalize state deviations in a JuMP model.
Supports three modes:
- L1 norm only: Uses
MOI.NormOneCone(default if no penalty specified) - L2 squared norm only: Uses sum of squared deviations (solver-compatible alternative to SecondOrderCone)
- Both norms: Creates both constraints with separate penalties
Arguments
model: The JuMP model to add deficit variables tolen: Number of deficit variables (typically dimension of state)penalty_l1: Penalty coefficient for L1 norm (NormOneCone). Ifnothingand L1 is used, defaults to max objective coefficient.penalty_l2: Penalty coefficient for L2 squared norm (sum of squares). Ifnothingand L2 is used, defaults to max objective coefficient.penalty: Legacy argument. If provided and penaltyl1/penaltyl2 are bothnothing, uses this for L1 norm only.
Returns
norm_deficit: Single variable representing total penalized deviation (for logging compatibility)_deficit: Vector of deficit variables for each state dimension
Examples
# L1 norm only (default behavior, backwards compatible)
norm_deficit, _deficit = create_deficit!(model, 3; penalty=1000.0)
# L2 norm only
norm_deficit, _deficit = create_deficit!(model, 3; penalty_l2=1000.0)
# Both L1 and L2 norms
norm_deficit, _deficit = create_deficit!(model, 3; penalty_l1=1000.0, penalty_l2=500.0)DecisionRules.default_annealed_schedule — Method
default_annealed_schedule(num_batches::Int)Build the default annealed target-penalty schedule over num_batches training batches: multipliers 0.1 -> 1.0 -> 10.0 -> 30.0 with phase lengths proportional to 2/2/4/16 of the horizon (the last phase takes the remainder; every phase keeps at least one batch). For num_batches < 4 the last num_batches multipliers are used, one batch each, so the run always ends at the strong-penalty phase.
Returns a Vector{Tuple{Int,Int,Float64}} of (first_batch, last_batch, multiplier) entries suitable for the penalty_schedule keyword of train_multistage and train_multiple_shooting. Multipliers are applied relative to the penalty the model was built with (the objective coefficient of the norm_deficit variables created by create_deficit!), so with penalty=:auto the effective penalty is multiplier * max |objective coefficient|.
DecisionRules.default_record — Method
default_record(sample_log, iter, model)Default per-batch recording callback: prints the same two per-batch lines as the historical record_loss default (metrics/loss = mean objective excluding the target-slack penalty, then metrics/training_loss = mean full objective) and returns false (training continues). Return true from a custom record to stop training.
DecisionRules.dense_multilayer_nn — Method
dense_multilayer_nn(num_inputs, num_outputs, layers; activation=Flux.relu, dense=Dense)Create a multi-layer neural network with the specified architecture.
Arguments
num_inputs::Int: Number of input featuresnum_outputs::Int: Number of output featureslayers::Vector{Int}: Hidden layer sizesactivation: Activation function (default: Flux.relu)dense: Layer type (Dense, LSTM, etc.)
DecisionRules.materialize_tangent — Method
materialize_tangent(x)Recursively convert ChainRulesCore tangent types (MutableTangent, Tangent) to plain NamedTuples/Arrays that Flux.update! can handle.
This is needed because Zygote produces MutableTangent for mutable structs (like Flux.Recur), but Flux.update!/Optimisers.jl expects plain NamedTuples.
DecisionRules.normalize_recur_state — Method
normalize_recur_state(state)Return a copy of a Flux.state object where any Recur-like nodes have their state field set to cell.state0. This avoids Flux.loadmodel! tie errors when loading into freshly constructed recurrent layers.
DecisionRules.policy_input_dim — Method
policy_input_dim(num_uncertainties, num_states)Compute the input dimension for a policy network.
Policy networks receive [uncertainty..., previous_state...] as input, so the input dimension is num_uncertainties + num_states.
This format is consistent between subproblems and deterministic equivalent formulations, enabling warmstarting policies trained with det_eq for use with subproblems.
Arguments
num_uncertainties::Int: Number of uncertainty parameters per stagenum_states::Int: Number of state variables
Returns
Int: Total input dimension for the policy network
DecisionRules.policy_input_dim — Method
policy_input_dim(uncertainty_samples, initial_state)Compute the input dimension for a policy network from problem data.
Arguments
uncertainty_samples: Uncertainty samples from problem constructioninitial_state: Initial state vector
Returns
Int: Total input dimension for the policy network
DecisionRules.predict_window_targets — Method
predict_window_targets(decision_rule, s_in, uncertainties_vec)Predict one target per stage in a window. This is an AD-friendly scan: target1 = π([u1; sin]) target2 = π([u2; target1]) ...
DecisionRules.setup_shooting_windows — Method
setup_shooting_windows(subproblems, state_params_in, state_params_out, initial_state,
uncertainties; window_size, model_factory=() -> JuMP.Model())Build window models for multiple shooting.
Notes:
- We store only the uncertainty PARAMETER refs (not sample sets) in WindowData.
DecisionRules.simulate_multiple_shooting — Method
simulate_multiple_shooting(windows, decision_rule, initial_state, uncertainty_sample, uncertainties_vec)uncertainty_sample: per-stage sampled tuples (param, value) matching your existing sampler output: Vector{Vector{Tuple{VariableRef,<:Real}}}uncertainties_vec: per-stage vectors (Float32) used as policy inputs
Returns total objective across windows. Gradients flow through:
- targets within each window (via solve_window rrule)
- realized end state between windows (via solve_window rrule seeding on end vars)
DecisionRules.simulate_multistage — Method
simulate_multistage(det_equivalent::JuMP.Model, state_params_in, state_params_out,
initial_state, uncertainties, decision_rules) -> Float64Convenience overload: rolls out decision_rules to produce target states, then calls the deterministic-equivalent simulate_multistage to solve the coupled problem.
DecisionRules.simulate_multistage — Method
simulate_multistage(subproblems, state_params_in, state_params_out,
initial_state, uncertainties, decision_rules) -> Float64Stage-wise (single shooting) forward simulation. Rolls decision_rules over uncertainties, solving one subproblem per stage. The realized state from each stage feeds the next via get_next_state. Returns the total objective across all stages (Extension §2, Eq. 2.1–2.4).
DecisionRules.simulate_multistage — Method
simulate_multistage(det_equivalent, state_params_in, state_params_out,
uncertainties, states) -> Float64Deterministic-equivalent (direct transcription) forward pass. Sets all parameter values from states and uncertainties into the coupled det_equivalent model, solves it, and returns the objective value (Extension §1, Eq. 1.1).
DecisionRules.simulate_stage — Method
simulate_stage(subproblem, state_param_in, state_param_out, uncertainty,
state_in, state_out_target) -> Float64Set parameter values on subproblem (incoming state, outgoing target, uncertainty), solve it, and return the objective value. Used as the inner solve in single-shooting rollouts (Extension §2, Eq. 2.1).
DecisionRules.simulate_states — Method
simulate_states(initial_state, uncertainties, decision_rule) -> Vector{Vector}Roll out decision_rule over uncertainties to produce a target-state trajectory. At each stage the policy receives [uncertainty..., previous_state...] and outputs the next target state. Returns a length-(T+1) vector of states starting with initial_state.
DecisionRules.solve_window — Method
solve_window(window_model, window_state_in_params, window_state_out_params,
s_in, targets)Solve a deterministic-equivalent window model.
Arguments
window_model: JuMP model (DiffOpt-enabled) for the windowwindow_state_in_params: Vector of MOI.Parameter vars for window initial statewindow_state_out_params: per-stage vector of tuples (targetparam, realizedvar)s_in: numeric initial statetargets: vector of numeric targets, one per stage in the window
Returns
- (objective, s_out): objective value, realized end state (Float32 vector)
DecisionRules.state_conditioned_policy — Method
state_conditioned_policy(n_uncertainty, n_state, n_output, layers;
activation=Flux.relu, encoder_type=Flux.LSTM)Create a StateConditionedPolicy with the specified architecture.
Arguments
n_uncertainty::Int: Number of uncertainty input dimensionsn_state::Int: Number of state dimensions (both input and output)n_output::Int: Number of output dimensions (typically same as n_state)layers::Vector{Int}: Hidden layer sizes for the encoderactivation: Activation function for dense layers (default: relu)encoder_type: Recurrent layer/cell type (LSTM,GRU,RNN, or their*Cellvariants; default:Flux.LSTM). Must supportFlux.initialstatesand the stateful(x, state) -> (output, new_state)call (Flux >= 0.16).
Architecture
- Encoder: encodertype(nuncertainty => layers[1]) -> ... -> layers[end]
- Combiner: Dense(layers[end] + nstate => noutput)
DecisionRules.train_multiple_shooting — Method
train_multiple_shooting(model, initial_state, windows, uncertainty_sampler; ...)This mirrors your other training loops:
- Reuse pre-built window models.
- For each SGD step, sample uncertainties, build uncertaintiesvec for the policy, evaluate simulatemultiple_shooting, and update parameters.
DecisionRules.train_multistage — Method
train_multistage(model, initial_state, det_equivalent::JuMP.Model,
state_params_in, state_params_out, uncertainty_sampler; kwargs...)Train a policy with the deterministic equivalent (direct transcription, Extension §1). Each SGD step samples uncertainty trajectories, rolls out target states with Base.accumulate, solves the coupled det_equivalent, and updates model. Gradient: Eq. 1.2, $λ^s ⊙ ∇_θ π$.
DecisionRules.train_multistage — Method
train_multistage(model, initial_state, subproblems, state_params_in,
state_params_out, uncertainty_sampler; kwargs...)Train a policy with stage-wise decomposition (single shooting, Extension §2). Each SGD step samples num_train_per_batch uncertainty trajectories, rolls out the policy through simulate_multistage (stage-wise overload), and updates model via the Flux optimizer.
DecisionRules.variable_to_parameter — Method
variable_to_parameter(model, variable; initial_value=0.0, deficit=nothing)Replace a decision variable with an MOI.Parameter and bind them via an equality constraint. When deficit is provided, the constraint becomes variable + deficit == parameter and both the parameter and the deficit variable are returned.
Internal Functions
DecisionRules.STRICT_GRADIENTS — Constant
STRICT_GRADIENTSGlobal flag controlling gradient fallback behavior in rrules.
When false (default), rrule pullbacks return zero gradients with a warning when the solver terminates unsuccessfully — this keeps training alive when a few samples hit numerical trouble.
When true, the same situation throws an error instead. Enable this in tests to verify that controlled test cases never silently fall through to zero gradients:
DecisionRules.STRICT_GRADIENTS[] = trueChainRulesCore.rrule — Method
ChainRulesCore.rrule for getlastrealized_state
Computes gradients w.r.t.:
- s_in (window start numeric state)
- targets (all target vectors in the window)
Given cotangents (Δs_out), we:
- seed reverse variables (realized end state vars) with Δs_out
- reverse_differentiate!
- read reverse sensitivities w.r.t. windowstatein_params and all target params
ChainRulesCore.rrule — Method
ChainRulesCore.rrule(get_next_state, subproblem, state_param_in, state_param_out, state_in, state_out_target)Correct reverse-mode rule using DiffOpt:
- Seeds reverse on realized output variables with Δstate_out
- Calls
DiffOpt.reverse_differentiate! - Reads sensitivities wrt parameter vars (stateparamin, stateparamout parameters)
- Returns VJP wrt the numeric inputs
state_inandstate_out_target
Assumptions:
subproblemis a JuMP.Model constructed withModel(() -> DiffOpt.diff_optimizer(...))state_param_in::Vector{JuMP.VariableRef}are JuMP Parameter variables (incoming state parameters)state_param_out::Vector{Tuple{JuMP.VariableRef,JuMP.VariableRef}}holds (target-Parameter variable, realized-state Variable) per componentget_next_state(...)updates parameter values,optimize!s, and returns a Vector matching the realized-state variables
ChainRulesCore.rrule — Method
ChainRulesCore.rrule(::typeof(simulate_multistage), det_equivalent, state_params_in,
state_params_out, uncertainties, states)Reverse-mode rule for the deterministic-equivalent (full-horizon) solve.
Mathematical basis (TS-DDR, arXiv:2405.14973, Eq. 1.2; Extension §1)
For the coupled problem $Q(w;θ) = \min \sum_t f_t + C_δ\|δ_t\|$ the gradient estimator is:
∇_θ E[Q] ≈ (1/S) Σ_s λ^s ⊙ ∇_θ π(·; θ)where $λ_t$ is the dual of the target constraint $x_t + δ_t = \hat{x}_t$. The pullback returns $Δ_{states}$ such that $Δ_{states}[1]$ holds the parameter duals of the initial-state parameters and $Δ_{states}[t+1]$ holds the target-constraint duals $λ_t$ for each stage.
Fallback strategy
Same as simulate_stage: tries compute_parameter_dual first, falls back to DiffOpt reverse differentiation if pdual raises. Solver failure (bad termination status) returns zero gradients or throws depending on STRICT_GRADIENTS.
ChainRulesCore.rrule — Method
ChainRulesCore.rrule(::typeof(simulate_stage), subproblem, state_param_in,
state_param_out, uncertainty, state_in, state_out)Reverse-mode rule for a single-stage subproblem solve.
Mathematical basis (TS-DDR, arXiv:2405.14973; Extension §2 Eq. 2.5)
For stage problem $q_t(x_{t-1}, w_t; \hat{x}_t)$, the sensitivities are:
∂q_t/∂(state_in) = μ_t (dual of dynamics constraint w.r.t. incoming state)
∂q_t/∂(target) = λ_t (dual of target constraint w.r.t. target x̂_t)These are the Lagrange multipliers that compute_parameter_dual (pdual) extracts from the solved model. This is the preferred path: closed-form and exact whenever the solver exposes constraint duals.
Fallback strategy
- pdual (parameter duals) — tried first.
- DiffOpt reverse differentiation — if pdual raises (e.g. the optimizer wrapper does not expose conic duals). Computes the same sensitivities via implicit differentiation of the KKT system.
- Zero gradients — only when the solver terminated with an unsuccessful status (not OPTIMAL / ALMOSTOPTIMAL / LOCALLYSOLVED). A warning is emitted. Set
DecisionRules.STRICT_GRADIENTS[] = trueto throw instead.
ChainRulesCore.rrule — Method
ChainRulesCore.rrule for solve_window
Computes gradients w.r.t.:
- s_in (window start numeric state)
- targets (all target vectors in the window)
Given cotangents (Δobj_val), we:
- seed reverse variables (objective and realized end state vars) with Δobj_val
- reverse_differentiate!
- read reverse sensitivities w.r.t. windowstatein_params and all target params
ChainRulesCore.rrule — Method
ChainRulesCore.rrule(::typeof(setwindowuncertainties!), window::WindowData, uncertainty_sample)
Declare setwindowuncertainties! as non-differentiable (mutates solver state).
DecisionRules._apply_deficit_penalty_multiplier! — Method
_apply_deficit_penalty_multiplier!(model::JuMP.Model, bases::Dict{VariableRef,Float64}, multiplier::Real) -> JuMP.Model
_apply_deficit_penalty_multiplier!(models::Vector{JuMP.Model}, bases::Vector, multiplier::Real) -> Vector{JuMP.Model}Mutate model (or each model in models) in place, setting every deficit variable's objective coefficient to multiplier * base using the bases from _deficit_penalty_bases. Return the mutated model(s).
DecisionRules._as_cell — Method
_as_cell(layer)Return the underlying recurrent cell of layer. Flux.LSTM/GRU/RNN wrap a cell (LSTMCell/GRUCell/RNNCell) in a .cell field; if layer has no such field it is already a cell and is returned unchanged.
DecisionRules._check_deficit_penalty_bases — Method
_check_deficit_penalty_bases(bases) -> typeof(bases)Return bases (from _deficit_penalty_bases) unchanged if it has at least one entry; otherwise throw ArgumentError, since a penalty_schedule would then have nothing to scale.
DecisionRules._deficit_penalty_bases — Method
_deficit_penalty_bases(model::JuMP.Model; deficit_name="norm_deficit") -> Dict{VariableRef,Float64}
_deficit_penalty_bases(models::Vector{JuMP.Model}; deficit_name="norm_deficit") -> Vector{Dict{VariableRef,Float64}}Capture the current objective coefficient of every deficit variable as the multiplier base for _apply_deficit_penalty_multiplier!. A variable counts as a deficit variable if deficit_name occurs in its name and its linear objective coefficient (see _linear_objective_coefficient) is nonzero.
Must be called before any penalty_schedule multiplier is applied, so the captured coefficients reflect the as-built penalties.
DecisionRules._get_dual_from_affine_constraints — Method
_get_dual_from_affine_constraints(model::JuMP.Model, param::JuMP.VariableRef, S)Get dual contribution from scalar affine constraints of set type S.
DecisionRules._get_dual_from_constraints — Method
_get_dual_from_constraints(model::JuMP.Model, param::JuMP.VariableRef)Compute the dual contribution from all constraints containing the parameter.
DecisionRules._get_dual_from_objective — Method
_get_dual_from_objective(model::JuMP.Model, param::JuMP.VariableRef)Get the dual contribution from the objective function. If parameter appears in the objective with coefficient c, contribution is:
- +c for minimization
- +c for maximization
DecisionRules._get_dual_from_quadratic_constraints — Method
_get_dual_from_quadratic_constraints(model::JuMP.Model, param::JuMP.VariableRef, S)Get dual contribution from quadratic constraints of set type S. Handles both affine terms (parameter appears linearly) and quadratic terms (parameter appears in products pv or pp).
DecisionRules._get_dual_from_vector_affine_constraints — Method
_get_dual_from_vector_affine_constraints(model::JuMP.Model, param::JuMP.VariableRef, F, S)Get dual contribution from vector affine constraints (like conic constraints).
DecisionRules._get_objective_parameter_coefficient — Method
_get_objective_parameter_coefficient(obj, param::JuMP.VariableRef)Get the coefficient of a parameter in the objective function.
DecisionRules._get_parameter_coefficient — Method
_get_parameter_coefficient(expr::JuMP.GenericAffExpr, param::JuMP.VariableRef)Get the coefficient of a parameter in an affine expression.
DecisionRules._get_parameter_coefficient_from_affine — Method
_get_parameter_coefficient_from_affine(expr::JuMP.GenericQuadExpr, param::JuMP.VariableRef)Get the coefficient of a parameter from the affine part of a quadratic expression.
DecisionRules._get_parameter_coefficient_from_quadratic — Method
_get_parameter_coefficient_from_quadratic(expr::JuMP.GenericQuadExpr, param::JuMP.VariableRef)Get the effective coefficient of a parameter from quadratic terms. For terms like coef * p * v, the effective coefficient is coef * value(v). For terms like coef * p * p, the effective coefficient is 2 * coef * value(p).
DecisionRules._init_recurrent_state — Method
_init_recurrent_state(encoder)Return the initial recurrent state for encoder: Flux.initialstates(encoder) for a single cell, or a tuple of per-layer initial states for a Chain of cells.
DecisionRules._linear_objective_coefficient — Method
_linear_objective_coefficient(model::JuMP.Model, variable::VariableRef) -> Float64Return variable's linear coefficient in model's objective, or 0.0 if variable does not appear in it. Supports affine and quadratic objectives (quadratic terms are ignored); throw ArgumentError for any other objective type.
DecisionRules._penalty_multiplier_for — Method
_penalty_multiplier_for(schedule, iter::Int) -> Float64Return the multiplier of the phase containing batch iter. If iter is past the schedule's last phase, return that phase's multiplier (hold the final value steady).
DecisionRules._record_loss_adapter — Method
_record_loss_adapter(record_loss)Adapt the deprecated 4-argument record_loss(iter, model, loss, tag) callback to the record(sample_log, iter, model) interface, reproducing the historical two-call contract: record_loss is called first with tag = "metrics/loss", and only if that returns false is it called again with tag = "metrics/training_loss". Return the result of whichever call is made last.
DecisionRules._resolve_penalty_schedule — Method
_resolve_penalty_schedule(penalty_schedule, num_batches::Int)Resolve the penalty_schedule keyword of train_multistage and train_multiple_shooting into a Vector{Tuple{Int,Int,Float64}} of (first_batch, last_batch, multiplier) phases, or nothing if penalty scaling is disabled:
nothingreturnsnothing(no scaling);:default_annealedreturnsdefault_annealed_schedule(num_batches);- any other value is checked with
_validate_penalty_scheduleand returned as-is.
DecisionRules._resolve_record — Method
_resolve_record(record, record_loss)Resolve the record/record_loss keywords of the training loops into a single per-batch callback. Return record unchanged if record_loss is nothing; otherwise require record === default_record and return _record_loss_adapter(record_loss). Throw ArgumentError if both a custom record and a record_loss are given.
DecisionRules._step_encoder — Method
_step_encoder(encoder, x, state) -> (output, new_state)Advance encoder by one step on input x from recurrent state, returning the output and the updated state. For a Chain of cells, each layer's output feeds the next and each layer's state is threaded independently.
DecisionRules._target_violation_share — Method
_target_violation_share(objective::Real, objective_no_deficit::Real) -> Float64Return the target-violation share, (objective - objective_no_deficit) / objective. Return NaN if either input is nonfinite or abs(objective) <= 1e-12 (share undefined).
DecisionRules._validate_penalty_schedule — Method
_validate_penalty_schedule(schedule) -> typeof(schedule)Validate an explicit penalty_schedule, a Vector of (first_batch, last_batch, multiplier) tuples: phases must be non-empty, contiguous, start at batch 1, satisfy first_batch <= last_batch, and have finite positive multipliers. Return schedule unchanged, or throw ArgumentError describing the first violation found.
DecisionRules.deterministic_equivalent! — Method
deterministic_equivalent!(model, subproblems, state_params_in, state_params_out,
initial_state, uncertainties)Build the deterministic-equivalent (direct transcription) JuMP model by copying all stage subproblems into model. Variables are renamed with a #t suffix to avoid conflicts. Stage coupling is enforced by identifying the realized state variable of stage t with the incoming state parameter of stage t+1.
Returns (model, uncertainties_new) where uncertainties_new maps the original uncertainty parameter refs to the new refs in the combined model.
DecisionRules.discrete_variables — Method
discrete_variables(model::JuMP.Model)Return all binary or integer variables in model.
DecisionRules.extract_uncertainty_params — Method
extract_uncertainty_params(window_uncertainties_new)Normalize uncertainty data to a per-stage vector of parameter VariableRefs.
Accepts either:
- Vector{Vector{Tuple{VariableRef, Any}}} (common in this package), or
- Vector{Vector{VariableRef}}.
DecisionRules.get_last_realized_state — Method
get_last_realized_state(window_model, window_state_in_params, window_state_out_params,
s_in, targets)Get the realized end state from the window model after solving.
DecisionRules.get_next_state — Method
get_next_state(subproblem, state_param_in, state_param_out, state_in,
state_out_target) -> VectorReturn the realized state from the most recent solve of subproblem by reading the values of the realized-state variables in state_param_out.
DecisionRules.pdual — Method
pdual(v::VariableRef) -> Float64Compute $∂Q/∂p$ for a JuMP parameter variable $p$ in a solved model, where $Q$ is the optimal objective value. By the envelope theorem / Lagrangian duality this equals the sum of $-\text{coef} \times \text{dual}$ over all constraints where $p$ appears, plus the objective coefficient of $p$.
This is the key quantity in TS-DDR (arXiv:2405.14973): the dual $λ_t$ of the target constraint gives the sensitivity $∂Q/∂\hat{x}_t$ used in Eq. 1.2.
DecisionRules.set_window_uncertainties! — Method
set_window_uncertainties!(window, uncertainty_sample)Set sampled uncertainty values into the window model parameters.
window.uncertainty_params[t][i]is the parameter VariableRef in the window modeluncertainty_sample[global_t][i][2]is the sampled numeric value (original structure)
DecisionRules.windows_equivalent! — Method
windows_equivalent!(model, subproblems, state_params_in, state_params_out, initial_state, uncertainties)Create a window equivalent without mutating the original subproblems and without adding extra variables/constraints beyond those already present in the subproblems.
DecisionRules.with_sensitivity_solution — Method
with_sensitivity_solution(f, model, integer_strategy)Run f(model) while model is in a state where duals or DiffOpt sensitivities can be read. Integer strategies that temporarily mutate the model must restore it before returning, including when f throws.
Flux.reset! — Method
Flux.reset!(m::StateConditionedPolicy)Reset the encoder's recurrent state to Flux.initialstates, e.g. before starting a new rollout.