API Reference

Public API

DecisionRules.ContinuousRelaxationIntegerStrategyType
ContinuousRelaxationIntegerStrategy()

Relax all binary/integer constraints to continuous bounds (binary → [0,1]), solve the resulting LP, and read duals in that relaxed state.

Compared to FixedDiscreteIntegerStrategy:

  • Faster: one LP solve instead of MIP + LP.
  • Smoother gradients: no integer fixing means no zero-gradient dead zones.
  • Less accurate: the LP solution may have fractional integer variables, so the gradient does not correspond to any feasible integer assignment.

A practical pattern is to train with ContinuousRelaxationIntegerStrategy during warmup (smooth landscape for initial learning) and switch to FixedDiscreteIntegerStrategy later (integer-accurate gradients for fine-tuning).

source
DecisionRules.FixedDiscreteIntegerStrategyType
FixedDiscreteIntegerStrategy()

Solve the original model, fix binary/integer variables to their incumbent values, relax integrality, re-solve the fixed continuous model, and read duals or sensitivities in that fixed-incumbent continuous state.

The returned derivative-like information is local to the incumbent integer assignment and should be interpreted as a postprocessing surrogate, not as a full differentiable MIP method.

source
DecisionRules.RolloutEvaluationType
RolloutEvaluation(subproblems, state_params_in, state_params_out, initial_state,
                  scenarios; stride=1, policy_state=:realized)

Evaluation helper that assesses the policy with a stage-wise rollout (the deployment semantics of a target-trajectory policy) on a fixed held-out scenario set. Deterministic-equivalent evaluation re-optimizes all stages jointly and can absorb stage-wise-unfollowable targets through the slack penalty, silently overstating policy quality; the rollout metric is the guard that detects this.

policy_state controls what state is passed back into the policy:

  • :realized pipes the previous realized state into the policy. This is the deployment/closed-loop rollout semantics.
  • :target pipes the previous target state into the policy, matching the deterministic-equivalent target-generation semantics from simulate_states while still solving the stage subproblems sequentially.

scenarios must be a vector of materialized scenarios, sampled once before training (e.g. [DecisionRules.sample(uncertainty_samples) for _ in 1:n]), so every evaluation uses the same fixed set. subproblems may be the training subproblems (all stage parameters are rewritten on every solve) or a separately built copy; when training on a deterministic equivalent, pass the stage-wise subproblems here.

Call evaluation(iter, model), e.g. from within a record callback. Every stride calls it rolls the policy out over the fixed set and reports:

  • metrics/rollout_objective_no_deficit: the rollout objective excluding the target-slack penalty term (the operational cost), and
  • metrics/rollout_target_violation_share: the realized slack penalty divided by the full rollout objective (NaN when undefined).

Policy comparisons should only be trusted when the violation share is small (≤ ~0.05); a larger share means the policy's targets are not followable stage by stage and the reported cost is not what deployment would realize. The latest values are kept in last_objective_no_deficit / last_violation_share for custom logging. Calls on batches that are not a multiple of stride are a no-op and leave the cached values unchanged.

source
DecisionRules.SampleLogType
SampleLog(; on_sample=(s, models, sample_log) -> nothing,
          objective_no_deficit_fn=get_objective_no_target_deficit)

Per-sample logger with local cache state for the training loops, patterned after SaveBest. During each batch the training loop calls sample_log(s, det_equivalent_or_subproblems) right after sample s has been simulated (successful solves only; a failed solve throws exactly as before). The default behavior caches, per sample, the full objective (objective_value) and the objective excluding the target-slack penalty term (objective_no_deficit_fn). The cache is cleared at the start of every batch and handed to the per-batch record(sample_log, iter, model) callback.

on_sample is an optional hook called as on_sample(s, models, sample_log) after the default caching. It receives the live JuMP model(s), so it can inspect termination statuses or dump the details of a suspicious sample for debugging — without paying any per-sample logging cost in the default configuration.

source
DecisionRules.StateConditionedPolicyType
StateConditionedPolicy

A policy architecture that separates temporal encoding from state conditioning:

  • encoder: a recurrent cell (LSTMCell/GRUCell/RNNCell, or a Chain of them) that encodes only the uncertainty sequence (temporal dependencies)
  • combiner: a Dense layer that combines the encoder output with the previous state to produce the next state

Flux's recurrent cells are stateless (Flux >= 0.16): each call returns (output, new_state) instead of mutating an internal Recur. StateConditionedPolicy therefore carries the encoder's recurrent state itself in state, threading it through one call per stage. Call Flux.reset! to clear it (back to Flux.initialstates) at the start of a rollout.

Input format: [uncertainty..., previous_state...]

source
DecisionRules.compute_parameter_dualMethod
compute_parameter_dual(model::JuMP.Model, param::JuMP.VariableRef)

Compute the dual value (sensitivity) of a parameter in a solved JuMP model.

The parameter dual represents ∂(objective)/∂(parameter_value) and is computed by:

  1. Finding all constraints where the parameter appears
  2. For each constraint, computing: -coefficient * constraint_dual
  3. For the objective, adding the coefficient (or negative for maximization)
  4. Summing all contributions

This works for any solved model, not just convex ones, as long as dual values are available.

Arguments

  • model: A solved JuMP model
  • param: A parameter variable (created with @variable(model, p in MOI.Parameter(value)))

Returns

  • The dual value (sensitivity) of the parameter

Example

model = Model(HiGHS.Optimizer)
@variable(model, x >= 0)
@variable(model, p in MOI.Parameter(1.0))
@constraint(model, con, x >= 2 * p)
@objective(model, Min, 3 * x + p)
optimize!(model)
dual_p = compute_parameter_dual(model, p)  # Should be -2 * dual(con) + 1
source
DecisionRules.create_deficit!Method
create_deficit!(model::JuMP.Model, len::Int; penalty_l1=nothing, penalty_l2=nothing, penalty=nothing)

Create deficit variables to penalize state deviations in a JuMP model.

Supports three modes:

  • L1 norm only: Uses MOI.NormOneCone (default if no penalty specified)
  • L2 squared norm only: Uses sum of squared deviations (solver-compatible alternative to SecondOrderCone)
  • Both norms: Creates both constraints with separate penalties

Arguments

  • model: The JuMP model to add deficit variables to
  • len: Number of deficit variables (typically dimension of state)
  • penalty_l1: Penalty coefficient for L1 norm (NormOneCone). If nothing and L1 is used, defaults to max objective coefficient.
  • penalty_l2: Penalty coefficient for L2 squared norm (sum of squares). If nothing and L2 is used, defaults to max objective coefficient.
  • penalty: Legacy argument. If provided and penaltyl1/penaltyl2 are both nothing, uses this for L1 norm only.

Returns

  • norm_deficit: Single variable representing total penalized deviation (for logging compatibility)
  • _deficit: Vector of deficit variables for each state dimension

Examples

# L1 norm only (default behavior, backwards compatible)
norm_deficit, _deficit = create_deficit!(model, 3; penalty=1000.0)

# L2 norm only
norm_deficit, _deficit = create_deficit!(model, 3; penalty_l2=1000.0)

# Both L1 and L2 norms
norm_deficit, _deficit = create_deficit!(model, 3; penalty_l1=1000.0, penalty_l2=500.0)
source
DecisionRules.default_annealed_scheduleMethod
default_annealed_schedule(num_batches::Int)

Build the default annealed target-penalty schedule over num_batches training batches: multipliers 0.1 -> 1.0 -> 10.0 -> 30.0 with phase lengths proportional to 2/2/4/16 of the horizon (the last phase takes the remainder; every phase keeps at least one batch). For num_batches < 4 the last num_batches multipliers are used, one batch each, so the run always ends at the strong-penalty phase.

Returns a Vector{Tuple{Int,Int,Float64}} of (first_batch, last_batch, multiplier) entries suitable for the penalty_schedule keyword of train_multistage and train_multiple_shooting. Multipliers are applied relative to the penalty the model was built with (the objective coefficient of the norm_deficit variables created by create_deficit!), so with penalty=:auto the effective penalty is multiplier * max |objective coefficient|.

source
DecisionRules.default_recordMethod
default_record(sample_log, iter, model)

Default per-batch recording callback: prints the same two per-batch lines as the historical record_loss default (metrics/loss = mean objective excluding the target-slack penalty, then metrics/training_loss = mean full objective) and returns false (training continues). Return true from a custom record to stop training.

source
DecisionRules.dense_multilayer_nnMethod
dense_multilayer_nn(num_inputs, num_outputs, layers; activation=Flux.relu, dense=Dense)

Create a multi-layer neural network with the specified architecture.

Arguments

  • num_inputs::Int: Number of input features
  • num_outputs::Int: Number of output features
  • layers::Vector{Int}: Hidden layer sizes
  • activation: Activation function (default: Flux.relu)
  • dense: Layer type (Dense, LSTM, etc.)
source
DecisionRules.materialize_tangentMethod
materialize_tangent(x)

Recursively convert ChainRulesCore tangent types (MutableTangent, Tangent) to plain NamedTuples/Arrays that Flux.update! can handle.

This is needed because Zygote produces MutableTangent for mutable structs (like Flux.Recur), but Flux.update!/Optimisers.jl expects plain NamedTuples.

source
DecisionRules.normalize_recur_stateMethod
normalize_recur_state(state)

Return a copy of a Flux.state object where any Recur-like nodes have their state field set to cell.state0. This avoids Flux.loadmodel! tie errors when loading into freshly constructed recurrent layers.

source
DecisionRules.policy_input_dimMethod
policy_input_dim(num_uncertainties, num_states)

Compute the input dimension for a policy network.

Policy networks receive [uncertainty..., previous_state...] as input, so the input dimension is num_uncertainties + num_states.

This format is consistent between subproblems and deterministic equivalent formulations, enabling warmstarting policies trained with det_eq for use with subproblems.

Arguments

  • num_uncertainties::Int: Number of uncertainty parameters per stage
  • num_states::Int: Number of state variables

Returns

  • Int: Total input dimension for the policy network
source
DecisionRules.policy_input_dimMethod
policy_input_dim(uncertainty_samples, initial_state)

Compute the input dimension for a policy network from problem data.

Arguments

  • uncertainty_samples: Uncertainty samples from problem construction
  • initial_state: Initial state vector

Returns

  • Int: Total input dimension for the policy network
source
DecisionRules.predict_window_targetsMethod
predict_window_targets(decision_rule, s_in, uncertainties_vec)

Predict one target per stage in a window. This is an AD-friendly scan: target1 = π([u1; sin]) target2 = π([u2; target1]) ...

source
DecisionRules.setup_shooting_windowsMethod
setup_shooting_windows(subproblems, state_params_in, state_params_out, initial_state,
                       uncertainties; window_size, model_factory=() -> JuMP.Model())

Build window models for multiple shooting.

Notes:

  • We store only the uncertainty PARAMETER refs (not sample sets) in WindowData.
source
DecisionRules.simulate_multiple_shootingMethod
simulate_multiple_shooting(windows, decision_rule, initial_state, uncertainty_sample, uncertainties_vec)
  • uncertainty_sample: per-stage sampled tuples (param, value) matching your existing sampler output: Vector{Vector{Tuple{VariableRef,<:Real}}}
  • uncertainties_vec: per-stage vectors (Float32) used as policy inputs

Returns total objective across windows. Gradients flow through:

  • targets within each window (via solve_window rrule)
  • realized end state between windows (via solve_window rrule seeding on end vars)
source
DecisionRules.simulate_multistageMethod
simulate_multistage(det_equivalent::JuMP.Model, state_params_in, state_params_out,
                    initial_state, uncertainties, decision_rules) -> Float64

Convenience overload: rolls out decision_rules to produce target states, then calls the deterministic-equivalent simulate_multistage to solve the coupled problem.

source
DecisionRules.simulate_multistageMethod
simulate_multistage(subproblems, state_params_in, state_params_out,
                    initial_state, uncertainties, decision_rules) -> Float64

Stage-wise (single shooting) forward simulation. Rolls decision_rules over uncertainties, solving one subproblem per stage. The realized state from each stage feeds the next via get_next_state. Returns the total objective across all stages (Extension §2, Eq. 2.1–2.4).

source
DecisionRules.simulate_multistageMethod
simulate_multistage(det_equivalent, state_params_in, state_params_out,
                    uncertainties, states) -> Float64

Deterministic-equivalent (direct transcription) forward pass. Sets all parameter values from states and uncertainties into the coupled det_equivalent model, solves it, and returns the objective value (Extension §1, Eq. 1.1).

source
DecisionRules.simulate_stageMethod
simulate_stage(subproblem, state_param_in, state_param_out, uncertainty,
               state_in, state_out_target) -> Float64

Set parameter values on subproblem (incoming state, outgoing target, uncertainty), solve it, and return the objective value. Used as the inner solve in single-shooting rollouts (Extension §2, Eq. 2.1).

source
DecisionRules.simulate_statesMethod
simulate_states(initial_state, uncertainties, decision_rule) -> Vector{Vector}

Roll out decision_rule over uncertainties to produce a target-state trajectory. At each stage the policy receives [uncertainty..., previous_state...] and outputs the next target state. Returns a length-(T+1) vector of states starting with initial_state.

source
DecisionRules.solve_windowMethod
solve_window(window_model, window_state_in_params, window_state_out_params,
             s_in, targets)

Solve a deterministic-equivalent window model.

Arguments

  • window_model: JuMP model (DiffOpt-enabled) for the window
  • window_state_in_params: Vector of MOI.Parameter vars for window initial state
  • window_state_out_params: per-stage vector of tuples (targetparam, realizedvar)
  • s_in: numeric initial state
  • targets: vector of numeric targets, one per stage in the window

Returns

  • (objective, s_out): objective value, realized end state (Float32 vector)
source
DecisionRules.state_conditioned_policyMethod
state_conditioned_policy(n_uncertainty, n_state, n_output, layers;
                         activation=Flux.relu, encoder_type=Flux.LSTM)

Create a StateConditionedPolicy with the specified architecture.

Arguments

  • n_uncertainty::Int: Number of uncertainty input dimensions
  • n_state::Int: Number of state dimensions (both input and output)
  • n_output::Int: Number of output dimensions (typically same as n_state)
  • layers::Vector{Int}: Hidden layer sizes for the encoder
  • activation: Activation function for dense layers (default: relu)
  • encoder_type: Recurrent layer/cell type (LSTM, GRU, RNN, or their *Cell variants; default: Flux.LSTM). Must support Flux.initialstates and the stateful (x, state) -> (output, new_state) call (Flux >= 0.16).

Architecture

  • Encoder: encodertype(nuncertainty => layers[1]) -> ... -> layers[end]
  • Combiner: Dense(layers[end] + nstate => noutput)
source
DecisionRules.train_multiple_shootingMethod
train_multiple_shooting(model, initial_state, windows, uncertainty_sampler; ...)

This mirrors your other training loops:

  • Reuse pre-built window models.
  • For each SGD step, sample uncertainties, build uncertaintiesvec for the policy, evaluate simulatemultiple_shooting, and update parameters.
source
DecisionRules.train_multistageMethod
train_multistage(model, initial_state, det_equivalent::JuMP.Model,
                 state_params_in, state_params_out, uncertainty_sampler; kwargs...)

Train a policy with the deterministic equivalent (direct transcription, Extension §1). Each SGD step samples uncertainty trajectories, rolls out target states with Base.accumulate, solves the coupled det_equivalent, and updates model. Gradient: Eq. 1.2, $λ^s ⊙ ∇_θ π$.

source
DecisionRules.train_multistageMethod
train_multistage(model, initial_state, subproblems, state_params_in,
                 state_params_out, uncertainty_sampler; kwargs...)

Train a policy with stage-wise decomposition (single shooting, Extension §2). Each SGD step samples num_train_per_batch uncertainty trajectories, rolls out the policy through simulate_multistage (stage-wise overload), and updates model via the Flux optimizer.

source
DecisionRules.variable_to_parameterMethod
variable_to_parameter(model, variable; initial_value=0.0, deficit=nothing)

Replace a decision variable with an MOI.Parameter and bind them via an equality constraint. When deficit is provided, the constraint becomes variable + deficit == parameter and both the parameter and the deficit variable are returned.

source

Internal Functions

DecisionRules.STRICT_GRADIENTSConstant
STRICT_GRADIENTS

Global flag controlling gradient fallback behavior in rrules.

When false (default), rrule pullbacks return zero gradients with a warning when the solver terminates unsuccessfully — this keeps training alive when a few samples hit numerical trouble.

When true, the same situation throws an error instead. Enable this in tests to verify that controlled test cases never silently fall through to zero gradients:

DecisionRules.STRICT_GRADIENTS[] = true
source
ChainRulesCore.rruleMethod

ChainRulesCore.rrule for getlastrealized_state

Computes gradients w.r.t.:

  • s_in (window start numeric state)
  • targets (all target vectors in the window)

Given cotangents (Δs_out), we:

  • seed reverse variables (realized end state vars) with Δs_out
  • reverse_differentiate!
  • read reverse sensitivities w.r.t. windowstatein_params and all target params
source
ChainRulesCore.rruleMethod
ChainRulesCore.rrule(get_next_state, subproblem, state_param_in, state_param_out, state_in, state_out_target)

Correct reverse-mode rule using DiffOpt:

  • Seeds reverse on realized output variables with Δstate_out
  • Calls DiffOpt.reverse_differentiate!
  • Reads sensitivities wrt parameter vars (stateparamin, stateparamout parameters)
  • Returns VJP wrt the numeric inputs state_in and state_out_target

Assumptions:

  • subproblem is a JuMP.Model constructed with Model(() -> DiffOpt.diff_optimizer(...))
  • state_param_in::Vector{JuMP.VariableRef} are JuMP Parameter variables (incoming state parameters)
  • state_param_out::Vector{Tuple{JuMP.VariableRef,JuMP.VariableRef}} holds (target-Parameter variable, realized-state Variable) per component
  • get_next_state(...) updates parameter values, optimize!s, and returns a Vector matching the realized-state variables
source
ChainRulesCore.rruleMethod
ChainRulesCore.rrule(::typeof(simulate_multistage), det_equivalent, state_params_in,
                     state_params_out, uncertainties, states)

Reverse-mode rule for the deterministic-equivalent (full-horizon) solve.

Mathematical basis (TS-DDR, arXiv:2405.14973, Eq. 1.2; Extension §1)

For the coupled problem $Q(w;θ) = \min \sum_t f_t + C_δ\|δ_t\|$ the gradient estimator is:

∇_θ E[Q] ≈ (1/S) Σ_s  λ^s ⊙ ∇_θ π(·; θ)

where $λ_t$ is the dual of the target constraint $x_t + δ_t = \hat{x}_t$. The pullback returns $Δ_{states}$ such that $Δ_{states}[1]$ holds the parameter duals of the initial-state parameters and $Δ_{states}[t+1]$ holds the target-constraint duals $λ_t$ for each stage.

Fallback strategy

Same as simulate_stage: tries compute_parameter_dual first, falls back to DiffOpt reverse differentiation if pdual raises. Solver failure (bad termination status) returns zero gradients or throws depending on STRICT_GRADIENTS.

source
ChainRulesCore.rruleMethod
ChainRulesCore.rrule(::typeof(simulate_stage), subproblem, state_param_in,
                     state_param_out, uncertainty, state_in, state_out)

Reverse-mode rule for a single-stage subproblem solve.

Mathematical basis (TS-DDR, arXiv:2405.14973; Extension §2 Eq. 2.5)

For stage problem $q_t(x_{t-1}, w_t; \hat{x}_t)$, the sensitivities are:

∂q_t/∂(state_in)  = μ_t    (dual of dynamics constraint w.r.t. incoming state)
∂q_t/∂(target)     = λ_t    (dual of target constraint w.r.t. target x̂_t)

These are the Lagrange multipliers that compute_parameter_dual (pdual) extracts from the solved model. This is the preferred path: closed-form and exact whenever the solver exposes constraint duals.

Fallback strategy

  1. pdual (parameter duals) — tried first.
  2. DiffOpt reverse differentiation — if pdual raises (e.g. the optimizer wrapper does not expose conic duals). Computes the same sensitivities via implicit differentiation of the KKT system.
  3. Zero gradients — only when the solver terminated with an unsuccessful status (not OPTIMAL / ALMOSTOPTIMAL / LOCALLYSOLVED). A warning is emitted. Set DecisionRules.STRICT_GRADIENTS[] = true to throw instead.
source
ChainRulesCore.rruleMethod

ChainRulesCore.rrule for solve_window

Computes gradients w.r.t.:

  • s_in (window start numeric state)
  • targets (all target vectors in the window)

Given cotangents (Δobj_val), we:

  • seed reverse variables (objective and realized end state vars) with Δobj_val
  • reverse_differentiate!
  • read reverse sensitivities w.r.t. windowstatein_params and all target params
source
ChainRulesCore.rruleMethod

ChainRulesCore.rrule(::typeof(setwindowuncertainties!), window::WindowData, uncertainty_sample)

Declare setwindowuncertainties! as non-differentiable (mutates solver state).

source
DecisionRules._apply_deficit_penalty_multiplier!Method
_apply_deficit_penalty_multiplier!(model::JuMP.Model, bases::Dict{VariableRef,Float64}, multiplier::Real) -> JuMP.Model
_apply_deficit_penalty_multiplier!(models::Vector{JuMP.Model}, bases::Vector, multiplier::Real) -> Vector{JuMP.Model}

Mutate model (or each model in models) in place, setting every deficit variable's objective coefficient to multiplier * base using the bases from _deficit_penalty_bases. Return the mutated model(s).

source
DecisionRules._as_cellMethod
_as_cell(layer)

Return the underlying recurrent cell of layer. Flux.LSTM/GRU/RNN wrap a cell (LSTMCell/GRUCell/RNNCell) in a .cell field; if layer has no such field it is already a cell and is returned unchanged.

source
DecisionRules._deficit_penalty_basesMethod
_deficit_penalty_bases(model::JuMP.Model; deficit_name="norm_deficit") -> Dict{VariableRef,Float64}
_deficit_penalty_bases(models::Vector{JuMP.Model}; deficit_name="norm_deficit") -> Vector{Dict{VariableRef,Float64}}

Capture the current objective coefficient of every deficit variable as the multiplier base for _apply_deficit_penalty_multiplier!. A variable counts as a deficit variable if deficit_name occurs in its name and its linear objective coefficient (see _linear_objective_coefficient) is nonzero.

Must be called before any penalty_schedule multiplier is applied, so the captured coefficients reflect the as-built penalties.

source
DecisionRules._get_dual_from_objectiveMethod
_get_dual_from_objective(model::JuMP.Model, param::JuMP.VariableRef)

Get the dual contribution from the objective function. If parameter appears in the objective with coefficient c, contribution is:

  • +c for minimization
  • +c for maximization
source
DecisionRules._get_dual_from_quadratic_constraintsMethod
_get_dual_from_quadratic_constraints(model::JuMP.Model, param::JuMP.VariableRef, S)

Get dual contribution from quadratic constraints of set type S. Handles both affine terms (parameter appears linearly) and quadratic terms (parameter appears in products pv or pp).

source
DecisionRules._get_parameter_coefficient_from_quadraticMethod
_get_parameter_coefficient_from_quadratic(expr::JuMP.GenericQuadExpr, param::JuMP.VariableRef)

Get the effective coefficient of a parameter from quadratic terms. For terms like coef * p * v, the effective coefficient is coef * value(v). For terms like coef * p * p, the effective coefficient is 2 * coef * value(p).

source
DecisionRules._init_recurrent_stateMethod
_init_recurrent_state(encoder)

Return the initial recurrent state for encoder: Flux.initialstates(encoder) for a single cell, or a tuple of per-layer initial states for a Chain of cells.

source
DecisionRules._linear_objective_coefficientMethod
_linear_objective_coefficient(model::JuMP.Model, variable::VariableRef) -> Float64

Return variable's linear coefficient in model's objective, or 0.0 if variable does not appear in it. Supports affine and quadratic objectives (quadratic terms are ignored); throw ArgumentError for any other objective type.

source
DecisionRules._penalty_multiplier_forMethod
_penalty_multiplier_for(schedule, iter::Int) -> Float64

Return the multiplier of the phase containing batch iter. If iter is past the schedule's last phase, return that phase's multiplier (hold the final value steady).

source
DecisionRules._record_loss_adapterMethod
_record_loss_adapter(record_loss)

Adapt the deprecated 4-argument record_loss(iter, model, loss, tag) callback to the record(sample_log, iter, model) interface, reproducing the historical two-call contract: record_loss is called first with tag = "metrics/loss", and only if that returns false is it called again with tag = "metrics/training_loss". Return the result of whichever call is made last.

source
DecisionRules._resolve_penalty_scheduleMethod
_resolve_penalty_schedule(penalty_schedule, num_batches::Int)

Resolve the penalty_schedule keyword of train_multistage and train_multiple_shooting into a Vector{Tuple{Int,Int,Float64}} of (first_batch, last_batch, multiplier) phases, or nothing if penalty scaling is disabled:

source
DecisionRules._resolve_recordMethod
_resolve_record(record, record_loss)

Resolve the record/record_loss keywords of the training loops into a single per-batch callback. Return record unchanged if record_loss is nothing; otherwise require record === default_record and return _record_loss_adapter(record_loss). Throw ArgumentError if both a custom record and a record_loss are given.

source
DecisionRules._step_encoderMethod
_step_encoder(encoder, x, state) -> (output, new_state)

Advance encoder by one step on input x from recurrent state, returning the output and the updated state. For a Chain of cells, each layer's output feeds the next and each layer's state is threaded independently.

source
DecisionRules._target_violation_shareMethod
_target_violation_share(objective::Real, objective_no_deficit::Real) -> Float64

Return the target-violation share, (objective - objective_no_deficit) / objective. Return NaN if either input is nonfinite or abs(objective) <= 1e-12 (share undefined).

source
DecisionRules._validate_penalty_scheduleMethod
_validate_penalty_schedule(schedule) -> typeof(schedule)

Validate an explicit penalty_schedule, a Vector of (first_batch, last_batch, multiplier) tuples: phases must be non-empty, contiguous, start at batch 1, satisfy first_batch <= last_batch, and have finite positive multipliers. Return schedule unchanged, or throw ArgumentError describing the first violation found.

source
DecisionRules.deterministic_equivalent!Method
deterministic_equivalent!(model, subproblems, state_params_in, state_params_out,
                          initial_state, uncertainties)

Build the deterministic-equivalent (direct transcription) JuMP model by copying all stage subproblems into model. Variables are renamed with a #t suffix to avoid conflicts. Stage coupling is enforced by identifying the realized state variable of stage t with the incoming state parameter of stage t+1.

Returns (model, uncertainties_new) where uncertainties_new maps the original uncertainty parameter refs to the new refs in the combined model.

source
DecisionRules.extract_uncertainty_paramsMethod
extract_uncertainty_params(window_uncertainties_new)

Normalize uncertainty data to a per-stage vector of parameter VariableRefs.

Accepts either:

  • Vector{Vector{Tuple{VariableRef, Any}}} (common in this package), or
  • Vector{Vector{VariableRef}}.
source
DecisionRules.get_last_realized_stateMethod
get_last_realized_state(window_model, window_state_in_params, window_state_out_params,
                        s_in, targets)

Get the realized end state from the window model after solving.

source
DecisionRules.get_next_stateMethod
get_next_state(subproblem, state_param_in, state_param_out, state_in,
               state_out_target) -> Vector

Return the realized state from the most recent solve of subproblem by reading the values of the realized-state variables in state_param_out.

source
DecisionRules.pdualMethod
pdual(v::VariableRef) -> Float64

Compute $∂Q/∂p$ for a JuMP parameter variable $p$ in a solved model, where $Q$ is the optimal objective value. By the envelope theorem / Lagrangian duality this equals the sum of $-\text{coef} \times \text{dual}$ over all constraints where $p$ appears, plus the objective coefficient of $p$.

This is the key quantity in TS-DDR (arXiv:2405.14973): the dual $λ_t$ of the target constraint gives the sensitivity $∂Q/∂\hat{x}_t$ used in Eq. 1.2.

source
DecisionRules.set_window_uncertainties!Method
set_window_uncertainties!(window, uncertainty_sample)

Set sampled uncertainty values into the window model parameters.

  • window.uncertainty_params[t][i] is the parameter VariableRef in the window model
  • uncertainty_sample[global_t][i][2] is the sampled numeric value (original structure)
source
DecisionRules.windows_equivalent!Method
windows_equivalent!(model, subproblems, state_params_in, state_params_out, initial_state, uncertainties)

Create a window equivalent without mutating the original subproblems and without adding extra variables/constraints beyond those already present in the subproblems.

source
DecisionRules.with_sensitivity_solutionMethod
with_sensitivity_solution(f, model, integer_strategy)

Run f(model) while model is in a state where duals or DiffOpt sensitivities can be read. Integer strategies that temporarily mutate the model must restore it before returning, including when f throws.

source
Flux.reset!Method
Flux.reset!(m::StateConditionedPolicy)

Reset the encoder's recurrent state to Flux.initialstates, e.g. before starting a new rollout.

source