import Pkg; Pkg.activate(@DIR); Pkg.instantiate()

The content of this notebook comes from the CMU course "Optimal-Control-16-745" from Zachary Manchester:¶

https://github.com/Optimal-Control-16-745/lecture-notebooks/blob/main/Lecture%203/minimization.ipynb

Lecture 3 — Minimization via Newton's Method¶

Key ideas:¶

• ∂f/∂x is a row vector; we use ∇f(x) = (∂f/∂x)^T (column)¶

• Newton for minimization solves: H(x) Δx = -∇f(x)¶

• Update: x ← x + Δx, where H = ∇²f is the Hessian¶

• If H ≻ 0 (positive definite), the Newton step is a descent direction¶

• Regularization ("damped Newton"): add βI until H ≽ 0 to guarantee descent¶

• Line Search: avoid overshooting a minima¶

In [1]:
using LinearAlgebra
using ForwardDiff
using PyPlot
In [2]:
"""
f(x)


Scalar objective used to illustrate Newton's method on a nonconvex quartic.
 

f(x) = x^4 + x^3 - x^2 - x


This function has both minima and maxima, so a naive Newton step can ascend if
`∇²f(x) < 0` at the current iterate (see lecture notes on sufficient conditions).
"""

function f(x)
    return x.^4 + x.^3 - x.^2 - x
end
Out[2]:
f (generic function with 1 method)
In [3]:
"""
∇f(x)

Analytic gradient (first derivative) of `f`.
"""
function ∇f(x)
    return 4.0*x.^3 + 3.0*x.^2 - 2.0*x - 1.0
end
Out[3]:
∇f
In [4]:
"""
Analytic Hessian (second derivative) of `f`.
"""

function ∇2f(x)
    return 12.0*x.^2 + 6.0*x - 2.0
end
Out[4]:
∇2f (generic function with 1 method)
In [5]:
x = LinRange(-1.75,1.25,1000)
Out[5]:
1000-element LinRange{Float64, Int64}:
 -1.75, -1.747, -1.74399, -1.74099, …, 1.24099, 1.24399, 1.247, 1.25
In [6]:
p = plot(x,f(x))
Out[6]:
1-element Vector{PyCall.PyObject}:
 PyObject <matplotlib.lines.Line2D object at 0x16801bfa0>
In [7]:
function newton_step(x0)
    xn = x0 - ∇2f(x0)\∇f(x0)
end
Out[7]:
newton_step (generic function with 1 method)

Let's check that our newton step implementation is doing what we expect should happen in a "nice region" of the function (think basin of attractions).¶

In [8]:
xguess = -1.5
plot(x, f(x))
plot(xguess, f(xguess), "rx")
Out[8]:
1-element Vector{PyCall.PyObject}:
 PyObject <matplotlib.lines.Line2D object at 0x1685ec460>
In [9]:
xnew = newton_step(xguess[end])
xguess = [xguess xnew]
plot(x, f(x))
plot(xguess, f(xguess), "rx")
Out[9]:
2-element Vector{PyCall.PyObject}:
 PyObject <matplotlib.lines.Line2D object at 0x1685ec730>
 PyObject <matplotlib.lines.Line2D object at 0x1685ec820>

Let's now test out a different starting point for newton to motivate the need for "Globalization Strategies" to make newton work better and more generally!¶

In [10]:
xguess = 0.0
plot(x, f(x))
plot(xguess, f(xguess), "rx")
Out[10]:
1-element Vector{PyCall.PyObject}:
 PyObject <matplotlib.lines.Line2D object at 0x1693ba920>
In [11]:
xnew = newton_step(xguess[end])
xguess = [xguess xnew]
plot(x, f(x))
plot(xguess, f(xguess), "rx")
Out[11]:
2-element Vector{PyCall.PyObject}:
 PyObject <matplotlib.lines.Line2D object at 0x1694506d0>
 PyObject <matplotlib.lines.Line2D object at 0x1694507c0>

Wow! Newton actually pushed us to a worse solution with a higher objective value. Why?¶

Globalization Strategy 1: Regularization¶

In [12]:
∇2f(0.0)
Out[12]:
-2.0
In [13]:
function regularized_newton_step(x0)
    β = 1.0
    H = ∇2f(x0)
    while !isposdef(H)
        H = H + β*I
    end
    xn = x0 - H\∇f(x0)
end
Out[13]:
regularized_newton_step (generic function with 1 method)
In [14]:
xguess = 0.0
plot(x, f(x))
plot(xguess, f(xguess), "rx")
Out[14]:
1-element Vector{PyCall.PyObject}:
 PyObject <matplotlib.lines.Line2D object at 0x170479de0>
In [15]:
xnew = regularized_newton_step(xguess[end])
xguess = [xguess xnew]
plot(x, f(x))
plot(xguess, f(xguess), "rx")
Out[15]:
2-element Vector{PyCall.PyObject}:
 PyObject <matplotlib.lines.Line2D object at 0x1704eba00>
 PyObject <matplotlib.lines.Line2D object at 0x1704ebaf0>

Ok this is better as we move to right direction of descent however we overshot the minima! We will fix this with Line-Search¶

Line-Search¶

In [16]:
function backtracking_regularized_newton_step(x0)
    b = 0.1
    c = 0.5
    β = 1.0
    H = ∇2f(x0)
    while !isposdef(H)
        H = H + β*I
    end
    Δx = -H\∇f(x0)
    
    α = 1.0
    while f(x0 + α*Δx) > f(x0) + b*α*∇f(x0)*Δx
        α = c*α
    end
    print(α)
    xn = x0 + α*Δx
end
Out[16]:
backtracking_regularized_newton_step (generic function with 1 method)
In [17]:
xguess = 0.0
plot(x, f(x))
plot(xguess, f(xguess), "rx")
Out[17]:
1-element Vector{PyCall.PyObject}:
 PyObject <matplotlib.lines.Line2D object at 0x17058d6c0>
In [18]:
xnew = backtracking_regularized_newton_step(xguess[end])
xguess = [xguess xnew]
plot(x, f(x))
plot(xguess, f(xguess), "rx")
0.5
Out[18]:
2-element Vector{PyCall.PyObject}:
 PyObject <matplotlib.lines.Line2D object at 0x1705df850>
 PyObject <matplotlib.lines.Line2D object at 0x1705df940>