Summer of learning: Week 2
2025 June 01
$\S$1. Goals for next week
I suppose this can be in order of decreasing importance.
- start reading Hubbard and Hubbard vector calculus; the Colley textbook was fine until around chapter 4 or 5, it started losing rigor; the appendices with the proofs of important results were important
- still have to finish LADR (it turns out that there are a lot of exercises)
- still have to start learning silver (it turns out that coding makes me really annoyed and tired (even if this has stopped being true I have yet to actually spend enough time))
- still have to figure out how to do physics (maybe review the exercises from earlier morin chapters which I skipped because I was lazy)
- tao analysis I
$\S$2. LADR
$\S$2.1. Upper triangular matrices
Definition 1. (Upper triangular matrix)
An upper triangular matrix is a matrix such that each entry below the main diagonal is zero.
If a matrix $M$ is upper triangular, then we can say this about its corresponding operator $T$ (under the basis, say, $\beta=\{v_1,\dots,v_n\}$:
Proposition 2.
The matrix of $T$ under the basis $\beta$ is upper triangular, if and only if for each $1\le k\le n$, $Tv_k\in {\operatorname{span}}(v_1,\dots,v_k)$.
Proof.
For the if direction, notice that
\[ Tv_k=\sum_{i=1}^{k}M_{ki}v_i \]
which means $Tv_k\in {\operatorname{span}}(v_1,\dots,v_k)$. The reverse direction follows similarly.
$\square$
Corollary 3.
${\operatorname{span}}(v_1,\dots,v_k)$ is invariant under $T$.
Theorem 4.
Suppose $T\in \mathcal L(V)$ has an upper-diagonal matrix under some basis of $V$, and the values on the diagonal of this matrix are $\lambda_1,\dots,\lambda_n$. Then
\[ (T-\lambda_1 I)\dots(T-\lambda_n I)=0. \]
Proof.
This follows by induction. Any vector in ${\operatorname{span}}(v_1)$ will satisfy
\[ (T-\lambda_2 I)(T-\lambda_3 I)\dots(T-\lambda_m I)(T-\lambda_1 I)v=0. \]
Notice that $(T-\lambda_2 I)v_2\in {\operatorname{span}}(v_1)$, so
\[ (T-\lambda_1 I)(T-\lambda_2 I)v=0 \]
which is what we desired. Similarly $(T-\lambda_k I)v_k\in {\operatorname{span}}(v_1,\dots,v_{k-1})$, so $(T-\lambda_1 I)\dots(T-\lambda_k I)v_k=0$ assuming it’s true for $k-1$.
$\square$
Theorem 5.
Suppose $T\in \mathcal L(V)$ has an upper-diagonal matrix under some basis of $V$. Then the eigenvalues of $T$ lie on the diagonal of the matrix.
Proof.
We have that $\lambda_1$ is an eigenvalue of $T$, since $Tv_1=\lambda_1v_1$. For $2\le k\le n$, notice that
\[ \dim{\operatorname{span}}(v_1,\dots,v_k)=k,\ \dim{\operatorname{span}}(v_1,\dots,v_{k-1})=k-1. \]
Thus, $T-\lambda_k I$ restricted over $\dim{\operatorname{span}}(v_1,\dots,v_k)$ is not injective, and there exists $v\neq 0$ in ${\operatorname{span}}(v_1,\dots,v_k)$ such that $(T-\lambda_k I)v=0$; this is an eigenvector of $\lambda_k$, as desired.
To show that there exist no other eigenvalues, note that the minimal polynomial of $T$ divides $(T-\lambda_1 I)\dots(T-\lambda_n I)$, and hence cannot have any other roots.
$\square$
$\S$2.2. Diagonalizable operators
Definition 6.
The eigenspace of an operator $T\in \mathcal L(V)$ and eigenvalue $\lambda$ is defined as
\[ E(\lambda, T)={\operatorname{null}\,}(T-\lambda I)=\{v\in V:Tv=\lambda v\}.\]
One could also call it ${\operatorname{ker}}(T-\lambda I)$.
Summing eigenspaces is direct, as eigenvectors of distinct eigenvalues are linearly independent. Hence
\[ \dim E(\lambda_1,T)+\dots+\dim E(\lambda_m,T)=\dim (E(\lambda_1,T)\oplus\dots\oplus E(\lambda_m,T))\le \dim V. \]
This fact was proven earlier in the book, but I’ll write it down here:
Theorem 7.
Consider a list of eigenvectors $v_1,\dots,v_n$ corresponding to different eigenvalues $\lambda_1,\dots,\lambda_n$. This list is linearly independent.
Proof.
By induction. This is clearly true for $n=1$, so we assume it is true for $n=k-1$ and prove it true for $n=k$.
Let $a_1,\dots,a_k$ be scalars such that
\[ a_1v_1+\dots+a_kv_k=0.\]
Apply $T-\lambda_k I$ to both sides; this gives
\[ a_1(\lambda_1-\lambda_k)v_1+\dots+a_{k-1}(\lambda_1-\lambda_{k-1}v_{k-1}=0.\]
This means that our original $a_1,\dots, a_{k-1}$ were all zero, meaning $a_k=0$ as well, which is the conclusion.
$\square$
Before we write down the condition for diagonalizability, here are some preliminary results:
Proposition 8.Let $T\in \mathcal L(V)$. The following are equivalent.
- $T$ is diagonalizable.
- $V$ has a basis of eigenvectors of $T$.
- $V=E(\lambda_1,T)\oplus\dots\oplus E(\lambda_m,T)$.
- $\dim V=\dim E(\lambda_1,T)+\dots+\dim E(\lambda_m, T)$.
Proof.
We show three equivalences:
- (c)$\iff$(d): Proven by the earlier discussion.
- (a)$\iff$(b): Under such a basis of eigenvectors, we would have
\[ \mathcal M(T) =
\begin{bmatrix}
\lambda_1&&&0\\
&\lambda_2&&\\
&&\ddots&\\
0&&&\lambda_m
\end{bmatrix},
\]
because (as easily verified) $Tv_k=\lambda_kv_k$.
- (d)$\iff$(b): Take a basis of each $E(\lambda_k,T)$; we know that adjoining two such bases gives another linearly independent list in $V$, so putting them all together gives a basis of $V$.
$\square$
And while I’m at it this result is also needed later:
Theorem 9. (Range of polynomial operator invariant under operator)
Let $p(x)$ be a polynomial. Then ${\operatorname{range}\,} p(T)$ and ${\operatorname{null}\,} p(T)$ are invariant under $T$.
Proof.
For ${\operatorname{null}\,} p(T)$, let there be a vector $v\in {\operatorname{null}\,} p(T)$. Then
\[ p(T)(Tv)=(Tp(T))v=T(p(T)v)=0. \]
For ${\operatorname{range}\,} p(T)$, suppose $u\in {\operatorname{range}\,} p(T)$. Then there exists $v=p(T)u$, and
\[ Tu=T(p(T)v)=(p(T)(Tv)) \]
i.e. $Tu\in {\operatorname{range}\,} p(T)$.
$\square$
So if $V$ has $\dim V$ eigenvectors of $T$, the largest number possible (why?), this makes $T$ diagonalizable. But this isn’t enough, because of:
\[
\begin{bmatrix}
1&0&0&0\\
0&4&0&0\\
0&0&3&0\\
0&0&0&4
\end{bmatrix}.
\]
The minimal polynomial of this matrix turns out to be $(x-1)(x-3)(x-4)$ rather than $(x-1)(x-3)(x-4)^2$.
So here is the condition which is both necessary and sufficient:
Theorem 10. (Diagonalizability criterion)
$T$ is diagonalizable if and only if the minimal polynomial $T$ is of the form
\[ p(z)=(z-\lambda_1)\dots(z-\lambda_m) \]
for distinct $\lambda_1,\dots,\lambda_m$. That is, $p(z)$ splits/has no double roots.
Proof.
The if direction is easy, so we prove the converse. Induct on $m$. If $m=1$, then $T-\lambda_1I=0$, or $T=\lambda_1I$. So $T$ is diagonalizable.
Now assume true for $m=k-1$ and prove for $m=k$. Then we have that ${\operatorname{range}\,}(T-\lambda_m I)$ is invariant under $T$, so $T|_{{\operatorname{range}\,}(T-\lambda_m I)}$ is an operator. If $u\in {\operatorname{range}\,}(T-\lambda_mI)$, then for some $v\in V$,
\[ (T-\lambda_1I)\dots(T-\lambda_{m-1}I)u=(T-\lambda_1I)\dots(T-\lambda_nI)v=0 \]
This has shown that $(x-\lambda_1)\dots(x-\lambda_{m-1})$ is a multiple of the minimal polynomial of $T|_{{\operatorname{range}\,}(T-\lambda_m I)}$. Hence by our induction hypothesis ${\operatorname{range}\,}(T-\lambda_m I)$ has a basis of eigenvectors of $T$.
Now we show that ${\operatorname{range}\,}(T-\lambda_m I)\oplus {\operatorname{null}\,}(T-\lambda_m I)$, which is enough. If a vector $u$ is in both, then
\begin{align*}
0&=(T-\lambda_1I)\dots(T-\lambda_{m-1}I)u\\
&=(\lambda_m-\lambda_1)\dots(\lambda_m-\lambda_{m-1})u\\
\iff u&=0.
\end{align*}
$\square$
This means that one can restrict $T\in \mathcal L(V)$ to a subspace $U\subset V$, and $T|_U$ will still be diagonalizable. (Look at the minimal polynomials of both.)
Theorem 11. (Gershgorin disk theorem)
Suppose $T\in \mathcal L(V)$ and $v_1,\dots,v_n$ is a basis of $V$; let $M=\mathscr M(T,(v_1,\dots,v_n))$. Then each eigenvalue of $T$ is contained in a Gershgorin disk of $T$, where a Gershgorin disk wrt the basis of $V$ is defined as
\[\{z\in {\mathbb F} \mid |z-A_{j,j}|\le \sum_{j\neq k, 1\le k\le n}|A_{j,k}| \}\]
where $1\le j\le n$.
Proof.
Let $w$ be an eigenvector, $\lambda$ its corresponding eigenvector. Then there exist $c_1, \dots, c_n\in {\mathbb F}$ such that
\[w=c_1 v_1+\dots+c_nv_n.\]
Applying $T$ to both sides gives
\begin{align*}
\lambda c_1v_1+\dots+\lambda c_n v_n &= \sum_{j=1}^{n} c_j Tv_j \\
&= \sum_{i=1}^{n} \left( \sum_{k=1}^{n} A_{i,k}c_k \right)v_j.
\end{align*}
Now take the value of $c_i$ with the largest value, call it $c_j$. Then
\begin{align*}
\lambda c_j &= \sum_{1\le k\le n}A_{j,k}c_k \\
\iff |\lambda - A_{j,j}| &= \left| \sum_{1\le k\le n,j\neq k} A_{j,k}\frac{c_k}{c_j} \right|\\
&\le \left| \sum_{1\le k\le n,j\neq k} A_{j,k} \right|\le\sum_{1\le k\le n,j\neq k} |A_{j,k}|.
\end{align*}
$\square$
$\S$2.3. Exercises
Problem 1. (LADR 5C.3)
Suppose $T\in \mathcal L(V)$ is invertible and $v_1, \dots, v_n$ is a basis of $V$ with respect to which the matrix of $T$ is upper triangular, with $\lambda_1, \dots, \lambda_n$ on the diagonal. Show that the matrix of $T^{-1}$ is also upper triangular with respect to the basis $v_1, \dots, v_n$, with
\[ \frac{1}{\lambda_1},\dots,\frac{1}{\lambda_n} \]
on the diagonal.
Solution.
Because $T$ is upper diagonal, we have that $\operatorname{span}(v_1,\dots,v_k)$ is invariant under $T$. Hence, $T^{-1}v_k$ can only depend on the values of $v_1,\dots,v_k$, which means that $T^{-1}$ must also be upper diagonal.
But then consider the matrix of $TT^{-1}=I$: because the entries on the diagonal are 1, the entries on the diagonal of $T^{-1}$ must be
\[ \frac{1}{\lambda_1},\dots,\frac{1}{\lambda_n} \]
$\square$
Problem 2. (LADR 5C.6)
Suppose ${\mathbb F}={\mathbb C}$, $V$ is finite-dimensional, and $T\in \mathcal L(V)$. Prove that if $k\in \{1,\dots,\dim V\}$, then $V$ has a $k$-dimensional subspace invariant under $T$.
Solution.
Because ${\mathbb F}={\mathbb C}$, there is some basis for which $\mathcal M(T)$ is upper triangular. Take the first $k$ vectors in this basis; their span is $k$-dimensional and invariant under $T$.
$\square$
Problem 3. (LADR 5C.8)
Suppose $B$ is a square matrix with complex entries. Prove that there exists an invertible square matrix $A$ with complex entries such that $A^{-1} BA$ is an upper-triangular matrix.
Solution.
Let $A$ be the change-of-basis matrix which takes the basis which gives $B$ from the underlying linear operator $T$, to the basis in which $T$ is upper-triangular. Then by change-of-basis formula $A^{-1} BA$ is the desired matrix.
$\square$
Problem 4. (LADR 5C.11)
Suppose ${\mathbb F}={\mathbb C}$ and $V$ is finite-dimensional. Prove that if $T\in \mathcal L(V)$, then there exists a basis of $V$ with respect to which $T$ has a lower-triangular matrix.
Solution.
Take the basis under which $T$ has an upper-triangular matrix and reverse it.
$\square$
Problem 5. (LADR 5D.1)
Suppose $V$ is a finite-dimensional complex vector space and $T\in \mathcal L(V)$.
- Prove that if $T^4=I$, then $T$ is diagonalizable.
- Prove that if $T^4=T$, then $T$ is diagonalizable.
- Give an example of an operator $T\in \mathcal L({\mathbb C}^2$ such that $T^4=T^2$ and $T$ is not diagonalizable.
Solution.
- The minimal polynomial is $x^4-1=(x-1)(x+1)(x-i)(x+i)$.
- The minimal polynomial is $x^3-1=(x-1)(x-\omega)(x-\omega^2)$ for the root of unity $\omega^3=1$.
- Consider
\[ \begin{bmatrix}
0&0&0&0\\
0&0&0&0\\
1&0&0&0\\
0&1&0&0
\end{bmatrix},\]
which under the standard basis, sends $e_1\to e_3,e_2\to e_4,e_3\to 0,e_4\to 0$. It is also clearly not diagonalizable.
$\square$
Problem 6. (LADR 5D.2)
Suppose $T\in \mathcal L(V)$ has a diagonal matrix $A$ with respect to some basis of $V$. Prove that if $\lambda \in {\mathbb F}$ then $\lambda$ appears on the diagonal of $A$ precisely $\dim E(\lambda, T)$ times.
Solution.
Said basis of $V$ must be a basis of eigenvectors. Because these make up a basis of $V$, exactly $\dim E(\lambda, T)$ of these eigenvectors must also be in $E(\lambda, T)$, so we can look at the corresponding columns, which must have the value of $\lambda$.
$\square$
Problem 7. (LADR 5D.3)
Suppose $V$ is finite-dimensional and $T\in \mathscr L(V)$. Prove that if the operator $T$ is diagonalizable, then $V={\operatorname{null}\,} T\oplus {\operatorname{range}\,} T$.
Solution.
If $T$ is diagonalizable, it is also invertible (consider the basis under which the matrix is diagonal, that gives you $\mathscr M(T^{-1})$), hence surjective. That proves that ${\operatorname{null}\,} T=\{0\}$, we’re done.
$\square$
Problem 8. (LADR 5D.4)
Suppose $V$ is finite-dimensional and $T\in \mathscr L(V)$. Prove that the following are equivalent.
- $V={\operatorname{null}\,} T\oplus {\operatorname{range}\,} T$.
- $V={\operatorname{null}\,} T+{\operatorname{range}\,} T$.
- ${\operatorname{null}\,} T\cap {\operatorname{range}\,} T=\{0\}$.
Solution.
(a) and (b) are equivalent by rank-nullity theorem: because $\dim V=\dim {\operatorname{null}\,} T+\dim {\operatorname{range}\,} T$, the addition must be direct. Then (c) is equivalent to (a) because the sum is direct.
$\square$
Problem 9. (LADR 5D.5)
Suppose $V$ is a finite-dimensional complex vector space and $T\in \mathscr L(V)$. Prove that $T$ is diagonalizable if and only if
\[ V={\operatorname{null}\,}(T-\lambda I)\oplus {\operatorname{range}\,}(T-\lambda I) \]
for every $\lambda \in {\mathbb C}$.
Solution.
We will use the fact that $T-\lambda_1 I$ and $T-\lambda_2 I$ commute. This is true because
\[(T-\lambda_1 I)(T-\lambda 2 I)=(T-\lambda_2 I)(T-\lambda_1I)=T^2-(\lambda_1+\lambda_2)T+\lambda_1\lambda_2 I.\]
If $\lambda$ is not an eigenvalue of $V$, then ${\operatorname{null}\,}(T-\lambda I)=\{0\}$. Then $V={\operatorname{null}\,}(T-\lambda I)\oplus {\operatorname{range}\,}(T-\lambda I)$ if and only if $T-\lambda I$ is surjective, which it is if $T$ is diagonalizable.
So assume that $\lambda$ (henceforth $\lambda_i$) is an eigenvalue of $V$, and let the minimal polynomial of $T$ be
\[ p(x)=(x-\lambda_1)\dots(x-\lambda_n). \]
Take a vector $u\in {\operatorname{null}\,}(T-\lambda_iI)\cap {\operatorname{range}\,}(T-\lambda_iI)$; there exists a vector $v\in V$ such that $(T-\lambda_i)v=u$, and hence
\begin{align*}
(T-\lambda_1I)\dots(T-\lambda_n I)v &= 0 \\
(T-\lambda_1I)\dots(T-\lambda_nI)(T-\lambda_iI)v &= 0\\
(T-\lambda_1I)\dots(T-\lambda_nI)u &= 0\\
(\lambda_i-\lambda_1)\dots(\lambda_i-\lambda_n)u &= 0.
\end{align*}
If $p(x)$ has any double roots, then $u$ does not have to be zero, and ${\operatorname{null}\,} (T-\lambda_i I)\cap {\operatorname{range}\,}(T-\lambda_i I)\neq \{0\}$. However if $p(x)$ does not (which happens if and only if $T$ is diagonalizable), then $u$ must be zero and ${\operatorname{null}\,}(T-\lambda_iI)\cap {\operatorname{range}\,}(T-\lambda_iI)=\{0\}$ i.e. $V={\operatorname{null}\,}(T-\lambda_iI)\cap {\operatorname{range}\,}(T-\lambda_iI)$.
$\square$
Problem 10. (LADR 5D.15)
Suppose $V$ is a finite-dimensional complex vector space, $T\in \mathcal L(V)$, and $p$ is the minimal polynomial of $T$. Prove that the following are equivalent:
- $T$ is diagonalizable.
- There does not exist $\lambda\in {\mathbb C}$ such that $p$ is a polynomial multiple of $(z-\lambda)^2$.
- $p$ and its derivative $p’$ have no zeros in common.
- The greatest common divisor of $p$ and $p’$ is the constant polynomial $1$.
Solution.
(c) and (d) are equivalent upon factoring out said greatest common divisor. Then we’re done by the well-known fact that $p$ and $p’$ share roots if and only of $p$ has a double root.
$\square$
Problem 11. (LADR 5D.16)
Suppose that $T\in \mathscr L(V)$ is diagonalizable. Let $\lambda_1, \dots, \lambda_m$ denote the distinct eigenvalues of $T$. Prove that a subspace $U$ of $V$ is invariant under $T$ if and only if there exist subspaces $U_1, \dots, U_m$ of $V$ such that $U_k\subseteq E(\lambda_k,T)$ for each $k$ and $U=U_1\oplus \dots\oplus U_m$.
Solution.
Consider $T|_U$, which must also be diagonalizable, as its minimal polynomial divides the minimal polynomial of $T$. Hence there is a basis of eigenvectors of $T|_U$ which spans $U$ (and these eigenvectors are also eigenvectors of $T$ in $V$).
Then take $U_i$ to be the corresponding subspace of each eigenvector corresponding to some eigenvalue $\lambda_i$; the sum $U=U_1\oplus \dots \oplus U_m$ is direct because eigenvectors corresponding to different eigenvalues are linearly independent.
$\square$
Problem 12. (LADR 5D.22)
Suppose $T\in \mathscr L(V)$ and $A$ is an $n$-by-$n$ matrix that is the matrix of $T$ with respect to some basis of $V$. Prove that if
\[ |A_{j,j}| > \sum_{k=1\\k\neq j}^{n} |A_{j,k}| \]
for each $j\in \{1,\dots,n\}$, then $T$ is invertible.
Solution.
The condition states that the Gershgorin disks of $T$ with respect to the chosen basis of $V$ do not contain zero, that is, $T$ is surjective.
$\square$
$\S$3. Colley Vector Calculus
$\S$3.1. Vector-valued functions, or something
For a (continuous) path $\mathbf x(t):{\mathbb R}\to {\mathbb R}^n$ we can define:
- Length
\[\int_a^b ||\mathbf x’(t)||dt\]
- Arc-length parametrization
\[s(t)=\int_a^t||\mathbf x’(\tau)||d\tau\]
- Unit tangent vector
\[\mathbf T(t)=\frac{\mathbf x’(t)}{||\mathbf x’(t)||}.\]
The arc-length parametrization allows us to consider properties of the curve which rely only on the shape, rather than how fast the curve is traversed over the time $t$, since $s$ travels at one arc length per second of time. For example the curvature and torsion which will require:
- Normal
\[\mathbf N=\frac{d\mathbf T/dt}{||d\mathbf T/dt||}=\frac{d\mathbf T/ds}{||d\mathbf T/ds||}\]
- $\mathbf B$
\[\mathbf B =\mathbf T\times \mathbf N\]
The significance of this is that $\mathbf T,\mathbf N,$ and $\mathbf B$ create a frame of coordinates with origin at the current point, since they are mutually orthogonal (by definition) and of unit length. This is helpful later.
It turns out that there are scalars $\kappa$ and $\tau$ such that
\[\frac{dT}{ds}=\kappa \mathbf N,\ \frac{dB}{ds}=-\tau \mathbf N.\]
These are the curvature and torsion, respectively, of $\mathbf x(t)$. We have
\[\kappa(t)=\frac{||d\mathbf T/dt||}{ds/dt} = \left\Vert\frac{d\textbf T}{ds}\right\Vert.\]
And then note that
\[\mathbf v(t)=\mathbf x’(t) = \dot s \mathbf T,\]
so we can get the acceleration
\begin{align*}
\mathbf a(t)&= \frac{d}{dt} \dot sT\\
&= \ddot s\mathbf T+\dot s\frac{d\mathbf T}{dt}\\
&= \ddot s\mathbf T + \dot s\cdot \frac{d\mathbf T}{ds}\cdot \frac{d\mathbf T}{ds} \\
&= \ddot s\mathbf T+\dot s^2 \kappa\mathbf N.
\end{align*}
This gives that the tangent component of acceleration is just $\ddot s$!
We can also find the curvature of the path by calculating $\mathbf v\times \mathbf a$:
\[\mathbf v\times \mathbf a=(\dot s\mathbf T)\times (\ddot s\mathbf T+\dot s^2\kappa \mathbf N)=\dot s^3\kappa \mathbf B.\]
Taking norm of both sides, we get:
\[ \kappa = \frac{||\mathbf v\times\mathbf a||}{||\mathbf v||^3}. \]
Then there are also these things called the Frenet-Serret formulas that go like
\[
\begin{bmatrix}
\mathbf T’\\
\mathbf N’\\
\mathbf B’
\end{bmatrix} =
\begin{bmatrix}
0&\kappa&0\\
-\kappa&0&\tau\\
0&-\tau&0
\end{bmatrix} \begin{bmatrix}
\mathbf T\\
\mathbf N\\
\mathbf B
\end{bmatrix}
\]
which is... if you know what this is...
Then exercise 44 goes on about some “Darboux rotation vector” which is uh... I’ll just ignore this bit.
$\S$3.2. Maxima and minima
good old taylor series am i right gang
anyways if you want to generalize those to multiple dimensions you gotta use the hessian
\[
Hf=
\begin{bmatrix}
\frac{\partial^2}{\partial x_1^2}f& \frac{\partial^2}{\partial x_1\partial x_2}f&\dots&\\
\vdots&&&\\
&&&\\
&&&\\
\end{bmatrix}
\]
and this gives
\[ f(\mathbf x)\approx p_2(\mathbf a)=f(\mathbf a)+Df(\mathbf a)\mathbf h+\frac 12 \mathbf h^{T}Hf(\textbf a)\mathbf h \]
which is a good approximation to order two (because I said so (the proof is really tedious and the same verification as in one variable)).
The Hessian gives conditions on when $f$ is concave up, concave down, or a saddle point (although if $\det H=0$ this is a “critical point” and you must investigate further which is a bummer honestly).
Theorem 12. (Lagrange multipliers)
If you have a constraint $g(x)=c$ which is bounded and closed (this is a level set in three dimensions), and some function $f(x)$ trying to be optimized over the curve, then this optimum occurs at some point at which
\[\nabla f(\mathbf x)=\lambda \nabla g(\mathbf x).\]
$\S$3.3. Exercises
Problem 1. (Colley Vector Calculus 4.23)
- Show that the maximum value of $f(x,y,z)=x^2y^2z^2$ subject to the constraint that $x^2+y^2+z^2=a^2$ is
\[ \frac{a^{6}}{27}=\left( \frac{a^2}{3} \right) ^3. \]
- Use part (a) to show that, for all $x$, $y$, and $z$,
\[ (x^2y^2z^2)^{1/3}\le \frac{x^2+y^2+z^2}{3}. \]
- Show that, for any positive numbers $x_1,x_2,\dots,x_n$,
\[ (x_1x_2\dots x_n)^{1/n} \le \frac{x_1+x_2+\dots+x_n}{n}. \]
- When does equality hold?
Solution.
We’ll just prove AM-GM without all the bother of part (a). We would like to maximize $y_1^2y_2^2\dots y_n^2=P$ subject to $y_1^2+y_2^2+\dots+y_n^2=a^2$. By Lagrange Multipliers
\[ \langle 2P/y_1, 2P/y_2,\dots, 2P/y_n\rangle = \lambda \langle 2y_1,\dots,2y_n \rangle \]
which is true if and only if all the $y_i$ are equal, which achieves the minimum. So
\[ (y_1y_2\dots y_n)^2 \le \left( \frac{a^2}{n} \right)^n=\left( \frac{y_1^2+\dots+y_n^2}{n} \right)^n. \]
$\square$
Problem 2. (Colley Vector Calculus 4.24)
Show that, given a quadratic form $f(x_1,\dots,x_n)=\mathbf x^{T}A\mathbf x$, the gradient $\nabla f=\lambda \nabla g$ is equivalent to
\[ A\mathbf x=\lambda \mathbf x \]
for some matrix $A$; conclude that the absolute minimum/maximum value $f$ obtains on the unit hypersphere is the smallest/largest eigenvalue of $A$.
Solution.
ts is just plug in and verify
Then, we have
\[f=\mathbf x^T A\mathbf x=\lambda \mathbf x^T \mathbf x=\lambda \Vert x\Vert\]
which pretty much gives the conclusion since $\Vert x\Vert=1$ only on the unit hypersphere.
$\square$