Abstract¶ The conditions for a functional on a vector space to be a norm and some examples of both vector and matrix norms.
Keywords: Norms Vector Spaces Vectors Matrices ¶ Norm Conditions ¶ Let V V V be a vector space over the field F \mathbb{F} F . A functional, ∣ ∣ ⋅ ∣ ∣ \vert \vert \cdot \vert \vert ∣∣ ⋅ ∣∣ , on V V V from it’s elements to the positive real numbers, ∣ ∣ ⋅ ∣ ∣ : V → R + \vert \vert \cdot \vert \vert: V \rightarrow \mathbb{R}^{+} ∣∣ ⋅ ∣∣ : V → R + is a norm if it satisfies the following conditions for ψ , σ ∈ V \psi, \sigma ~\in ~V ψ , σ ∈ V and λ ∈ F 1 \lambda~\in~\mathbb{F}^{1} λ ∈ F 1 :
∣ ∣ ψ ∣ ∣ ≥ 0 \vert \vert \psi \vert \vert \geq 0 ∣∣ ψ ∣∣ ≥ 0 where ∣ ∣ ψ ∣ ∣ = 0 \vert \vert \psi \vert \vert=0 ∣∣ ψ ∣∣ = 0 iif ψ = 0 \psi = 0 ψ = 0
∣ ∣ λ ψ ∣ ∣ = ∣ λ ∣ ∣ ∣ ψ ∣ ∣ \vert \vert \lambda \psi \vert \vert = \vert \lambda \vert ~ \vert \vert \psi \vert \vert ∣∣ λ ψ ∣∣ = ∣ λ ∣ ∣∣ ψ ∣∣
∣ ∣ ψ + σ ∣ ∣ ≤ ∣ ∣ ψ ∣ ∣ + ∣ ∣ σ ∣ ∣ \vert \vert \psi + \sigma \vert \vert \leq \vert \vert \psi \vert \vert + \vert \vert \sigma \vert \vert ∣∣ ψ + σ ∣∣ ≤ ∣∣ ψ ∣∣ + ∣∣ σ ∣∣
Norms can be thought of as functionals that measure the size of an elements of a vector space. A vector space with a norm is called a normed vector space.
Vector Norm Examples ¶
Let ψ ∈ V \psi~\in~V ψ ∈ V be a vector such that ψ ∈ F n \psi~\in~\mathbb{F}^{n} ψ ∈ F n with elements ( ψ 1 , ψ 2 , … , ψ n ) (\psi_1, \psi_2, \ldots, \psi_n) ( ψ 1 , ψ 2 , … , ψ n ) . The l p l_{p} l p -norm is given by
∣ ∣ ψ ∣ ∣ p = ( ∣ ψ 1 ∣ p + ∣ ψ 2 ∣ p + … + ∣ ψ n ∣ p ) 1 / p \vert \vert \psi \vert \vert_{p} = \big( \vert \psi_1 \vert^p + \vert \psi_2 \vert^p + \ldots + \vert \psi_n \vert^p)^{1/p} ∣∣ ψ ∣ ∣ p = ( ∣ ψ 1 ∣ p + ∣ ψ 2 ∣ p + … + ∣ ψ n ∣ p ) 1/ p With p = 1 p=1 p = 1 and p = 2 p=2 p = 2 being the 1-norm and Euclidean (l 2 l_2 l 2 ) Norm:
∣ ∣ ψ ∣ ∣ 1 = ( ∣ ψ 1 ∣ + ∣ ψ 2 ∣ + … + ∣ ψ n ∣ ) ∣ ∣ ψ ∣ ∣ 2 = ( ∣ ψ 1 ∣ 2 + ∣ ψ 2 ∣ 2 + … + ∣ ψ n ∣ 2 ) 1 / 2 \begin{align*}
\vert \vert \psi \vert \vert_{1} &= \big( \vert \psi_1 \vert + \vert \psi_2 \vert + \ldots + \vert \psi_n \vert) \\
\vert \vert \psi \vert \vert_{2} &= \big( \vert \psi_1 \vert^2 + \vert \psi_2 \vert^2 + \ldots + \vert \psi_n \vert^2)^{1/2}
\end{align*} ∣∣ ψ ∣ ∣ 1 ∣∣ ψ ∣ ∣ 2 = ( ∣ ψ 1 ∣ + ∣ ψ 2 ∣ + … + ∣ ψ n ∣ ) = ( ∣ ψ 1 ∣ 2 + ∣ ψ 2 ∣ 2 + … + ∣ ψ n ∣ 2 ) 1/2 Let ψ ∈ V \psi~\in~V ψ ∈ V be a vector such that ψ ∈ F n \psi~\in~\mathbb{F}^{n} ψ ∈ F n with elements ( ψ 1 , ψ 2 , … , ψ n ) (\psi_1, \psi_2, \ldots, \psi_n) ( ψ 1 , ψ 2 , … , ψ n ) . The Sup-norm is given by
∣ ∣ ψ ∣ ∣ ∞ = max { ∣ ψ i ∣ : 1 ≤ i ≤ n } . \vert \vert \psi \vert \vert_{\infty} = \max \{\vert \psi_i \vert~: 1 \leq i \leq n \}. ∣∣ ψ ∣ ∣ ∞ = max { ∣ ψ i ∣ : 1 ≤ i ≤ n } . This is just the maximum elements of ψ \psi ψ .
Matrix Norms ¶ In addition to satisfying the above requirements, to be considered a norm on a vector space, M M M , which has matrices as elements, the functional Ω : M → R + \Omega: M \rightarrow \mathbb{R}^{+} Ω : M → R + must satisfy the following condition for all A , B ∈ M A,B~\in~M A , B ∈ M :
∣ ∣ A B ∣ ∣ ≤ ∣ ∣ A ∣ ∣ ∣ ∣ B ∣ ∣ \vert \vert AB \vert \vert \leq \vert \vert A \vert \vert ~ \vert \vert B \vert \vert ∣∣ A B ∣∣ ≤ ∣∣ A ∣∣ ∣∣ B ∣∣
Matrix Norm Examples ¶ The following norms depend only on the singular values of the matrices they act upon. This means that the following conditions are true:
∣ ∣ A ∣ ∣ = ∣ ∣ A V ∣ ∣ = ∣ ∣ U A ∣ ∣ \begin{align*}
\vert \vert A \vert \vert = \vert \vert A V \vert \vert = \vert \vert U A \vert \vert
\end{align*} ∣∣ A ∣∣ = ∣∣ A V ∣∣ = ∣∣ U A ∣∣ where V V V and U U U are unitary matrices.
Hence, these norms are all also unitarily invariant, meaning ∣ ∣ A ∣ ∣ = ∣ ∣ U A U † ∣ ∣ \vert \vert A \vert \vert = \vert \vert UAU^{\dagger} \vert \vert ∣∣ A ∣∣ = ∣∣ U A U † ∣∣ .
The singular valued decomposition of a matrix A ∈ M n m ( C ) A~\in~\mathbb{M}_{nm}(\mathbb{C}) A ∈ M nm ( C ) is
A = U D V † , A = U D V^{\dagger}, A = U D V † , where U U U is an m × m m \times m m × m complex unitary matrix, V V V is a n × n n \times n n × n complex unitary matrix, and D D D is an m × n m \times n m × n diagonal matrix with non-negative real numbers on the diagonal. These non-negative real numbers are the singular values of A A A .
Consider a norm ∣ ∣ ⋅ ∣ ∣ : M n m ( C ) → R + \vert \vert \cdot \vert \vert: \mathbb{M}_{nm}(\mathbb{C}) \rightarrow \mathbb{R}^+ ∣∣ ⋅ ∣∣ : M nm ( C ) → R + that depends only on the singular values of the matrix it is applied to and let W W W be a unitary operator. Consider now
A W = U D V † W , = U D V ~ † , \begin{align*}
AW &= UDV^\dagger W, \\
&= UD\tilde{V}^\dagger,
\end{align*} A W = U D V † W , = U D V ~ † , where V ~ † = V † W \tilde{V}^\dagger = V^\dagger W V ~ † = V † W is just a different unitary matrix. Hence, it can be seen that the singular values of A A A and A W AW A W are the same - the non-negative real numbers on the diagonal of D D D - and therefore that ∣ ∣ A W ∣ ∣ = ∣ ∣ A ∣ ∣ \vert \vert A W \vert \vert = \vert \vert A \vert \vert ∣∣ A W ∣∣ = ∣∣ A ∣∣ .
The same argument can be made to prove that ∣ ∣ W A ∣ ∣ = ∣ ∣ A ∣ ∣ \vert \vert W A \vert \vert = \vert \vert A \vert \vert ∣∣ W A ∣∣ = ∣∣ A ∣∣
A matrix A ∈ M n n ( C ) A \in \mathbb{M}_{nn}(\mathbb{C}) A ∈ M nn ( C ) has an eigen-decomposition if
A = Q P Q † \begin{align*}
A = Q P Q^\dagger
\end{align*} A = QP Q † where Q Q Q has the eigenvectors of A A A as it’s columns and P P P is a diagonal matrix with the eigenvalues of A A A on it’s diagonals.
All matrices have a singular valued decompostion, but not all matrices have an eigen-decomposition. Moreover, unlike the singular values, the eigenvalues can be negative and complex.
For Hermitian matrices, the singular values are the absolute values of the eigenvalues.
Note, for Hermitian matrices, the singular values are the absolute values of the eigenvalues.
Let A ∈ M A~\in~M A ∈ M be a matrix such that A ∈ M m n ( F ) A~\in~\mathbb{M}_{mn}(\mathbb{F}) A ∈ M mn ( F ) . The trace norm (or one norm) is given by
∣ ∣ A ∣ ∣ tr = ∑ i μ i ( A ) , \vert \vert A \vert \vert_{\textrm{tr}} = \sum_{i} \mu_{i}(A), ∣∣ A ∣ ∣ tr = i ∑ μ i ( A ) , where μ i ( A ) \mu_{i}(A) μ i ( A ) are the singular value of A A A . Hence, the Trace norm of A A A is the sum of singular values of A. The trace norm is often instead denoted ∣ ∣ A ∣ ∣ 1 \vert \vert A \vert \vert_1 ∣∣ A ∣ ∣ 1 .
It is also given by
∣ ∣ A ∣ ∣ tr = tr ( A † A ) . \vert \vert A \vert \vert_{\textrm{tr}} = \textrm{tr} \big( \sqrt{A^{\dagger}A} \big). ∣∣ A ∣ ∣ tr = tr ( A † A ) . Let A ∈ M A~\in~M A ∈ M be a matrix such that A ∈ M m n ( F ) A~\in~\mathbb{M}_{mn}(\mathbb{F}) A ∈ M mn ( F ) . The Frobenius norm is given by
∣ ∣ A ∣ ∣ F = ∑ j n ∑ i m ∣ a i j ∣ 2 . \vert \vert A \vert \vert_{F} = \sqrt{ \sum_{j}^{n} \sum_{i}^{m} \vert a_{ij} \vert ^{2} }. ∣∣ A ∣ ∣ F = j ∑ n i ∑ m ∣ a ij ∣ 2 . It is also given by
∣ ∣ A ∣ ∣ F = tr ( A † A ) . \vert \vert A \vert \vert_{F} = \sqrt{\textrm{tr}(A^{\dagger}A)}. ∣∣ A ∣ ∣ F = tr ( A † A ) . Let A ∈ M A~\in~M A ∈ M be a matrix such that A ∈ M m n ( F ) A~\in~\mathbb{M}_{mn}(\mathbb{F}) A ∈ M mn ( F ) . Let ψ ∈ V \psi~\in~V ψ ∈ V be a vector such that ψ ∈ M m n ( F ) \psi~\in~\mathbb{M}_{mn}(\mathbb{F}) ψ ∈ M mn ( F ) with elements ( ψ 1 , ψ 2 , … , ψ n ) (\psi_1, \psi_2, \ldots, \psi_n) ( ψ 1 , ψ 2 , … , ψ n ) .
The l 2 l_2 l 2 vector norm is given by
∣ ∣ ψ ∣ ∣ 2 = ( ∣ ψ 1 ∣ 2 + ∣ ψ 2 ∣ 2 + … + ∣ ψ n ∣ 2 ) 1 / 2 \begin{align*}
\vert \vert \psi \vert \vert_{2} &= \big( \vert \psi_1 \vert^2 + \vert \psi_2 \vert^2 + \ldots + \vert \psi_n \vert^2)^{1/2}
\end{align*} ∣∣ ψ ∣ ∣ 2 = ( ∣ ψ 1 ∣ 2 + ∣ ψ 2 ∣ 2 + … + ∣ ψ n ∣ 2 ) 1/2 The l 2 l_2 l 2 norm of a matrix (also called the spectral norm) is given by the induced norm (operator norm ) of the l 2 l_2 l 2 norm on a vector,
∣ ∣ A ∣ ∣ 2 = sup ψ ∣ ∣ A ψ ∣ ∣ 2 ∣ ∣ ψ ∣ ∣ 2 \vert \vert A \vert \vert_2 = \sup_{\psi} \frac{ \vert \vert A \psi \vert \vert_2 }{ \vert \vert \psi \vert \vert_{2}} ∣∣ A ∣ ∣ 2 = ψ sup ∣∣ ψ ∣ ∣ 2 ∣∣ A ψ ∣ ∣ 2 It is also given by
∣ ∣ A ∣ ∣ 2 = μ max ( A † A ) \vert \vert A \vert \vert_2 = \sqrt{\mu_{\textrm{max}}(A^{\dagger}A)} ∣∣ A ∣ ∣ 2 = μ max ( A † A ) Let A ∈ M A~\in~M A ∈ M be a matrix such that A ∈ M m n ( F ) A~\in~\mathbb{M}_{mn}(\mathbb{F}) A ∈ M mn ( F ) . The k k k -Ky Fan norm is given by
∣ ∣ A ∣ ∣ ∗ k = ∑ i k ∣ μ i ↓ ( A ) ∣ , \vert \vert A \vert \vert^{k} _{*} = \sum_{i}^{k} \vert \mu^{\downarrow}_{i}(A) \vert, ∣∣ A ∣ ∣ ∗ k = i ∑ k ∣ μ i ↓ ( A ) ∣ , where μ i ↓ ( A ) \mu^{\downarrow}_i (A) μ i ↓ ( A ) is the i i i th singular value of A A A such that μ i ↓ ( A ) ≥ μ i + 1 ↓ ( A ) ∀ i \mu^{\downarrow}_i(A) \geq \mu^{\downarrow}_{i+1}(A)~\forall~i μ i ↓ ( A ) ≥ μ i + 1 ↓ ( A ) ∀ i . Hence the Ky Fan Norm is the sum of the k k k th largest singular values.
Ky Fan Dominance : Let B ∈ M B~\in~M B ∈ M be a matrix such that B ∈ M m n ( F ) B~\in~\mathbb{M}_{mn}(\mathbb{F}) B ∈ M mn ( F ) . If it follows that
∣ ∣ A ∣ ∣ ∗ k ≤ ∣ ∣ B ∣ ∣ ∗ k , \vert \vert A \vert \vert^{k} _{*} \leq \vert \vert B \vert \vert^{k} _{*}, ∣∣ A ∣ ∣ ∗ k ≤ ∣∣ B ∣ ∣ ∗ k , for all k k k , then for all unitarily invariant norms it is the case that
∣ ∣ A ∣ ∣ ≤ ∣ ∣ B ∣ ∣ . \vert \vert A \vert \vert \leq \vert \vert B \vert \vert. ∣∣ A ∣∣ ≤ ∣∣ B ∣∣. Let A ∈ M A~\in~M A ∈ M be a matrix such that A ∈ M m n ( F ) A~\in~\mathbb{M}_{mn}(\mathbb{F}) A ∈ M mn ( F ) . The Schatten norms are given by
∣ ∣ A ∣ ∣ p = ( ∑ i k [ μ i ( A ) ] p ) 1 p , \vert \vert A \vert \vert_{p} = \bigg( \sum_{i}^{k} \big[\mu_{i}(A)\big]^{p} \bigg)^{\frac{1}{p}}, ∣∣ A ∣ ∣ p = ( i ∑ k [ μ i ( A ) ] p ) p 1 , where μ i ( A ) \mu_i(A) μ i ( A ) are the singular values of A A A .
Equally, Schatten norms can be written as
∣ ∣ A ∣ ∣ p = tr [ ∣ A ∣ p ] 1 p , \vert \vert A \vert \vert_{p} = \textrm{tr} \big[ \vert A \vert ^{p} \big]^{\frac{1}{p}}, ∣∣ A ∣ ∣ p = tr [ ∣ A ∣ p ] p 1 , where ∣ A ∣ = A A † \vert A \vert = \sqrt{AA^{\dagger}} ∣ A ∣ = A A † .
If p = 1 p=1 p = 1 you get the trace norm , if p = 2 p=2 p = 2 you get the Frobenius Norm and if p = ∞ p=\infty p = ∞ you get the l 2 l_{2} l 2 norm .
The following norms do not necessarily depend on the singular values and hence the above conditions might not hold.
This is an induced norm.
Let A ∈ M A~\in~M A ∈ M be a matrix such that A ∈ F m × n A~\in~\mathbb{F}^{m \times n} A ∈ F m × n . If A A A is a linear map such that A : V → W A: V \rightarrow W A : V → W , where V V V is a vector space over F m \mathbb{F}^{m} F m and W W W is a vector space over F n \mathbb{F}^{n} F n then the operator norm is given by
∣ ∣ A ∣ ∣ o p = inf { c ≥ 0 : ∣ ∣ A ψ ∣ ∣ W ≤ c ∣ ∣ ψ ∣ ∣ V ∀ ψ ∈ V } , \vert \vert A \vert \vert_{op} = \inf \{c \geq 0: \vert \vert A \psi \vert \vert_{W} \leq c \vert \vert \psi \vert \vert_{V}~\forall~\psi~\in~V \}, ∣∣ A ∣ ∣ o p = inf { c ≥ 0 : ∣∣ A ψ ∣ ∣ W ≤ c ∣∣ ψ ∣ ∣ V ∀ ψ ∈ V } , where ∣ ∣ ⋅ ∣ ∣ W \vert \vert \cdot \vert \vert_{W} ∣∣ ⋅ ∣ ∣ W and ∣ ∣ ⋅ ∣ ∣ V \vert \vert \cdot \vert \vert_{V} ∣∣ ⋅ ∣ ∣ V are vector norms on V V V and W W W respectively.
The operator norm can be colloquially thought of as the maximum amount a map A A A can lengthen a vector in V V V .
Let A ∈ M A~\in~M A ∈ M be a matrix such that A ∈ F m × n A~\in~\mathbb{F}^{m \times n} A ∈ F m × n . The L p , q L_{p,q} L p , q norm, where p , q ≥ 1 p,q \geq 1 p , q ≥ 1 , is given by
∣ ∣ A ∣ ∣ p , q = ( ∑ j n ( ∑ i m ∣ a i j ∣ p ) q p ) 1 q , \vert \vert A \vert \vert_{p,q} = \biggl( \sum^{n}_{j} \bigg( \sum_{i}^{m} \vert a_{ij} \vert^{p} \bigg)^{\frac{q}{p}} \biggl)^{\frac{1}{q}}, ∣∣ A ∣ ∣ p , q = ( j ∑ n ( i ∑ m ∣ a ij ∣ p ) p q ) q 1 , where a i j a_{ij} a ij are the elements of A A A .
The L 2 , 1 L_{2,1} L 2 , 1 norm, which is the sum of the l 2 l_{2} l 2 vector norms of the columns of A A A , is given by
∣ ∣ A ∣ ∣ 2 , 1 = ∑ j n ∣ ∣ a j ∣ ∣ 2 = ∑ j n ( ∑ i m ∣ a i j ∣ 2 ) 1 2 \vert \vert A \vert \vert_{2,1} = \sum^{n}_{j} \vert \vert a_j \vert \vert_{2} = \sum^{n}_{j} \bigg( \sum_{i}^{m} \vert a_{ij} \vert^{2} \bigg)^{\frac{1}{2}} ∣∣ A ∣ ∣ 2 , 1 = j ∑ n ∣∣ a j ∣ ∣ 2 = j ∑ n ( i ∑ m ∣ a ij ∣ 2 ) 2 1 where a j a_{j} a j are the columns of the matrix A A A .
Cross Norms ¶ Consider a tensor product space V = X ⊗ Y V = \mathcal{X} \otimes \mathcal{Y} V = X ⊗ Y .
A norm, ∣ ∣ ⋅ ∣ ∣ \vert \vert \cdot \vert \vert ∣∣ ⋅ ∣∣ , is a cross norm if
∣ ∣ V ∣ ∣ = ∣ ∣ X ⊗ Y ∣ ∣ = ∣ ∣ X ∣ ∣ ∣ ∣ Y ∣ ∣ . \vert \vert V \vert \vert = \vert \vert \mathcal{X} \otimes \mathcal{Y} \vert \vert = \vert \vert \mathcal{X} \vert \vert ~ \vert \vert \mathcal{Y} \vert \vert. ∣∣ V ∣∣ = ∣∣ X ⊗ Y ∣∣ = ∣∣ X ∣∣ ∣∣ Y ∣∣. All norms that depend only on the singular values are cross norms.
Note : this is a somewhat hand-wavey proof.
Let the singular valued decomposition of a matrix A ∈ M n m ( C ) A~\in~\mathbb{M}_{nm}(\mathbb{C}) A ∈ M nm ( C ) be
A = U D A V † , A = U D_A V^{\dagger}, A = U D A V † , and the singular valued decomposition of a matrix B ∈ M n m ( C ) B~\in~\mathbb{M}_{nm}(\mathbb{C}) B ∈ M nm ( C ) be
B = W D B T † . B = W D_B T^{\dagger}. B = W D B T † . Hence, the singular values of A ⊗ B A \otimes B A ⊗ B are the product all of the singular values of A A A with the singular values of B B B .
The singular valued decomposition of a matrix A ⊗ B A \otimes B A ⊗ B can then be seen to be
A ⊗ B = ( U ⊗ W ) ( D A ⊗ D B ) ( V ⊗ T ) † . \begin{align*}
A \otimes B = (U \otimes W)(D_A \otimes D_B)(V \otimes T)^\dagger.
\end{align*} A ⊗ B = ( U ⊗ W ) ( D A ⊗ D B ) ( V ⊗ T ) † . By observing this fact, it can be seen that any norms that depends only the singular values of a matrix can be factorised into a product of the norm applied to A A A and B B B separately.
Useful Norm Inequalities ¶
Let H \mathcal{H} H be a Hilbert space , and let A , B ∈ H A, B \in \mathcal{H} A , B ∈ H be operators on the space.
The following inequality:
∣ ∣ A B ∣ ∣ r ≤ ∣ ∣ A ∣ ∣ p ∣ ∣ B ∣ ∣ q , \vert \vert AB \vert \vert_r \leq \vert \vert A \vert \vert_p ~ \vert \vert B \vert \vert_q, ∣∣ A B ∣ ∣ r ≤ ∣∣ A ∣ ∣ p ∣∣ B ∣ ∣ q , where 1 p + 1 q = 1 r \frac{1}{p} + \frac{1}{q} = \frac{1}{r} p 1 + q 1 = r 1 .
Note, Hölder’s Inequality is more general than this, and applies to vector spaces other than Hilbert spaces and objects other than operators. See here for some details
Let A ∈ H 1 ⊗ H 2 A \in \mathcal{H}_1 \otimes \mathcal{H}_2 A ∈ H 1 ⊗ H 2 be an operator acting on the bipartite Hilbert space H \mathcal{H} H , and ∣ ∣ ⋅ ∣ ∣ p \vert \vert \cdot \vert \vert_p ∣∣ ⋅ ∣ ∣ p be the Schatten norm , then:
∣ ∣ tr 1 [ A ] ∣ ∣ p ≤ [ d i m ( H 1 ) ] ( p − 1 ) / p ∣ ∣ A ∣ ∣ p . \vert \vert ~ \textrm{tr}_1 [ A ] ~ \vert \vert_p \leq \big[ {\rm dim}(\mathcal{H}_1)]^{(p-1)/p} \vert \vert A \vert \vert_p. ∣∣ tr 1 [ A ] ∣ ∣ p ≤ [ dim ( H 1 ) ] ( p − 1 ) / p ∣∣ A ∣ ∣ p . The result it more general than this and more details can be found here
Let x , y ∈ V \bm{x}, \bm{y}~\in~V x , y ∈ V , where V V V is a inner product space defined over F \mathbb{F} F with inner product ( ⋅ , ⋅ ) → F 1 (\cdot, \cdot) \rightarrow \mathbb{F}^{1} ( ⋅ , ⋅ ) → F 1 , then
∣ ( x , y ) ∣ 2 ≤ ∣ ∣ x ∣ ∣ ∣ ∣ y ∣ ∣ , \vert (\bm{x}, \bm{y}) \vert ^{2} \leq \vert \vert \bm{x} \vert \vert ~ \vert \vert \bm{y} \vert \vert, ∣ ( x , y ) ∣ 2 ≤ ∣∣ x ∣∣ ∣∣ y ∣∣ , where ∣ ⋅ ∣ \vert \cdot \vert ∣ ⋅ ∣ is the absolute value and
∣ ∣ x ∣ ∣ = ( x , x ) . \vert \vert \bm{x} \vert \vert = \sqrt{ (\bm{x}, \bm{x}) }. ∣∣ x ∣∣ = ( x , x ) . Dual Norms ¶ Consider a normed vector space V V V , where the norm is ∣ ∣ ⋅ ∣ ∣ \vert \vert \cdot \vert \vert ∣∣ ⋅ ∣∣ .
There also exists norms on the dual space of V V V , denoted V ∗ V^* V ∗ , which is the space of linear functionals on V V V . One method of finding norms on the dual space is to use norms of the primal space. These so called dual-norms are defined as
∣ ∣ f ∣ ∣ ∗ ≔ sup X { ∣ f ( X ) ∣ : ∣ ∣ X ∣ ∣ ≤ 1 , X ∈ V } , \vert \vert f \vert \vert^* \coloneqq \sup_{X} \big\{ \vert f(X) \vert : \vert \vert X \vert \vert \leq 1, ~ ~ X \in V \big\}, ∣∣ f ∣ ∣ ∗ : = X sup { ∣ f ( X ) ∣ : ∣∣ X ∣∣ ≤ 1 , X ∈ V } , where f ∈ V ∗ f \in V^* f ∈ V ∗ and X ∈ V X \in V X ∈ V . Intuitively, this can be understood as finding the X X X within the unit-ball for which the functional f f f maximally increases.
These norms can be written in terms of only the primal vector space V V V using the Riesz Representation Theorem.
Consider a finite dimension vector space V V V , with an inner product , denoted by ( ⋅ , ⋅ ) (\cdot, \cdot) ( ⋅ , ⋅ ) , that is linear in the first argument and anti-linear in the second argument.
The Riesz Representation Theorem then states that for all linear functionals f ∈ V ∗ f \in V^* f ∈ V ∗ there exists a unique vector Y ∈ V Y \in V Y ∈ V such that
f ( X ) = ( X , Y ) , X ∈ V . f(X) = (X, Y), ~ ~ X \in V. f ( X ) = ( X , Y ) , X ∈ V . Using the Riesz Representation Theorem, one can identify each linear functional f ∈ V ∗ f \in V^* f ∈ V ∗ via a Y ∈ V Y \in V Y ∈ V . The dual norms are then typically written as
∣ ∣ Y ∣ ∣ ∗ ≔ sup X { ∣ ( X , Y ) ∣ : ∣ ∣ X ∣ ∣ ≤ 1 , X ∈ V } , \vert \vert Y \vert \vert^* \coloneqq \sup_{X} \big\{ \vert (X,Y) \vert : \vert \vert X \vert \vert \leq 1, ~ ~ X \in V \big\}, ∣∣ Y ∣ ∣ ∗ : = X sup { ∣ ( X , Y ) ∣ : ∣∣ X ∣∣ ≤ 1 , X ∈ V } , where f ( X ) = ( X , Y ) f(X) = (X,Y) f ( X ) = ( X , Y ) and the Y Y Y whilst in V V V represents an elements of V ∗ V^* V ∗ .
For the inner product ( X , Y ) = tr [ X † Y ] (X,Y) = \textrm{tr} \big[ X^\dagger Y \big] ( X , Y ) = tr [ X † Y ] , for example, the dual norm is
∣ ∣ Y ∣ ∣ ∗ ≔ sup X { ∣ tr [ X † Y ] ∣ : ∣ ∣ X ∣ ∣ ≤ 1 , X ∈ V } . \vert \vert Y \vert \vert^* \coloneqq \sup_{X} \big\{ \vert \textrm{tr} \big[ X^\dagger Y \big] \vert : \vert \vert X \vert \vert \leq 1, ~ ~ X \in V \big\}. ∣∣ Y ∣ ∣ ∗ : = X sup { ∣ tr [ X † Y ] ∣ : ∣∣ X ∣∣ ≤ 1 , X ∈ V } . Dual Norm Examples ¶
The dual of the trace norm (one norm), ∣ ∣ A ∣ ∣ 1 \vert \vert A \vert \vert_1 ∣∣ A ∣ ∣ 1 , is
∣ ∣ A ∣ ∣ 1 ∗ = ∣ ∣ A ∣ ∣ ∞ , \vert \vert A \vert \vert_1^* = \vert \vert A \vert \vert_{\infty}, ∣∣ A ∣ ∣ 1 ∗ = ∣∣ A ∣ ∣ ∞ , such that the dual norm of the trace norm is the infinity norm.
The dual of the infinity norm (operator norm), ∣ ∣ A ∣ ∣ ∞ \vert \vert A \vert \vert_{\infty} ∣∣ A ∣ ∣ ∞ , is
∣ ∣ A ∣ ∣ ∞ ∗ = ∣ ∣ A ∣ ∣ 1 \vert \vert A \vert \vert_{\infty}^* = \vert \vert A \vert \vert_{1} ∣∣ A ∣ ∣ ∞ ∗ = ∣∣ A ∣ ∣ 1 such that the dual norm of the infinity norm is the trace norm.
Note, some norms as self dual, meaning the dual of the norm returns the same norm.
The Dual of a Dual ¶ The dual of a dual norm returns the original norm.
This can be used to give alternative ways to phrase the original norm. For example, the dual of the trace norm is defined as
∣ ∣ Y ∣ ∣ t r ∗ ≔ sup X { ∣ ( X , Y ) ∣ : ∣ ∣ X ∣ ∣ t r ≤ 1 , X ∈ V } . \vert \vert Y \vert \vert_{\rm tr}^* \coloneqq \sup_{X} \big\{ \vert (X,Y) \vert : \vert \vert X \vert \vert_{\rm tr} \leq 1, ~ ~ X \in V \big\}. ∣∣ Y ∣ ∣ tr ∗ : = X sup { ∣ ( X , Y ) ∣ : ∣∣ X ∣ ∣ tr ≤ 1 , X ∈ V } . Now, the dual of ∣ ∣ Y ∣ ∣ t r ∗ \vert \vert Y \vert \vert_{\rm tr}^* ∣∣ Y ∣ ∣ tr ∗ is defined as
[ ∣ ∣ Y ∣ ∣ t r ∗ ] ∗ ≔ sup X { ∣ ( X , Y ) ∣ : ∣ ∣ X ∣ ∣ t r ∗ ≤ 1 , X ∈ V } . \big[ \vert \vert Y \vert \vert_{\rm tr}^* \big]^* \coloneqq \sup_{X} \big\{ \vert (X,Y) \vert : \vert \vert X \vert \vert_{\rm tr}^* \leq 1, ~ ~ X \in V \big\}. [ ∣∣ Y ∣ ∣ tr ∗ ] ∗ : = X sup { ∣ ( X , Y ) ∣ : ∣∣ X ∣ ∣ tr ∗ ≤ 1 , X ∈ V } . But, the dual of the dual is the original norm, such that
[ ∣ ∣ Y ∣ ∣ t r ∗ ] ∗ = ∣ ∣ Y ∣ ∣ t r . \big[ \vert \vert Y \vert \vert_{\rm tr}^* \big]^* = \vert \vert Y \vert \vert_{\rm tr}. [ ∣∣ Y ∣ ∣ tr ∗ ] ∗ = ∣∣ Y ∣ ∣ tr . It was then shown in the examples above that the dual of the trace norm is the infinity norm. Hence, an equivalent form for the trace norm is
∣ ∣ Y ∣ ∣ t r ≔ sup X { ∣ ( X , Y ) ∣ : ∣ ∣ X ∣ ∣ ∞ ≤ 1 , X ∈ V } . \vert \vert Y \vert \vert_{\rm tr} \coloneqq \sup_{X} \big\{ \vert (X,Y) \vert : \vert \vert X \vert \vert_{\infty} \leq 1, ~ ~ X \in V \big\}. ∣∣ Y ∣ ∣ tr : = X sup { ∣ ( X , Y ) ∣ : ∣∣ X ∣ ∣ ∞ ≤ 1 , X ∈ V } .
Rastegin, A. E. (2012). Relations for certain symmetric norms and anti-norms before and after partial trace . 10.48550/ARXIV.1202.3853