For of my research projects, I needed to decompose a multivariate polynomial into a linear combination of so-called pure monomials. These are monomials of the form \(\lambda^r\) where \(\lambda\) is some linear form. I had never read the proof that such a basis exists, but it turned out that the proof is quite neat, so I decided to write a short exposition of this proof here (mainly for my own reference). This is based on parts of the first two chapters of (Reznick 1992).
Let us set the stage: We work over the real numbers. Then we can consider polynomials of degree \(d\) on \(\mathbb{R}^n\), the set of which we denote by \(\mathrm{Pol}^d(\mathbb{R}^n)\). Bourbaki (2007) tells us that we can identify \(\mathrm{Pol}^d(\mathbb{R}^n) \cong \mathrm{Sym}^d((\mathbb{R}^n)^{\ast})\), the \(d\)th symmetric tensor power of forms on \(\mathbb{R}^n\). From this we see that \(\mathrm{Pol}^d(\mathbb{R}^n)\) is a finite-dimensional vector space with dimension \(\dim \mathrm{Pol}^d(\mathbb{R}^n) = \binom{n + d - 1}{d}\) (See for example this mathSE question). We have a standard basis \(dx^1,\ldots,dx^n\) of \(\mathbb{R}^n\) induced by the standard coordinates on \(\mathbb{R}^n\). Our first goal now is to obtain a more concrete expression for multivariate polynomials.
It is useful to use multi-indices to write expressions with polynomials. A multi-index is an \(n\)-tuple \(\alpha \in \mathbb{N}^n\) and it indicates the number of times each coordinate is chosen: So the \(i\)th coordinate is chosen \(\alpha_i\) times. The degree of the multi-index is \(|\alpha| = \sum_{i = 1}^n\alpha_i\) and its factorial is \(\alpha! = \prod_{i = 1}^n \alpha_i!\). We denote the set of all multi-indices in dimension \(n\) of degree \(d\) by
\begin{equation*} \mathcal{I}(n, d) = \{ \alpha \in \mathbb{N}^n \mid |\alpha| = d \}. \end{equation*}
Then the set \(\{dx^{\alpha} \mid \alpha \in \mathcal{I}(n, d)\}\) is a basis for \(\mathrm{Pol}^d(\mathbb{R}^n)\). We also introduce the following shorthand for the multinomial coefficient
\begin{equation*} c(\alpha) = \frac{|\alpha|!}{\alpha!}. \end{equation*}
Then any polynomial \(p \in \mathrm{Pol}^d(\mathbb{R}^n)\) can written as \[ p = \sum_{\alpha \in \mathcal{I}(n, d)} c(\alpha) a(p;\alpha) dx^{\alpha}, \] for real numbers \(a(p; \alpha)\).
We now introduce an inner product on \(\mathrm{Pol}^d(\mathbb{R}^n)\). For all \(p, q \in \mathrm{Pol}^d(\mathbb{R}^n)\), we define \[ \langle p, q \rangle = \sum_{\alpha \in \mathcal{I}(n, d)} c(\alpha)a(p;\alpha)a(q;\alpha). \] This is a inner product since it can be seen as an \(L^2\)-inner product with respect to some weighted counting measure on \(\mathcal{I}(n, d)\).
For each vector \(v \in \mathbb{R}^n\), we get a degree \(d\) polynomial by the usual dot product on \(\mathbb{R}^n\). That is, for \(x \in \mathbb{R}^n\) we have \[ (v\cdot)^d(x) := (v \cdot x)^d = \left( \sum_{i = 1}^n v^ix^i \right)^d = \sum_{\alpha \in \mathcal{I}(n, d)} c(\alpha) v^{\alpha}x^{\alpha}. \] This demonstrates that \(a((v\cdot)^d; \alpha) = v^{\alpha}\). Then we observe that for any polynomial \(p \in \mathrm{Pol}^d(\mathbb{R}^n)\), we have \[ \langle p, (v \cdot)^d \rangle = \sum_{\alpha \in \mathcal{I}(n, d)} c(\alpha)a(p; \alpha)v^{\alpha} = p(v). \]
For a subset \(A \subset \mathbb{R}^n\) we define
\begin{equation*} \mathrm{span}_d(A) = \left\{ \sum_k \lambda_k (v_k\cdot)^d \middle| v_k \in A, \lambda_k \in \mathbb{R} \right\}. \end{equation*}
We can use the observation above to give an description of the orthogonal complement of \(\mathrm{span}_d(A)\):
\begin{equation*} \mathrm{span}_d(A)^{\perp} = \{ p \in \mathrm{Pol}^d(\mathbb{R}^n) \mid \forall v \in A, p(v) = 0\}. \end{equation*}
This is reminiscent of the vanishing ideal in algebraic geometry, which is \(I(A) = \{p \in \mathbb{R}[X_{1},\ldots,X_n] \mid \forall v \in A, p(v) = 0\}\). In fact, \(\mathrm{span}_d(A)^{\perp} = I(A) \cap \mathrm{Pol}^d(\mathbb{R}^n).\)
Then the first result is:
Proof.
This proof is not really constructive1, but luckily we can actually write down an explicit basis of this form. We have the following criterion to determine \(d\)-independence of a set.
Proof.
\((\Leftarrow)\) Suppose that such \(h_j\) exist and suppose that \(q = \sum_k \lambda_k (\alpha_k \cdot)^d - 0\). Then using our observation about inner products with \((v \cdot)^d\) we get
\begin{equation*} 0 = \langle q, h_j \rangle = \left\langle \sum_k \lambda_k (\alpha_k \cdot)^d, h_j\right\rangle = \sum_k \lambda_k h_j(\alpha_k) = \lambda_j \end{equation*}
for all \(1 \leq j \leq r\). So it follows that \(A\) is \(d\)-independent.
\((\Rightarrow)\) Suppose that \(\{\alpha_1,\ldots,\alpha_r\}\) are \(d\)-independent. Then the set \(\{(\alpha_1\cdot)^d,\ldots,(\alpha_r\cdot)^d\}\) is linearly independent in the inner product space \(\mathrm{Pol}^d(\mathbb{R}^n)\). So ?
Note that we can view the set \(\mathcal{I}(n, d)\) of multi-indices as a subset of \(\mathbb{R}^n\). The following theorem tells us that this set even induces a basis of \(\mathrm{Pol}^d(\mathbb{R}^n)\). The idea of the proof is to construct the dual forms for Serret’s theorem explicitly.
Proof.
We first note that \(|\mathcal{I}(n, d)| = \dim \mathrm{Pol}^d(\mathbb{R}^n)\), so \(\mathcal{I}(n, d)\) has the correct cardinality. It remains to show that it \(d\)-independent.
Note that for \(\alpha \not= \beta \in \mathcal{I}(n, d)\) there exists a \(1 \leq j \leq r\) such that \(\alpha_j \leq \beta_j\) (We have an \(r\)-tuple of integers \(\varepsilon = \alpha - \beta \not= 0\) and \(|\varepsilon| = 0\), so it must contain at least one negative integer). So then the polynomial \(\prod_{k = 0}^{\beta_j - 1}(x_j - k)\) evaluates to \(0\) on \(\alpha\) while it evaluates to something non-zero on \(\beta\). However, this polynomial is not homogeneous. For all \(\alpha \in \mathcal{I}(n, d)\) we have by definition \(d = \alpha_1 + \cdots \alpha_n\), so the homogeneous polynomial
\begin{equation*} \prod_{k = 0}^{\beta_j - 1}(dx_j - k(x_1 + \cdots + x_n)) \end{equation*}
is \(d\) times the previous polynomial on elements of \(\mathcal{I}(n, d)\). Then we define for each \(\beta \in \mathcal{I}(n, d)\)
\begin{equation*} h_{\beta}(x) = \prod_{j = 1}^n\prod_{k = 0}^{\beta_j - 1}(dx_j - k(x_1 + \cdots + x_n)). \end{equation*}
This a homogeneous polynomial of degree \(\beta_1 + \cdots + \beta_n = d\) so \(h_{\beta} \in \mathrm{Pol}^d(\mathbb{R}^n)\). Moreover, by the observation above, for all \(\alpha \in \mathrm{Pol}^d(\mathbb{R}^n)\), \(\alpha \not= \beta\), we have \(h_{\beta}(\alpha) = 0\). Finally,
\begin{equation*} h_{\beta}(\beta) = d^d\prod_j^n \prod_{k = 0}^{\beta_j - 1}(\beta_j - k) = d^d\prod_{j = 1}^n\beta_j! = d^d\beta! \not= 0. \end{equation*}
So by normalizing the \(h_{\beta}\), we have found a set of dual forms and so by Serret’s theorem it follows that \(\mathcal{I}(n, d)\) is \(d\)-independent.
References
One can argue that for finite-dimensional vector spaces, the existence of a basis is constructive. However, this is some algorithmic procedure that does not readily give a closed form for all basis elements at once. ↩︎