We study the average of sum, in the sense of integral.

Read moreThroughout we consider the Hilbert space \(L^2=L^2(\mathbb{R})\), the space of all complex-valued functions with real variable such that \(f \in L^2\) if and only if \[ \lVert f \rVert_2^2=\int_{-\infty}^{\infty}|f(t)|^2dm(t)<\infty \] where \(m\) denotes the ordinary Lebesgue measure (in fact it's legitimate to consider Riemann integral in this context).

For each \(t \geq 0\), we assign an bounded linear operator \(Q(t)\) such that \[ (Q(t)f)(s)=f(s+t). \] This is indeed bounded since we have \(\lVert Q(t)f \rVert_2 = \lVert f \rVert_2\) as the Lebesgue measure is translate-invariant. This is a left translation operator with a single step \(t\).

Suppose \(1 < p < \infty\) and \(f \in L^p((0,\infty))\) (with respect to Lebesgue measure of course) is a nonnegative function, take \[ F(x) = \frac{1}{x}\int_0^x f(t)dt \quad 0 < x <\infty, \] we have Hardy's inequality \(\def\lrVert[#1]{\lVert #1 \rVert}\) \[ \lrVert[F]_p \leq q\lrVert[f]_p \] where \(\frac{1}{p}+\frac{1}{q}=1\) of course.

There are several ways to prove it. I think there are several good reasons to write them down thoroughly since that may be why you find this page. Maybe you are burnt out since it's *left as exercise*. You are assumed to have enough knowledge of Lebesgue measure and integration.

Let \(S_1,S_2 \subset \mathbb{R}\) be two measurable set, suppose \(F:S_1 \times S_2 \to \mathbb{R}\) is measurable, then \[ \left[\int_{S_2} \left\vert\int_{S_1}F(x,y)dx \right\vert^pdy\right]^{\frac{1}{p}} \leq \int_{S_1} \left[\int_{S_2} |F(x,y)|^p dy\right]^{\frac{1}{p}}dx. \] A proof can be found at here by turning to Example A9. You may need to replace all measures with Lebesgue measure \(m\).

Now let's get into it. For a measurable function in this place we should have \(G(x,t)=\frac{f(t)}{x}\). If we put this function inside this inequality, we see \[ \begin{aligned} \lrVert[F]_p &= \left[\int_0^\infty \left\vert \int_0^x \frac{f(t)}{x}dt \right\vert^p dx\right]^{\frac{1}{p}} \\ &= \left[\int_0^\infty \left\vert \int_0^1 f(ux)du \right\vert^p dx\right]^{\frac{1}{p}} \\ &\leq \int_0^1 \left[\int_0^\infty |f(ux)|^pdx\right]^{\frac{1}{p}}du \\ &= \int_0^1 \left[\int_0^\infty |f(ux)|^pudx\right]^{\frac{1}{p}}u^{-\frac{1}{p}}du \\ &= \lrVert[f]_p \int_0^1 u^{-\frac{1}{p}}du \\ &=q\lrVert[f]_p. \end{aligned} \] Note we have used change-of-variable twice and the inequality once.

I have no idea how people came up with this solution. Take \(xF(x)=\int_0^x f(t)t^{u}t^{-u}dt\) where \(0<u<1-\frac{1}{p}\). Hölder's inequality gives us \[ \begin{aligned} xF(x) &= \int_0^x f(t)t^ut^{-u}dt \\ &\leq \left[\int_0^x t^{-uq}dt\right]^{\frac{1}{q}}\left[\int_0^xf(t)^pt^{up}dt\right]^{\frac{1}{p}} \\ &=\left(\frac{1}{1-uq}x^{1-uq}\right)^{\frac{1}{q}}\left[\int_0^xf(t)^pt^{up}dt\right]^{\frac{1}{p}} \end{aligned} \] Hence \[ \begin{aligned} F(x)^p & \leq \frac{1}{x^p}\left\{\left(\frac{1}{1-uq}x^{1-uq}\right)^{\frac{1}{q}}\left[\int_0^xf(t)^pt^{up}dt\right]^{\frac{1}{p}}\right\}^{p} \\ &= \left(\frac{1}{1-uq}\right)^{\frac{p}{q}}x^{\frac{p}{q}(1-uq)-p}\int_0^x f(t)^pt^{up}dt \\ &= \left(\frac{1}{1-uq}\right)^{p-1}x^{-up-1}\int_0^x f(t)^pt^{up}dt \end{aligned} \]

Note we have used the fact that \(\frac{1}{p}+\frac{1}{q}=1 \implies p+q=pq\) and \(\frac{p}{q}=p-1\). Fubini's theorem gives us the final answer: \[ \begin{aligned} \int_0^\infty F(x)^pdx &\leq \int_0^\infty\left[\left(\frac{1}{1-uq}\right)^{p-1}x^{-up-1}\int_0^x f(t)^pt^{up}dt\right]dx \\ &=\left(\frac{1}{1-uq}\right)^{p-1}\int_0^\infty dx\int_0^x f(t)^pt^{up}x^{-up-1}dt \\ &=\left(\frac{1}{1-uq}\right)^{p-1}\int_0^\infty dt\int_t^\infty f(t)^pt^{up}x^{-up-1}dx \\ &=\left(\frac{1}{1-uq}\right)^{p-1}\frac{1}{up}\int_0^\infty f(t)^pdt. \end{aligned} \] It remains to find the minimum of \(\varphi(u) = \left(\frac{1}{1-uq}\right)^{p-1}\frac{1}{up}\). This is an elementary calculus problem. By taking its derivative, we see when \(u=\frac{1}{pq}<1-\frac{1}{p}\) it attains its minimum \(\left(\frac{p}{p-1}\right)^p=q^p\). Hence we get \[ \int_0^\infty F(x)^pdx \leq q^p\int_0^\infty f(t)^pdt, \] which is exactly what we want. Note the constant \(q\) cannot be replaced with a smaller one. We simply proved the case when \(f \geq 0\). For the general case, one simply needs to take absolute value.

This approach makes use of properties of \(L^p\) space. Still we assume that \(f \geq 0\) but we also assume \(f \in C_c((0,\infty))\), that is, \(f\) is continuous and has compact support. Hence \(F\) is differentiable in this situation. Integration by parts gives \[ \int_0^\infty F^p(x)dx=xF(x)^p\vert_0^\infty- p\int_0^\infty xdF^p = -p\int_0^\infty xF^{p-1}(x)F'(x)dx. \] Note since \(f\) has compact support, there are some \([a,b]\) such that \(f >0\) only if \(0 < a \leq x \leq b < \infty\) and hence \(xF(x)^p\vert_0^\infty=0\). Next it is natural to take a look at \(F'(x)\). Note we have \[ F'(x) = \frac{f(x)}{x}-\frac{\int_0^x f(t)dt}{x^2}, \] hence \(xF'(x)=f(x)-F(x)\). A substitution gives us \[ \int_0^\infty F^p(x)dx = -p\int_0^\infty F^{p-1}(x)[f(x)-F(x)]dx, \] which is equivalent to say \[ \int_0^\infty F^p(x)dx = \frac{p}{p-1}\int_0^\infty F^{p-1}(x)f(x)dx. \] Hölder's inequality gives us \[ \begin{aligned} \int_0^\infty F^{p-1}(x)f(x)dx &\leq \left[\int_0^\infty F^{(p-1)q}(x)dx\right]^{\frac{1}{q}}\left[\int_0^\infty f(x)^pdx\right]^{\frac{1}{p}} \\ &=\left[\int_0^\infty F^{p}(x)dx\right]^{\frac{1}{q}}\left[\int_0^\infty f(x)^pdx\right]^{\frac{1}{p}}. \end{aligned} \] Together with the identity above we get \[ \int_0^\infty F^p(x)dx = q\left[\int_0^\infty F^{p}(x)dx\right]^{\frac{1}{q}}\left[\int_0^\infty f(x)^pdx\right]^{\frac{1}{p}} \] which is exactly what we want since \(1-\frac{1}{q}=\frac{1}{p}\) and all we need to do is divide \(\left[\int_0^\infty F^pdx\right]^{1/q}\) on both sides. So what's next? Note \(C_c((0,\infty))\) is dense in \(L^p((0,\infty))\). For any \(f \in L^p((0,\infty))\), we can take a sequence of functions \(f_n \in C_c((0,\infty))\) such that \(f_n \to f\) with respect to \(L^p\)-norm. Taking \(F=\frac{1}{x}\int_0^x f(t)dt\) and \(F_n = \frac{1}{x}\int_0^x f_n(t)dt\), we need to show that \(F_n \to F\) pointwise, so that we can use Fatou's lemma. For \(\varepsilon>0\), there exists some \(m\) such that \(\lrVert[f_n-f]_p < \frac{1}{n}\). Thus \[ \begin{aligned} |F_n(x)-F(x)| &= \frac{1}{x}\left\vert \int_0^x f_n(t)dt - \int_0^x f(t)dt \right\vert \\ &\leq \frac{1}{x} \int_0^x |f_n(t)-f(t)|dt \\ &\leq \frac{1}{x} \left[\int_0^x|f_n(t)-f(t)|^pdt\right]^{\frac{1}{p}}\left[\int_0^x 1^qdt\right]^{\frac{1}{q}} \\ &=\frac{1}{x^{1/p}}\left[\int_0^x|f_n(t)-f(t)|^pdt\right]^{\frac{1}{p}} \\ &\leq \frac{1}{x^{1/p}}\lrVert[f_n-f]_p <\frac{\varepsilon}{x^{1/p}}. \end{aligned} \] Hence \(F_n \to F\) pointwise, which also implies that \(|F_n|^p \to |F|^p\) pointwise. For \(|F_n|\) we have \[ \begin{aligned} \int_0^\infty |F_n(x)|^pdx &= \int_0^\infty \left\vert\frac{1}{x}\int_0^x f_n(t)dt\right\vert^p dx \\ &\leq \int_0^\infty \left[\frac{1}{x}\int_0^x |f_n(t)|dt\right]^{p}dx \\ &\leq q\int_0^\infty |f_n(t)|^pdt \end{aligned} \] note the third inequality follows since we have already proved it for \(f \geq 0\). By Fatou's lemma, we have \[ \begin{aligned} \int_0^\infty |F(x)|^pdx &= \int_0^\infty \lim_{n \to \infty}|F_n(x)|^pdx \\ &\leq \lim_{n \to \infty} \int_0^\infty |F_n(x)|^pdx \\ &\leq \lim_{n \to \infty}q^p\int_0^\infty |f_n(x)|^pdx \\ &=q^p\int_0^\infty |f(x)|^pdx. \end{aligned} \]

Throughout, let \((X,\mathfrak{M},\mu)\) be a measure space where \(\mu\) is positive.

If \(f\) is of \(L^p(\mu)\), which means \(\lVert f \rVert_p=\left(\int_X |f|^p d\mu\right)^{1/p}<\infty\), or equivalently \(\int_X |f|^p d\mu<\infty\), then we may say \(|f|^p\) is of \(L^1(\mu)\). In other words, we have a function \[
\begin{aligned}
\lambda: L^p(\mu) &\to L^1(\mu) \\
f &\mapsto |f|^p.
\end{aligned}
\] This function does not have to be one to one due to absolute value. But we hope this function to be *fine* enough, at the very least, we hope it is continuous.

Here, \(f \sim g\) means that \(f-g\) equals to \(0\) almost everywhere with respect to \(\mu\). It can be easily verified that this is a equivalence relation.

We still use \(\varepsilon-\delta\) argument but it's in a metric space. Suppose \((X,d_1)\) and \((Y,d_2)\) are two metric spaces and \(f:X \to Y\) is a function. We say \(f\) is continuous at \(x_0 \in X\) if for any \(\varepsilon>0\), there exists some \(\delta>0\) such that \(d_2(f(x_0),f(x))<\varepsilon\) whenever \(d_1(x_0,x)<\delta\). Further, we say \(f\) is continuous on \(X\) if \(f\) is continuous at every point \(x \in X\).

For \(1\leq p<\infty\), we already have a metric by \[ d(f,g)=\lVert f-g \rVert_p \] given that \(d(f,g)=0\) if and only if \(f \sim g\). This is complete and makes \(L^p\) a Banach space. But for \(0<p<1\) (yes we are going to cover that), things are much more different, and there is one reason: Minkowski inequality holds reversely! In fact we have \[ \lVert f+g \rVert_p \geq \lVert f \rVert_p + \lVert g \rVert_p \] for \(0<p<1\). In fact, \(L^p\) space has too many weird things when \(0<p<1\). Precisely,

For \(0<p<1\), \(L^p(\mu)\) is locally convex if and only if \(\mu\) assumes finitely many values. (Proof.)

On the other hand, for example, \(X=[0,1]\) and \(\mu=m\) be the Lebesgue measure, then \(L^p(\mu)\) has *no* open convex subset other than \(\varnothing\) and \(L^p(\mu)\) itself. However,

A topological vector space \(X\) is normable if and only if its origin has a convex bounded neighbourhood. (See Kolmogorov's normability criterion.)

Therefore \(L^p(m)\) is not normable, hence not Banach.

We have gone too far. We need a metric that is fine enough.

*In this subsection we always have \(0<p<1\).*

Define \[ \Delta(f)=\int_X |f|^p d\mu \] for \(f \in L^p(\mu)\). We will show that we have a metric by \[ d(f,g)=\Delta(f-g). \] Fix \(y\geq 0\), consider the function \[ f(x)=(x+y)^p-x^p. \] We have \(f(0)=y^p\) and \[ f'(x)=p(x+y)^{p-1}-px^{p-1} \leq px^{p-1}-px^{p-1}=0 \] when \(x > 0\) and hence \(f(x)\) is nonincreasing on \([0,\infty)\), which implies that \[ (x+y)^p \leq x^p+y^p. \] Hence for any \(f\), \(g \in L^p\), we have \[ \Delta(f+g)=\int_X |f+g|^p d\mu \leq \int_X |f|^p d\mu + \int_X |g|^p d\mu=\Delta(f)+\Delta(g). \] This inequality ensures that \[ d(f,g)=\Delta(f-g) \] is a metric. It's immediate that \(d(f,g)=d(g,f) \geq 0\) for all \(f\), \(g \in L^p(\mu)\). For the triangle inequality, note that \[ d(f,h)+d(g,h)=\Delta(f-h)+\Delta(h-g) \geq \Delta((f-h)+(h-g))=\Delta(f-g)=d(f,g). \] This is translate-invariant as well since \[ d(f+h,g+h)=\Delta(f+h-g-h)=\Delta(f-g)=d(f,g) \] The completeness can be verified in the same way as the case when \(p>1\). In fact, this metric makes \(L^p\) a locally bounded F-space.

The metric of \(L^1\) is defined by \[ d_1(f,g)=\lVert f-g \rVert_1=\int_X |f-g|d\mu. \] We need to find a relation between \(d_p(f,g)\) and \(d_1(\lambda(f),\lambda(g))\), where \(d_p\) is the metric of the corresponding \(L^p\) space.

As we have proved, \[ (x+y)^p \leq x^p+y^p. \] Without loss of generality we assume \(x \geq y\) and therefore \[ x^p=(x-y+y)^p \leq (x-y)^p+y^p. \] Hence \[ x^p-y^p \leq (x-y)^p. \] By interchanging \(x\) and \(y\), we get \[ |x^p-y^p| \leq |x-y|^p. \] Replacing \(x\) and \(y\) with \(|f|\) and \(|g|\) where \(f\), \(g \in L^p\), we get \[ \int_{X}\lvert |f|^p-|g|^p \rvert d\mu \leq \int_X |f-g|^p d\mu. \] But \[ d_1(\lambda(f),\lambda(g))=\int_{X}\lvert |f|^p-|g|^p \rvert d\mu \\ d_p(f,g)=\Delta(f-g)= d\mu \leq \int_X |f-g|^p d\mu \] and we therefore have \[ d_1(\lambda(f),\lambda(g)) \leq d_p(f,g). \] Hence \(\lambda\) is continuous (and in fact, Lipschitz continuous and uniformly continuous) when \(0<p<1\).

It's natural to think about Minkowski's inequality and Hölder's inequality in this case since they are critical inequality enablers. You need to think about some examples of how to create the condition to use them and get a fine result. In this section we need to prove that \[ |x^p-y^p| \leq p|x-y|(x^{p-1}+y^{p-1}). \] This inequality is surprisingly easy to prove however. We will use nothing but the mean value theorem. Without loss of generality we assume that \(x > y \geq 0\) and define \(f(t)=t^p\). Then \[ \frac{f(x)-f(y)}{x-y}=f'(\zeta)=p\zeta^{p-1} \] where \(y < \zeta < x\). But since \(p-1 \geq 0\), we see \(\zeta^{p-1} < x^{p-1} <x^{p-1}+y^{p-1}\). Therefore \[ f(x)-f(y)=x^p-y^p=p(x-y)\zeta^{p-1}<p(x-y)(x^{p-1}-y^{p-1}). \] For \(x=y\) the equality holds.

Therefore \[
\begin{aligned}
d_1(\lambda(f),\lambda(g)) &= \int_X \left||f|^p-|g|^p\right|d\mu \\
&\leq \int_Xp\left||f|-|g|\right|(|f|^{p-1}+|g|^{p-1})d\mu
\end{aligned}
\] By *Hölder's inequality*, we have \[
\begin{aligned}
\int_X ||f|-|g||(|f|^{p-1}+|g|^{p-1})d\mu & \leq \left[\int_X \left||f|-|g|\right|^pd\mu\right]^{1/p}\left[\int_X\left(|f|^{p-1}+|g|^{p-1}\right)^q\right]^{1/q} \\
&\leq \left[\int_X \left|f-g\right|^pd\mu\right]^{1/p}\left[\int_X\left(|f|^{p-1}+|g|^{p-1}\right)^q\right]^{1/q} \\
&=\lVert f-g \rVert_p \left[\int_X\left(|f|^{p-1}+|g|^{p-1}\right)^q\right]^{1/q}.
\end{aligned}
\] By *Minkowski's inequality*, we have \[
\left[\int_X\left(|f|^{p-1}+|g|^{p-1}\right)^q\right]^{1/q} \leq \left[\int_X|f|^{(p-1)q}d\mu\right]^{1/q}+\left[\int_X |g|^{(p-1)q}d\mu\right]^{1/q}
\] Now things are clear. Since \(1/p+1/q=1\), or equivalently \(1/q=(p-1)/p\), suppose \(\lVert f \rVert_p\), \(\lVert g \rVert_p \leq R\), then \((p-1)q=p\) and therefore \[
\left[\int_X|f|^{(p-1)q}d\mu\right]^{1/q}+\left[\int_X |g|^{(p-1)q}d\mu\right]^{1/q} = \lVert f \rVert_p^{p-1}+\lVert g \rVert_p^{p-1} \leq 2R^{p-1}.
\] Summing the inequalities above, we get \[
\begin{aligned}
d_1(\lambda(f),\lambda(g)) \leq 2pR^{p-1}\lVert f-g \rVert_p =2pR^{p-1}d_p(f,g)
\end{aligned}
\] hence \(\lambda\) is continuous.

We have proved that \(\lambda\) is continuous, and when \(0<p<1\), we have seen that \(\lambda\) is Lipschitz continuous. It's natural to think about its differentiability afterwards, but the absolute value function is not even differentiable so we may have no chance. But this is still a fine enough result. For example we have no restriction to \((X,\mathfrak{M},\mu)\) other than the positivity of \(\mu\). Therefore we may take \(\mathbb{R}^n\) as the Lebesgue measure space here, or we can take something else.

It's also interesting how we use elementary Calculus to solve some much more abstract problems.