1. Introduction
1.1. Background and Motivation
Analogtodigital (A/D) conversion is a process where bandlimited signals, e.g., audio signals, are digitized for storage and transmission, which is feasible thanks to the classical sampling theorem. In particular, the theorem indicates that discrete sampling is sufficient to capture all features of a given bandlimited signal, provided that the sampling rate is higher than the Nyquist rate.
Given a function
, its Fourier transform
is defined asThe Fourier transform can also be uniquely extended to as a unitary transformation.
Definition 1.1.
Given , if its Fourier transform is supported in .
An important component of A/D conversion is the following theorem:
Theorem 1.2 (Classical Sampling Theorem).
Given , for any satisfying

on

for ,
with and , , one has
(1) 
where the convergence is both uniform on compact sets of and in .
In particular, for and , the following identity holds in :
However, the discrete nature of digital data storage makes it impossible to store exactly the samples . Instead, the quantized samples chosen from a predetermined finite alphabet are stored. This results in the following reconstructed signal
As for the choice of the quantized samples , we shall discuss the following two schemes

Pulse Code Modulation (PCM):
Quantized samples are taken as the directroundoff of the current sample, i.e.,
(2) 
Quantization:
A sequence of auxiliary variables is introduced for this scheme. is defined recursively as
quantization was introduced in 1963, [15], and is still widely used, due to its advantages over PCM. Specifically, quantization is robust against hardware imperfection [9], a decisive weakness for PCM. For quantization, and the more general noise shaping schemes to be explained below, the boundedness of
turns out to be essential, as most analyses on quantization problems rely on it for error estimation. Schemes with bounded auxiliary variables are said to be
stable.Despite its merits over PCM, quantization merely produces linear error decay with respect to bits used as opposed to exponential error decay produced by its counterpart PCM. Thus, it is desirable to generalize quantization for higher order error decay.
Given , one can consider an th order quantization scheme as investigated by Daubechies and DeVore:
Theorem 1.3 (Higher Order Quantization, [8]).
Consider the following stable quantization scheme
where and are the quantized samples and auxiliary variables respectively. Then, for all ,
The existence of such scheme is also proven in the same paper. This has improved the error decay rate from linear to arbitrary polynomial degree while preserving the merits of a first order quantization scheme.
From here, a natural question arises: is it possible to generalize quantization scheme further so that the reconstruction error decay matches the exponential decay of PCM? Two solutions have been proposed for this question. The first one is to create new quantization schemes, known as noise shaping quantization schemes. A brief summary of its development will be provided in Section A.
The other possibility is to drastically enhance data storage efficiency while maintaining the same level of reconstruction accuracy, and signal decimation belongs in this category. The process is as follows: given an rth order quantization scheme, there exists such that
where . Then, consider
a downsampled sequence of , where .
The process where we convert the quantized samples to is called signal decimation.
Decimation has been known in the engineering community [5], and it was observed that decimation results in exponential error decay with respect to the bitrate, even though the observation remained a conjecture until 2015 [10], when Daubechies and Saab proved the following theorem:
Theorem 1.4 (Signal Decimation for Bandlimited Functions, [10]).
Given , , and , there exists a function such that
(3) 
Moreover, the bits needed for each Nyquist interval is
(4) 
Consequently,
From (3) and (4), we can see that the reconstruction error after decimation still decays polynomially with respect to the sampling frequency. As for the data storage, the bits needed changes from to . Thus, the reconstruction error decays exponentially with respect to the bits used.
In [16], the author made an extension of decimation to signals in finite dimensional spaces. Such signals are sampled by finite frames, and a brief introduction on finite frames is given in Appendix A. Using the alternative decimation operator introduced in the same paper, it is proven that up to the second order sigmadelta quantization, similar results to Theorem 1.4 can be achieved:
Definition 1.5 (Alternative Decimation).
Given fixed , the alternative decimation operator is defined to be , where

is the integration operator satisfying
Here, the cyclic convention is adopted: For any , .

is the subsampling operator satisfying
where .
Definition 1.6 (Unitarily Generated Frames (UGF)).
Theorem 1.7 (Alternative Decimation for Finite Frames, [16]).
Given , , , , and as the generator, base vector, eigenvalues, eigenvectors, and the corresponding UGF, respectively, and . Suppose

,

, and

,
then the dual frame combined with the th order quantization has polynomial reconstruction error decay rate of degree with respect to the oversampling rate :
Moreover, the total bits used to record the quantized samples are bits, where the constant depends on . Suppose is fixed as , then as a function of bits used at each entry, satisfies
The constant is independent of the oversampling rate .
1.2. Prior Works
1.2.1. Quantization for Bandlimited Functions
Despite its simple form and robustness, quantization only results in linear error decay with respect to the sampling period as . It was later proven in [8] that a generalization of quantization, namely the rth order quantization, exists for any arbitrary , and for such schemes the error decay is of polynomial order . Leveraging the different constants for this family of quantization schemes, subexponential decay can also be achieved. A different family of quantization schemes was shown [14] to have exponential error decay with small exponent (.) In [11], the exponent was improved to .
1.2.2. Finite Frames
quantization can also be applied to finite frames. It is proven [3] that for any family of frames with bounded frame variation, the reconstruction error decays linearly with respect to the oversampling rate , where the frame is an matrix. With different choices of dual frames, [4] proposed that the socalled Sobolev dual achieves minimum induced matrix 2norm for reconstructions. By carefully matching between the dual frame and the quantization scheme, [7] proved that using
dual for random frames will result in exponential decay with nearoptimal exponent and high probability.
1.2.3. Decimation
In [5], using the assumption that the noise in quantization is random along with numerical experiments, it was asserted that decimation greatly reduces the number of bits needed while maintaining the reconstruction accuracy. In [10], a rigorous proof was given to show that such an assertion is indeed valid, and the reduction of bits used turns the linear decay into exponential decay with respect to the bitrate.
As for the adaptation to finite dimensional signals, the author proved in [16] that there exists a similar operator called the alternative decimation operator that behaves similarly to the decimation for bandlimited signals. In particular, for the first and second order of quantization, it is possible to achieve exponential reconstruction error decay with respect to the bitrate as well. However, similar to the caveat of decimation, it merely improves the storage efficiency while maintaining the same level reconstruction error. Thus, the error rate with respect to the oversampling rate remains the same (quadratic for the second order,) which is still inferior to other noise shaping schemes.
2. Main Results
We have seen in Theorem 1.7 that alternative decimation is only useful up to the second order. Thus, we aim to extend our results to arbitrary orders, and the solution we present here is called the adapted decimation.
Definition 2.1 (Adapted Decimation).
Given , the adapted decimation operator is defined to be
where is the usual backward difference matrix, satisfies , and has .
It will be shown that, for unitarily generated frames satisfying conditions specified in Theorem 2.2, an th order quantization coupled with the corresponding adapted decimation has th order polynomial reconstruction error decay rate with respect to the ratio . As for the data storage, decimation allows for highly efficient storage, making the error decay exponentially with respect to the bits used.
For the rest of the paper, we shall also assume that our quantization scheme is stable, i.e. remains bounded as the dimension .
We also choose the quantization alphabet to be which is uniformly spaced and symmetric around the origin: Given , we define
For complex Euclidean spaces, we define to be the midrise quantizer with length . Throughout this paper we shall always deal with such .
Theorem 2.2.
Given , , , , and as the generator, base vector, eigenvalues, eigenvectors, and the corresponding UGF, respectively, and fixed. Suppose

,

,

,

, and

there exists from the stable th order quantization satisfying
where , then the following statements are true:

Recursivity: For all , there exists such that .

Signal reconstruction: is a frame.

Error estimate: For the dual frame , the reconstruction error satisfies
(6) 
Efficient data storage: Suppose the length of the quantization alphabet is , then the total bits used to record the quantized samples are bits. Furthermore, suppose is fixed as , then as a function of bits used at each entry, satisfies
(7) where , independent of .
Remark 2.3.
Harmonic frames are unitarily generated frames with the generator being the diagonal matrix with entries and the base vector , so Theorem 2.2 is applicable for harmonic frames as well.
3. Proof of Theorem 2.2
First, we claim that scales down to the usual backwarddifference matrix under the undersampling matrix :
Lemma 3.1.
Given with ,
where is the dimensional backward difference matrix.
Proof.
Note that, for ,
For , . ∎
The most important estimate in Theorem 2.2 is (6), where we find the reconstruction error under adapted decimation . It can be written as follows: given a signal and the matrix , the reconstruction error satisfies
where the fourth equality follows from Lemma 3.1, and is the to2 norm of . Thus, in order to obtain this estimate, we need to answer two questions:

Is a frame? What is the lower frame bound of ?

What is ?
The the first question will be answered positively in Proposition 3.11, and the estimate in the third question is given in Lemma 3.13.
Aside from the reconstruction error estimate, we also need to calculate the bits needed to record the decimated sample , which will be done in Lemma 3.14.
3.1. The Effect of Adapted Decimation on the Frame
We start by introducing the following notation:
Definition 3.2.
Given , the by constant matrix has constant on all entries.
The following two lemmas are needed for us to describe in Proposition 3.5.
Lemma 3.3.
Proof.
For any , the th row of can be written as
Now,
where is a diagonal matrix.
Thus,
Thus, . ∎
Lemma 3.4.
where .
Proof.
For any ,
∎
Proposition 3.5.
Given ,
(8) 
3.2. Cancellation Between Residual Terms of
From (8), we can divide into two parts: being the main term, and the rest being residual terms. In this section, we shall investigate the behavior of the residual terms.
Define a doublesequence recursively by
Let and . We first examine the form of each before calculating the cancellation between and .
Lemma 3.6.
For any and ,
Proof.
First, it can easily be seen that for all by induction on . Then, by definition and induction on ,
∎
Lemma 3.7.
For and ,
Proof.
We shall prove this by induction on . For and ,
For ,
For and ,
As for ,
where the third equality follows from the fact that . ∎
Proposition 3.8.
For ,
Comments
There are no comments yet.