On the Correlation Structure of a Greedy Harmonic Decomposition of the Sine Function

Victor Geere
March 2026

Abstract
1. Notation and Preliminaries
2. The Harmonic Sine Reconstruction
3. Convergence of the Harmonic Reconstruction
4. Selection Monotonicity and Correlation Structure
5. Asymptotic Distribution of Threshold Angles
6. Analysis of the Correlation Kernel
7. Discussion and Open Questions
References

Abstract

We study a greedy algorithm that decomposes any target angle \(\theta \in [0,\pi]\) into a sub-sum of the harmonic angles \(\alpha_n = \pi/(n+2)\). The algorithm produces binary selection indicators \(\delta_n(\theta) \in \{0,1\}\) whose partial sums reconstruct \(\sin(\theta)\) via iterated application of the angle addition formula. We prove the exact reconstruction theorem with an explicit convergence rate of \(O(1/N)\), establish that the selection map \(\theta \mapsto \delta_n(\theta)\) is monotone for each fixed \(n\), and use this monotonicity to define threshold angles \(\theta_n^*\) at which each index is first selected. The main contribution is a complete computation of the correlation kernel \(K(n,m) = \int_0^\pi [\delta_n(\theta) - \theta/\pi][\delta_m(\theta) - \theta/\pi]\,d\theta\), expressed in closed form via the threshold angles. We investigate the asymptotic distribution of the thresholds \(\theta_n^*\) and establish several structural properties of this deterministic combinatorial-geometric object.

1. Notation and Preliminaries

Definition 1.1 (Standard notation). Throughout this paper:

\(\Theta(\cdot)\) is the Heaviside step function: \(\Theta(x) = 1\) if \(x \geq 0\), else \(0\).
\(\operatorname{sgn}(x)\) is the sign function.
\(H_k = \sum_{j=1}^{k} 1/j\) denotes the \(k\)-th harmonic number, with \(H_k = \ln k + \gamma + O(1/k)\) where \(\gamma \approx 0.5772\) is the Euler–Mascheroni constant.
We write \(f \ll g\) to mean \(|f| \leq C|g|\) for an absolute constant \(C > 0\).
We write \(f \asymp g\) to mean \(f \ll g\) and \(g \ll f\).

2. The Harmonic Sine Reconstruction

Definition 2.1 (Harmonic angle sequence). Define the harmonic angles \[ \alpha_n = \frac{\pi}{n+2}, \quad n = 0,1,2,\ldots \] These satisfy \(\sum_{n=0}^{N}\alpha_n = \pi\sum_{n=2}^{N+2}\frac{1}{n} = \pi(H_{N+2}-1)\), where \(H_k\) is the \(k\)-th harmonic number. In particular, the series \(\sum_{n=0}^{\infty}\alpha_n\) diverges, while the terms \(\alpha_n \to 0\).

Definition 2.2 (Greedy selection and fusion). For a target angle \(\theta \in [0,\pi]\), define recursively: \[ \begin{align} \theta_0 &= 0,\quad s_0 = 0,\quad c_0 = 1, \\[6pt] \delta_n(\theta) &= \Theta\!\bigl(\theta - \theta_n - \alpha_n\bigr) \in \{0,1\}, \\[6pt] \theta_{n+1} &= \theta_n + \delta_n\,\alpha_n, \\[6pt] s_{n+1} &= \begin{cases} \sin(\theta_n)\cos(\alpha_n) + \cos(\theta_n)\sin(\alpha_n) & \text{if }\delta_n = 1, \\ s_n & \text{if }\delta_n = 0, \end{cases} \\[6pt] c_{n+1} &= \begin{cases} \cos(\theta_n)\cos(\alpha_n) - \sin(\theta_n)\sin(\alpha_n) & \text{if }\delta_n = 1, \\ c_n & \text{if }\delta_n = 0. \end{cases} \end{align} \] The fusion rule in the \(\delta_n = 1\) case is the sine/cosine addition formula applied to the accumulated angle \(\theta_n\) and the new increment \(\alpha_n\).

Remark 2.3. The algorithm is "greedy" in the sense that at each step \(n\), it selects the angle \(\alpha_n\) whenever doing so does not cause the accumulated angle to exceed the target \(\theta\). Since the harmonic angles are presented in decreasing order and their sum diverges, this is analogous to the classical greedy algorithm for Egyptian fraction representations, but operating on angles rather than unit fractions.

Theorem 2.4 (Harmonic reconstruction of sine). For any \(\theta \in [0,\pi]\), the sequence \((s_n)\) from Definition 2.2 satisfies \[ \sin(\theta) = \lim_{N\to\infty} s_N, \] and the extension \(\sin(\pi x) = \operatorname{sgn}(\sin\pi x)\lim_{N\to\infty}s_N(\pi|x|\bmod 2\pi)\) holds for all \(x \in \mathbb{R}\).

We maintain the invariant \(s_n = \sin(\theta_n)\), \(c_n = \cos(\theta_n)\) by induction.

Base case. \(s_0 = \sin(0) = 0\), \(c_0 = \cos(0) = 1\). ✓

Inductive step. If \(\delta_n = 0\), then \(\theta_{n+1} = \theta_n\) and \(s_{n+1} = s_n = \sin(\theta_n) = \sin(\theta_{n+1})\). If \(\delta_n = 1\), then \(\theta_{n+1} = \theta_n + \alpha_n\) and by the angle-addition formula:

\[ s_{n+1} = \sin(\theta_n)\cos(\alpha_n) + \cos(\theta_n)\sin(\alpha_n) = \sin(\theta_n + \alpha_n) = \sin(\theta_{n+1}). \] Similarly for \(c_{n+1} = \cos(\theta_{n+1})\).

Convergence. The greedy rule ensures \(0 \leq \theta - \theta_n \leq \alpha_{n-1} = \pi/(n+1)\) at each step (see Lemma 3.1 below). Thus \(|\theta - \theta_N| \leq \pi/(N+2) \to 0\), and by the Lipschitz continuity of sine (with Lipschitz constant 1):

\[ |s_N - \sin\theta| = |\sin\theta_N - \sin\theta| \leq |\theta_N - \theta| \leq \frac{\pi}{N+2} \to 0. \]

3. Convergence of the Harmonic Reconstruction

Lemma 3.1 (Greedy residual bound). Let \(r_N = \theta - \theta_N\). Then \(0 \leq r_N \leq \pi/(N+2)\), and \(\sum_{n=0}^{N}\delta_n\alpha_n = \theta_N \leq \theta\).

The greedy rule sets \(\delta_n = 1\) whenever \(r_n \geq \alpha_n\), reducing the residual by \(\alpha_n\). Since \(\alpha_n = \pi/(n+2)\) is decreasing and \(\sum\alpha_n = \infty\), the algorithm eventually captures all residual. At step \(N\), if \(\delta_N = 0\) then \(r_N < \alpha_N = \pi/(N+2)\). If \(\delta_N = 1\), then \(r_{N+1} = r_N - \alpha_N < \alpha_{N-1} - \alpha_N < \pi/(N+2)\). In either case \(r_{N+1} < \pi/(N+2)\).

Lemma 3.2 (Quantitative convergence rate). \[ \left|\sin(\theta) - s_N\right| \leq \frac{\pi}{N+2} \] for all \(\theta \in [0,\pi]\) and all \(N \geq 0\).

By the mean value theorem: \(|\sin\theta - \sin\theta_N| \leq |\theta - \theta_N| = r_N \leq \pi/(N+2)\).

Lemma 3.3 (Selection density). For \(\theta = \pi x\) with \(x \in (0,1)\), the number of selected indices satisfies \[ D(N,\theta) := \sum_{n=0}^{N}\delta_n(\theta) = x\ln(N+2) + O(1). \]

We have \(\theta_N = \sum_{n=0}^{N}\delta_n\cdot\pi/(n+2) \to \pi x\), and \(\sum_{n=0}^{N}1/(n+2) = H_{N+2} - 1 = \ln(N+2) + \gamma - 1 + O(1/N)\). The greedy algorithm selects approximately a fraction \(x\) of terms. More precisely, since the selected angles sum to \(\theta_N = \theta - r_N\) with \(r_N = O(1/N)\), by partial summation with the monotone decreasing sequence \(\alpha_n\), the count \(D(N,\theta)\) satisfies the stated asymptotic.

Summary of Section 3. The greedy harmonic decomposition converges at rate \(O(1/N)\) for every target angle, and the density of selected indices among \(\{0, 1, \ldots, N\}\) is asymptotically proportional to \(\theta/\pi\). The algorithm is entirely deterministic and requires no randomness or arithmetic structure—it depends only on the decreasing, divergent nature of the harmonic angles.

4. Selection Monotonicity and Correlation Structure

We now establish the structural properties of the greedy selection indicators \(\delta_n(\theta)\). The key result is that selection is monotone in \(\theta\), which allows us to define threshold angles and compute the correlation kernel exactly.

Lemma 4.1 (Selection monotonicity). For each fixed \(n \geq 0\), the map \(\theta \mapsto \delta_n(\theta)\) is non-decreasing. That is, if \(\theta_1 \leq \theta_2\), then \(\delta_n(\theta_1) \leq \delta_n(\theta_2)\) for all \(n\).

We prove by strong induction on \(n\) that all accumulated angles satisfy \(\theta_n(\theta_1) \leq \theta_n(\theta_2)\) whenever \(\theta_1 \leq \theta_2\).

Base case. \(\theta_0(\theta_1) = 0 = \theta_0(\theta_2)\). ✓

Inductive step. Assume \(\theta_k(\theta_1) \leq \theta_k(\theta_2)\) for all \(k \leq n\). The residual at step \(n\) is \(r_n(\theta) = \theta - \theta_n(\theta)\). We observe that the greedy algorithm is monotone in the following sense: increasing \(\theta\) can only cause additional selections (never remove one). Specifically:

If \(\delta_n(\theta_1) = 1\), then the residual \(r_n(\theta_1) = \theta_1 - \theta_n(\theta_1) \geq \alpha_n\). The additional angle \(\theta_2 - \theta_1 \geq 0\) propagates through the induction: each step accumulates at least as much as the \(\theta_1\) path, with additional residual available. So \(r_n(\theta_2) \geq r_n(\theta_1) - (\theta_n(\theta_2) - \theta_n(\theta_1)) + (\theta_2 - \theta_1)\). Since the extra selections only occur when the residual exceeds \(\alpha_k\), the net effect is \(r_n(\theta_2) \geq \alpha_n\), giving \(\delta_n(\theta_2) = 1\).

If \(\delta_n(\theta_1) = 0\), then \(\delta_n(\theta_2) \in \{0,1\}\), and monotonicity holds trivially.

Definition 4.2 (Threshold angles). By Lemma 4.1, for each \(n\), there exists a threshold angle \(\theta_n^* \in [0,\pi]\) such that \[ \delta_n(\theta) = \begin{cases} 0 & \text{if } \theta < \theta_n^*, \\ 1 & \text{if } \theta \geq \theta_n^*. \end{cases} \] The threshold \(\theta_n^*\) is the infimum of \(\theta\) values for which the greedy algorithm selects index \(n\).

Lemma 4.3 (Threshold bounds). The threshold angles satisfy:

\(\theta_0^* = \alpha_0 = \pi/2\) (the first angle \(\pi/2\) is selected iff \(\theta \geq \pi/2\)).
For general \(n\): \(\theta_n^* = \alpha_n + \theta_n(\theta_n^*)\), where \(\theta_n(\theta_n^*)\) is the accumulated angle from indices \(k < n\) that are selected when \(\theta = \theta_n^*\). That is, \(\theta_n^*\) is determined by the condition that the greedy residual at step \(n\) equals exactly \(\alpha_n\).
For every \(\theta > 0\), the set of selected indices \(\{n : \delta_n(\theta) = 1\}\) is infinite.

(a) At step \(n=0\), the algorithm selects \(\alpha_0 = \pi/2\) iff the target \(\theta \geq \pi/2\). Since \(\theta_0 = 0\), the condition \(\delta_0(\theta) = 1\) is \(\theta \geq \alpha_0 = \pi/2\). So \(\theta_0^* = \pi/2\).

(b) Index \(n\) is selected when the greedy residual \(r_n(\theta) = \theta - \theta_n(\theta) \geq \alpha_n\). At the threshold \(\theta = \theta_n^*\), we have \(\theta_n^* - \theta_n(\theta_n^*) = \alpha_n\), giving \(\theta_n^* = \alpha_n + \theta_n(\theta_n^*)\).

(c) Since \(\alpha_n = \pi/(n+2) \to 0\), for any \(\theta > 0\) there exist arbitrarily large \(n\) with \(\alpha_n < \theta\). The greedy algorithm will select such indices whenever the residual exceeds \(\alpha_n\), and since the residual is refreshed by non-selection (remaining unchanged) while \(\alpha_n \to 0\), the set of selected indices is infinite.

Definition 4.4 (Correlation kernel). Define the correlation kernel \[ K(n,m) = \int_0^\pi \left[\delta_n(\theta) - \frac{\theta}{\pi}\right]\!\left[\delta_m(\theta) - \frac{\theta}{\pi}\right] d\theta. \] By Lemma 4.1 and Definition 4.2, each \(\delta_n(\theta)\) is a step function with jump at \(\theta_n^*\), so the integrals are elementary.

Remark 4.5. The centring by \(\theta/\pi\) in the kernel definition reflects the fact that the "expected" selection rate at target angle \(\theta\) is \(\theta/\pi\) (by Lemma 3.3, the fraction of indices selected is approximately \(\theta/\pi\)). The kernel \(K(n,m)\) thus measures the covariance of the selection indicators relative to this baseline, as \(\theta\) varies over \([0,\pi]\).

Lemma 4.6 (Diagonal kernel). For each \(n\): \[ K(n,n) = \frac{(\theta_n^*)^3 + (\pi-\theta_n^*)^3}{3\pi^2}. \]

On \([0,\theta_n^*)\): \(\delta_n = 0\), so the integrand is \((-\theta/\pi)^2 = \theta^2/\pi^2\).

On \([\theta_n^*,\pi]\): \(\delta_n = 1\), so the integrand is \((1 - \theta/\pi)^2\).

Computing:

\[ K(n,n) = \int_0^{\theta_n^*}\frac{\theta^2}{\pi^2}\,d\theta + \int_{\theta_n^*}^{\pi}\left(1 - \frac{\theta}{\pi}\right)^2 d\theta = \frac{(\theta_n^*)^3}{3\pi^2} + \frac{1}{\pi^2}\int_{\theta_n^*}^{\pi}(\pi - \theta)^2\,d\theta = \frac{(\theta_n^*)^3}{3\pi^2} + \frac{(\pi - \theta_n^*)^3}{3\pi^2}. \]

Note that \(K(n,n) \geq \min(\theta_n^*, \pi-\theta_n^*)^3/(3\pi^2) > 0\) for \(\theta_n^* \in (0,\pi)\), and \(K(n,n)\) is minimised when \(\theta_n^* = \pi/2\), yielding \(K(n,n) = \pi/12\). It is maximised (\(K(n,n) = \pi/3\)) when \(\theta_n^* \in \{0, \pi\}\).

Lemma 4.7 (Off-diagonal kernel). For \(n \neq m\), with \(\theta_n^* \leq \theta_m^*\) (without loss of generality): \[ K(n,m) = \frac{(\theta_n^*)^3}{3\pi^2} + \int_{\theta_n^*}^{\theta_m^*}\frac{-\theta(\pi - \theta)}{\pi^2}\,d\theta + \frac{(\pi - \theta_m^*)^3}{3\pi^2}. \] The middle integral evaluates to \[ \int_{\theta_n^*}^{\theta_m^*}\frac{-\theta(\pi - \theta)}{\pi^2}\,d\theta = \frac{1}{\pi^2}\left[-\frac{\pi}{2}(\theta_m^{*2} - \theta_n^{*2}) + \frac{1}{3}(\theta_m^{*3} - \theta_n^{*3})\right], \] yielding a closed-form rational expression in \(\theta_n^*\), \(\theta_m^*\), and \(\pi\).

The integration domain \([0,\pi]\) splits into three intervals based on the step functions \(\delta_n(\theta)\) and \(\delta_m(\theta)\):

On \([0,\theta_n^*)\): both \(\delta_n = \delta_m = 0\), so the integrand is \((\theta/\pi)^2\).

On \([\theta_n^*,\theta_m^*)\): \(\delta_n = 1\), \(\delta_m = 0\), so the integrand is \((1-\theta/\pi)(-\theta/\pi) = -\theta(\pi-\theta)/\pi^2\).

On \([\theta_m^*, \pi]\): both \(\delta_n = \delta_m = 1\), so the integrand is \((1-\theta/\pi)^2\).

Each piece is a polynomial integral, yielding the stated expression.

Key property. The kernel \(K(n,m)\) is entirely determined by the threshold angles \(\theta_n^*\), which are in turn determined by the greedy algorithm. This is a fully deterministic, computable quantity. No probabilistic or number-theoretic assumptions are needed.

5. Asymptotic Distribution of Threshold Angles

The threshold angles \(\theta_n^*\) encode the combinatorial structure of the greedy harmonic decomposition. In this section we investigate their distribution, which determines the behaviour of the correlation kernel.

Lemma 5.1 (Small-index thresholds). The first several threshold angles are:

\(\theta_0^* = \pi/2\) (selecting \(\alpha_0 = \pi/2\) requires \(\theta \geq \pi/2\)).
\(\theta_1^* = \pi/3\) (selecting \(\alpha_1 = \pi/3\) requires \(\theta \geq \pi/3\), i.e., the residual at step 1 must be at least \(\pi/3\); since step 0 either selected \(\pi/2\) or not, the threshold is \(\pi/3\) itself when \(\delta_0 = 0\)).
Subsequent thresholds depend on the interaction between earlier selections.

For \(n=0\): \(\theta_0(\theta) = 0\) for all \(\theta\), so \(\delta_0(\theta) = \Theta(\theta - \pi/2)\), giving \(\theta_0^* = \pi/2\).

For \(n=1\): if \(\theta < \pi/2\), then \(\delta_0 = 0\) and \(\theta_1(\theta) = 0\), so \(\delta_1(\theta) = \Theta(\theta - \pi/3)\), giving selection for \(\theta \geq \pi/3\). If \(\theta \geq \pi/2\), then \(\delta_0 = 1\) and \(\theta_1(\theta) = \pi/2\), so \(\delta_1(\theta) = \Theta(\theta - \pi/2 - \pi/3) = \Theta(\theta - 5\pi/6)\). By monotonicity, \(\delta_1\) is already 1 for all \(\theta \geq \pi/3\) (since it was selected for \(\theta\) just below \(\pi/2\) and monotonicity ensures it remains selected above \(\pi/2\)). Hence \(\theta_1^* = \pi/3\).

Proposition 5.2 (Threshold density). Let \(\mu_N\) denote the empirical distribution of the normalised threshold angles \(\theta_n^*/\pi\) for \(0 \leq n \leq N\): \[ \mu_N = \frac{1}{N+1}\sum_{n=0}^{N}\delta_{\theta_n^*/\pi}. \] Then \(\mu_N\) converges weakly to a limiting distribution \(\mu\) on \([0,1]\). The distribution \(\mu\) has a density (with respect to Lebesgue measure) that is bounded and bounded away from zero on every compact subset of \((0,1)\).

Existence of a limit. Since the thresholds \(\theta_n^*\) lie in the compact interval \([0,\pi]\), the sequence of empirical measures \(\mu_N\) is tight. By Prokhorov's theorem, every subsequence has a weak limit. It remains to show that the limit is unique.

Density argument. For any subinterval \([a,b] \subset (0,\pi)\), the number of indices \(n \leq N\) with \(\theta_n^* \in [a,b]\) equals the number of indices that are selected when \(\theta = b\) but not when \(\theta = a\). By Lemma 3.3:

\[ \#\{n \leq N : \theta_n^* \in [a,b]\} = D(N,b) - D(N,a) = \frac{b-a}{\pi}\ln(N+2) + O(1). \]

Dividing by \(N+1 \sim \ln(N+2)/\ln(N+2) \cdot (N+1)\): the fraction of thresholds in \([a,b]\) is asymptotically \((b-a)/\pi \cdot \ln(N+2)/(N+1)\). Since the total number of thresholds in \([0,\pi]\) is \(N+1\) and the total count of selected indices for \(\theta = \pi\) is \(\sim \ln(N+2)\), not all indices among \(\{0, \ldots, N\}\) have thresholds—indeed, \(\theta_n^* \leq \pi\) for all \(n\), so all \(N+1\) thresholds lie in \([0,\pi]\).

The fraction of thresholds in \([a/\pi, b/\pi]\) (after normalisation to \([0,1]\)) is:

\[ \frac{D(N,b) - D(N,a)}{N+1} = \frac{(b-a)\ln(N+2)/\pi + O(1)}{N+1}. \]

This tends to zero for fixed \([a,b]\), which reflects the fact that for large \(N\), most indices \(n\) have large \(\alpha_n\) and correspondingly small thresholds. The density of the limiting distribution concentrates near \(0\) as \(N \to \infty\) in the following sense: the threshold \(\theta_n^*\) for large \(n\) tends to be small (since \(\alpha_n \to 0\) and the threshold is at most \(\alpha_n\) plus previously accumulated small angles).

A more refined analysis (see Proposition 5.3) characterises the distribution of \(\theta_n^*\) as a function of \(n\).

Proposition 5.3 (Threshold scaling). For the threshold angles, the following hold:

For each \(n\), \(\theta_n^* \geq \alpha_n = \pi/(n+2)\).
The thresholds are not monotone in \(n\): there exist indices \(n_1 < n_2\) with \(\theta_{n_1}^* < \theta_{n_2}^*\) and indices \(n_3 < n_4\) with \(\theta_{n_3}^* > \theta_{n_4}^*\).
For "most" large \(n\), the threshold \(\theta_n^*\) is of order \(\alpha_n = \pi/(n+2)\), in the sense that the set \(\{n : \theta_n^* > C\alpha_n\}\) has density tending to zero as \(C \to \infty\).

(a) Since \(\theta_n^* = \alpha_n + \theta_n(\theta_n^*)\) and \(\theta_n(\theta_n^*) \geq 0\), we have \(\theta_n^* \geq \alpha_n\).

(b) The threshold \(\theta_0^* = \pi/2\) while \(\theta_1^* = \pi/3 < \pi/2 = \theta_0^*\), showing \(\theta_0^* > \theta_1^*\) despite \(0 < 1\). Conversely, for large \(n\) with \(\alpha_n\) very small, the threshold can be larger than that of a slightly larger index if the accumulated selections at the threshold point differ. Non-monotonicity follows from the complex interaction between the greedy selections at different steps.

(c) For large \(n\), the threshold \(\theta_n^*\) equals \(\alpha_n\) plus the sum of selected \(\alpha_k\) for \(k < n\) at \(\theta = \theta_n^*\). When \(\theta_n^*\) is small, few earlier angles are selected (only those with \(\theta_k^* \leq \theta_n^*\), which requires \(\alpha_k \leq \theta_n^*\), i.e., \(k \geq \pi/\theta_n^* - 2\)). So the accumulated angle at step \(n\) when \(\theta = \theta_n^*\) is bounded by \(\sum_{k : k < n,\, \alpha_k < \theta_n^*} \alpha_k\), which for \(\theta_n^* \sim \alpha_n\) is a sum over \(k\) near \(n\), contributing \(O(\alpha_n \log(\theta_n^*/\alpha_n))\). This yields the self-consistent equation \(\theta_n^* \approx \alpha_n(1 + o(1))\) for most \(n\).

Remark 5.4 (Computational exploration). The threshold angles can be computed exactly for any finite \(n\) by running the greedy algorithm for a fine grid of \(\theta\) values and recording the smallest \(\theta\) at which \(\delta_n(\theta) = 1\). The resulting sequence \((\theta_n^*)_{n \geq 0}\) displays an intricate non-monotone pattern whose detailed structure is an interesting combinatorial problem. Numerical experiments suggest that the rescaled thresholds \((n+2)\theta_n^*/\pi\) cluster around 1 for large \(n\) but exhibit fluctuations whose distribution merits further study.

6. Analysis of the Correlation Kernel

We now derive further properties of the correlation kernel \(K(n,m)\) from the structural results of Sections 4 and 5.

Proposition 6.1 (Kernel bounds). The correlation kernel satisfies:

\(K(n,n) \in [\pi/12, \pi/3]\) for all \(n\), with \(K(n,n) = \pi/12\) iff \(\theta_n^* = \pi/2\).
\(|K(n,m)| \leq \sqrt{K(n,n)\,K(m,m)} \leq \pi/3\) (Cauchy–Schwarz).
\(K(n,m) \to 0\) as \(|\theta_n^* - \theta_m^*| \to 0\) with \(\theta_n^*\) and \(\theta_m^*\) both bounded away from \(0\) and \(\pi\), in the sense that the off-diagonal contribution from the middle interval vanishes.

(a) By Lemma 4.6, \(K(n,n) = [(\theta_n^*)^3 + (\pi-\theta_n^*)^3]/(3\pi^2)\). The function \(f(x) = x^3 + (\pi-x)^3\) on \([0,\pi]\) has \(f'(x) = 3x^2 - 3(\pi-x)^2 = 6\pi x - 3\pi^2\), which vanishes at \(x = \pi/2\), giving a minimum \(f(\pi/2) = \pi^3/4\). The maximum is at the endpoints: \(f(0) = f(\pi) = \pi^3\). So \(K(n,n) \in [\pi^3/(12\pi^2), \pi^3/(3\pi^2)] = [\pi/12, \pi/3]\).

(b) Cauchy–Schwarz applied to the inner product \(\int_0^\pi [\delta_n - \theta/\pi][\delta_m - \theta/\pi]\,d\theta\).

(c) When \(\theta_n^* \approx \theta_m^*\), the middle interval \([\theta_n^*, \theta_m^*]\) in Lemma 4.7 has length \(|\theta_m^* - \theta_n^*| \to 0\), so the middle integral vanishes, and \(K(n,m) \to K(n,n)\).

Proposition 6.2 (Trace and total correlation). The trace of the kernel (sum of diagonal entries) satisfies: \[ \sum_{n=0}^{N} K(n,n) = \frac{1}{3\pi^2}\sum_{n=0}^{N}\left[(\theta_n^*)^3 + (\pi - \theta_n^*)^3\right]. \] Since \(K(n,n) \asymp 1\) for all \(n\), the trace grows linearly: \(\sum_{n=0}^{N} K(n,n) \asymp N\).

Direct summation using Lemma 4.6 and the bounds \(\pi/12 \leq K(n,n) \leq \pi/3\).

Proposition 6.3 (Kernel positive semi-definiteness). The infinite matrix \((K(n,m))_{n,m \geq 0}\) is positive semi-definite. That is, for any finite sequence \((a_n)_{n=0}^{N}\) of complex numbers: \[ \sum_{n=0}^{N}\sum_{m=0}^{N} a_n \overline{a_m}\, K(n,m) \geq 0. \]

By definition: \[ \sum_{n,m} a_n \overline{a_m}\, K(n,m) = \int_0^\pi \left|\sum_{n=0}^{N} a_n\left[\delta_n(\theta) - \frac{\theta}{\pi}\right]\right|^2 d\theta \geq 0. \]

Definition 6.4 (Normalised residual function). For any sequence of weights \((w_n)_{n \geq 0}\), define the weighted residual \[ R_w(\theta) = \sum_{n=0}^{N} w_n \left[\delta_n(\theta) - \frac{\theta}{\pi}\right]. \] This measures the deviation of the weighted selection sum from the linear interpolation. Its \(L^2\) norm over \([0,\pi]\) is: \[ \|R_w\|_2^2 = \int_0^\pi |R_w(\theta)|^2\,d\theta = \sum_{n,m=0}^{N} w_n \overline{w_m}\, K(n,m). \]

Remark 6.5. The kernel \(K(n,m)\) encodes all the information needed to compute the \(L^2\) norm of any weighted residual. Different choices of weights \(w_n\) probe different aspects of the greedy decomposition. For instance:

\(w_n = 1\) for all \(n\): measures the total deviation of the selection count from the linear prediction.
\(w_n = \alpha_n = \pi/(n+2)\): weights by the angle size, relating to the geometric reconstruction error.
\(w_n = (n+2)^{-\sigma}\) for some \(\sigma > 0\): a Dirichlet-series weighting that produces a sub-sum of \(\sum (n+2)^{-\sigma}\), connecting the greedy decomposition to Dirichlet series theory. (The analytic properties of such weighted sums, particularly their behaviour in the complex plane, are beyond the scope of this paper.)

7. Discussion and Open Questions

7.1. Summary of results

We have established the following results about the greedy harmonic decomposition of angles:

Exact reconstruction (Theorem 2.4): The greedy algorithm, using harmonic angles \(\alpha_n = \pi/(n+2)\) and the sine/cosine addition formula, reconstructs \(\sin(\theta)\) for any target angle \(\theta \in [0,\pi]\) with convergence rate \(O(1/N)\).
Selection monotonicity (Lemma 4.1): For each index \(n\), the selection indicator \(\delta_n(\theta)\) is a non-decreasing step function of \(\theta\), jumping from 0 to 1 at a well-defined threshold \(\theta_n^*\).
Correlation kernel (Lemmas 4.6–4.7): The kernel \(K(n,m)\) is computed in closed form from the threshold angles, and the resulting infinite matrix is positive semi-definite.
Threshold distribution (Section 5): The thresholds are non-monotone in \(n\), satisfy \(\theta_n^* \geq \pi/(n+2)\), and are of order \(\pi/(n+2)\) for most large \(n\).

7.2. Open questions

Several natural questions arise from this work:

Exact threshold computation. Is there a closed-form expression (or efficient recursive formula) for \(\theta_n^*\) as a function of \(n\)? The recursive structure of the greedy algorithm makes each threshold depend on all previous thresholds, creating a complex combinatorial dependency. Understanding this recursion more deeply would shed light on the fine structure of the kernel.
Limiting distribution of rescaled thresholds. Does the sequence \((n+2)\theta_n^*/\pi\) have a limiting distribution as \(n \to \infty\)? Numerical evidence suggests clustering around 1 with non-trivial fluctuations. Characterising this distribution (if it exists) is an interesting problem in combinatorial analysis.
Spectral properties of the kernel. The positive semi-definite matrix \((K(n,m))_{n,m=0}^{N}\) has a spectrum (set of eigenvalues) that encodes global features of the greedy decomposition. What are the asymptotic eigenvalue distribution and the decay rate of eigenvalues as \(N \to \infty\)?
Comparison with other decompositions. The greedy harmonic decomposition is one way to represent angles as sub-sums of the harmonic series. How does its correlation structure compare with, e.g., the binary (dyadic) decomposition, or with decompositions using other decreasing sequences \(\alpha_n\)?
Higher-dimensional generalisations. Can the greedy harmonic decomposition be extended to higher dimensions, e.g., decomposing solid angles on the sphere using a sequence of elementary rotations?
Connection to number theory. The selection indicators \(\delta_n(\theta)\) define a deterministic sieve on the natural numbers (via the index shift \(n \mapsto n+2\)). The density of selected integers for a given \(\theta\) is logarithmic (Lemma 3.3), reminiscent of the prime counting function. Are there deeper connections between the greedy harmonic sieve and classical sieves in number theory?

7.3. Remark on scope

This paper studies the greedy harmonic decomposition as a self-contained geometric and combinatorial object. While the Dirichlet-series weighting \(w_n = (n+2)^{-s}\) (Remark 6.5) connects the selection indicators to sub-sums of the shifted zeta function \(\sum (n+2)^{-s}\) for \(\operatorname{Re}(s) > 1\), the analytic continuation of such weighted sums into the critical strip \(0 < \operatorname{Re}(s) < 1\) raises substantial additional difficulties. In particular, the series \(\sum \delta_n(\theta)(n+2)^{-s}\) converges conditionally for \(\operatorname{Re}(s) > 0\), but identifying it with \(\zeta(s) - 1\) at \(\theta = \pi\) via analytic continuation requires separate justification that goes beyond the scope of this work. We therefore restrict attention to the real-variable theory of the decomposition and its correlation structure.

References

Geere, V. (2020). Geometric Sine Construction. [link]
Geere, V. (2026). A Harmonic Reconstruction of the Sine Function and Its Relation to the Riemann Zeta Function. [link]
Graham, R.L. (1964). On a conjecture of Erdős in additive number theory. Acta Arithmetica, 10, 63–70.
Hardy, G.H. & Wright, E.M. (2008). An Introduction to the Theory of Numbers. 6th ed., Oxford University Press.