This tutorial covers the interpolation procedures within AutoSignal that retain frequency domain information. The two prediction algorithms within AutoSignal are also covered.
**Generating A Test Signal For Interpolation**
Select the Generate Signal option in the Edit menu or Main toolbar.
For this tutorial, we will create a data stream consisting of three very closely spaced sinusoids and white noise. The center sinusoid is reduced in power so that distortion in the frequency domain representation of interpolated data is likely to cause the loss of this spectral feature.
Click Read and select the file tutor8a.sig from the Signals subdirectory.
The following signal expression is imported:
SRATE=5000
NYQ=SRATE/2
AMP1=1
AMP2=0.5
AMP3=1
FREQ1=NYQ*0.1
FREQ2=NYQ*0.11
FREQ3=NYQ*0.12
PHASE1=PI/2
PHASE2=PI
PHASE3=3*PI/2
F1=AMP1*SIN(2*PI*X*FREQ1+PHASE1)
F2=AMP2*SIN(2*PI*X*FREQ2+PHASE2)
F3=AMP3*SIN(2*PI*X*FREQ3+PHASE3)
Y=F1+F2+F3
The X (time) values vary from 0 to 0.511 with a 0.0002 sample increment. The Nyquist frequency is 2500 (half the sampling rate). The first spectral peak is at frequency 250. The second peak is at frequency 275 and contains one-quarter the power of the first. The final peak is at frequency 300 and contains the same power as the initial peak.
Click OK to process the current signal.
An AutoSignal graph is presented containing the 256 point generated data. While the two Fourier-based interpolation procedures are normally safe, this example illustrates an instance where this is not the case.
Click OK to accept the generated data. Click Yes when asked to update the main data table with the revised data.
Fourier Spectrum
Select the Fourier Spectrum option from the Spectral menu or toolbar. Use the Best Exact N algorithm with Nmin set to 8192. Change the plot to dB, and set the signal count in the sig field to 3. Zoom-in on the peaks region using the left mouse button.
This is not an easy data set to spectrally analyze. The frequencies of the three sinusoids span only 2% of the Nyquist range and the data sequence length is only 256. The resolution of Fourier methods is insufficient to map the three components. To do so, we will use the best AR (Autoregressive) spectral algorithm for resolving sinusoids in noise.
Click OK to close the Fourier spectral dialog.
AR Frequency Spectrum
Select the AR (AutoRegressive) Spectrum option from the Spectral menu or toolbar. Be sure the Data Svd FB algorithm is selected and set the model order to 80 and the Signal Subspace to 6. Be sure Full Range and Adaptive n are checked and that a dB plot format is selected. Zoom-in on the three spectral peaks.
The three spectral peaks are properly identified using this high resolution algorithm.
Click the Graphically Select Signal and Noise Subspaces button.
The singular values give a clear indication of six signal-bearing eigenmodes, confirming that there are only three spectral peaks present. The remaining eigenmodes capture only noise. A good interpolation procedure must preserve the spectral properties of the data. The interpolation methods that follow will be evaluated to see if they yield only three spectral features at these three original frequencies.
Leave the signal subspace set to 6. Click OK to close the singular values dialog.
Click OK to close the AR spectral dialog.
**Fourier Upsampling**
Select the Fourier Upsampling option in the Process menu or toolbar. Enter 5 for the upsampling Factor. Note that 1276 data points are in the output stream.
Click OK to close the upsampling procedure and answer Yes to updating the data table.
This zero-insertion FFT procedure is fast, but is limited to integer upsampling ratios. When data are upsampled by a factor of 5, the Nyquist frequency also increases by a factor of 5. In this algorithm, all information beyond the original Nyquist frequency is automatically zeroed. This creates a discontinuity in the frequency spectrum at the original Nyquist frequency. The zero-insertion algorithm is probably the most widely used upsampling procedure.
Again select the Fourier Spectrum option. Using the dropdown box, set Nmin is set to 1276, the new data size.
Note that the Fourier spectrum is now considerably more complicated. Noise exists only up to the original Nyquist limit. A transition to noise-free frequencies follows. The discontinuity is apparent.
Set Nmin to 8192 and zoom in the peaks region.
This is the classic case where no measure of interpolation can help resolve this third peak. The additional data elements arising from the interpolation go into extending the Nyquist limit, not in adding resolution. The zero-padding interpolates the existing spectrum, but cannot generate a higher resolution spectrum. The only way to get the resolution necessary to resolve this third peak is to sample a longer data record.
Click OK to close the Fourier spectral dialog.
Select the AR (AutoRegressive) Spectrum option.
Click the Graphically Select Signal and Noise Subspaces button.
There is now only evidence of the 2 higher power components. The lower power peak is not mapped by the AR spectrum. Further, the two higher power peaks are not as sharp and the singular value plot is impacted by the frequency domain discontinuity. The AR spectrum does not work as well because the signal becomes more complicated as a consequence of the interpolation. The important point is that the spectrum is degraded by this technique.
Click OK to close the singular values dialog.
Click OK to close the AR spectral dialog.
Click the Reset XY Data button in the main toolbar to undo the upsampling and restore the original data.
**Fourier Interpolation**
Select the Fourier Interpolation option in the Process menu or toolbar. To compare this algorithm with the zero-insertion interpolation, enter 1276 for the output length n. Be sure the Low Pass post-filter box is not checked.
This procedure offers thresholding, both numerical and graphical, as well as the ability to reconstruct the Fourier basis functions or their derivatives. Since this procedure constructs the interpolation directly from the phase-bearing sine basis functions, you can set any interpolation density desired. Further, the expanded Nyquist range can be estimated or the same low pass filter at the original Nyquist limit can be implemented.
Click OK to close the Fourier interpolation procedure and answer Yes to updating the data table.
Again select the Fourier Spectrum option. Using the dropdown box, set Nmin is set to 1276, the new data size.
Although the interpolated data are generated from the Fourier basis functions and no high frequency filtration occurs, the frequency spectrum contains the same discontinuity.
Click OK to close the Fourier spectral dialog.
Select the AR (AutoRegressive) Spectrum option.
The same effect is observed with this type of interpolation. The peaks are not as sharp and the low power peak is no longer resolved.
Click the Graphically Select Signal and Noise Subspaces button.
The same eigenvalue plot is also observed.
As with the zero-insertion method, a degradation in the spectrum has occurred with the Fourier parametric interpolation. Although this example represents an extreme case, it does illustrate that even the best interpolation procedures which preserve spectral properties are not free of distortion.
Click OK to close the singular values dialog.
Click OK to close the AR spectral dialog.
**Generating A Test Signal For Exploring Prediction Algorithms**
Closely related to interpolation, and carrying a far greater number of caveats, are the prediction algorithms in AutoSignal. There are two procedures which can be used for prediction and forecasting. The first is based on AR modeling, and the second relies on fitting a multiple harmonic model to the data. In order to explore both, we will create data series with underlying harmonic oscillations and a great deal of noise.
Select the Generate Signal option in the Edit menu or Main toolbar.
Click Read and select the file tutor8b.sig from the Signals subdirectory.
The following signal expression is imported:
SRATE=5000
NYQ=SRATE/2
AMP1=1.2
AMP2=0.8
AMP3=1.0
AMP4=1.2
FREQ1=NYQ*0.02
FREQ2=NYQ*0.05
FREQ3=NYQ*0.08
FREQ4=NYQ*0.12
PHASE1=0
PHASE2=PI/2
PHASE3=PI
PHASE4=3*PI/2
F1=AMP1*SIN(2*PI*X*FREQ1+PHASE1)
F2=AMP2*SIN(2*PI*X*FREQ2+PHASE2)
F3=AMP3*SIN(2*PI*X*FREQ3+PHASE3)
F4=AMP4*SIN(2*PI*X*FREQ4+PHASE4)
Y=F1+F2+F3+F4
The X (time) values vary from 0 to 0.511 with a 0.0002 sample increment. The Nyquist frequency is 2500 (half the sampling rate). The four underlying harmonic components are at frequencies 50, 125, 200, and 300.
Click OK to process the current signal.
An AutoSignal graph is presented containing the 256 point generated data. We will truncate the data used for prediction at the minimum value in the series and see how well the prediction algorithms can forecast the remaining 70 data values.
Click OK to accept the generated data. Click Yes when asked to update the main data table with the revised data.
**AR Linear Prediction**
An AR (Autoregressive) model forecasts future values by a linear relationship with some number of prior values. It can also be used to back forecast prior values by a linear relationship with some number of subsequent values. In this example, we will only explore the prediction of future values.
Select the AR Linear Prediction option in the Time menu or toolbar. Select the Data Svd Fwd algorithm. Be sure the Single Order box is checked and set the model order to 60.
Click the Graphically Select Signal and Noise Subspaces button and select eigenmode 8 as the last signal-bearing eigenmode.
Be sure Stabilization is set to None, and that x start is set to 0. Enter 0.0368 in the x end field. Set the Predicted Points n to 71.
The Autoregressive model fits the data through the 0.0368 minimum. The remaining 71 values serve as a reference for how reliable the prediction is. The prediction is surprisingly accurate as a consequence of the in-situ noise removal offered by the SVD (singular value decomposition) thresholding.
It is important that the quality of the AR model fit be assessed in order to know whether or not the prediction is likely to have any validity.
**AR Fit Residuals**
Click on the View Residuals button. Be sure the SNP option is selected. It is the second from the end of the toolbar.
The stabilized normal probability plot should be similar to the following graph where the residuals are confirmed as being normally distributed. The blue line is a 90% critical limit. In 1 of 10 random Gaussian noise sets, a single SNP point will reach this 90% critical limit. The green line is a 95% limit, the yellow line a 99% limit, and the red line is a 99.9% limit.
This is an important check on the validity of the AR fit. If the residuals lack a normal distribution, there is a good chance the AR model order is insufficient to map the oscillations in the data.
Close the Residuals window.
**Goodness Of Fit For The AR Model**
Note the r² value in the informational field. The value should be approximately 0.90. Although this is not a superb goodness of fit, it is very good given the measure of random noise added to this data set.
Select the Apply Eigendecomposition and Reconstruction to Local Copy of Data option. Be sure the CovM FB algorithm is selected and set the Eigendecomposition matrix Order to 60. Using the left mouse button, enclose eigenmodes 1 through 8.
Click OK to implement the eigenfiltering.
With the noise removed, the AR fit should now show an r² value in excess of 0.99. When you are uncertain whether a weak AR goodness of fit is due to the presence of noise or due to data that cannot be adequately modeled using AR methods, eigendecomposition denoising is an excellent way to see the quality of the AR fit when most observation noise is removed. Be very cautious of using predictions from AR model fits with low goodness of fit values.
Click the Reset button to restore the data to its initial state. Since the reset also restores the full data range, again enter 0.0368 in the x end field.
**Complex Roots**
Click the Plot Roots button.
AR roots on or very near the unit circle are usually indicative of signal content where those within the interior are typically associated with noise. Here you will note that there are eight roots very close to the unit circle.
Click the Magnitude option.
The eight complex roots closest to the unit circle are even more evident in a magnitude plot. In this instance, two of the roots are slightly outside the unit circle. The results you see may be different because of the random noise component. We will now stabilize the AR model by moving these roots to the unit circle.
Close the Complex Roots option. Select the Unit Circle Out stabilization.
If there are roots outside the unit circle in your data set, you should observe a subtle change in the predicted values.
**Multiple AR Orders**
AutoSignal offers the means to readily assess the predictions from a range of model orders.
Uncheck the Single Order box. Enter 40 in the min field and 80 in the max field. Be sure the increment inc is set to 5.
Note that the accuracy of the predictions persist for greater duration when a higher order is used.
Click the Display as 3D Plot button. Be sure Contour is selected.
Only the predicted region is plotted. Note that the predictions are stable at approximately orders 55 and higher.
Close the 3D View window. Check the Single Order box.
Close the Linear Prediction procedure. Answer No when asked to update the main data table.
**Parametric Prediction Using Harmonic Models**
An alternative to AR modeling involves the use of harmonic or sinusoidal models to map the trends in a data sequence. An FFT equal to the length of a data series can perfectly reproduce the data using phase-bearing sinusoid basis functions. When a subset of this number of sinusoids is optimized for the frequencies, amplitudes, and phases, a harmonic parametric model that approximates the data may be possible.
Select the Parametric Interpolation and Prediction option in the Process menu or toolbar. Set the algorithm to MUSIC and the model order to 60. Set the Signal Subspace to 8. Be sure the sinusoidal model is Undamped and that NL Optimization Enable is checked. Leave x start at 0 and set x end to 0.0368. For the output, set n to 256, leave x start at 0, and set x end to 0.051.
Click on the Set Confidence/Prediction Intervals button. Be sure Prediction Intervals is checked and that a 95% Confidence is selected. Click OK to close the Intervals dialog.
These settings duplicate the prediction that was made in the AR Linear Prediction option. The MUSIC algorithm very accurately identifies the four sinusoidal components that were used to generate the data. These frequencies, and the amplitudes and phases from a four-component linear sinusoidal fit are then refined using AutoSignal's non-linear optimization procedure. The parametric model is then used to construct output data which can include any measure of prediction (extrapolation) desired.
The predictions are excellent as evidenced by the 95% prediction intervals (blue curves).
Click the Hide Y Plot button. Change x end to 0.1.
In fact, the narrow confidence limits for predictions persist far into the future.
The AR model considerations associated with assessing the validity of the predictions apply equally to a model consisting of optimized sinusoids or damped sinusoids. The residuals should be checked for normality to appraise the completeness of the model. The goodness of fit statistics should be high, or if noise is present, the goodness of fit values should become very good when a denoised version of the signal is used for the prediction.
**Predictions In The Real World**
The AR and harmonic prediction algorithms are primarily for processing signal data containing significant deterministic components. Data that are indistinguishable from noise will not be predicted by either algorithm. Even when deterministic components are present, they may be too complex to be modeled by a linear combination of previous values or by a collection of sinusoids.
The two prediction procedures highlighted in this tutorial are robust effective algorithms that can manage a large measure of observation noise. The key to valid predictions using them rests almost entirely on whether a successful fit to the underlying data trend can be realized using AR or harmonic models. Some data sets will be favorably described by these models and others will not. |