Separating Signal from Noise
I want to experiment with modeling price changes over time as the combination of a smooth trend component overlaid with a random noise component. My goal is to examine the statistical properties of each constituent component and compare the results to the statistical properties of the undecomposed market price.
In this post, I use a least squares moving average to determine the smooth component of a price series. This is our signal, so to speak. The difference between the market price and the smooth component is the noise component, which I have dubbed the “dither.” All calculations are based on the logarithmic price values.
S&P 500 ETF (Daily)
The first set of data I want to examine is a series of daily closing prices of an S&P 500 index tracking fund covering a period of approximately 21 years. The first chart below shows the market price and the smooth component. The second one shows the noise component. Here are the charts:
The dither component of the daily price fluctuation is separated from the smooth component that represents the general trend, creating two distinct data sets that can be analyzed individually. In my post titled The Distribution of Price Fluctuations, I plotted the daily returns of the market price in a histogram, along with a fitted normal and Laplace density function. We can perform the same analysis not only on the market price data but also on the separated smooth and dither components. See charts below:
As might be expected, based on a previous study, the histogram for the market price data has the shape of a Laplace distribution. For the smooth data set, the shape of the histogram also looks roughly like that of a Laplace distribution, although it’s a bit distorted. Notice, however, that the standard deviation of the smooth data is about an order of magnitude smaller than that of the market price data. Also notice how the peak in the smooth price histogram is shifted noticeably to the right, indicating a general uptrend in the data. The rightward shift is present in the market price data as well, but it’s not as noticeable because of the larger dispersion of daily price moves in the market price data.
Looking at the dither component, the shape of the histogram resembles that of a Laplace distribution about as neatly as the shape of the histogram for the market price does. The standard deviation is about the same as that of the market price data as well. To gain more insights, let’s look at some concrete numbers concerning the analysis of these three data sets:
These values are the maximum likelihood estimates for each data set. The tables above show the estimated location and scale parameters for both the normal density function and the Laplace density function. My post titled Normal and Laplace Distributions provides the details on how these values are calculated.
The location parameter for the normal density function is roughly the same for both the market price and the smooth component, while the estimated parameter value for the dither component is about an order of magnitude smaller. This suggests that the smooth component embodies the general direction of the price trend, while the dither component is relatively neutral. We can see a similar pattern in the location parameter values for the Laplace density function.
The scale parameter values for both density functions are roughly the same for both the market price and the dither component, while the values for the smooth component are an order of magnitude smaller. This seems to imply that the dither component embodies most of the noise that obscures the otherwise smooth trend in the original market price.
S&P 500 ETF (Intraday)
The next set of data I want to look at is a series of intraday prices of the same S&P 500 index tracking fund evaluated previously. This data set contains one-minute intraday data covering a single trading day. The charts below show the market price along with the smooth trend component and the dither noise component:
The intraday price series contains a couple of sudden price moves that are not tracked very well by the least squares moving average. This results in large spikes on the noise chart. Let’s take a look at the histogram for the market price data series and compare it to that of the separate smooth and dither components:
This histogram for market price data looks like it might approximate the shape of the Laplace density function, but it has a set of shoulders not present in the model function. The histogram for the smooth component has a shape that is even less well defined. But look at the shape of the histogram for the dither component—it looks like an almost ideal approximation of the Laplace density function. Let’s take a look at the numbers:
The location and scale parameters fitted to the normal density function follow the same pattern we saw in the previous data set. The smooth component represents the trend and the dither component represents the noise. For the Laplace density function, this pattern also holds for the scale parameter but not for the location parameter. Interestingly, the location parameter fitted to the Laplace density function implies a sideways trend in the market price but indicates an upward bias in both the smooth component and the dither component.
Japanese Yen (Daily)
Now let’s take a look at the daily exchange rate between the US dollar and the Japanese yen. This data set covers a range of about 18 years. Here are the charts:
The smooth component seems to track the market price fairly well most of the time, but there does appear to be some noticeable lag following reversals. The dither component seems to oscillate up and down in cycles. Here are the histograms:
The shape of the histogram for both the market price data and the dither component closely resemble the shape of the Laplace density function. For the smooth component, the histogram has a general bell shape, but it looks like it might be a bit too sloppy and asymmetrical to properly characterize it as having the shape of a normal or a Laplace density function. Here are the numbers:
The pattern here is very similar to that of the previous data set. These results suggest that the smooth component captures the trend and the dither component captures the noise. As with the estimates computed for the previous section, the location parameters estimated for the Laplace distribution give confusing results.
Chinese Yuan (Intraday)
The next data set is a series of intraday exchange rates between the Chinese yuan and the US dollar covering a period of approximately 24 hours. Each data point is one minute apart. Here are the charts showing the intraday market prices along with the separated smooth and dither components:
In my post titled The Very Strange Chinese Yuan, I examined a different set of intraday exchange rates between the Chinese yuan and the US dollar. In that post, I demonstrated a peculiar triple peak pattern in the distribution of price movements. We can observe the same phenomenon in this data set as well:
The shape of the histogram for the market price data shows the triple peak pattern that is characteristic of intraday exchange rates between the yuan and dollar. The histogram for the smooth component exhibits a roughly bell-shaped distribution with no indication of the triple peak pattern at all. The fitted density functions for the smooth component are both shifted to the left, which can be attributed to the downward trend visible in the price chart. The histogram for the dither component, on the other hand, clearly shows the triple peak pattern, indicating that this distinctive noise pattern is almost entirely removed from the price trend. Here are the parameter estimates for the density functions:
Again, we see a pattern here similar to that of the previous data sets. For the parameters fitted to the normal density function, the magnitude of the values give evidence that the smooth component represents the trend and the dither component represents the noise. We can see this mirrored in the scale parameter values fitted to the Laplace density function, but the fitted location parameter values are not as intuitive. For the Laplace density function, the location parameter fitted to the market price data is zero even though there is an obvious downward trend in the price chart.
Bitcoin (Daily)
The final set of data I want to examine is the series of daily Bitcoin prices covering a period of about five years. Here are the charts:
The price chart shows a fairly consistent multi-year trend followed by a distinct reversal. The noise chart exhibits what appears to be a cyclical pattern, although the periods don’t seem to be evenly spaced. Here are the histograms:
The histograms for both the market price and the dither component have a shape that resembles the Laplace density function. The histogram for the smooth component has a sloppy and irregular shape. Here are the numbers:
Not surprisingly, these results mirror what we’ve seen with the other data sets. The parameters fitted to the normal density function and the scale parameters fitted to the Laplace density function indicate that the smooth and dither components represent the trend and noise respectively. The location parameters estimated for the Laplace density function remain a bit more mysterious.
Final Thoughts
I think the most obvious conclusion to draw from this experiment is that a smooth price signal can be separated from the unrelated noise in price fluctuations. More specifically, the particular shape of the distribution of price movements—a shape that typically resembles that of a Laplace distribution—can be detached from the smooth price trend. The characteristics of the distribution of market price fluctuations are largely a consequence of the noise component independent of the trend component. Even in the case of the intraday Chinese yuan prices, the idiosyncratic triple peak distribution can be isolated to the noise component only.
Another interesting observation in the data sets examined here is that the noise is not entirely random. There is a structure to it. There are undeniable up and down cycles. My initial thought is to apply Fourier analysis to extract a cyclical component from the residual noise. I am curious what the distribution characteristics of the residual noise would look like if it could be isolated from the cyclical component as well as the trend component. The up and down cycles are somewhat irregular, however, which might make a Fourier analysis difficult. I think this is something worth further investigation.
The techniques used in this article rely on a least squares moving average to determine the smooth trend component of a price series. While the least squares moving average is great for tracking sustained price trends, the disadvantage is that it reacts slowly to sharp reversals in the trend. The slow reaction to trend reversals can produce artificially large spikes in the noise component. There might be better smoothing algorithms worth exploring.
Comments