In a recent post, I discussed the use of the multivariate Ornstein-Uhlenbeck model as an alternative to linear regressions. In this post, I discuss the MVOU model as alternative to principal components analysis. As an example, we’ll use USD plan vanilla swap rates up to tenors of ten years, shown (centered) since 1997 in the graph below.
The first choice we need to make when conducting PCA is which covariance matrix to use – the one containing covariances of daily changes in swap rates, or the one containing covariances of the original, undifferenced data. As it happens, these two covariance matrixes are quite a bit different for this data set, as illustrated by the graph below, which shows the differences between the two correlation matrixes. Note that the differences for shorter tenors are quite large.
In contrast, it’s not necessary to choose a covariance matrix in the MVOU framework, as conditional covariances can differ as a function of time between observations, with the unconditional covariances as a limit. In fact, in our case the differences between elements of the correlation matrixes shown above are the result of considerable interaction effects. In particular, we find significant attractions among nearest neighbors, with the 1Y rate particularly attracted to the 2Y rate.
Our next step in the PCA framework is to project the current swap rates into a smaller subspace. As the largest two eigenvalues of the covariance matrix constitute 99.87% of the sum of all ten eigenvalues, we’ll project our swap rates onto the two-dimensional plane corresponding to the two eigenvectors associated with these two largest eigenvalues. The differences between the current rates and the projected rates are shown in the next graph.
Note that the 1Y swap rate and the rates of longer-dated swaps are currently below this plane, such that the residuals are negative, while the residuals for the other swap rates are positive.
At this point, many analysts typically refer to a graph of residuals over time, such as the one shown below. Concluding from this graph that the residuals are mean-reverting, analysts often assume that the current residuals will revert toward zero, with swap rates below the plane increasing and with swap rates above the plans decreasing in the future.
At this point, we could model these PCA residuals as a multivariate mean-reverting system, allowing us to estimate the reversion speed and volatility of each of the variables. But if we were willing to model the residuals as a multivariate mean-reverting system, presumably we would simply have modeled the original data as a multivariate mean-reverting system. And in fact, I’ve never seen anyone model the residuals this way.
The more typical approach at this point in the PCA is to choose two or three swap rates to form a portfolio that would benefit in the event that the residuals returned to zero. For example, given the relative magnitude of their residuals, it wouldn’t be surprising for an analyst to form a butterfly spread consisting of the 1Y, 4Y, and 10Y swap tenors.
Having chosen the specific tenors, the next step is to select a specific weight for each tenor to be used in constructing a butterfly spread. A typical approach is to create a butterfly spread with a value that, in theory, will not change due to a change in the values of either of our first two factors. In our case, if we arbitrarily fix the weight of the 4Y rate to be 2, the weights of the 1Y and 10Y rates that would immunize the spread against changes in the values of the first two factors would be -0.860 and -1.27.
Having chosen weights for our spread, the next step in the typical PCA is to graph the resulting butterfly, and we show the resulting PCA-weighted 1Y/4Y/10Y spread in the graph below.
The spread gives the appearance of being mean-reverting, which is encouraging. And it currently has a value of 20 bp. More specifically, this is the change in the value of the fly that would result if the 1Y residual, 4Y residual, and 10Y residual all assumed values of zero simultaneously.
To further quantify the risk and return of this PCA-weighted fly, many analysts would model the weighted fly as a univariate Ornstein-Uhlenbeck process. Taking this approach, we estimate that this weighted fly reverts around its mean with a half-life of 94 days. And in the context of this calibrated, univariate Ornstein-Uhlenbeck process, the expected change over a period of three months is a decline of 10 bp, with a standard deviation of 31.4 bp, corresponding to an annualized Sharpe ratio of 0.64.
Having said that, there are various problems with this stage of our principal components analysis. One of the more serious – though less discussed – problems is that a linear combination of variables that follow a multivariate Ornstein-Uhlenbeck process can’t be modeled as a univariate OU process (unless the weights used to construct the linear combination are orthogonal to one of the eigenvectors of the transition matrix of the process). In our PCA, we haven’t explicitly modeled the residuals as a multivariate mean-reverting process – but that’s presumably what we have in mind when we rely on the mean-reverting properties of the residuals to choose swap tenors and associated weights for constructing a weighted spread.
Even if we assume that our weighted spread can be modeled by a univariate OU process, the Sharpe ratio we’ve calculated is for a naïve strategy in which we position for the spread to decline and then simply leave this static position in place, with no updating of the position, for a fixed horizon of three months. At a minimum, we almost surely would exit the position in the event it reached its long-run mean (zero, by construction, in this case). So our naïve ‘sell-and-hold’ strategy almost surely understates the ex ante Sharpe ratio associated with a strategy we’d actually use in practice.
Another issue to note is that at no point in this principal components analysis have we optimized any notion of return or of risk-adjusted return. For example, the weights of our butterfly spread were chosen to ensure that the PCA-weighted fly has no exposure to changes in the first or second factors. This constraint sometimes appears intuitive in the context of a PCA, but there’s no economic basis for this decision. No trader wakes up in the morning with any particular goals pertaining to the vectors associated with the largest two eigenvalues of a covariance matrix. So while these PCA weights aren’t entirely arbitrary, they haven’t been chosen to optimize any economic quantity with which we’d typically be concerned, such as an ex ante Sharpe ratio or the expected value of a utility function.
In contrast, let’s see how we’d proceed in the context of a multivariate Ornstein-Uhlenbeck framework.
While there’s nothing in the MVOU framework keeping us from projecting current swap rates into a smaller subspace, there’s no particular benefit in doing this. The focus of the MVOU framework is not to identify instruments whose prices appear relatively rich or cheap currently. Rather, it’s to identify linear combinations of instruments for which the expected change in future value is attractive relative to the associated risk. For example, the expected change for each swap rate over a three-month horizon, according to our calibrated MVOU model, is shown in the graph below.
Note that while the residuals from a two-factor PCA had two real roots (ie, the pattern crossed the horizontal axis twice), the expected changes in the graph above are monotonic and cross the horizontal axis only once. With that in mind, these expected changes are more suggestive of a steepening trade than of a butterfly spread. For example, a simple equal-weighted 2s10s steepener has an annualized ex ante Sharpe ratio of 1.10 – considerably better than the 1Y/4Y/10Y butterfly analyzed in our PCA.
As a comparison, let’s consider the same tenors as above, namely 1Y, 4Y, and 10Y. If we apply the PCA weights above of -0.860, 2, and -1.27, we obtain an ex ante, annualized Sharpe ratio of essentially zero, considerably less than suggested when we model the PCA-weighted fly as a univariate Ornstein-Uhlenbeck process.
Two issues help explain the reasons the Sharpe ratios for the same trade are so different in the PCA and MVOU frameworks. First, the covariance matrix used in the PCA corresponds to the levels of these swap rates – ie, to the unconditional covariance matrix characterizing rates over the long run. But in our MVOU framework, the three-month conditional covariance matrix resembles the covariance matrix of daily differences more than it resembles the covariance matrix corresponding to the levels of the historical swap rates.
Second, in the MVOU framework, the expected change in the PCA-weighted butterfly spread is a linear combination of the conditional expected values of the 1Y, 4Y, and 10Y swap rates. In the PCA framework, the expected change in the PCA-weighted butterfly spread is produced by a univariate Ornstein-Uhlenbeck model calibrated to the historical data for this fly. As we mentioned earlier, a linear combination of variables following a MVOU process won’t follow a univariate OU process, except under quite restrictive circumstances.
Part of the problem is that the expected value of the butterfly spread in the MVOU framework is conditioned on all three of the relevant swap rates. In contrast, in the PCA framework, the expected value of the butterfly spread is conditioned on one number – the butterfly spread – which doesn’t convey as much information as is conveyed by retaining distinct values for all three swap rates.
The MVOU model imposes more structure on the data. In particular, it not only specifies a scatter matrix through which common shocks are transmitted to all the swap rates. It also contains a transition matrix that specifies the interactions among and between the various swap rates as a function of their (centered) distances from each other (and from the own long-run means).
In the PCA framework, the only things we need to specify are the covariance matrix and the number of factors we presume are driving the system (ie, the dimension of the subspace into which we want to project our data). In the MVOU framework, we don’t need to specify the dimensionality of any subspace, but we do need to specify -- and estimate – the possible interactions between the variables described by the transition matrix.
So that we don’t mis-identify any false economies in the PCA framework, it’s worth recalling that we do need to specify whether we want to use the covariance matrix corresponding to daily changes or the covariance matrix corresponding to the original, undifferenced data. And in making the choice to use one matrix rather than another, we’re essentially choosing to discard the information contained in the matrix we’ve decided not to use. In contrast, in the MVOU framework, one of the roles of the transition matrix is to introduce horizon-dependence into the covariance matrix. The transition matrix allows us to match the covariance matrix corresponding to differenced data over the short term and to match the covariance matrix corresponding to the undifferenced data over the long run.
In the interest of full disclosure, I should admit that specifying and calibrating the parameters of a MVOU model requires more effort than calculating the eigenvalues and eigenvectors of a covariance matrix. In particular, while the elements of the scatter matrix are often quite close to a scaled Cholesky decomposition of the covariance matrix of differenced data, calibration of the transition matrix typically requires a bit of effort.
But while the MVOU framework requires somewhat more effort than does the PCA framework, the returns on this additional investment are typically significant. In fact, our advice to traders and analysts is to compare results using both methods. When there are significant differences, as there are in the example considered here, it’s useful to investigate and to understand the reasons for the differences.
The multivariate Ornstein-Uhlenbeck model is no panacea. It requires somewhat greater effort to implement than does PCA, and it requires more assumptions than does the PCA framework. But it provides greater flexibility to model relevant dynamics in our data, with the potential to produce better trading performance as a result. In our view, it’s a particularly useful tool to have available on a trading desk.
Comments