1#' calculate Normalized Getmansky Smoothing Index
2#'
3#' Proposed by Getmansky et al to provide a normalized measure of "liquidity
4#' risk."
5#'
6#' To measure the effects of smoothing, Getmansky, Lo, et al (2004) define a
7#' "smoothing profile" as a vector of coefficients for an MLE fit on returns
8#' using a two-period moving-average process.
9#'
10#' The moving-average process of order \eqn{k=2} (specified using
11#' \code{MAorder}) gives \eqn{R_t = \theta_{0} R_{t} + \theta_1 R_{t -1} +
12#' \theta_2 R_{t-2}}, under the constraint that the sum of the coefficients is
13#' equal to 1. In , the \code{arima} function allows us to create an MA(2)
14#' model using an "ARIMA(p,d,q)" model, where \eqn{p} is the number of
15#' autoregressive terms (AR), \eqn{d} is the degree of differencing, and
16#' \eqn{q} is the number of lagged forecast errors (MA) in the prediction
17#' equation.  The \code{order} parameter allows us to specify the three
18#' components \eqn{(p, d, q)} as an argument, e.g., \code{order = c(0, 0, 2)}.
19#' The \code{method} specifies how to fit the model, in this case using maximum
20#' likelihood estimation (MLE) in a fashion similar to the estimation of
21#' standard moving-average time series models, using:
22#'
23#' \code{arima(ra, order=c(0,0,2), method="ML", transform.pars=TRUE,
24#' include.mean=FALSE)}
25#'
26#' \code{include.mean}: Getmansky, et al. (2004) p 555 "By applying the above
27#' procedure to observed de-meaned returns...", so we set that parameter to
28#' 'FALSE'.
29#'
30#' \code{transform.pars}: ibid, "we impose the additional restriction that the
31#' estimated MA(k) process be invertible," so we set the parameter to 'TRUE'.
32#'
33#' The coefficients, \eqn{\theta_{j}}, are then normalized to sum to
34#' interpreted as a "weighted average of the fund's true returns over the most
35#' recent \eqn{k + 1} periods, including the current period."
36#'
37#' If these weights are disproportionately centered on a small number of lags,
38#' relatively little serial correlation will be induced. However, if the
39#' weights are evenly distributed among many lags, this would show higher
40#' serial correlation.
41#'
42#' The paper notes that because \eqn{\theta_j \in [0, 1]}, \eqn{\xi} is also
43#' confined to the unit interval, and is minimized when all the
44#' \eqn{\theta_j}'s are identical.  That implies a value of \eqn{1/(k + 1)} for
45#' \eqn{\xi}, and a maximum value of \eqn{\xi = 1} when one coefficient is 1
46#' and the rest are 0.  In the context of smoothed returns, a lower value of
47#' \eqn{\xi} implies more smoothing, and the upper bound of 1 implies no
48#' smoothing.
49#'
50#' The "smoothing index", represented as \eqn{\xi}, is calculated the same way
51#' the Herfindahl index.  The Herfindal measure is well known in the industrial
52#' organization literature as a measure of the concentration of firms in a
53#' given industry where \eqn{y_j} represents the market share of firm \eqn{j}.
54#'
55#' This method (as well as the implementation described in the paper), does not
56#' enforce \eqn{\theta_j \in [0, 1]}, so \eqn{\xi} is not limited to that range
57#' either.  All we can say is that lower values are "less liquid" and higher
58#' values are "more liquid" or mis-specified.  In this function, setting the
59#' parameter neg.thetas = FALSE does enforce the limitation, eliminating
60#' negative autocorrelation coefficients from the calculation (the papers below
61#' do not make an economic case for eliminating negative autocorrelation,
62#' however).
63#'
64#' Interpretation of the resulting value is difficult.  All we can say is that
65#' lower values appear to have autocorrelation structure like we might expect
66#' of "less liquid" instruments.  Higher values appear "more liquid" or are
67#' poorly fit or mis-specified.
68#'
69#' @param R an xts, vector, matrix, data frame, timeSeries or zoo object of
70#' asset returns
71#' @param neg.thetas if FALSE, function removes negative coefficients (thetas)
72#' when calculating the index
73#' @param MAorder specify the number of periods used to calculate the moving
74#' average, defaults to 2
75#' @param verbose if TRUE, return a list containing the Thetas in addition to
76#' the smoothing index/
77#' @param \dots any other passthru parameters
78#' @section Acknowledgments: Thanks to Dr. Stefan Albrecht, CFA, for invaluable
79#' input.
80#' @author Peter Carl
81#' @references Chan, Nicholas, Mila Getmansky, Shane M. Haas, and Andrew W. Lo.
82#' 2005. Systemic Risk and Hedge Funds. NBER Working Paper Series (11200).
83#' Getmansky, Mila, Andrew W. Lo, and Igor Makarov. 2004. An Econometric Model
84#' of Serial Correlation and Illiquidity in Hedge Fund Returns. Journal of
85#' Financial Economics (74): 529-609.
86###keywords ts multivariate distribution models
87#' @examples
88#'
89#' data(managers)
90#' data(edhec)
91#' SmoothingIndex(managers[,1,drop=FALSE])
92#' SmoothingIndex(managers[,1:8])
93#' SmoothingIndex(edhec)
94#'
95#' @export
96SmoothingIndex <-
97function (R, neg.thetas = FALSE, MAorder=2, verbose = FALSE, ...)
98{ # @author Peter Carl
99
100    # Description:
101    # SmoothingIndex
102
103    # ra    log return vector
104
105    # Function:
106    if (is.vector(R)) {
107        x = na.omit(R)
108
109        MA2 = NULL
110        thetas = 0
111        smoothing.index = 0
112
113        # First, create a a maximum likelihood estimation fit for an MA process.
114
115        # include.mean: Getmansky, et al. JFE 2004 p 555 "By applying the above procedure
116        # to observed de-meaned returns...", so set parameter to FALSE
117        # transform.pars: ibid, "we impose the additional restriction that the estimated MA(k)
118        # process be invertible." so set the parameter to TRUE
119        MA2 = arima(R, order=c(0,0,MAorder), method="ML", transform.pars=TRUE, include.mean=FALSE)
120
121        # Page 555:
122        #
123        # "Because of the scaling property Eq. (52) of the MA(k) likelihood function, a
124        # simple procedure for obtaining estimates of our smoothing model with the
125        # normalization Eq. (49) is to transform estimates (\theta; sigma) from standard
126        # MA(k) estimation packages such as SAS or RATS by dividing each \theta_i by 1 + \theta_1 +
127        # \theta_2 ... \theta_k and multiplying sigma by the same factor. The likelihood function
128        # remains unchanged but the transformed smoothing coefficients will now satisfy
129        # Eq. (49)."
130
131        # From the arima function above, MA2$coef contains two coefficients, and no intercept value.
132        # The calculation below adjusts for that.
133        coefMA2=0
134        if(neg.thetas == FALSE)
135            for (i in 1:length(coef(MA2)))
136                coefMA2[i] = max(0,coef(MA2)[i])
137    #         coefMA2 = max(0,coef(MA2)) # enforces no negative thetas
138        else
139            coefMA2 = coef(MA2) # allows negative thetas
140
141        # Dr. Stefan Albrecht, CFA points out, "I assume that you have to take:"
142        thetas = c(1, coefMA2) / (1 + sum(coefMA2))
143    #
144    #
145    #     thetas = as.numeric((MA2$coef)/sum(MA2$coef))
146
147        # This measure is well known in the industrial organization literature as the HeRfindahl
148        # index, a measure of the concentration of firms in a given industry where yj represents the
149        # market share of firm j: Because theta_j; 1; x is also confined to the unit interval,
150        # and is minimized when all the theta_j 's are identical, which implies a value of 1=\delta_k \phi 1 \phi
151        # for x; and is maximized when one coefficient is 1 and the rest are 0, in which case x 1/4 1:
152        # In the context of smoothed returns, a lower value of x implies more smoothing, and the upper bound
153        # of 1 implies no smoothing, hence we shall refer to x as a "smoothing index".
154
155        smoothing.index = sum(thetas^2) # Calc'd as Herfindahl index would be, referred to as \xi, below
156
157        # The interpretation of this is tricky:
158
159        # "Because \theta_j \varepsilon [0, 1], \xi is also confined to the unit interval, and is minimized when all
160        # the \theta_j 's are identical, which implies a value of 1/(k + 1) for \xi, and is maximized when
161        # one coefficient is 1 and the rest are 0, in which case \xi = 1. In the context of smoothed
162        # returns, a lower value of \xi implies more smoothing, and the upper bound of 1 implies no
163        # smoothing, hence we shall refer to \xi as a "smoothing index"."
164
165        # That's fine, except that this method (as described in the paper), does not enforce
166        # \thetaj \varepsilon [0, 1], so \xi is not limited to that range either.  All we can say is that lower values
167        # are "less liquid" and higher values are "more liquid" or mis-specified.
168
169        if(verbose)
170            return(list(SmoothingIndex = smoothing.index, Thetas = thetas))
171        else
172            return(smoothing.index)
173
174    }
175    else {
176        R = checkData(R, method = "matrix", ... = ...)
177        result = apply(R, 2, SmoothingIndex, neg.thetas = neg.thetas, MAorder = MAorder, verbose = verbose, ... = ...)
178        if(length(result) ==1)
179            return(result)
180        else {
181            dim(result) = c(1,NCOL(R))
182            colnames(result) = colnames(R)
183            rownames(result) = "Smoothing Index"
184            return(result)
185        }
186    }
187}
188
189###############################################################################
190# R (http://r-project.org/) Econometrics for Performance and Risk Analysis
191#
192# Copyright (c) 2004-2020 Peter Carl and Brian G. Peterson
193#
194# This R package is distributed under the terms of the GNU Public License (GPL)
195# for full details see the file COPYING
196#
197# $Id$
198#
199###############################################################################
200