1# -*- coding: utf-8 -*-
2"""Sandwich covariance estimators
3
4
5Created on Sun Nov 27 14:10:57 2011
6
7Author: Josef Perktold
8Author: Skipper Seabold for HCxxx in linear_model.RegressionResults
9License: BSD-3
10
11Notes
12-----
13
14for calculating it, we have two versions
15
16version 1: use pinv
17pinv(x) scale pinv(x)   used currently in linear_model, with scale is
181d (or diagonal matrix)
19(x'x)^(-1) x' scale x (x'x)^(-1),  scale in general is (nobs, nobs) so
20pretty large general formulas for scale in cluster case are in [4],
21which can be found (as of 2017-05-20) at
22http://www.tandfonline.com/doi/abs/10.1198/jbes.2010.07136
23This paper also has the second version.
24
25version 2:
26(x'x)^(-1) S (x'x)^(-1)    with S = x' scale x,    S is (kvar,kvars),
27(x'x)^(-1) is available as normalized_covparams.
28
29
30
31S = sum (x*u) dot (x*u)' = sum x*u*u'*x'  where sum here can aggregate
32over observations or groups. u is regression residual.
33
34x is (nobs, k_var)
35u is (nobs, 1)
36x*u is (nobs, k_var)
37
38
39For cluster robust standard errors, we first sum (x*w) over other groups
40(including time) and then take the dot product (sum of outer products)
41
42S = sum_g(x*u)' dot sum_g(x*u)
43For HAC by clusters, we first sum over groups for each time period, and then
44use HAC on the group sums of (x*w).
45If we have several groups, we have to sum first over all relevant groups, and
46then take the outer product sum. This can be done by summing using indicator
47functions or matrices or with explicit loops. Alternatively we calculate
48separate covariance matrices for each group, sum them and subtract the
49duplicate counted intersection.
50
51Not checked in details yet: degrees of freedom or small sample correction
52factors, see (two) references (?)
53
54
55This is the general case for MLE and GMM also
56
57in MLE     hessian H, outerproduct of jacobian S,   cov_hjjh = HJJH,
58which reduces to the above in the linear case, but can be used
59generally, e.g. in discrete, and is misnomed in GenericLikelihoodModel
60
61in GMM it's similar but I would have to look up the details, (it comes
62out in sandwich form by default, it's in the sandbox), standard Newey
63West or similar are on the covariance matrix of the moment conditions
64
65quasi-MLE: MLE with mis-specified model where parameter estimates are
66fine (consistent ?) but cov_params needs to be adjusted similar or
67same as in sandwiches. (I did not go through any details yet.)
68
69TODO
70----
71* small sample correction factors, Done for cluster, not yet for HAC
72* automatic lag-length selection for Newey-West HAC,
73  -> added: nlag = floor[4(T/100)^(2/9)]  Reference: xtscc paper, Newey-West
74     note this will not be optimal in the panel context, see Peterson
75* HAC should maybe return the chosen nlags
76* get consistent notation, varies by paper, S, scale, sigma?
77* replace diag(hat_matrix) calculations in cov_hc2, cov_hc3
78
79
80References
81----------
82[1] John C. Driscoll and Aart C. Kraay, “Consistent Covariance Matrix Estimation
83with Spatially Dependent Panel Data,” Review of Economics and Statistics 80,
84no. 4 (1998): 549-560.
85
86[2] Daniel Hoechle, "Robust Standard Errors for Panel Regressions with
87Cross-Sectional Dependence", The Stata Journal
88
89[3] Mitchell A. Petersen, “Estimating Standard Errors in Finance Panel Data
90Sets: Comparing Approaches,” Review of Financial Studies 22, no. 1
91(January 1, 2009): 435 -480.
92
93[4] A. Colin Cameron, Jonah B. Gelbach, and Douglas L. Miller, “Robust Inference
94With Multiway Clustering,” Journal of Business and Economic Statistics 29
95(April 2011): 238-249.
96
97
98not used yet:
99A.C. Cameron, J.B. Gelbach, and D.L. Miller, “Bootstrap-based improvements
100for inference with clustered errors,” The Review of Economics and
101Statistics 90, no. 3 (2008): 414–427.
102
103"""
104import numpy as np
105
106from statsmodels.tools.grouputils import combine_indices, group_sums
107from statsmodels.stats.moment_helpers import se_cov
108
109__all__ = ['cov_cluster', 'cov_cluster_2groups', 'cov_hac', 'cov_nw_panel',
110           'cov_white_simple',
111           'cov_hc0', 'cov_hc1', 'cov_hc2', 'cov_hc3',
112           'se_cov', 'weights_bartlett', 'weights_uniform']
113
114
115
116
117#----------- from linear_model.RegressionResults
118'''
119    HC0_se
120        White's (1980) heteroskedasticity robust standard errors.
121        Defined as sqrt(diag(X.T X)^(-1)X.T diag(e_i^(2)) X(X.T X)^(-1)
122        where e_i = resid[i]
123        HC0_se is a property.  It is not evaluated until it is called.
124        When it is called the RegressionResults instance will then have
125        another attribute cov_HC0, which is the full heteroskedasticity
126        consistent covariance matrix and also `het_scale`, which is in
127        this case just resid**2.  HCCM matrices are only appropriate for OLS.
128    HC1_se
129        MacKinnon and White's (1985) alternative heteroskedasticity robust
130        standard errors.
131        Defined as sqrt(diag(n/(n-p)*HC_0)
132        HC1_se is a property.  It is not evaluated until it is called.
133        When it is called the RegressionResults instance will then have
134        another attribute cov_HC1, which is the full HCCM and also `het_scale`,
135        which is in this case n/(n-p)*resid**2.  HCCM matrices are only
136        appropriate for OLS.
137    HC2_se
138        MacKinnon and White's (1985) alternative heteroskedasticity robust
139        standard errors.
140        Defined as (X.T X)^(-1)X.T diag(e_i^(2)/(1-h_ii)) X(X.T X)^(-1)
141        where h_ii = x_i(X.T X)^(-1)x_i.T
142        HC2_se is a property.  It is not evaluated until it is called.
143        When it is called the RegressionResults instance will then have
144        another attribute cov_HC2, which is the full HCCM and also `het_scale`,
145        which is in this case is resid^(2)/(1-h_ii).  HCCM matrices are only
146        appropriate for OLS.
147    HC3_se
148        MacKinnon and White's (1985) alternative heteroskedasticity robust
149        standard errors.
150        Defined as (X.T X)^(-1)X.T diag(e_i^(2)/(1-h_ii)^(2)) X(X.T X)^(-1)
151        where h_ii = x_i(X.T X)^(-1)x_i.T
152        HC3_se is a property.  It is not evaluated until it is called.
153        When it is called the RegressionResults instance will then have
154        another attribute cov_HC3, which is the full HCCM and also `het_scale`,
155        which is in this case is resid^(2)/(1-h_ii)^(2).  HCCM matrices are
156        only appropriate for OLS.
157
158'''
159
160# Note: HCCM stands for Heteroskedasticity Consistent Covariance Matrix
161def _HCCM(results, scale):
162    '''
163    sandwich with pinv(x) * diag(scale) * pinv(x).T
164
165    where pinv(x) = (X'X)^(-1) X
166    and scale is (nobs,)
167    '''
168    H = np.dot(results.model.pinv_wexog,
169        scale[:,None]*results.model.pinv_wexog.T)
170    return H
171
172def cov_hc0(results):
173    """
174    See statsmodels.RegressionResults
175    """
176
177    het_scale = results.resid**2 # or whitened residuals? only OLS?
178    cov_hc0 = _HCCM(results, het_scale)
179
180    return cov_hc0
181
182def cov_hc1(results):
183    """
184    See statsmodels.RegressionResults
185    """
186
187    het_scale = results.nobs/(results.df_resid)*(results.resid**2)
188    cov_hc1 = _HCCM(results, het_scale)
189    return cov_hc1
190
191def cov_hc2(results):
192    """
193    See statsmodels.RegressionResults
194    """
195
196    # probably could be optimized
197    h = np.diag(np.dot(results.model.exog,
198                          np.dot(results.normalized_cov_params,
199                          results.model.exog.T)))
200    het_scale = results.resid**2/(1-h)
201    cov_hc2_ = _HCCM(results, het_scale)
202    return cov_hc2_
203
204def cov_hc3(results):
205    """
206    See statsmodels.RegressionResults
207    """
208
209    # above probably could be optimized to only calc the diag
210    h = np.diag(np.dot(results.model.exog,
211                          np.dot(results.normalized_cov_params,
212                          results.model.exog.T)))
213    het_scale=(results.resid/(1-h))**2
214    cov_hc3_ = _HCCM(results, het_scale)
215    return cov_hc3_
216
217#---------------------------------------
218
219def _get_sandwich_arrays(results, cov_type=''):
220    """Helper function to get scores from results
221
222    Parameters
223    """
224
225    if isinstance(results, tuple):
226        # assume we have jac and hessian_inv
227        jac, hessian_inv = results
228        xu = jac = np.asarray(jac)
229        hessian_inv = np.asarray(hessian_inv)
230    elif hasattr(results, 'model'):
231        if hasattr(results, '_results'):
232            # remove wrapper
233            results = results._results
234        # assume we have a results instance
235        if hasattr(results.model, 'jac'):
236            xu = results.model.jac(results.params)
237            hessian_inv = np.linalg.inv(results.model.hessian(results.params))
238        elif hasattr(results.model, 'score_obs'):
239            xu = results.model.score_obs(results.params)
240            hessian_inv = np.linalg.inv(results.model.hessian(results.params))
241        else:
242            xu = results.model.wexog * results.wresid[:, None]
243
244            hessian_inv = np.asarray(results.normalized_cov_params)
245
246        # experimental support for freq_weights
247        if hasattr(results.model, 'freq_weights') and not cov_type == 'clu':
248            # we do not want to square the weights in the covariance calculations
249            # assumes that freq_weights are incorporated in score_obs or equivalent
250            # assumes xu/score_obs is 2D
251            # temporary asarray
252            xu /= np.sqrt(np.asarray(results.model.freq_weights)[:, None])
253
254    else:
255        raise ValueError('need either tuple of (jac, hessian_inv) or results' +
256                         'instance')
257
258    return xu, hessian_inv
259
260
261def _HCCM1(results, scale):
262    '''
263    sandwich with pinv(x) * scale * pinv(x).T
264
265    where pinv(x) = (X'X)^(-1) X
266    and scale is (nobs, nobs), or (nobs,) with diagonal matrix diag(scale)
267
268    Parameters
269    ----------
270    results : result instance
271       need to contain regression results, uses results.model.pinv_wexog
272    scale : ndarray (nobs,) or (nobs, nobs)
273       scale matrix, treated as diagonal matrix if scale is one-dimensional
274
275    Returns
276    -------
277    H : ndarray (k_vars, k_vars)
278        robust covariance matrix for the parameter estimates
279
280    '''
281    if scale.ndim == 1:
282        H = np.dot(results.model.pinv_wexog,
283                   scale[:,None]*results.model.pinv_wexog.T)
284    else:
285        H = np.dot(results.model.pinv_wexog,
286                   np.dot(scale, results.model.pinv_wexog.T))
287    return H
288
289def _HCCM2(hessian_inv, scale):
290    '''
291    sandwich with (X'X)^(-1) * scale * (X'X)^(-1)
292
293    scale is (kvars, kvars)
294    this uses results.normalized_cov_params for (X'X)^(-1)
295
296    Parameters
297    ----------
298    results : result instance
299       need to contain regression results, uses results.normalized_cov_params
300    scale : ndarray (k_vars, k_vars)
301       scale matrix
302
303    Returns
304    -------
305    H : ndarray (k_vars, k_vars)
306        robust covariance matrix for the parameter estimates
307
308    '''
309    if scale.ndim == 1:
310        scale = scale[:,None]
311
312    xxi = hessian_inv
313    H = np.dot(np.dot(xxi, scale), xxi.T)
314    return H
315
316#TODO: other kernels, move ?
317def weights_bartlett(nlags):
318    '''Bartlett weights for HAC
319
320    this will be moved to another module
321
322    Parameters
323    ----------
324    nlags : int
325       highest lag in the kernel window, this does not include the zero lag
326
327    Returns
328    -------
329    kernel : ndarray, (nlags+1,)
330        weights for Bartlett kernel
331
332    '''
333
334    #with lag zero
335    return 1 - np.arange(nlags+1)/(nlags+1.)
336
337def weights_uniform(nlags):
338    '''uniform weights for HAC
339
340    this will be moved to another module
341
342    Parameters
343    ----------
344    nlags : int
345       highest lag in the kernel window, this does not include the zero lag
346
347    Returns
348    -------
349    kernel : ndarray, (nlags+1,)
350        weights for uniform kernel
351
352    '''
353
354    #with lag zero
355    return np.ones(nlags+1)
356
357
358kernel_dict = {'bartlett': weights_bartlett,
359               'uniform': weights_uniform}
360
361
362def S_hac_simple(x, nlags=None, weights_func=weights_bartlett):
363    '''inner covariance matrix for HAC (Newey, West) sandwich
364
365    assumes we have a single time series with zero axis consecutive, equal
366    spaced time periods
367
368
369    Parameters
370    ----------
371    x : ndarray (nobs,) or (nobs, k_var)
372        data, for HAC this is array of x_i * u_i
373    nlags : int or None
374        highest lag to include in kernel window. If None, then
375        nlags = floor(4(T/100)^(2/9)) is used.
376    weights_func : callable
377        weights_func is called with nlags as argument to get the kernel
378        weights. default are Bartlett weights
379
380    Returns
381    -------
382    S : ndarray, (k_vars, k_vars)
383        inner covariance matrix for sandwich
384
385    Notes
386    -----
387    used by cov_hac_simple
388
389    options might change when other kernels besides Bartlett are available.
390
391    '''
392
393    if x.ndim == 1:
394        x = x[:,None]
395    n_periods = x.shape[0]
396    if nlags is None:
397        nlags = int(np.floor(4 * (n_periods / 100.)**(2./9.)))
398
399    weights = weights_func(nlags)
400
401    S = weights[0] * np.dot(x.T, x)  #weights[0] just for completeness, is 1
402
403    for lag in range(1, nlags+1):
404        s = np.dot(x[lag:].T, x[:-lag])
405        S += weights[lag] * (s + s.T)
406
407    return S
408
409def S_white_simple(x):
410    '''inner covariance matrix for White heteroscedastistity sandwich
411
412
413    Parameters
414    ----------
415    x : ndarray (nobs,) or (nobs, k_var)
416        data, for HAC this is array of x_i * u_i
417
418    Returns
419    -------
420    S : ndarray, (k_vars, k_vars)
421        inner covariance matrix for sandwich
422
423    Notes
424    -----
425    this is just dot(X.T, X)
426
427    '''
428    if x.ndim == 1:
429        x = x[:,None]
430
431    return np.dot(x.T, x)
432
433
434def S_hac_groupsum(x, time, nlags=None, weights_func=weights_bartlett):
435    '''inner covariance matrix for HAC over group sums sandwich
436
437    This assumes we have complete equal spaced time periods.
438    The number of time periods per group need not be the same, but we need
439    at least one observation for each time period
440
441    For a single categorical group only, or a everything else but time
442    dimension. This first aggregates x over groups for each time period, then
443    applies HAC on the sum per period.
444
445    Parameters
446    ----------
447    x : ndarray (nobs,) or (nobs, k_var)
448        data, for HAC this is array of x_i * u_i
449    time : ndarray, (nobs,)
450        timeindes, assumed to be integers range(n_periods)
451    nlags : int or None
452        highest lag to include in kernel window. If None, then
453        nlags = floor[4(T/100)^(2/9)] is used.
454    weights_func : callable
455        weights_func is called with nlags as argument to get the kernel
456        weights. default are Bartlett weights
457
458    Returns
459    -------
460    S : ndarray, (k_vars, k_vars)
461        inner covariance matrix for sandwich
462
463    References
464    ----------
465    Daniel Hoechle, xtscc paper
466    Driscoll and Kraay
467
468    '''
469    #needs groupsums
470
471    x_group_sums = group_sums(x, time).T #TODO: transpose return in grou_sum
472
473    return S_hac_simple(x_group_sums, nlags=nlags, weights_func=weights_func)
474
475
476def S_crosssection(x, group):
477    '''inner covariance matrix for White on group sums sandwich
478
479    I guess for a single categorical group only,
480    categorical group, can also be the product/intersection of groups
481
482    This is used by cov_cluster and indirectly verified
483
484    '''
485    x_group_sums = group_sums(x, group).T  #TODO: why transposed
486
487    return S_white_simple(x_group_sums)
488
489
490def cov_crosssection_0(results, group):
491    '''this one is still wrong, use cov_cluster instead'''
492
493    #TODO: currently used version of groupsums requires 2d resid
494    scale = S_crosssection(results.resid[:,None], group)
495    scale = np.squeeze(scale)
496    cov = _HCCM1(results, scale)
497    return cov
498
499def cov_cluster(results, group, use_correction=True):
500    '''cluster robust covariance matrix
501
502    Calculates sandwich covariance matrix for a single cluster, i.e. grouped
503    variables.
504
505    Parameters
506    ----------
507    results : result instance
508       result of a regression, uses results.model.exog and results.resid
509       TODO: this should use wexog instead
510    use_correction : bool
511       If true (default), then the small sample correction factor is used.
512
513    Returns
514    -------
515    cov : ndarray, (k_vars, k_vars)
516        cluster robust covariance matrix for parameter estimates
517
518    Notes
519    -----
520    same result as Stata in UCLA example and same as Peterson
521
522    '''
523    #TODO: currently used version of groupsums requires 2d resid
524    xu, hessian_inv = _get_sandwich_arrays(results, cov_type='clu')
525
526    if not hasattr(group, 'dtype') or group.dtype != np.dtype('int'):
527        clusters, group = np.unique(group, return_inverse=True)
528    else:
529        clusters = np.unique(group)
530
531    scale = S_crosssection(xu, group)
532
533    nobs, k_params = xu.shape
534    n_groups = len(clusters) #replace with stored group attributes if available
535
536    cov_c = _HCCM2(hessian_inv, scale)
537
538    if use_correction:
539        cov_c *= (n_groups / (n_groups - 1.) *
540                  ((nobs-1.) / float(nobs - k_params)))
541
542    return cov_c
543
544def cov_cluster_2groups(results, group, group2=None, use_correction=True):
545    '''cluster robust covariance matrix for two groups/clusters
546
547    Parameters
548    ----------
549    results : result instance
550       result of a regression, uses results.model.exog and results.resid
551       TODO: this should use wexog instead
552    use_correction : bool
553       If true (default), then the small sample correction factor is used.
554
555    Returns
556    -------
557    cov_both : ndarray, (k_vars, k_vars)
558        cluster robust covariance matrix for parameter estimates, for both
559        clusters
560    cov_0 : ndarray, (k_vars, k_vars)
561        cluster robust covariance matrix for parameter estimates for first
562        cluster
563    cov_1 : ndarray, (k_vars, k_vars)
564        cluster robust covariance matrix for parameter estimates for second
565        cluster
566
567    Notes
568    -----
569
570    verified against Peterson's table, (4 decimal print precision)
571    '''
572
573    if group2 is None:
574        if group.ndim !=2 or group.shape[1] != 2:
575            raise ValueError('if group2 is not given, then groups needs to be ' +
576                             'an array with two columns')
577        group0 = group[:, 0]
578        group1 = group[:, 1]
579    else:
580        group0 = group
581        group1 = group2
582        group = (group0, group1)
583
584
585    cov0 = cov_cluster(results, group0, use_correction=use_correction)
586    #[0] because we get still also returns bse
587    cov1 = cov_cluster(results, group1, use_correction=use_correction)
588
589    # cov of cluster formed by intersection of two groups
590    cov01 = cov_cluster(results,
591                        combine_indices(group)[0],
592                        use_correction=use_correction)
593
594    #robust cov matrix for union of groups
595    cov_both = cov0 + cov1 - cov01
596
597    #return all three (for now?)
598    return cov_both, cov0, cov1
599
600
601def cov_white_simple(results, use_correction=True):
602    '''
603    heteroscedasticity robust covariance matrix (White)
604
605    Parameters
606    ----------
607    results : result instance
608       result of a regression, uses results.model.exog and results.resid
609       TODO: this should use wexog instead
610
611    Returns
612    -------
613    cov : ndarray, (k_vars, k_vars)
614        heteroscedasticity robust covariance matrix for parameter estimates
615
616    Notes
617    -----
618    This produces the same result as cov_hc0, and does not include any small
619    sample correction.
620
621    verified (against LinearRegressionResults and Peterson)
622
623    See Also
624    --------
625    cov_hc1, cov_hc2, cov_hc3 : heteroscedasticity robust covariance matrices
626        with small sample corrections
627
628    '''
629    xu, hessian_inv = _get_sandwich_arrays(results)
630    sigma = S_white_simple(xu)
631
632    cov_w = _HCCM2(hessian_inv, sigma)  #add bread to sandwich
633
634    if use_correction:
635        nobs, k_params = xu.shape
636        cov_w *= nobs / float(nobs - k_params)
637
638    return cov_w
639
640
641def cov_hac_simple(results, nlags=None, weights_func=weights_bartlett,
642                   use_correction=True):
643    '''
644    heteroscedasticity and autocorrelation robust covariance matrix (Newey-West)
645
646    Assumes we have a single time series with zero axis consecutive, equal
647    spaced time periods
648
649
650    Parameters
651    ----------
652    results : result instance
653       result of a regression, uses results.model.exog and results.resid
654       TODO: this should use wexog instead
655    nlags : int or None
656        highest lag to include in kernel window. If None, then
657        nlags = floor[4(T/100)^(2/9)] is used.
658    weights_func : callable
659        weights_func is called with nlags as argument to get the kernel
660        weights. default are Bartlett weights
661
662    Returns
663    -------
664    cov : ndarray, (k_vars, k_vars)
665        HAC robust covariance matrix for parameter estimates
666
667    Notes
668    -----
669    verified only for nlags=0, which is just White
670    just guessing on correction factor, need reference
671
672    options might change when other kernels besides Bartlett are available.
673
674    '''
675    xu, hessian_inv = _get_sandwich_arrays(results)
676    sigma = S_hac_simple(xu, nlags=nlags, weights_func=weights_func)
677
678    cov_hac = _HCCM2(hessian_inv, sigma)
679
680    if use_correction:
681        nobs, k_params = xu.shape
682        cov_hac *= nobs / float(nobs - k_params)
683
684    return cov_hac
685
686cov_hac = cov_hac_simple   #alias for users
687
688#---------------------- use time lags corrected for groups
689#the following were copied from a different experimental script,
690#groupidx is tuple, observations assumed to be stacked by group member and
691#sorted by time, equal number of periods is not required, but equal spacing is.
692#I think this is pure within group HAC: apply HAC to each group member
693#separately
694
695def lagged_groups(x, lag, groupidx):
696    '''
697    assumes sorted by time, groupidx is tuple of start and end values
698    not optimized, just to get a working version, loop over groups
699    '''
700    out0 = []
701    out_lagged = []
702    for l,u in groupidx:
703        if l+lag < u: #group is longer than lag
704            out0.append(x[l+lag:u])
705            out_lagged.append(x[l:u-lag])
706
707    if out0 == []:
708        raise ValueError('all groups are empty taking lags')
709    #return out0, out_lagged
710    return np.vstack(out0), np.vstack(out_lagged)
711
712
713
714def S_nw_panel(xw, weights, groupidx):
715    '''inner covariance matrix for HAC for panel data
716
717    no denominator nobs used
718
719    no reference for this, just accounting for time indices
720    '''
721    nlags = len(weights)-1
722
723    S = weights[0] * np.dot(xw.T, xw)  #weights just for completeness
724    for lag in range(1, nlags+1):
725        xw0, xwlag = lagged_groups(xw, lag, groupidx)
726        s = np.dot(xw0.T, xwlag)
727        S += weights[lag] * (s + s.T)
728    return S
729
730
731def cov_nw_panel(results, nlags, groupidx, weights_func=weights_bartlett,
732                 use_correction='hac'):
733    '''Panel HAC robust covariance matrix
734
735    Assumes we have a panel of time series with consecutive, equal spaced time
736    periods. Data is assumed to be in long format with time series of each
737    individual stacked into one array. Panel can be unbalanced.
738
739    Parameters
740    ----------
741    results : result instance
742       result of a regression, uses results.model.exog and results.resid
743       TODO: this should use wexog instead
744    nlags : int or None
745        Highest lag to include in kernel window. Currently, no default
746        because the optimal length will depend on the number of observations
747        per cross-sectional unit.
748    groupidx : list of tuple
749        each tuple should contain the start and end index for an individual.
750        (groupidx might change in future).
751    weights_func : callable
752        weights_func is called with nlags as argument to get the kernel
753        weights. default are Bartlett weights
754    use_correction : 'cluster' or 'hac' or False
755        If False, then no small sample correction is used.
756        If 'cluster' (default), then the same correction as in cov_cluster is
757        used.
758        If 'hac', then the same correction as in single time series, cov_hac
759        is used.
760
761
762    Returns
763    -------
764    cov : ndarray, (k_vars, k_vars)
765        HAC robust covariance matrix for parameter estimates
766
767    Notes
768    -----
769    For nlags=0, this is just White covariance, cov_white.
770    If kernel is uniform, `weights_uniform`, with nlags equal to the number
771    of observations per unit in a balance panel, then cov_cluster and
772    cov_hac_panel are identical.
773
774    Tested against STATA `newey` command with same defaults.
775
776    Options might change when other kernels besides Bartlett and uniform are
777    available.
778
779    '''
780    if nlags == 0: #so we can reproduce HC0 White
781        weights = [1, 0]  #to avoid the scalar check in hac_nw
782    else:
783        weights = weights_func(nlags)
784
785    xu, hessian_inv = _get_sandwich_arrays(results)
786
787    S_hac = S_nw_panel(xu, weights, groupidx)
788    cov_hac = _HCCM2(hessian_inv, S_hac)
789    if use_correction:
790        nobs, k_params = xu.shape
791        if use_correction == 'hac':
792            cov_hac *= nobs / float(nobs - k_params)
793        elif use_correction in ['c', 'clu', 'cluster']:
794            n_groups = len(groupidx)
795            cov_hac *= n_groups / (n_groups - 1.)
796            cov_hac *= ((nobs-1.) / float(nobs - k_params))
797
798    return cov_hac
799
800
801def cov_nw_groupsum(results, nlags, time, weights_func=weights_bartlett,
802                 use_correction=0):
803    '''Driscoll and Kraay Panel robust covariance matrix
804
805    Robust covariance matrix for panel data of Driscoll and Kraay.
806
807    Assumes we have a panel of time series where the time index is available.
808    The time index is assumed to represent equal spaced periods. At least one
809    observation per period is required.
810
811    Parameters
812    ----------
813    results : result instance
814       result of a regression, uses results.model.exog and results.resid
815       TODO: this should use wexog instead
816    nlags : int or None
817        Highest lag to include in kernel window. Currently, no default
818        because the optimal length will depend on the number of observations
819        per cross-sectional unit.
820    time : ndarray of int
821        this should contain the coding for the time period of each observation.
822        time periods should be integers in range(maxT) where maxT is obs of i
823    weights_func : callable
824        weights_func is called with nlags as argument to get the kernel
825        weights. default are Bartlett weights
826    use_correction : 'cluster' or 'hac' or False
827        If False, then no small sample correction is used.
828        If 'hac' (default), then the same correction as in single time series, cov_hac
829        is used.
830        If 'cluster', then the same correction as in cov_cluster is
831        used.
832
833    Returns
834    -------
835    cov : ndarray, (k_vars, k_vars)
836        HAC robust covariance matrix for parameter estimates
837
838    Notes
839    -----
840    Tested against STATA xtscc package, which uses no small sample correction
841
842    This first averages relevant variables for each time period over all
843    individuals/groups, and then applies the same kernel weighted averaging
844    over time as in HAC.
845
846    Warning:
847    In the example with a short panel (few time periods and many individuals)
848    with mainly across individual variation this estimator did not produce
849    reasonable results.
850
851    Options might change when other kernels besides Bartlett and uniform are
852    available.
853
854    References
855    ----------
856    Daniel Hoechle, xtscc paper
857    Driscoll and Kraay
858
859    '''
860
861    xu, hessian_inv = _get_sandwich_arrays(results)
862
863    #S_hac = S_nw_panel(xw, weights, groupidx)
864    S_hac = S_hac_groupsum(xu, time, nlags=nlags, weights_func=weights_func)
865    cov_hac = _HCCM2(hessian_inv, S_hac)
866    if use_correction:
867        nobs, k_params = xu.shape
868        if use_correction == 'hac':
869            cov_hac *= nobs / float(nobs - k_params)
870        elif use_correction in ['c', 'cluster']:
871            n_groups = len(np.unique(time))
872            cov_hac *= n_groups / (n_groups - 1.)
873            cov_hac *= ((nobs-1.) / float(nobs - k_params))
874
875    return cov_hac
876