1\name{svytable}
2\alias{svreptable}
3\alias{svytable}
4\alias{svytable.svyrep.design}
5\alias{svytable.survey.design}
6\alias{svychisq}
7\alias{svychisq.survey.design}
8\alias{svychisq.svyrep.design}
9\alias{summary.svytable}
10\alias{print.summary.svytable}
11\alias{summary.svreptable}
12\alias{degf}
13\alias{degf.svyrep.design}
14\alias{degf.survey.design2}
15\alias{degf.twophase}
16%- Also NEED an '\alias' for EACH other topic documented here.
17\title{Contingency tables for survey data}
18\description{
19  Contingency tables and chisquared tests of association for survey data.
20}
21\usage{
22\method{svytable}{survey.design}(formula, design, Ntotal = NULL, round = FALSE,...)
23\method{svytable}{svyrep.design}(formula, design, Ntotal = sum(weights(design, "sampling")), round = FALSE,...)
24\method{svychisq}{survey.design}(formula, design,
25   statistic = c("F",  "Chisq","Wald","adjWald","lincom","saddlepoint"),na.rm=TRUE,...)
26\method{svychisq}{svyrep.design}(formula, design,
27   statistic = c("F",  "Chisq","Wald","adjWald","lincom","saddlepoint"),na.rm=TRUE,...)
28\method{summary}{svytable}(object,
29   statistic = c("F","Chisq","Wald","adjWald","lincom","saddlepoint"),...)
30degf(design, ...)
31\method{degf}{survey.design2}(design, ...)
32\method{degf}{svyrep.design}(design, tol=1e-5,...)
33}
34%- maybe also 'usage' for other objects documented here.
35\arguments{
36  \item{formula}{Model formula specifying margins for the table (using \code{+} only)}
37  \item{design}{survey object}
38  \item{statistic}{See Details below}
39  \item{Ntotal}{A population total or set of population stratum totals
40    to normalise to.}
41  \item{round}{Should the table entries be rounded to the nearest
42    integer?}
43  \item{na.rm}{Remove missing values}
44  \item{object}{Output from \code{svytable}}
45  \item{...}{For \code{svytable} these are passed to \code{xtabs}. Use
46    \code{exclude=NULL}, \code{na.action=na.pass} to include \code{NA}s
47    in the table}
48  \item{tol}{Tolerance for \code{\link{qr}} in computing the matrix rank}
49 }
50\details{
51
52The \code{svytable} function computes a weighted crosstabulation. This
53is especially useful for producing graphics. It is sometimes easier
54to use \code{\link{svytotal}} or \code{\link{svymean}}, which also
55produce standard errors, design effects, etc.
56
57The frequencies in the table can be normalised to some convenient total
58such as 100 or 1.0 by specifying the \code{Ntotal} argument.  If the
59formula has a left-hand side the mean or sum of this variable rather
60than the frequency is tabulated.
61
62The \code{Ntotal} argument can be either a single number or a data
63frame whose first column gives the (first-stage) sampling strata and
64second column the population size in each stratum.  In this second case
65the \code{svytable} command performs `post-stratification': tabulating
66and scaling to the population within strata and then adding up the
67strata.
68
69As with other \code{xtabs} objects, the output of \code{svytable} can be
70processed by \code{ftable} for more attractive display. The
71\code{summary} method for \code{svytable} objects calls \code{svychisq}
72for a test of independence.
73
74\code{svychisq} computes first and second-order Rao-Scott corrections to
75the Pearson chisquared test, and two Wald-type tests.
76
77The default (\code{statistic="F"}) is the Rao-Scott second-order
78correction.  The p-values are computed with a Satterthwaite
79approximation to the distribution and with denominator degrees of
80freedom as recommended by Thomas and Rao (1990). The alternative
81\code{statistic="Chisq"} adjusts the Pearson chisquared statistic by a
82design effect estimate and then compares it to the chisquared
83distribution it would have under simple random sampling.
84
85The \code{statistic="Wald"} test is that proposed by Koch et al (1975)
86and used by the SUDAAN software package. It is a Wald test based on the
87differences between the observed cells counts and those expected under
88independence. The adjustment given by \code{statistic="adjWald"} reduces
89the statistic when the number of PSUs is small compared to the number of
90degrees of freedom of the test. Thomas and Rao (1987) compare these
91tests and find the adjustment benefical.
92
93\code{statistic="lincom"} replaces the numerator of the Rao-Scott F with
94the exact asymptotic distribution, which is a linear combination of
95chi-squared variables (see \code{\link{pchisqsum}}, and
96\code{statistic="saddlepoint"} uses a saddlepoint approximation to this
97distribution.  The \code{CompQuadForm} package is needed for
98\code{statistic="lincom"} but not for
99\code{statistic="saddlepoint"}. The saddlepoint approximation is
100especially useful when the p-value is very small (as in large-scale
101multiple testing problems).
102
103For designs using replicate weights the code is essentially the same as
104for designs with sampling structure, since the necessary variance
105computations are done by the appropriate methods of
106\code{\link{svytotal}} and \code{\link{svymean}}.  The exception is that
107the degrees of freedom is computed as one less than the rank of the
108matrix of replicate weights (by \code{degf}).
109
110
111At the moment, \code{svychisq} works only for 2-dimensional tables.
112}
113\value{
114  The table commands return an \code{xtabs} object, \code{svychisq}
115  returns a \code{htest} object.
116}
117\references{
118Davies RB (1973). "Numerical inversion of a characteristic function"
119Biometrika 60:415-7
120
121P. Duchesne, P. Lafaye de Micheaux (2010) "Computing the distribution of
122quadratic forms: Further comparisons between the Liu-Tang-Zhang
123approximation and exact methods", Computational Statistics and Data
124Analysis, Volume 54,  858-862
125
126Koch, GG, Freeman, DH, Freeman, JL (1975) "Strategies in the
127multivariate analysis of data from complex surveys" International
128Statistical Review 43: 59-78
129
130Rao, JNK, Scott, AJ (1984) "On Chi-squared Tests For Multiway
131Contigency Tables with Proportions Estimated From Survey Data"  Annals
132of Statistics 12:46-60.
133
134Sribney WM (1998) "Two-way contingency tables for survey or clustered
135data" Stata Technical Bulletin 45:33-49.
136
137Thomas, DR, Rao, JNK (1987) "Small-sample comparison of level and power
138for simple goodness-of-fit statistics under cluster sampling" JASA 82:630-636
139
140}
141
142\note{Rao and Scott (1984) leave open one computational issue. In
143  computing `generalised design effects' for these tests, should the
144  variance under simple random sampling be estimated using the observed
145  proportions or the the predicted proportions under the null
146  hypothesis? \code{svychisq} uses the observed proportions, following
147  simulations by Sribney (1998), and the choices made in Stata}
148
149
150\seealso{\code{\link{svytotal}} and \code{\link{svymean}} report totals
151  and proportions by category for factor variables.
152
153  See \code{\link{svyby}} and \code{\link{ftable.svystat}} to construct
154  more complex tables of summary statistics.
155
156  See \code{\link{svyloglin}} for loglinear models.
157
158  See \code{\link{regTermTest}} for Rao-Scott tests in regression models.
159
160See  \url{https://notstatschat.rbind.io/2019/06/08/design-degrees-of-freedom-brief-note/} for an explanation of the design degrees of freedom with replicate weights.
161
162}
163\examples{
164  data(api)
165  xtabs(~sch.wide+stype, data=apipop)
166
167  dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
168  summary(dclus1)
169
170  (tbl <- svytable(~sch.wide+stype, dclus1))
171  plot(tbl)
172  fourfoldplot(svytable(~sch.wide+comp.imp+stype,design=dclus1,round=TRUE), conf.level=0)
173
174  svychisq(~sch.wide+stype, dclus1)
175  summary(tbl, statistic="Chisq")
176  svychisq(~sch.wide+stype, dclus1, statistic="adjWald")
177
178  rclus1 <- as.svrepdesign(dclus1)
179  summary(svytable(~sch.wide+stype, rclus1))
180  svychisq(~sch.wide+stype, rclus1, statistic="adjWald")
181
182}
183\keyword{survey}% at least one, from doc/KEYWORDS
184\keyword{category}% __ONLY ONE__ keyword per line
185\keyword{htest}% __ONLY ONE__ keyword per line
186