1\chapter{Matrices}
2\label{chap:matrices}
3
4Matrices are one- or two-dimensional arrays of double-precision
5floating-point numbers. Hansl users who are accustomed to other matrix
6languages should note that multi-index objects are not
7supported. Matrices have rows and columns, and that's it.
8
9\section{Matrix indexing}
10\label{sec:mat-index}
11
12Individual matrix elements are accessed through the \verb|[r,c]|
13syntax, where indexing starts at 1. For example, \texttt{X[3,4]}
14indicates the element of $X$ on the third row, fourth column. For
15example,
16\begin{code}
17  matrix X = zeros(2,3)
18  X[2,1] = 4
19  print X
20\end{code}
21produces
22\begin{code}
23X (2 x 3)
24
25  0   0   0
26  4   0   0
27\end{code}
28
29Here are some more advanced ways to access matrix elements:
30\begin{enumerate}
31\item In case the matrix has only one row (column), the column (row)
32  specification can be omitted, as in \texttt{x[3]}.
33\item Including the comma but omitting the row or column specification
34  means ``take them all'', as in \texttt{x[4,]} (fourth row, all columns).
35\item For square matrices, the special syntax \texttt{x[diag]} can be
36  used to access the diagonal.
37\item Consecutive rows or columns can be specified via the colon
38  (\texttt{:}) character, as in \texttt{x[,2:4]} (columns 2 to 4).
39  But note that, unlike some other matrix languages, the syntax
40  \texttt{[m:n]} is illegal if $m>n$.
41\item It is possible to use a vector to hold indices to a matrix. E.g.\
42  if $e = [2,3,6]$, then \texttt{X[,e]} contains the second, third and
43  sixth columns of $X$.
44\end{enumerate}
45Moreover, matrices can be empty (zero rows and columns).
46
47In the example above, the matrix \texttt{X} was constructed using
48the function \texttt{zeros()}, whose meaning should be obvious, but
49matrix elements can also be specified directly, as in
50\begin{code}
51scalar a = 2*3
52matrix A = { 1, 2, 3 ; 4, 5, a }
53\end{code}
54The matrix is defined by rows; the elements on each row are separated
55by commas and rows are separated by semicolons.  The whole expression
56must be wrapped in braces.  Spaces within the braces are not
57significant. The above expression defines a $2\times3$ matrix.
58
59Note that each element should be a numerical value, the name of a
60scalar variable, or an expression that evaluates to a scalar. In the
61example above the scalar \texttt{a} was first assigned a value and
62then used in matrix construction. (Also note, in passing, that
63\texttt{a} and \texttt{A} are two separate identifiers, due to
64case-sensitivity.)
65
66\section{Matrix operations}
67\label{sec:mat-op}
68
69Matrix sum, difference and product are obtained via \texttt{+},
70\texttt{-} and \texttt{*}, respectively. The prime operator
71(\texttt{'}) can act as a unary operator, in which case it transposes
72the preceding matrix, or as a binary operator, in which case it acts
73as in ordinary matrix algebra, multiplying the transpose of the first
74matrix into the second one.\footnote{In fact, in this case an
75  optimized algorithm is used; you should always use \texttt{a'a}
76  instead of \texttt{a'*a} for maximal precision and performance.}
77Errors are flagged if conformability is a problem. For example:
78\begin{code}
79  matrix a = {11, 22 ; 33, 44}  # a is square 2 x 2
80  matrix b = {1,2,3; 3,2,1}     # b is 2 x 3
81
82  matrix c = a'         # c is the transpose of a
83  matrix d = a*b        # d is a 2x3 matrix equal to a times b
84
85  matrix gina = b'd     # valid: gina is 3x3
86  matrix lina = d + b   # valid: lina is 2x3
87
88  /* -- these would generate errors if uncommented ----- */
89
90  # pina = a + b  # sum non-conformability
91  # rina = d * b  # product non-conformability
92\end{code}
93
94Other noteworthy matrix operators include \texttt{\^} (matrix power),
95\texttt{**} (Kronecker product), and the ``concatenation'' operators,
96\verb|~| (horizontal) and \texttt{|} (vertical). Readers are invited
97to try them out by running the following code
98\begin{code}
99matrix A = {2,1;0,1}
100matrix B = {1,1;1,0}
101
102matrix KP = A ** B
103matrix PWR = A^3
104matrix HC = A ~ B
105matrix VC = A | B
106
107print A B KP PWR HC VC
108\end{code}
109Note, in particular, that $A^3 = A \cdot A \cdot A$, which is different
110from what you get by computing the cubes of each element of $A$
111separately.
112
113Hansl also supports matrix left- and right-``division'', via the
114\verb'\' and \verb'/' operators, respectively. The expression
115\verb|A\b| solves $Ax = b$ for the unknown $x$. $A$ is assumed to be
116an $m \times n$ matrix with full column rank. If $A$ is square the
117method is LU decomposition. If $m > n$ the QR decomposition is used to
118find the least squares solution. In most cases, this is numerically
119more robust and more efficient than inverting $A$ explicitly.
120
121Element-by-element operations are supported by the so-called ``dot''
122operators, which are obtained by putting a dot (``\texttt{.}'') before
123the corresponding operator. For example, the code
124\begin{code}
125A = {1,2; 3,4}
126B = {-1,0; 1,-1}
127eval A * B
128eval A .* B
129\end{code}
130produces
131\begin{code}
132   1   -2
133   1   -4
134
135  -1    0
136   3   -4
137\end{code}
138
139It's easy to verify that the first operation performed is regular
140matrix multiplication $A \cdot B$, whereas the second one is the
141Hadamard (element-by-element) product $A \odot B$. In fact, dot
142operators are more general and powerful than shown in the example
143above; see the chapter on matrices in \GUG{} for details.
144
145Dot and concatenation operators are less rigid than ordinary matrix
146operations in terms of conformability requirements: in most cases
147hansl will try to do ``the obvious thing''. For example, a common
148idiom in hansl is \texttt{Y = X ./ w}, where $X$ is an $n \times k$
149matrix and $w$ is an $n \times 1$ vector. The result $Y$ is an $n
150\times k$ matrix in which each row of $X$ is divided by the
151corresponding element of $w$. In proper matrix notation, this
152operation should be written as
153\[
154  Y = \langle w \rangle^{-1} X,
155\]
156where the $\langle \cdot \rangle$ indicates a diagonal
157matrix. Translating literally the above expression would imply
158creating a diagonal matrix out of $w$ and then inverting it, which is
159computationally much more expensive than using the dot operation. A
160detailed discussion is provided in \GUG.
161
162Hansl provides a reasonably comprehensive set of matrix functions,
163that is, functions that produce and/or operate on matrices. For a
164full list, see the \GCR, but a basic ``survival kit'' is provided
165in Table~\ref{tab:essential-matfuncs}.  Moreover, most scalar
166functions, such as \texttt{abs(), log()} etc., will operate on a
167matrix element-by-element.
168
169\begin{table}[htbp]
170  \centering
171  \small
172  \begin{tabular}{rp{0.6\textwidth}}
173    \textbf{Function(s)} & \textbf{Purpose} \\
174    \hline
175    \texttt{rows(X), cols(X)} & return the number of rows and columns
176    of $X$, respectively \\
177    \texttt{zeros(r,c), ones(r,c)} & produce matrices with $r$ rows
178    and $c$ columns, filled with zeros and ones, respectively \\
179    \texttt{mshape(X,r,c)} & rearrange the elements of $X$ into a
180    matrix with $r$ rows and $c$ columns \\
181    \texttt{I(n)} & identity matrix of size $n$ \\
182    \texttt{seq(a,b)} & generate a row vector containing integers from
183    $a$ to $b$ \\
184    \texttt{inv(A)} & invert, if possible, the matrix $A$ \\
185    \texttt{maxc(A), minc(A), meanc(A)} & return a row vector
186    with the max, min, means of each column of $A$, respectively\\
187    \texttt{maxr(A), minr(A), meanr(A)} & return a column vector
188    with the max, min, means of each row of $A$, respectively\\
189    \texttt{mnormal(r,c), muniform(r,c)} & generate $r \times c$
190    matrices filled with standard Gaussian and uniform pseudo-random
191    numbers, respectively \\
192    \hline
193  \end{tabular}
194  \caption{Essential set of hansl matrix functions}
195  \label{tab:essential-matfuncs}
196\end{table}
197
198The following piece of code is meant to provide a concise example of
199all the features mentioned above.
200
201\begin{code}
202# example: OLS using matrices
203
204# fix the sample size
205scalar T = 256
206
207# construct vector of coefficients by direct imputation
208matrix beta = {1.5, 2.5, -0.5} # note: row vector
209
210# construct the matrix of independent variables
211matrix Z = mnormal(T, cols(beta)) # built-in functions
212
213# now construct the dependent variable: note the
214# usage of the "dot" and transpose operators
215
216matrix y = {1.2} .+ Z*beta' + mnormal(T, 1)
217
218# now do estimation
219matrix X = 1 ~ Z  # concatenation operator
220matrix beta_hat1 = inv(X'X) * (X'y) # OLS by hand
221matrix beta_hat2 = mols(y, X)       # via the built-in function
222matrix beta_hat3 = X\y              # via matrix division
223
224print beta_hat1 beta_hat2 beta_hat3
225\end{code}
226
227\section{Matrix pointers}
228\label{sec:mat-pointers}
229
230Hansl uses the ``by value'' convention for passing parameters to
231functions. That is, when a variable is passed to a function as an
232argument, what the function actually gets is a \emph{copy} of the
233variable, which means that the value of the variable at the caller
234level is not modified by anything that goes on inside the function.
235But the use of pointers allows a function and its caller to cooperate
236such that an outer variable can be modified by the function.
237
238This mechanism is used by some built-in matrix functions to provide
239more than one ``return'' value. The primary result is always provided
240by the return value proper but certain auxiliary values may be
241retrieved via ``pointerized'' arguments; this usage is flagged by
242prepending the ampersand symbol, ``\texttt{\&}'', to the name of the
243argument variable.
244
245The \texttt{eigensym} function, which performs the eigen-analysis of
246symmetric matrices, is a case in point. In the example below the first
247argument $A$ represents the input data, that is, the matrix
248whose analysis is required. This variable will not be modified in any
249way by the function call. The primary result is the vector of
250eigenvalues of $A$, which is here assigned to the variable
251\texttt{ev}. The (optional) second argument, \texttt{\&V} (which may
252be read as ``the address of \texttt{V}''), is used to retrieve the
253right eigenvectors of $A$. A variable named in this way must be
254already declared, but it need not be of the right dimensions to
255receive the result; it will be resized as needed.
256\begin{code}
257matrix A = {1,2 ; 2,5}
258matrix V
259matrix ev = eigensym(A, &V)
260print A ev V
261\end{code}
262
263%%% Local Variables:
264%%% mode: latex
265%%% TeX-master: "hansl-primer"
266%%% End:
267