1\chapter{Matrices} 2\label{chap:matrices} 3 4Matrices are one- or two-dimensional arrays of double-precision 5floating-point numbers. Hansl users who are accustomed to other matrix 6languages should note that multi-index objects are not 7supported. Matrices have rows and columns, and that's it. 8 9\section{Matrix indexing} 10\label{sec:mat-index} 11 12Individual matrix elements are accessed through the \verb|[r,c]| 13syntax, where indexing starts at 1. For example, \texttt{X[3,4]} 14indicates the element of $X$ on the third row, fourth column. For 15example, 16\begin{code} 17 matrix X = zeros(2,3) 18 X[2,1] = 4 19 print X 20\end{code} 21produces 22\begin{code} 23X (2 x 3) 24 25 0 0 0 26 4 0 0 27\end{code} 28 29Here are some more advanced ways to access matrix elements: 30\begin{enumerate} 31\item In case the matrix has only one row (column), the column (row) 32 specification can be omitted, as in \texttt{x[3]}. 33\item Including the comma but omitting the row or column specification 34 means ``take them all'', as in \texttt{x[4,]} (fourth row, all columns). 35\item For square matrices, the special syntax \texttt{x[diag]} can be 36 used to access the diagonal. 37\item Consecutive rows or columns can be specified via the colon 38 (\texttt{:}) character, as in \texttt{x[,2:4]} (columns 2 to 4). 39 But note that, unlike some other matrix languages, the syntax 40 \texttt{[m:n]} is illegal if $m>n$. 41\item It is possible to use a vector to hold indices to a matrix. E.g.\ 42 if $e = [2,3,6]$, then \texttt{X[,e]} contains the second, third and 43 sixth columns of $X$. 44\end{enumerate} 45Moreover, matrices can be empty (zero rows and columns). 46 47In the example above, the matrix \texttt{X} was constructed using 48the function \texttt{zeros()}, whose meaning should be obvious, but 49matrix elements can also be specified directly, as in 50\begin{code} 51scalar a = 2*3 52matrix A = { 1, 2, 3 ; 4, 5, a } 53\end{code} 54The matrix is defined by rows; the elements on each row are separated 55by commas and rows are separated by semicolons. The whole expression 56must be wrapped in braces. Spaces within the braces are not 57significant. The above expression defines a $2\times3$ matrix. 58 59Note that each element should be a numerical value, the name of a 60scalar variable, or an expression that evaluates to a scalar. In the 61example above the scalar \texttt{a} was first assigned a value and 62then used in matrix construction. (Also note, in passing, that 63\texttt{a} and \texttt{A} are two separate identifiers, due to 64case-sensitivity.) 65 66\section{Matrix operations} 67\label{sec:mat-op} 68 69Matrix sum, difference and product are obtained via \texttt{+}, 70\texttt{-} and \texttt{*}, respectively. The prime operator 71(\texttt{'}) can act as a unary operator, in which case it transposes 72the preceding matrix, or as a binary operator, in which case it acts 73as in ordinary matrix algebra, multiplying the transpose of the first 74matrix into the second one.\footnote{In fact, in this case an 75 optimized algorithm is used; you should always use \texttt{a'a} 76 instead of \texttt{a'*a} for maximal precision and performance.} 77Errors are flagged if conformability is a problem. For example: 78\begin{code} 79 matrix a = {11, 22 ; 33, 44} # a is square 2 x 2 80 matrix b = {1,2,3; 3,2,1} # b is 2 x 3 81 82 matrix c = a' # c is the transpose of a 83 matrix d = a*b # d is a 2x3 matrix equal to a times b 84 85 matrix gina = b'd # valid: gina is 3x3 86 matrix lina = d + b # valid: lina is 2x3 87 88 /* -- these would generate errors if uncommented ----- */ 89 90 # pina = a + b # sum non-conformability 91 # rina = d * b # product non-conformability 92\end{code} 93 94Other noteworthy matrix operators include \texttt{\^} (matrix power), 95\texttt{**} (Kronecker product), and the ``concatenation'' operators, 96\verb|~| (horizontal) and \texttt{|} (vertical). Readers are invited 97to try them out by running the following code 98\begin{code} 99matrix A = {2,1;0,1} 100matrix B = {1,1;1,0} 101 102matrix KP = A ** B 103matrix PWR = A^3 104matrix HC = A ~ B 105matrix VC = A | B 106 107print A B KP PWR HC VC 108\end{code} 109Note, in particular, that $A^3 = A \cdot A \cdot A$, which is different 110from what you get by computing the cubes of each element of $A$ 111separately. 112 113Hansl also supports matrix left- and right-``division'', via the 114\verb'\' and \verb'/' operators, respectively. The expression 115\verb|A\b| solves $Ax = b$ for the unknown $x$. $A$ is assumed to be 116an $m \times n$ matrix with full column rank. If $A$ is square the 117method is LU decomposition. If $m > n$ the QR decomposition is used to 118find the least squares solution. In most cases, this is numerically 119more robust and more efficient than inverting $A$ explicitly. 120 121Element-by-element operations are supported by the so-called ``dot'' 122operators, which are obtained by putting a dot (``\texttt{.}'') before 123the corresponding operator. For example, the code 124\begin{code} 125A = {1,2; 3,4} 126B = {-1,0; 1,-1} 127eval A * B 128eval A .* B 129\end{code} 130produces 131\begin{code} 132 1 -2 133 1 -4 134 135 -1 0 136 3 -4 137\end{code} 138 139It's easy to verify that the first operation performed is regular 140matrix multiplication $A \cdot B$, whereas the second one is the 141Hadamard (element-by-element) product $A \odot B$. In fact, dot 142operators are more general and powerful than shown in the example 143above; see the chapter on matrices in \GUG{} for details. 144 145Dot and concatenation operators are less rigid than ordinary matrix 146operations in terms of conformability requirements: in most cases 147hansl will try to do ``the obvious thing''. For example, a common 148idiom in hansl is \texttt{Y = X ./ w}, where $X$ is an $n \times k$ 149matrix and $w$ is an $n \times 1$ vector. The result $Y$ is an $n 150\times k$ matrix in which each row of $X$ is divided by the 151corresponding element of $w$. In proper matrix notation, this 152operation should be written as 153\[ 154 Y = \langle w \rangle^{-1} X, 155\] 156where the $\langle \cdot \rangle$ indicates a diagonal 157matrix. Translating literally the above expression would imply 158creating a diagonal matrix out of $w$ and then inverting it, which is 159computationally much more expensive than using the dot operation. A 160detailed discussion is provided in \GUG. 161 162Hansl provides a reasonably comprehensive set of matrix functions, 163that is, functions that produce and/or operate on matrices. For a 164full list, see the \GCR, but a basic ``survival kit'' is provided 165in Table~\ref{tab:essential-matfuncs}. Moreover, most scalar 166functions, such as \texttt{abs(), log()} etc., will operate on a 167matrix element-by-element. 168 169\begin{table}[htbp] 170 \centering 171 \small 172 \begin{tabular}{rp{0.6\textwidth}} 173 \textbf{Function(s)} & \textbf{Purpose} \\ 174 \hline 175 \texttt{rows(X), cols(X)} & return the number of rows and columns 176 of $X$, respectively \\ 177 \texttt{zeros(r,c), ones(r,c)} & produce matrices with $r$ rows 178 and $c$ columns, filled with zeros and ones, respectively \\ 179 \texttt{mshape(X,r,c)} & rearrange the elements of $X$ into a 180 matrix with $r$ rows and $c$ columns \\ 181 \texttt{I(n)} & identity matrix of size $n$ \\ 182 \texttt{seq(a,b)} & generate a row vector containing integers from 183 $a$ to $b$ \\ 184 \texttt{inv(A)} & invert, if possible, the matrix $A$ \\ 185 \texttt{maxc(A), minc(A), meanc(A)} & return a row vector 186 with the max, min, means of each column of $A$, respectively\\ 187 \texttt{maxr(A), minr(A), meanr(A)} & return a column vector 188 with the max, min, means of each row of $A$, respectively\\ 189 \texttt{mnormal(r,c), muniform(r,c)} & generate $r \times c$ 190 matrices filled with standard Gaussian and uniform pseudo-random 191 numbers, respectively \\ 192 \hline 193 \end{tabular} 194 \caption{Essential set of hansl matrix functions} 195 \label{tab:essential-matfuncs} 196\end{table} 197 198The following piece of code is meant to provide a concise example of 199all the features mentioned above. 200 201\begin{code} 202# example: OLS using matrices 203 204# fix the sample size 205scalar T = 256 206 207# construct vector of coefficients by direct imputation 208matrix beta = {1.5, 2.5, -0.5} # note: row vector 209 210# construct the matrix of independent variables 211matrix Z = mnormal(T, cols(beta)) # built-in functions 212 213# now construct the dependent variable: note the 214# usage of the "dot" and transpose operators 215 216matrix y = {1.2} .+ Z*beta' + mnormal(T, 1) 217 218# now do estimation 219matrix X = 1 ~ Z # concatenation operator 220matrix beta_hat1 = inv(X'X) * (X'y) # OLS by hand 221matrix beta_hat2 = mols(y, X) # via the built-in function 222matrix beta_hat3 = X\y # via matrix division 223 224print beta_hat1 beta_hat2 beta_hat3 225\end{code} 226 227\section{Matrix pointers} 228\label{sec:mat-pointers} 229 230Hansl uses the ``by value'' convention for passing parameters to 231functions. That is, when a variable is passed to a function as an 232argument, what the function actually gets is a \emph{copy} of the 233variable, which means that the value of the variable at the caller 234level is not modified by anything that goes on inside the function. 235But the use of pointers allows a function and its caller to cooperate 236such that an outer variable can be modified by the function. 237 238This mechanism is used by some built-in matrix functions to provide 239more than one ``return'' value. The primary result is always provided 240by the return value proper but certain auxiliary values may be 241retrieved via ``pointerized'' arguments; this usage is flagged by 242prepending the ampersand symbol, ``\texttt{\&}'', to the name of the 243argument variable. 244 245The \texttt{eigensym} function, which performs the eigen-analysis of 246symmetric matrices, is a case in point. In the example below the first 247argument $A$ represents the input data, that is, the matrix 248whose analysis is required. This variable will not be modified in any 249way by the function call. The primary result is the vector of 250eigenvalues of $A$, which is here assigned to the variable 251\texttt{ev}. The (optional) second argument, \texttt{\&V} (which may 252be read as ``the address of \texttt{V}''), is used to retrieve the 253right eigenvectors of $A$. A variable named in this way must be 254already declared, but it need not be of the right dimensions to 255receive the result; it will be resized as needed. 256\begin{code} 257matrix A = {1,2 ; 2,5} 258matrix V 259matrix ev = eigensym(A, &V) 260print A ev V 261\end{code} 262 263%%% Local Variables: 264%%% mode: latex 265%%% TeX-master: "hansl-primer" 266%%% End: 267