1.. _spm-dicom:
2
3======================
4 SPM DICOM conversion
5======================
6
7These are some notes on the algorithms that SPM_ uses to convert from
8DICOM_ to nifti_.  There are other notes in :ref:`dicom-mosaic`.
9
10The relevant SPM files are ``spm_dicom_headers.m``,
11``spm_dicom_dict.mat`` and ``spm_dicom_convert.m``.  These notes refer
12the version in SPM8, as of around January 2010.
13
14``spm_dicom_dict.mat``
15======================
16
17This is obviously a Matlab ``.mat`` file.  It contains variables
18``group`` and ``element``, and ``values``, where ``values`` is a struct
19array, one element per (group, element) pair, with fields ``name`` and
20``vr`` (the last a cell array).
21
22
23``spm_dicom_headers.m``
24=======================
25
26Reads the given DICOM files into a struct.  It looks like this was
27written by John Ahsburner (JA).  Relevant fixes are:
28
29File opening
30------------
31
32When opening the DICOM file, SPM (subfunction ``readdicomfile``)
33
34#. opens as little endian
35#. reads 4 characters starting at pos 128
36#. checks if these are ``DICM``; if so then continues file read;
37   otherwise, tests to see if this is what SPM calls *truncated DICOM
38   file format* - lacking 128 byte lead in and ``DICM`` string:
39
40   #. Seeks to beginning of file
41   #. Reads two unsigned short values into ``group`` and ``tag``
42   #. If the (``group``, ``element``) pair exist in
43      ``spm_dicom_dict.mat``, then set file pointer to 0 and continue
44      read with ``read_dicom`` subfunction..
45   #. If ``group`` == 8 and ``element`` == 0, this is apparently the
46      signature for a 'GE Twin+excite' for which JA notes there is no
47      documentation; set file pointer to 0 and continue read with
48      ``read_dicom`` subfunction.
49   #. Otherwise - crash out with error saying that this is not DICOM file.
50
51tag read for Philips Integra
52----------------------------
53
54The ``read_dicom`` subfunction reads a tag, then has a loop during which
55the tag is processed (by setting values into the return structure).  At
56the end of the loop, it reads the next tag.  The loop breaks when the
57current tag is empty, or is the item delimitation tag (group=FFFE,
58element=E00D).
59
60After it has broken out of the loop, if the last tag was (FFFE, E00D)
61(item delimitation tag), and the tag length was not 0, then SPM sets the
62file pointer back by 4 bytes from the current position.  JA comments
63that he didn't find that in the standard, but that it seemed to be
64needed for the Philips Integra.
65
66Tag length
67----------
68
69Tag lengths as read in ``read_tag`` subfunction.  If current format is
70explicit (as in 'explicit little endian'):
71
72#. For VR of \x00\x00, then group, element must be (FFFE, E00D) (item
73   delimitation tag). JA comments that GE 'ImageDelimitationItem' has
74   no VR, just 4 0 bytes.  In this case the tag length is zero, and we
75   read another two bytes ahead.
76
77There's a check for not-even tag length.  If not even:
78
79#. 4294967295 appears to be OK - and decoded as Inf for tag length.
80#. 13 appears to mean 10 and is reset to be 10
81#. Any other odd number is not valid and gives a tag length of 0
82
83``SQ`` VR type (Sequnce of items type)
84--------------------------------------
85
86tag length of 13 set to tag length 10.
87
88
89``spm_dicom_convert.m``
90=======================
91
92Written by John Ashburner and Jesper Andersson.
93
94File categorization
95-------------------
96
97SPM makes a special case of Siemens 'spectroscopy images'.  These are
98images that have 'SOPClassUID' == '1.3.12.2.1107.5.9.1' and the private
99tag of (29, 1210); for these it pulls out the affine, and writes a
100volume of ones corresponding to the acquisition planes.
101
102For images that are not spectroscopy:
103
104* Discards images that do not have any of ('MR', 'PT', 'CT') in 'Modality' field.
105* Discards images lacking any of 'StartOfPixelData', 'SamplesperPixel',
106  'Rows', 'Columns', 'BitsAllocated', 'BitsStored', 'HighBit',
107  'PixelRespresentation'
108* Discards images lacking any of 'PixelSpacing', 'ImagePositionPatient',
109  'ImageOrientationPatient' - presumably on the basis that SPM cannot
110  reconstruct the affine.
111* Fields 'SeriesNumber', 'AcquisitionNumber' and 'InstanceNumber' are
112  set to 1 if absent.
113
114Next SPM distinguishes between :ref:`dicom-mosaic` and standard DICOM.
115
116Mosaic images are those with the Siemens private tag::
117
118  (0029, 1009) [CSA Image Header Version]          OB: '20100114'
119
120and a readable CSA header (see :ref:`dicom-mosaic`), and with
121non-empty fields from that header of 'AcquisitionMatrixText',
122'NumberOfImagesInMosaic', and with non-zero 'NumberOfImagesInMosaic'.  The
123rest are standard DICOM.
124
125For converting mosaic format, see :ref:`dicom-mosaic`.  The rest of this
126page refers to standard (slice by slice) DICOMs.
127
128.. _spm-volume-sorting:
129
130Sorting files into volumes
131--------------------------
132
133First pass
134~~~~~~~~~~
135
136Take first header, put as start of first volume.   For each subsequent header:
137
138#. Get ``ICE_Dims`` if present.  Look for Siemens 'CSAImageHeaderInfo',
139   check it has a 'name' field, then pull dimensions out of 'ICE_Dims'
140   field in form of 9 integers separated by '_', where 'X' in this
141   string replaced by '-1' - giving 'ICE1'
142
143Then, for each currently identified volume:
144
145#. If we have ICE1 above, and we do have 'CSAIMageHeaderInfo', with a
146   'name', in the first header in this volume, then extract ICE dims in
147   the same way as above, for the first header in this volume, and check
148   whether all but ICE1[6:8] are the same as ICE2.  Set flag that all
149   ICE dims are identical for this volume.  Set this flag to True if we
150   did not have ICE1 or CSA information.
151#. Match the current header to the current volume iff the following match:
152
153   #. SeriesNumber
154   #. Rows
155   #. Columns
156   #. ImageOrientationPatient (to tolerance of sum squared difference 1e-4)
157   #. PixelSpacing (to tolerance of sum squared difference 1e-4)
158   #. ICE dims as defined above
159   #. ImageType (iff imagetype exists in both)
160   #. SequenceName (iff sequencename exists in both)
161   #. SeriesInstanceUID (iff exists in both)
162   #. EchoNumbers (iff exists in both)
163
164#. If the current header matches the current volume, insert it there,
165   otherwise make a new volume for this header
166
167.. _spm-second-pass:
168
169Second pass
170~~~~~~~~~~~
171
172We now have a list of volumes, where each volume is a list of headers
173that may match.
174
175For each volume:
176
177#. Estimate the z direction cosine by (effectively) finding the cross
178   product of the x and y direction cosines contained in
179   'ImageOrientationPatient' - call this ``z_dir_cos``
180#. For each header in this volume, get the z coordinate by taking the
181   dot product of the 'ImagePositionPatient' vector and ``z_dir_cos``
182   (see :ref:`dicom-z-from-slice`).
183#. Sort the headers according to this estimated z coordinate.
184#. If this volume is more than one slice, and there are any slices with
185   the same z coordinate (as defined above), run the
186   :ref:`dicom-img-resort` on this volume - on the basis that it may
187   have caught more than one volume-worth of slices.  Return one or more
188   volume's worth of lists.
189
190Final check
191~~~~~~~~~~~
192
193For each volume, recalculate z coordinate as above.  Calculate the z
194gaps.  Subtract the mean of the z gaps from all z gaps.  If the average of the
195(gap-mean(gap)) is greater than 1e-4, then print a warning that there
196are missing DICOM files.
197
198.. _dicom-img-resort:
199
200Possible volume resort
201~~~~~~~~~~~~~~~~~~~~~~
202
203This step happens if there were volumes with slices having the same z
204coordinate in the :ref:`spm-second-pass` step above.  The resort is on the
205set of DICOM headers that were in the volume, for which there were
206slices with identical z coordinates.  We'll call the list of headers
207that the routine is still working on - ``work_list``.
208
209#. If there is no 'InstanceNumber' field for the first header in
210   ``work_list``, bail out.
211#. Print a message about the 'AcquisitionNumber' not changing from
212   volume to volume.  This may be a relic from previous code, because
213   this version of SPM does not use the 'AcquisitionNumber' field except
214   for making filenames.
215#. Calculate the z coordinate as for :ref:`spm-second-pass`, for each
216   DICOM header.
217#. Sort the headers by 'InstanceNumber'
218#. If any headers have the same 'InstanceNumber', then discard all but
219   the first header with the same number.  At this point the remaining
220   headers in ``work_list`` will have different 'InstanceNumber's, but
221   may have the same z coordinate.
222#. Now sort by z coordinate
223#. If there are ``N`` headers, make a ``N`` length vector of flags
224   ``is_processed``, for which all values == False
225#. Make an output list of header lists, call it ``hdr_vol_out``, set to empty.
226#. While there are still any False elements in ``is_processed``:
227
228   #. Find first header for which corresponding ``is_processed`` is
229      False - call this ``hdr_to_check``
230   #. Collect indices (in ``work_list``) of headers which have the same
231      z coordinate as ``hdr_to_check``, call this list
232      ``z_same_indices``.
233   #. Sort ``work_list[z_same_indices]`` by 'InstanceNumber'
234   #. For each index in ``z_same_indices`` such that ``i`` indexes the
235      indices, and ``zsind`` is ``z_same_indices[i]``: append header
236      corresponding to ``zsind`` to ``hdr_vol_out[i]``.  This assumes
237      that the original ``work_list`` contained two or more volumes,
238      each with an identical set of z coordinates.
239   #. Set corresponding ``is_processed`` flag to True for all ``z_same_indices``.
240
241#. Finally, if the headers in ``work_list`` have 'InstanceNumber's that
242   cannot be sorted to a sequence ascending in units of 1, or if any
243   of the lists in ``hdr_vol_out`` have different lengths, emit a
244   warning about missing DICOM files.
245
246Writing DICOM volumes
247---------------------
248
249This means - writing DICOM volumes from standard (slice by slice) DICOM
250datasets rather than :ref:`dicom-mosaic`.
251
252Making the affine
253~~~~~~~~~~~~~~~~~
254
255We need the (4,4) affine $A$ going from voxel (array) coordinates in the
256DICOM pixel data, to mm coordinates in the :ref:`dicom-pcs`.
257
258This section tries to explain how SPM achieves this, but I don't
259completely understand their method.  See :ref:`dicom-3d-affines` for
260what I believe to be a simpler explanation.
261
262First define the constants, matrices and vectors as in
263:ref:`dicom-affine-defs`.
264
265$N$ is the number of slices in the volume.
266
267Then define the following matrices:
268
269.. math::
270
271   R = \left(\begin{smallmatrix}1 & a & 1 & 0\\1 & b & 0 & 1\\1 & c & 0 & 0\\1 & d & 0 & 0\end{smallmatrix}\right)
272
273   L = \left(\begin{smallmatrix}T^{1}_{{1}} & e & F_{{11}} \Delta{r} & F_{{12}} \Delta{c}\\T^{1}_{{2}} & f & F_{{21}} \Delta{r} & F_{{22}} \Delta{c}\\T^{1}_{{3}} & g & F_{{31}} \Delta{r} & F_{{32}} \Delta{c}\\1 & h & 0 & 0\end{smallmatrix}\right)
274
275For a volume with more than one slice (header), then $a=1; b=1, c=N, d=1$. $e, f, g$ are the values from $T^N$,
276and $h == 1$.
277
278For a volume with only one slice (header) $a=0, b=0, c=1, d=0$ and $e,
279f, g, h$ are $n_1 \Delta{s}, n_2 \Delta{s}, n_3 \Delta{s}, 0$.
280
281The full transform appears to be $A_{spm} = R L^{-1}$.
282
283Now, SPM, don't forget, is working in terms of Matlab array indexing,
284which starts at (1,1,1) for a three dimensional array, whereas DICOM
285expects a (0,0,0) start (see :ref:`dicom-slice-affine`).  In this
286particular part of the SPM DICOM code, somewhat confusingly, the (0,0,0)
287to (1,1,1) indexing is dealt with in the $A$ transform, rather than the
288``analyze_to_dicom`` transformation used by SPM in other places. So, the
289transform $A_{spm}$ goes from (1,1,1) based voxel indices to mm.  To
290get the (0, 0, 0)-based transform we want, we need to pre-apply the
291transform to take 0-based voxel indices to 1-based voxel indices:
292
293.. math::
294
295   A = R L^{-1} \left(\begin{smallmatrix}1 & 0 & 0 & 1\\0 & 1 & 0 & 1\\0 & 0 & 1 & 1\\0 & 0 & 0 & 1\end{smallmatrix}\right)
296
297This formula with the definitions above result in the single and multi
298slice formulae in :ref:`dicom-3d-affine-formulae`.
299
300See :download:`derivations/spm_dicom_orient.py` for the derivations and
301some explanations.
302
303Writing the voxel data
304~~~~~~~~~~~~~~~~~~~~~~
305
306Just apply scaling and offset from 'RescaleSlope' and 'RescaleIntercept'
307for each slice and write volume.
308
309.. include:: ../links_names.txt
310