1%Copyright (C) 2007-2014  Brian Langenberger
2%This work is licensed under the
3%Creative Commons Attribution-Share Alike 3.0 United States License.
4%To view a copy of this license, visit
5%http://creativecommons.org/licenses/by-sa/3.0/us/ or send a letter to
6%Creative Commons,
7%171 Second Street, Suite 300,
8%San Francisco, California, 94105, USA.
9
10\chapter{MP3}
11MP3 is the de-facto standard for lossy audio.
12It is little more than a series of MPEG frames with an
13optional ID3v2 metadata header and optional ID3v1 metadata
14footer.
15
16MP3 decoders are assumed to be very tolerant of anything in
17the stream that doesn't look like an MPEG frame, ignoring such
18junk until the next frame is found.
19Since MP3 files have no standard container format in which
20non-MPEG data can be placed, metadata such as ID3 tags are often
21made `sync-safe' by formatting them in a way that decoders won't
22confuse tags for MPEG frames.
23\section{the MP3 File Stream}
24\begin{figure}[h]
25\includegraphics{figures/mp3/stream.pdf}
26\end{figure}
27\begin{table}[h]
28\begin{tabular}{|c||l||l||r|r|r||l|}
29\hline
30& & & \multicolumn{3}{c||}{Sample Rate} & \\
31bits & MPEG ID & Description & MPEG-1 & MPEG-2 & MPEG-2.5 & Channels \\
32\hline
33\texttt{00} & MPEG-2.5 & reserved & 44100 & 22050 & 11025 & Stereo \\
34\texttt{01} & reserved & Layer III & 48000 & 24000 & 12000 & Joint stereo \\
35\texttt{10} & MPEG-2 & Layer II & 32000 & 16000 & 8000 & Dual channel stereo\\
36\texttt{11} & MPEG-1 & Layer I & reserved & reserved & reserved & Mono \\
37\hline
38\end{tabular}
39\end{table}
40\par
41\noindent
42Layer I frames always contain 384 samples.
43Layer II and Layer III frames always contain 1152 samples.
44If the \VAR{Protection} bit is 0, the frame header is followed by a
4516 bit CRC.
46
47\pagebreak
48
49\begin{table}[h]
50{\relsize{-2}
51\begin{tabular}{|c||r|r|r|r|r|}
52\hline
53& MPEG-1 & MPEG-1 & MPEG-1 & MPEG-2 & MPEG-2 \\
54bits & Layer-1 & Layer-2 & Layer-3 & Layer-1 & Layer-2/3 \\
55\hline
56\texttt{0000} & free & free & free & free & free \\
57\texttt{0001} & 32 & 32 & 32 & 32 & 8 \\
58\texttt{0010} & 64 & 48 & 40 & 48 & 16 \\
59\texttt{0011} & 96 & 56 & 48 & 56 & 24 \\
60\texttt{0100} & 128 & 64 & 56 & 64 & 32 \\
61\texttt{0101} & 160 & 80 & 64 & 80 & 40 \\
62\texttt{0110} & 192 & 96 & 80 & 96 & 48 \\
63\texttt{0111} & 224 & 112 & 96 & 112 & 56 \\
64\texttt{1000} & 256 & 128 & 112 & 128 & 64 \\
65\texttt{1001} & 288 & 160 & 128 & 144 & 80 \\
66\texttt{1010} & 320 & 192 & 160 & 160 & 96 \\
67\texttt{1011} & 352 & 224 & 192 & 176 & 112 \\
68\texttt{1100} & 384 & 256 & 224 & 192 & 128 \\
69\texttt{1101} & 416 & 320 & 256 & 224 & 144 \\
70\texttt{1110} & 448 & 384 & 320 & 256 & 160 \\
71\texttt{1111} & bad & bad & bad & bad & bad \\
72\hline
73\end{tabular}
74}
75\caption{Bitrate in 1000 bits per second}
76\end{table}
77To find the total size of an MPEG frame, use one of the following
78formulas:
79\begin{align}
80\intertext{Layer I:}
81\text{Byte Length} &= \left ( \frac{12 \times \text{Bitrate}}{\text{Sample Rate}} + \text{Pad} \right ) \times 4 \\
82\intertext{Layer II/III:}
83\text{Byte Length} &= \frac{144 \times \text{Bitrate}}{\text{Sample Rate}} + \text{Pad}
84\end{align}
85For example, an MPEG-1 Layer III frame with a sampling rate of 44100,
86a bitrate of 128kbps and a set pad bit is 418 bytes long, including the header.
87\begin{equation}
88\frac{144 \times 128000}{44100} + 1 = 418
89\end{equation}
90
91\clearpage
92
93\subsection{the Xing Header}
94
95An MP3 frame header contains the track's sampling rate,
96bits-per-sample and number of channels.
97However, because MP3 files are little more than
98concatenated MPEG frames, there is no obvious place to
99store the track's total length.
100Since the length of each frame is a constant number of samples,
101one can calculate the track length by counting the number of frames.
102This method is the most accurate but is also quite slow.
103
104For MP3 files in which all frames have the same bitrate
105- also known as constant bitrate, or CBR files -
106one can divide the total size of file (minus any ID3 headers/footers),
107by the bitrate to determine its length.
108If an MP3 file has no Xing header in its first frame,
109one can assume it is CBR.
110
111An MP3 file that does contain a Xing header in its first frame
112can be assumed to be variable bitrate, or VBR.
113In that case, the rate of the first frame cannot be used as a
114basis to calculate the length of the entire file.
115Instead, one must use the information from the Xing header
116which contains that length.
117
118All of the fields within a Xing header are big-endian.
119\begin{figure}[h]
120\includegraphics{figures/mp3/xing.pdf}
121\end{figure}
122
123\clearpage
124
125\section{ID3v1 Tags}
126ID3v1 tags are very simple metadata tags appended to an MP3 file.
127All of the fields are fixed length, padded with NULLs if necessary,
128and the text encoding is undefined.
129There are two versions of ID3v1 tags.
130ID3v1.1 has a track number field as a 1 byte value
131at the end of the comment field.
132If the byte just before the end is not null (0x00),
133assume we're dealing with a classic ID3v1 tag without a
134track number.
135
136\subsection{ID3v1}
137
138\begin{figure}[h]
139\includegraphics{figures/mp3/id3v1.pdf}
140\end{figure}
141
142\subsection{ID3v1.1}
143
144\begin{figure}[h]
145\includegraphics{figures/mp3/id3v11.pdf}
146\end{figure}
147
148\clearpage
149
150\subsection{ID3v1.1 Tag Example}
151\begin{figure}[h]
152  \includegraphics{figures/mp3/id3v11-example.pdf}
153\end{figure}
154\begin{table}[h]
155  \begin{tabular}{rl}
156    track title : & \texttt{some track name} \\
157    artist name : & \texttt{artist's name} \\
158    album name : & \texttt{album title} \\
159    year : & \texttt{2012} \\
160    comment : & \texttt{a lengthy comment field} \\
161    track number : & \texttt{1} \\
162    genre : & \texttt{0} \\
163  \end{tabular}
164\end{table}
165
166\clearpage
167
168\section{ID3v2 Tags}
169
170The ID3v2 tag was invented to address the deficiencies in the original
171ID3v1 tag.
172ID3v2 comes in three similar but not entirely compatible variants:
173ID3v2.2, ID3v2.3 and ID3v2.4.
174All of its fields are big-endian.
175
176\begin{figure}[h]
177\includegraphics{figures/mp3/id3v2_stream.pdf}
178\end{figure}
179\par
180\noindent
181\VAR{major version} is 2 for ID3v2.2, 3 for ID3v2.3 and 4 for ID3v2.4.
182\VAR{minor version} is always 0.
183The four \VAR{size} fields are recombined as follows:
184\begin{equation*}
185  \text{ID3 Frames size} = (\text{size}_3 \times 2 ^ {21}) + (\text{size}_2 \times 2 ^ {14}) + (\text{size}_1 \times 2 ^ 7) + \text{size}_0
186\end{equation*}
187Splitting the field with 0 bits ensures that no size value
188will appear to be an MP3 frame sync.
189
190\clearpage
191
192\subsection{ID3v2 Header Example}
193\begin{figure}[h]
194  \includegraphics{figures/mp3/id3v2_stream-example.pdf}
195\end{figure}
196\begin{table}[h]
197  \begin{tabular}{rl}
198    ID : & \texttt{"ID3"} \\
199    major version : & \texttt{3} \\
200    minor version : & \texttt{0} \\
201    flags : & \texttt{0} \\
202    $\text{size}_3$ : & \texttt{0} \\
203    $\text{size}_2$ : & \texttt{0} \\
204    $\text{size}_1$ : & \texttt{1} \\
205    $\text{size}_0$ : & \texttt{21} \\
206    ID3 Frames size : & $(0 \times 2 ^ {21}) + (0 \times 2 ^ {14}) + (1 \times 2 ^ 7) + 21 = 149$ bytes \\
207  \end{tabular}
208\end{table}
209\par
210\noindent
211Which indicates this is an ID3v2.3 tag with 149 bytes of ID3v2.3 frames.
212Since there is no `total number of frames' field,
213we must use the size field to determine when to stop reading
214additional ID3 frames.
215
216\clearpage
217
218\subsection{ID3v2.2 Header}
219
220\begin{figure}[h]
221\includegraphics{figures/id3v22/header.pdf}
222\end{figure}
223\par
224\noindent
225The \VAR{unsync} and \VAR{compression} flags are normally unused.
226
227\subsection{ID3v2.2 Frames}
228\begin{figure}[h]
229\includegraphics{figures/id3v22/frames.pdf}
230\end{figure}
231\par
232\noindent
233\VAR{frame ID} is a 3 byte ASCII string.
234\VAR{frame size} is the length of the frame data,
235not including its 6 byte header.
236
237\clearpage
238
239\subsubsection{COM Frame}
240\begin{figure}[h]
241\includegraphics{figures/id3v22/com.pdf}
242\end{figure}
243\par
244\noindent
245This frame supports a lengthy string of comment text.
246\VAR{encoding} is 0 for Latin-1, 1 for UCS-2,
247indicating the text encoding of the \VAR{short description}
248and \VAR{comment text} fields.
249\VAR{language} is a 3 byte ASCII string.
250\VAR{short description} is a NULL-terminated string
251containing a short description of the comment.
252Note that for UCS-2 comments, the NULL terminator is 2 bytes.
253The remainder of the frame is the comment text itself.
254
255\subsubsection{COM Frame Example}
256\begin{figure}[h]
257  \includegraphics{figures/id3v22/com-example.pdf}
258\end{figure}
259\begin{table}[h]
260  \begin{tabular}{rl}
261    encoding : & \texttt{0} (Latin-1) \\
262    language : & \texttt{"eng"} \\
263    short description : & \texttt{""} (empty string) \\
264    comment text : & \texttt{"comment text"} \\
265  \end{tabular}
266\end{table}
267
268\clearpage
269
270\subsubsection{PIC Frame}
271\begin{figure}[h]
272\includegraphics{figures/id3v22/pic.pdf}
273\end{figure}
274\par
275\noindent
276This frame contains an embedded image file.
277\VAR{image format} is a 3 byte string indicating the format of the image,
278typically `JPG' for JPEG images or 'PNG' for PNG images.
279\VAR{description} is a Latin-1 or UCS-2 encoded NULL-terminated string,
280depending on if \VAR{encoding} is 0 or 1, respectively.
281\VAR{picture type} is one of the following:
282\begin{table}[h]
283{\relsize{-1}
284\begin{tabular}{|r|l||r|l|}
285\hline
286value & type & value & type \\
287\hline
2880 & other & 1 & 32x32 pixels `file icon' (PNG only) \\
2892 & other file icon & 3 & cover (front) \\
2904 & cover (back) & 5 & leaflet page \\
2916 & media (e.g. label side of CD) & 7 & lead artist / lead performer / soloist \\
2928 & artist / performer & 9 & conductor \\
29310 & band / orchestra & 11 & composer \\
29412 & lyricist / text writer & 13 & recording location \\
29514 & during recording & 15 & during performance \\
29616 & movie / video screen capture & 17 & a bright colored fish \\
29718 & illustration & 19 & band / artist logotype \\
29820 & publisher / studio logotype & &  \\
299\hline
300\end{tabular}
301\caption{PIC image types}
302}
303\end{table}
304
305\clearpage
306
307\subsubsection{PIC Frame Example}
308\begin{figure}[h]
309  \includegraphics{figures/id3v22/pic-example.pdf}
310\end{figure}
311\begin{table}[h]
312  \begin{tabular}{rl}
313    encoding : & \texttt{0} (Latin-1) \\
314    image format : & \texttt{"PNG"} \\
315    image type : & \texttt{3} (front cover) \\
316    description : & \texttt{"Description"} \\
317\end{tabular}
318\end{table}
319
320\clearpage
321
322\subsubsection{Text Frames}
323\begin{figure}[h]
324\includegraphics{figures/id3v22/t__.pdf}
325\end{figure}
326\par
327\noindent
328Frames whose ID starts with `T' are for textual information.
329\VAR{encoding} is 0 for Latin-1 encoding, 1 for UCS-2 encoding.
330\par
331\noindent
332{\relsize{-1}
333\begin{tabular}{r|l}
334frame ID & meaning \\
335\hline
336\texttt{TAL} & album / movie / show title \\
337\texttt{TBP} & BPM (beats per minute) \\
338\texttt{TCM} & composer \\
339\texttt{TCO} & content type \\
340\texttt{TCR} & copyright message \\
341\texttt{TDA} & date \\
342\texttt{TDY} & playlist delay \\
343\texttt{TEN} & encoded by \\
344\texttt{TFT} & file type \\
345\texttt{TIM} & time \\
346\texttt{TKE} & initial key \\
347\texttt{TLA} & language(s) \\
348\texttt{TLE} & length \\
349\texttt{TMT} & media type \\
350\texttt{TOA} & original artist(s) / performer(s) \\
351\texttt{TOF} & original filename \\
352\texttt{TOL} & original Lyricist(s) / text writer(s) \\
353\texttt{TOR} & original release year \\
354\texttt{TOT} & original album / movie / show title \\
355\texttt{TP1} & lead artist(s) / performer(s) / soloist(s) / performing group \\
356\texttt{TP2} & band / orchestra / accompaniment \\
357\texttt{TP3} & conductor / performer refinement \\
358\texttt{TP4} & interpreted, remixed, or otherwise modified by \\
359\texttt{TPA} & album number / part of a set \\
360\texttt{TPB} & publisher \\
361\texttt{TRC} & ISRC (International Standard Recording Code) \\
362\texttt{TRD} & recording dates \\
363\texttt{TRK} & track number / position in set \\
364\texttt{TSI} & size \\
365\texttt{TSS} & software / hardware and settings used for encoding \\
366\texttt{TT1} & content group description \\
367\texttt{TT2} & title / songname / content description \\
368\texttt{TT3} & subtitle / description refinement \\
369\texttt{TXT} & lyricist / text writer \\
370\texttt{TYE} & year \\
371\end{tabular}
372}
373
374\clearpage
375
376The \texttt{TRK} and \texttt{TPA} numeric fields may be extended
377with the ``/'' character, indicating the field's total number.
378For example, a \texttt{TRK} of ``3/5'' means the track is number
3793 out of a total of 5.
380
381\subsubsection{Text Frame Example}
382\begin{figure}[h]
383  \includegraphics{figures/id3v22/t__-example.pdf}
384\end{figure}
385\begin{table}[h]
386\begin{tabular}{rl}
387frame ID : & \texttt{"TT2"} (title / song name / content description) \\
388encoding : & \texttt{0} (Latin-1) \\
389text : & \texttt{"some track name"}
390\end{tabular}
391\end{table}
392
393\subsubsection{User-Defined Text Frame}
394\begin{figure}[h]
395\includegraphics{figures/id3v22/txx.pdf}
396\end{figure}
397\par
398\noindent
399This frame is for user-defined text information.
400\VAR{description} is a NULL-terminated string indicating
401what the frame is for.
402\VAR{encoding} is 0 for Latin-1 encoding, 1 for UCS-2 encoding.
403
404\clearpage
405
406\subsubsection{URL Frames}
407\begin{figure}[h]
408  \includegraphics{figures/id3v22/w__.pdf}
409\end{figure}
410\par
411\noindent
412Frames whose ID begins with `W' contain URL links to external resources.
413\par
414\begin{table}[h]
415  \begin{tabular}{|r|l|}
416    \hline
417    Frame ID & Meaning \\
418    \hline
419    \texttt{WAF} & official audio file webpage \\
420    \texttt{WAR} & official artist / performer webpage \\
421    \texttt{WAS} & official audio source webpage \\
422    \texttt{WCM} & commercial information \\
423    \texttt{WCP} & copyright / legal information \\
424    \texttt{WPB} & publishers official webpage \\
425    \hline
426  \end{tabular}
427\end{table}
428\begin{figure}[h]
429  \includegraphics{figures/id3v22/wxx.pdf}
430\end{figure}
431\par
432\noindent
433This is a frame for user-defined links to external resources.
434\VAR{description} is a NULL-terminated string indicating
435what the frame is for.
436\VAR{encoding} is 0 for Latin-1 encoding, 1 for UCS-2 encoding.
437
438\clearpage
439
440\subsubsection{ULT Frame}
441\begin{figure}[h]
442\includegraphics{figures/id3v22/ult.pdf}
443\end{figure}
444\par
445\noindent
446This frame is for unsynchronized (non-karaoke) lyrics text,
447similar to a comment frame.
448\VAR{encoding} is 0 for Latin-1, 1 for UCS-2.
449\VAR{language} is a 3 byte ASCII string.
450\VAR{short description} is a NULL-terminated string
451containing a short description of the lyrics.
452Note that for UCS-2 lyrics, the NULL terminator is 2 bytes.
453The remainder of the frame is the lyrics text itself.
454
455\clearpage
456
457\subsubsection{GEO Frame}
458\begin{figure}[h]
459  \includegraphics{figures/id3v22/geo.pdf}
460\end{figure}
461\par
462\noindent
463This frame contains an embedded file (i.e. a ``general encapsulated object'').
464\VAR{MIME Type} and \VAR{filename} are encoded as NULL-terminated
465Latin-1 strings.
466\VAR{description} is encoded as a Latin-1 or UCS-2 NULL-terminated
467string, depending on if \VAR{encoding} is 0 or 1, respectively.
468The remainder of the frame is the file's binary data.
469
470\clearpage
471
472\subsection{ID3v2.3 Header}
473\begin{figure}[h]
474\includegraphics{figures/id3v23/header.pdf}
475\end{figure}
476\par
477\noindent
478The \VAR{unsync}, \VAR{extended}, \VAR{experimental} and \VAR{footer}
479flags are normally unused.
480
481\subsection{ID3v2.3 Frames}
482\begin{figure}[h]
483  \includegraphics{figures/id3v23/frames.pdf}
484\end{figure}
485\par
486\noindent
487\VAR{frame ID} is a 4 byte ASCII string.
488\VAR{frame size} is the length of the frame data,
489not including its 10 byte header.
490The \VAR{tag alter}, \VAR{file alter}, \VAR{read only},
491\VAR{compression}, \VAR{encryption} and \VAR{grouping} flags
492are normally unused.
493
494\clearpage
495
496\subsubsection{APIC Frame}
497\begin{figure}[h]
498\includegraphics{figures/id3v23/apic.pdf}
499\end{figure}
500\par
501\noindent
502This frame contains an embedded image file.
503\VAR{MIME type} is a NULL-terminated MIME type such as ``image/jpeg''
504or ``image/png''.
505\VAR{description} is a Latin-1 or UCS-2 encoded NULL-terminated string,
506depending on if \VAR{encoding} is 0 or 1, respectively.
507\VAR{picture Type} is one of the following:
508\begin{table}[h]
509{\relsize{-1}
510\begin{tabular}{|r|l||r|l|}
511\hline
512value & type & value & type \\
513\hline
5140 & other & 1 & 32x32 pixels `file icon' (PNG only) \\
5152 & other file icon & 3 & cover (front) \\
5164 & cover (back) & 5 & leaflet page \\
5176 & media (e.g. label side of CD) & 7 & lead artist / lead performer / soloist \\
5188 & artist / performer & 9 & conductor \\
51910 & band / orchestra & 11 & composer \\
52012 & lyricist / text writer & 13 & recording location \\
52114 & during recording & 15 & during performance \\
52216 & movie / video screen capture & 17 & a bright colored fish \\
52318 & illustration & 19 & band / artist logotype \\
52420 & publisher / studio logotype & &  \\
525\hline
526\end{tabular}
527\caption{APIC image types}
528}
529\end{table}
530
531\clearpage
532
533\subsubsection{APIC Frame Example}
534\begin{figure}[h]
535  \includegraphics{figures/id3v23/apic-example.pdf}
536\end{figure}
537\begin{table}[h]
538\begin{tabular}{rl}
539text encoding : & \texttt{0} (Latin-1) \\
540MIME type : & \texttt{"image/png"} \\
541picture type : & \texttt{3} (front cover) \\
542description : & \texttt{"Description"} \\
543\end{tabular}
544\end{table}
545
546\clearpage
547
548\subsubsection{Text Frames}
549\begin{figure}[h]
550  \includegraphics{figures/id3v23/t___.pdf}
551\end{figure}
552\par
553\noindent
554Frames beginning with `T' are text frames.
555\VAR{encoding} is 0 for Latin-1 encoding, 1 for UCS-2 encoding.
556\begin{table}[h]
557  {\relsize{-2}
558    \begin{tabular}{|c|l||c|l|}
559      \hline
560      ID & Meaning & ID & Meaning \\
561      \hline
562      \texttt{TALB} & album name &
563      \texttt{TBPM} & beats-per-minute \\
564      \texttt{TCOM} & composer &
565      \texttt{TCON} & content type \\
566      \texttt{TCOP} & copyright message &
567      \texttt{TDAT} & date \\
568      \texttt{TDLY} & playlist delay &
569      \texttt{TENC} & encoded by \\
570      \texttt{TEXT} & lyricist / text writer &
571      \texttt{TFLT} & file type \\
572      \texttt{TIME} & time &
573      \texttt{TIT1} & content group description \\
574      \texttt{TIT2} & title / songname / content description &
575      \texttt{TIT3} & subtitle / description refinement \\
576      \texttt{TKEY} & initial key &
577      \texttt{TLAN} & language(s) \\
578      \texttt{TLEN} & length &
579      \texttt{TMED} & media type \\
580      \texttt{TOAL} & original album / movie / show title &
581      \texttt{TOFN} & original filename \\
582      \texttt{TOLY} & original lyricist(s) / text writer(s) &
583      \texttt{TOPE} & original artist(s) / performer(s) \\
584      \texttt{TORY} & original release year &
585      \texttt{TOWN} & file owner / licensee \\
586      \texttt{TPE1} & lead performer(s) / soloist(s) &
587      \texttt{TPE2} & band / orchestra / accompaniment \\
588      \texttt{TPE3} & conductor / performer refinement &
589      \texttt{TPE4} & interpreted, remixed, or otherwise modified by \\
590      \texttt{TPOS} & part of a set &
591      \texttt{TPUB} & publisher \\
592      \texttt{TRCK} & track number / position in set &
593      \texttt{TRDA} & recording dates \\
594      \texttt{TRSN} & internet radio station name &
595      \texttt{TRSO} & internet radio station owner \\
596      \texttt{TSIZ} & size &
597      \texttt{TSRC} & ISRC (International Standard Recording Code) \\
598      \texttt{TSSE} & software / hardware and encoding settings &
599      \texttt{TYER} & year \\
600      \hline
601    \end{tabular}
602  }
603\end{table}
604
605\subsubsection{Text Frame Example}
606\begin{figure}[h]
607  \includegraphics{figures/id3v23/t___-example.pdf}
608\end{figure}
609\par
610\noindent
611\begin{tabular}{rl}
612frame type : & \texttt{"TIT2"} (title / song name / content description) \\
613encoding : & \texttt{0} (Latin-1) \\
614text : & \texttt{"Track Name"} \\
615\end{tabular}
616
617\clearpage
618
619\subsubsection{User-Defined Text Frame}
620
621\begin{figure}[h]
622  \includegraphics{figures/id3v23/txxx.pdf}
623\end{figure}
624\par
625\noindent
626This frame is for user-defined text information.
627\VAR{description} is a NULL-terminated string indicating
628what the frame is for.
629\VAR{encoding} is 0 for Latin-1 encoding, 1 for UCS-2 encoding.
630
631\subsubsection{GEOB Frame}
632\begin{figure}[h]
633\includegraphics{figures/id3v23/geob.pdf}
634\end{figure}
635This frame contains an embedded file (i.e. a ``general encapsulated object'').
636\VAR{MIME type} is a NULL-terminated Latin-1 string.
637\VAR{filename} and \VAR{content description} are NULL-terminated
638Latin-1 or UCS-2 encoded strings, if \VAR{encoding} is 0 or 1,
639respectively.
640\VAR{encapsulated object} is binary file data.
641
642\clearpage
643
644\subsubsection{COMM Frame}
645\begin{figure}[h]
646\includegraphics{figures/id3v23/comm.pdf}
647\end{figure}
648\par
649\noindent
650This frame supports a lengthy string of comment text.
651\VAR{encoding} is 0 for Latin-1, 1 for UCS-2.
652\VAR{language} is a 3 byte ASCII string.
653\VAR{description} is a NULL-terminated string
654containing a short description of the comment.
655Note that for UCS-2 comments, the NULL terminator is 2 bytes.
656The remainder of the frame is the comment text itself.
657
658\subsubsection{COMM Frame Example}
659\begin{figure}[h]
660  \includegraphics{figures/id3v23/comm-example.pdf}
661\end{figure}
662\begin{table}[h]
663\begin{tabular}{rl}
664encoding : & \texttt{0} (Latin-1) \\
665language : & \texttt{"eng"} \\
666description : & \texttt{""} (empty string) \\
667comment text : & \texttt{"Comment Text"} \\
668\end{tabular}
669\end{table}
670
671\clearpage
672
673\subsubsection{URL Frames}
674\begin{figure}[h]
675\includegraphics{figures/id3v23/w___.pdf}
676\end{figure}
677\par
678\noindent
679Frames beginning with `W' are URL links to external resources.
680\par
681\begin{table}[h]
682\begin{tabular}{|c|l|}
683\hline
684ID & Meaning \\
685\hline
686\texttt{WCOM} & commercial information \\
687\texttt{WCOP} & copyright / legal information \\
688\texttt{WOAF} & official audio file webpage \\
689\texttt{WOAR} & official artist/performer webpage \\
690\texttt{WOAS} & official audio source webpage \\
691\texttt{WORS} & official internet radio station homepage \\
692\texttt{WPAY} & payment \\
693\texttt{WPUB} & publisher's official webpage \\
694\hline
695\end{tabular}
696\end{table}
697
698\subsubsection{User-Defined URL Frame}
699
700\begin{figure}[h]
701\includegraphics{figures/id3v23/wxxx.pdf}
702\end{figure}
703\par
704\noindent
705This is a frame for user-defined links to external resources.
706\VAR{description} is a NULL-terminated string indicating
707what the frame is for.
708\VAR{encoding} is 0 for Latin-1 encoding, 1 for UCS-2 encoding.
709
710\clearpage
711
712\subsubsection{USLT Frame}
713\begin{figure}[h]
714  \includegraphics{figures/id3v23/uslt.pdf}
715\end{figure}
716\par
717\noindent
718This frame is for unsynchronized (non-karaoke) lyrics text,
719similar to a comment frame.
720\VAR{encoding} is 0 for Latin-1, 1 for UCS-2.
721\VAR{language} is a 3 byte ASCII string.
722\VAR{description} is a NULL-terminated string
723containing a short description of the lyrics.
724Note that for UCS-2 comments, the NULL terminator is 2 bytes.
725The remainder of the frame is the lyrics text itself.
726
727\clearpage
728
729\subsection{ID3v2.4 Header}
730\begin{figure}[h]
731\includegraphics{figures/id3v24/header.pdf}
732\end{figure}
733
734\subsection{ID3v2.4 Frames}
735\begin{figure}[h]
736  \includegraphics{figures/id3v24/frames.pdf}
737\end{figure}
738\par
739\noindent
740The ID3v2.4 frame size field also also ``sync safe'', like the ID3v2 headers.
741It also contains \VAR{tag alter}, \VAR{file alter}, \VAR{read only},
742\VAR{grouping}, \VAR{compression}, \VAR{encryption}, \VAR{unsync}
743and \VAR{data length} flags which are typically usused.
744
745\clearpage
746
747\subsubsection{APIC Frame}
748\begin{figure}[h]
749\includegraphics{figures/id3v24/apic.pdf}
750\end{figure}
751\par
752\noindent
753This frame contains an embedded image file.
754\VAR{MIME type} is a NULL-terminated MIME type such as ``image/jpeg''
755or ``image/png''.
756\VAR{encoding} determines the encoding of the NULL-terminated
757\VAR{description} field.
758Valid encodings are \texttt{0} for Latin-1,
759\texttt{1} for UTF-16,
760\texttt{2} for UTF-16BE
761and \texttt{3} for UTF-8.
762\VAR{picture type} is one of the following:
763\begin{table}[h]
764{\relsize{-1}
765\begin{tabular}{|r|l||r|l|}
766\hline
767value & type & value & type \\
768\hline
7690 & other & 1 & 32x32 pixels `file icon' (PNG only) \\
7702 & other file icon & 3 & cover (front) \\
7714 & cover (back) & 5 & leaflet page \\
7726 & media (e.g. label side of CD) & 7 & lead artist / lead performer / soloist \\
7738 & artist / performer & 9 & conductor \\
77410 & band / orchestra & 11 & composer \\
77512 & lyricist / text writer & 13 & recording location \\
77614 & during recording & 15 & during performance \\
77716 & movie / video screen capture & 17 & a bright colored fish \\
77818 & illustration & 19 & band / artist logotype \\
77920 & publisher / studio logotype & &  \\
780\hline
781\end{tabular}
782\caption{APIC image types}
783}
784\end{table}
785
786\clearpage
787
788\subsubsection{APIC Frame Example}
789\begin{figure}[h]
790\includegraphics{figures/id3v24/apic-example.pdf}
791\end{figure}
792\begin{table}[h]
793\begin{tabular}{rl}
794encoding : & \texttt{0} (Latin-1) \\
795MIME type : & \texttt{"image/png"} \\
796picture type : & \texttt{3} (front cover) \\
797description : & \texttt{"Description"} \\
798\end{tabular}
799\end{table}
800\par
801\noindent
802As with all ID3v2.4 frames, the \VAR{frame size} field contains
803embedded 0 bits as follows:
804\begin{figure}[h]
805\includegraphics{figures/id3v24/size.pdf}
806\end{figure}
807\par
808\noindent
809which indicates the frame's data size is:
810\begin{equation*}
811(0 \times 2 ^ {21}) + (0 \times 2 ^ {14}) + (1 \times 2 ^ {7}) + 55 = 183 \text{ bytes}
812\end{equation*}
813
814\clearpage
815
816\subsubsection{Text Frames}
817\begin{figure}[h]
818  \includegraphics{figures/id3v24/t___.pdf}
819\end{figure}
820\par
821\noindent
822Frames beginning with `T' are text frames.
823\VAR{encoding} is 0 for Latin-1 encoding, 1 for UTF-16 encoding,
8242 for UTF-16BE encoding and 3 for UTF-8 encoding.
825\begin{table}[h]
826  {\relsize{-2}
827    \begin{tabular}{|c|l||c|l|}
828      \hline
829      ID & Meaning & ID & Meaning \\
830      \hline
831      \texttt{TALB} & album name &
832      \texttt{TBPM} & beats-per-minute \\
833      \texttt{TCOM} & composer &
834      \texttt{TCON} & content type \\
835      \texttt{TCOP} & copyright message &
836      \texttt{TDAT} & date \\
837      \texttt{TDEN} & encoding time &
838      \texttt{TDLY} & playlist delay \\
839      \texttt{TDOR} & original release time &
840      \texttt{TDRC} & recording time \\
841      \texttt{TDRL} & release time &
842      \texttt{TDTG} & tagging time \\
843      \texttt{TENC} & encoded by &
844      \texttt{TEXT} & lyricist / text writer \\
845      \texttt{TFLT} & file type &
846      \texttt{TIPL} & involved people list \\
847      \texttt{TIT1} & content group description &
848      \texttt{TIT2} & title / songname / content description \\
849      \texttt{TIT3} & subtitle / description refinement &
850      \texttt{TKEY} & initial key \\
851      \texttt{TLAN} & language(s) &
852      \texttt{TLEN} & length \\
853      \texttt{TMCL} & musician credits list &
854      \texttt{TMED} & media type \\
855      \texttt{TMOO} & mood &
856      \texttt{TOAL} & original album / movie / show title \\
857      \texttt{TOFN} & original filename &
858      \texttt{TOLY} & original lyricist(s) / text writer(s) \\
859      \texttt{TOPE} & original artist(s) / performer(s) &
860      \texttt{TOWN} & file owner / licensee \\
861      \texttt{TPE1} & lead performer(s) / soloist(s) &
862      \texttt{TPE2} & band / orchestra / accompaniment \\
863      \texttt{TPE3} & conductor / performer refinement &
864      \texttt{TPE4} & interpreted, remixed, or otherwise modified by \\
865      \texttt{TPOS} & part of a set &
866      \texttt{TPRO} & produced notice \\
867      \texttt{TPUB} & publisher &
868      \texttt{TRCK} & track number / position in set \\
869      \texttt{TRSN} & internet radio station name &
870      \texttt{TRSO} & internet radio station owner \\
871      \texttt{TSOA} & album sort order &
872      \texttt{TSOP} & performer sort order \\
873      \texttt{TSOT} & title sort order &
874      \texttt{TSRC} & ISRC (International Standard Recording Code) \\
875      \texttt{TSSE} & software / hardware and encoding settings &
876      \texttt{TSST} & set subtitle \\
877      \hline
878    \end{tabular}
879  }
880\end{table}
881
882\clearpage
883
884\subsubsection{Text Frame Example}
885\begin{figure}[h]
886  \includegraphics{figures/id3v24/t___-example.pdf}
887\end{figure}
888\begin{table}[h]
889\begin{tabular}{rl}
890frame ID : & \texttt{"TIT2"} (title / song name / content description) \\
891encoding : & \texttt{0} (Latin-1) \\
892text: & \texttt{"Track Name"} \\
893\end{tabular}
894\end{table}
895
896\subsubsection{User-Defined Text Frame}
897\begin{figure}[h]
898  \includegraphics{figures/id3v24/txxx.pdf}
899\end{figure}
900\par
901\noindent
902This frame is for user-defined text information.
903\VAR{encoding} is 0 for Latin-1 encoding, 1 for UTF-16 encoding,
9042 for UTF-16BE encoding and 3 for UTF-8 encoding.
905\VAR{description} is a NULL-terminated string indicating
906what the frame is for.
907
908\clearpage
909
910\subsubsection{COMM Frame}
911\begin{figure}[h]
912\includegraphics{figures/id3v24/comm.pdf}
913\end{figure}
914\par
915\noindent
916This frame supports a lengthy string of comment text.
917\VAR{encoding} determines the encoding of the NULL-terminated
918\VAR{description} field and \VAR{comment text} field.
919Valid encodings are \texttt{0} for Latin-1,
920\texttt{1} for UTF-16,
921\texttt{2} for UTF-16BE
922and \texttt{3} for UTF-8.
923\VAR{language} is a 3 byte ASCII string.
924
925\subsubsection{COMM Frame Example}
926\begin{figure}[h]
927\includegraphics{figures/id3v24/comm-example.pdf}
928\end{figure}
929\begin{table}[h]
930\begin{tabular}{rl}
931encoding : & \texttt{0} (Latin-1) \\
932language : & \texttt{"eng"} \\
933description : & \texttt{""} (empty string) \\
934comment text : & \texttt{"Comment Text"} \\
935\end{tabular}
936\end{table}
937
938\clearpage
939
940\subsubsection{URL Frames}
941\begin{figure}[h]
942\includegraphics{figures/id3v24/w___.pdf}
943\end{figure}
944\par
945\noindent
946Frames with IDs beginning with `W' are URL links to external resources.
947\par
948\begin{table}[h]
949\begin{tabular}{|c|l|}
950\hline
951ID & Meaning \\
952\hline
953\texttt{WCOM} & commercial information \\
954\texttt{WCOP} & copyright / legal information \\
955\texttt{WOAF} & official audio file webpage \\
956\texttt{WOAR} & official artist/performer webpage \\
957\texttt{WOAS} & official audio source webpage \\
958\texttt{WORS} & official internet radio station homepage \\
959\texttt{WPAY} & payment \\
960\texttt{WPUB} & publisher's official webpage \\
961\hline
962\end{tabular}
963\end{table}
964
965\clearpage
966
967\subsubsection{User-Defined URL Frame}
968
969\begin{figure}[h]
970\includegraphics{figures/id3v24/wxxx.pdf}
971\end{figure}
972\par
973\noindent
974This is a frame for user-defined links to external resources.
975\VAR{encoding} is 0 for Latin-1 encoding, 1 for UTF-16 encoding,
9762 for UTF-16BE encoding and 3 for UTF-8 encoding.
977\VAR{description} is a NULL-terminated string indicating
978what the frame is for.
979
980\clearpage
981
982\subsubsection{GEOB Frame}
983\begin{figure}[h]
984  \includegraphics{figures/id3v24/geob.pdf}
985\end{figure}
986\par
987\noindent
988This frame contains an embedded file (i.e. a ``general encapsulated object'').
989\VAR{MIME type} is a NULL-terminated Latin-1 string.
990\VAR{filename} and \VAR{content description} are NULL-terminated
991strings encoded as Latin-1, UTF-16, UTF-16BE or UTF-8,
992depending on the \VAR{encoding field}.
993\VAR{encapsulated object} is binary file data.
994
995\clearpage
996
997\subsubsection{USLT Frame}
998\begin{figure}[h]
999  \includegraphics{figures/id3v24/uslt.pdf}
1000\end{figure}
1001\par
1002\noindent
1003This frame is for unsynchronized (non-karaoke) lyrics text,
1004similar to a comment frame.
1005\VAR{language} is a 3 byte ASCII string.
1006\VAR{description} and \VAR{lyrics text} are a NULL-terminated strings
1007encoded as Latin-1, UTF-16, UTF-16BE or UTF-8,
1008depending on the \VAR{encoding field}.
1009