1%Copyright (C) 2007-2014 Brian Langenberger 2%This work is licensed under the 3%Creative Commons Attribution-Share Alike 3.0 United States License. 4%To view a copy of this license, visit 5%http://creativecommons.org/licenses/by-sa/3.0/us/ or send a letter to 6%Creative Commons, 7%171 Second Street, Suite 300, 8%San Francisco, California, 94105, USA. 9 10\chapter{MP3} 11MP3 is the de-facto standard for lossy audio. 12It is little more than a series of MPEG frames with an 13optional ID3v2 metadata header and optional ID3v1 metadata 14footer. 15 16MP3 decoders are assumed to be very tolerant of anything in 17the stream that doesn't look like an MPEG frame, ignoring such 18junk until the next frame is found. 19Since MP3 files have no standard container format in which 20non-MPEG data can be placed, metadata such as ID3 tags are often 21made `sync-safe' by formatting them in a way that decoders won't 22confuse tags for MPEG frames. 23\section{the MP3 File Stream} 24\begin{figure}[h] 25\includegraphics{figures/mp3/stream.pdf} 26\end{figure} 27\begin{table}[h] 28\begin{tabular}{|c||l||l||r|r|r||l|} 29\hline 30& & & \multicolumn{3}{c||}{Sample Rate} & \\ 31bits & MPEG ID & Description & MPEG-1 & MPEG-2 & MPEG-2.5 & Channels \\ 32\hline 33\texttt{00} & MPEG-2.5 & reserved & 44100 & 22050 & 11025 & Stereo \\ 34\texttt{01} & reserved & Layer III & 48000 & 24000 & 12000 & Joint stereo \\ 35\texttt{10} & MPEG-2 & Layer II & 32000 & 16000 & 8000 & Dual channel stereo\\ 36\texttt{11} & MPEG-1 & Layer I & reserved & reserved & reserved & Mono \\ 37\hline 38\end{tabular} 39\end{table} 40\par 41\noindent 42Layer I frames always contain 384 samples. 43Layer II and Layer III frames always contain 1152 samples. 44If the \VAR{Protection} bit is 0, the frame header is followed by a 4516 bit CRC. 46 47\pagebreak 48 49\begin{table}[h] 50{\relsize{-2} 51\begin{tabular}{|c||r|r|r|r|r|} 52\hline 53& MPEG-1 & MPEG-1 & MPEG-1 & MPEG-2 & MPEG-2 \\ 54bits & Layer-1 & Layer-2 & Layer-3 & Layer-1 & Layer-2/3 \\ 55\hline 56\texttt{0000} & free & free & free & free & free \\ 57\texttt{0001} & 32 & 32 & 32 & 32 & 8 \\ 58\texttt{0010} & 64 & 48 & 40 & 48 & 16 \\ 59\texttt{0011} & 96 & 56 & 48 & 56 & 24 \\ 60\texttt{0100} & 128 & 64 & 56 & 64 & 32 \\ 61\texttt{0101} & 160 & 80 & 64 & 80 & 40 \\ 62\texttt{0110} & 192 & 96 & 80 & 96 & 48 \\ 63\texttt{0111} & 224 & 112 & 96 & 112 & 56 \\ 64\texttt{1000} & 256 & 128 & 112 & 128 & 64 \\ 65\texttt{1001} & 288 & 160 & 128 & 144 & 80 \\ 66\texttt{1010} & 320 & 192 & 160 & 160 & 96 \\ 67\texttt{1011} & 352 & 224 & 192 & 176 & 112 \\ 68\texttt{1100} & 384 & 256 & 224 & 192 & 128 \\ 69\texttt{1101} & 416 & 320 & 256 & 224 & 144 \\ 70\texttt{1110} & 448 & 384 & 320 & 256 & 160 \\ 71\texttt{1111} & bad & bad & bad & bad & bad \\ 72\hline 73\end{tabular} 74} 75\caption{Bitrate in 1000 bits per second} 76\end{table} 77To find the total size of an MPEG frame, use one of the following 78formulas: 79\begin{align} 80\intertext{Layer I:} 81\text{Byte Length} &= \left ( \frac{12 \times \text{Bitrate}}{\text{Sample Rate}} + \text{Pad} \right ) \times 4 \\ 82\intertext{Layer II/III:} 83\text{Byte Length} &= \frac{144 \times \text{Bitrate}}{\text{Sample Rate}} + \text{Pad} 84\end{align} 85For example, an MPEG-1 Layer III frame with a sampling rate of 44100, 86a bitrate of 128kbps and a set pad bit is 418 bytes long, including the header. 87\begin{equation} 88\frac{144 \times 128000}{44100} + 1 = 418 89\end{equation} 90 91\clearpage 92 93\subsection{the Xing Header} 94 95An MP3 frame header contains the track's sampling rate, 96bits-per-sample and number of channels. 97However, because MP3 files are little more than 98concatenated MPEG frames, there is no obvious place to 99store the track's total length. 100Since the length of each frame is a constant number of samples, 101one can calculate the track length by counting the number of frames. 102This method is the most accurate but is also quite slow. 103 104For MP3 files in which all frames have the same bitrate 105- also known as constant bitrate, or CBR files - 106one can divide the total size of file (minus any ID3 headers/footers), 107by the bitrate to determine its length. 108If an MP3 file has no Xing header in its first frame, 109one can assume it is CBR. 110 111An MP3 file that does contain a Xing header in its first frame 112can be assumed to be variable bitrate, or VBR. 113In that case, the rate of the first frame cannot be used as a 114basis to calculate the length of the entire file. 115Instead, one must use the information from the Xing header 116which contains that length. 117 118All of the fields within a Xing header are big-endian. 119\begin{figure}[h] 120\includegraphics{figures/mp3/xing.pdf} 121\end{figure} 122 123\clearpage 124 125\section{ID3v1 Tags} 126ID3v1 tags are very simple metadata tags appended to an MP3 file. 127All of the fields are fixed length, padded with NULLs if necessary, 128and the text encoding is undefined. 129There are two versions of ID3v1 tags. 130ID3v1.1 has a track number field as a 1 byte value 131at the end of the comment field. 132If the byte just before the end is not null (0x00), 133assume we're dealing with a classic ID3v1 tag without a 134track number. 135 136\subsection{ID3v1} 137 138\begin{figure}[h] 139\includegraphics{figures/mp3/id3v1.pdf} 140\end{figure} 141 142\subsection{ID3v1.1} 143 144\begin{figure}[h] 145\includegraphics{figures/mp3/id3v11.pdf} 146\end{figure} 147 148\clearpage 149 150\subsection{ID3v1.1 Tag Example} 151\begin{figure}[h] 152 \includegraphics{figures/mp3/id3v11-example.pdf} 153\end{figure} 154\begin{table}[h] 155 \begin{tabular}{rl} 156 track title : & \texttt{some track name} \\ 157 artist name : & \texttt{artist's name} \\ 158 album name : & \texttt{album title} \\ 159 year : & \texttt{2012} \\ 160 comment : & \texttt{a lengthy comment field} \\ 161 track number : & \texttt{1} \\ 162 genre : & \texttt{0} \\ 163 \end{tabular} 164\end{table} 165 166\clearpage 167 168\section{ID3v2 Tags} 169 170The ID3v2 tag was invented to address the deficiencies in the original 171ID3v1 tag. 172ID3v2 comes in three similar but not entirely compatible variants: 173ID3v2.2, ID3v2.3 and ID3v2.4. 174All of its fields are big-endian. 175 176\begin{figure}[h] 177\includegraphics{figures/mp3/id3v2_stream.pdf} 178\end{figure} 179\par 180\noindent 181\VAR{major version} is 2 for ID3v2.2, 3 for ID3v2.3 and 4 for ID3v2.4. 182\VAR{minor version} is always 0. 183The four \VAR{size} fields are recombined as follows: 184\begin{equation*} 185 \text{ID3 Frames size} = (\text{size}_3 \times 2 ^ {21}) + (\text{size}_2 \times 2 ^ {14}) + (\text{size}_1 \times 2 ^ 7) + \text{size}_0 186\end{equation*} 187Splitting the field with 0 bits ensures that no size value 188will appear to be an MP3 frame sync. 189 190\clearpage 191 192\subsection{ID3v2 Header Example} 193\begin{figure}[h] 194 \includegraphics{figures/mp3/id3v2_stream-example.pdf} 195\end{figure} 196\begin{table}[h] 197 \begin{tabular}{rl} 198 ID : & \texttt{"ID3"} \\ 199 major version : & \texttt{3} \\ 200 minor version : & \texttt{0} \\ 201 flags : & \texttt{0} \\ 202 $\text{size}_3$ : & \texttt{0} \\ 203 $\text{size}_2$ : & \texttt{0} \\ 204 $\text{size}_1$ : & \texttt{1} \\ 205 $\text{size}_0$ : & \texttt{21} \\ 206 ID3 Frames size : & $(0 \times 2 ^ {21}) + (0 \times 2 ^ {14}) + (1 \times 2 ^ 7) + 21 = 149$ bytes \\ 207 \end{tabular} 208\end{table} 209\par 210\noindent 211Which indicates this is an ID3v2.3 tag with 149 bytes of ID3v2.3 frames. 212Since there is no `total number of frames' field, 213we must use the size field to determine when to stop reading 214additional ID3 frames. 215 216\clearpage 217 218\subsection{ID3v2.2 Header} 219 220\begin{figure}[h] 221\includegraphics{figures/id3v22/header.pdf} 222\end{figure} 223\par 224\noindent 225The \VAR{unsync} and \VAR{compression} flags are normally unused. 226 227\subsection{ID3v2.2 Frames} 228\begin{figure}[h] 229\includegraphics{figures/id3v22/frames.pdf} 230\end{figure} 231\par 232\noindent 233\VAR{frame ID} is a 3 byte ASCII string. 234\VAR{frame size} is the length of the frame data, 235not including its 6 byte header. 236 237\clearpage 238 239\subsubsection{COM Frame} 240\begin{figure}[h] 241\includegraphics{figures/id3v22/com.pdf} 242\end{figure} 243\par 244\noindent 245This frame supports a lengthy string of comment text. 246\VAR{encoding} is 0 for Latin-1, 1 for UCS-2, 247indicating the text encoding of the \VAR{short description} 248and \VAR{comment text} fields. 249\VAR{language} is a 3 byte ASCII string. 250\VAR{short description} is a NULL-terminated string 251containing a short description of the comment. 252Note that for UCS-2 comments, the NULL terminator is 2 bytes. 253The remainder of the frame is the comment text itself. 254 255\subsubsection{COM Frame Example} 256\begin{figure}[h] 257 \includegraphics{figures/id3v22/com-example.pdf} 258\end{figure} 259\begin{table}[h] 260 \begin{tabular}{rl} 261 encoding : & \texttt{0} (Latin-1) \\ 262 language : & \texttt{"eng"} \\ 263 short description : & \texttt{""} (empty string) \\ 264 comment text : & \texttt{"comment text"} \\ 265 \end{tabular} 266\end{table} 267 268\clearpage 269 270\subsubsection{PIC Frame} 271\begin{figure}[h] 272\includegraphics{figures/id3v22/pic.pdf} 273\end{figure} 274\par 275\noindent 276This frame contains an embedded image file. 277\VAR{image format} is a 3 byte string indicating the format of the image, 278typically `JPG' for JPEG images or 'PNG' for PNG images. 279\VAR{description} is a Latin-1 or UCS-2 encoded NULL-terminated string, 280depending on if \VAR{encoding} is 0 or 1, respectively. 281\VAR{picture type} is one of the following: 282\begin{table}[h] 283{\relsize{-1} 284\begin{tabular}{|r|l||r|l|} 285\hline 286value & type & value & type \\ 287\hline 2880 & other & 1 & 32x32 pixels `file icon' (PNG only) \\ 2892 & other file icon & 3 & cover (front) \\ 2904 & cover (back) & 5 & leaflet page \\ 2916 & media (e.g. label side of CD) & 7 & lead artist / lead performer / soloist \\ 2928 & artist / performer & 9 & conductor \\ 29310 & band / orchestra & 11 & composer \\ 29412 & lyricist / text writer & 13 & recording location \\ 29514 & during recording & 15 & during performance \\ 29616 & movie / video screen capture & 17 & a bright colored fish \\ 29718 & illustration & 19 & band / artist logotype \\ 29820 & publisher / studio logotype & & \\ 299\hline 300\end{tabular} 301\caption{PIC image types} 302} 303\end{table} 304 305\clearpage 306 307\subsubsection{PIC Frame Example} 308\begin{figure}[h] 309 \includegraphics{figures/id3v22/pic-example.pdf} 310\end{figure} 311\begin{table}[h] 312 \begin{tabular}{rl} 313 encoding : & \texttt{0} (Latin-1) \\ 314 image format : & \texttt{"PNG"} \\ 315 image type : & \texttt{3} (front cover) \\ 316 description : & \texttt{"Description"} \\ 317\end{tabular} 318\end{table} 319 320\clearpage 321 322\subsubsection{Text Frames} 323\begin{figure}[h] 324\includegraphics{figures/id3v22/t__.pdf} 325\end{figure} 326\par 327\noindent 328Frames whose ID starts with `T' are for textual information. 329\VAR{encoding} is 0 for Latin-1 encoding, 1 for UCS-2 encoding. 330\par 331\noindent 332{\relsize{-1} 333\begin{tabular}{r|l} 334frame ID & meaning \\ 335\hline 336\texttt{TAL} & album / movie / show title \\ 337\texttt{TBP} & BPM (beats per minute) \\ 338\texttt{TCM} & composer \\ 339\texttt{TCO} & content type \\ 340\texttt{TCR} & copyright message \\ 341\texttt{TDA} & date \\ 342\texttt{TDY} & playlist delay \\ 343\texttt{TEN} & encoded by \\ 344\texttt{TFT} & file type \\ 345\texttt{TIM} & time \\ 346\texttt{TKE} & initial key \\ 347\texttt{TLA} & language(s) \\ 348\texttt{TLE} & length \\ 349\texttt{TMT} & media type \\ 350\texttt{TOA} & original artist(s) / performer(s) \\ 351\texttt{TOF} & original filename \\ 352\texttt{TOL} & original Lyricist(s) / text writer(s) \\ 353\texttt{TOR} & original release year \\ 354\texttt{TOT} & original album / movie / show title \\ 355\texttt{TP1} & lead artist(s) / performer(s) / soloist(s) / performing group \\ 356\texttt{TP2} & band / orchestra / accompaniment \\ 357\texttt{TP3} & conductor / performer refinement \\ 358\texttt{TP4} & interpreted, remixed, or otherwise modified by \\ 359\texttt{TPA} & album number / part of a set \\ 360\texttt{TPB} & publisher \\ 361\texttt{TRC} & ISRC (International Standard Recording Code) \\ 362\texttt{TRD} & recording dates \\ 363\texttt{TRK} & track number / position in set \\ 364\texttt{TSI} & size \\ 365\texttt{TSS} & software / hardware and settings used for encoding \\ 366\texttt{TT1} & content group description \\ 367\texttt{TT2} & title / songname / content description \\ 368\texttt{TT3} & subtitle / description refinement \\ 369\texttt{TXT} & lyricist / text writer \\ 370\texttt{TYE} & year \\ 371\end{tabular} 372} 373 374\clearpage 375 376The \texttt{TRK} and \texttt{TPA} numeric fields may be extended 377with the ``/'' character, indicating the field's total number. 378For example, a \texttt{TRK} of ``3/5'' means the track is number 3793 out of a total of 5. 380 381\subsubsection{Text Frame Example} 382\begin{figure}[h] 383 \includegraphics{figures/id3v22/t__-example.pdf} 384\end{figure} 385\begin{table}[h] 386\begin{tabular}{rl} 387frame ID : & \texttt{"TT2"} (title / song name / content description) \\ 388encoding : & \texttt{0} (Latin-1) \\ 389text : & \texttt{"some track name"} 390\end{tabular} 391\end{table} 392 393\subsubsection{User-Defined Text Frame} 394\begin{figure}[h] 395\includegraphics{figures/id3v22/txx.pdf} 396\end{figure} 397\par 398\noindent 399This frame is for user-defined text information. 400\VAR{description} is a NULL-terminated string indicating 401what the frame is for. 402\VAR{encoding} is 0 for Latin-1 encoding, 1 for UCS-2 encoding. 403 404\clearpage 405 406\subsubsection{URL Frames} 407\begin{figure}[h] 408 \includegraphics{figures/id3v22/w__.pdf} 409\end{figure} 410\par 411\noindent 412Frames whose ID begins with `W' contain URL links to external resources. 413\par 414\begin{table}[h] 415 \begin{tabular}{|r|l|} 416 \hline 417 Frame ID & Meaning \\ 418 \hline 419 \texttt{WAF} & official audio file webpage \\ 420 \texttt{WAR} & official artist / performer webpage \\ 421 \texttt{WAS} & official audio source webpage \\ 422 \texttt{WCM} & commercial information \\ 423 \texttt{WCP} & copyright / legal information \\ 424 \texttt{WPB} & publishers official webpage \\ 425 \hline 426 \end{tabular} 427\end{table} 428\begin{figure}[h] 429 \includegraphics{figures/id3v22/wxx.pdf} 430\end{figure} 431\par 432\noindent 433This is a frame for user-defined links to external resources. 434\VAR{description} is a NULL-terminated string indicating 435what the frame is for. 436\VAR{encoding} is 0 for Latin-1 encoding, 1 for UCS-2 encoding. 437 438\clearpage 439 440\subsubsection{ULT Frame} 441\begin{figure}[h] 442\includegraphics{figures/id3v22/ult.pdf} 443\end{figure} 444\par 445\noindent 446This frame is for unsynchronized (non-karaoke) lyrics text, 447similar to a comment frame. 448\VAR{encoding} is 0 for Latin-1, 1 for UCS-2. 449\VAR{language} is a 3 byte ASCII string. 450\VAR{short description} is a NULL-terminated string 451containing a short description of the lyrics. 452Note that for UCS-2 lyrics, the NULL terminator is 2 bytes. 453The remainder of the frame is the lyrics text itself. 454 455\clearpage 456 457\subsubsection{GEO Frame} 458\begin{figure}[h] 459 \includegraphics{figures/id3v22/geo.pdf} 460\end{figure} 461\par 462\noindent 463This frame contains an embedded file (i.e. a ``general encapsulated object''). 464\VAR{MIME Type} and \VAR{filename} are encoded as NULL-terminated 465Latin-1 strings. 466\VAR{description} is encoded as a Latin-1 or UCS-2 NULL-terminated 467string, depending on if \VAR{encoding} is 0 or 1, respectively. 468The remainder of the frame is the file's binary data. 469 470\clearpage 471 472\subsection{ID3v2.3 Header} 473\begin{figure}[h] 474\includegraphics{figures/id3v23/header.pdf} 475\end{figure} 476\par 477\noindent 478The \VAR{unsync}, \VAR{extended}, \VAR{experimental} and \VAR{footer} 479flags are normally unused. 480 481\subsection{ID3v2.3 Frames} 482\begin{figure}[h] 483 \includegraphics{figures/id3v23/frames.pdf} 484\end{figure} 485\par 486\noindent 487\VAR{frame ID} is a 4 byte ASCII string. 488\VAR{frame size} is the length of the frame data, 489not including its 10 byte header. 490The \VAR{tag alter}, \VAR{file alter}, \VAR{read only}, 491\VAR{compression}, \VAR{encryption} and \VAR{grouping} flags 492are normally unused. 493 494\clearpage 495 496\subsubsection{APIC Frame} 497\begin{figure}[h] 498\includegraphics{figures/id3v23/apic.pdf} 499\end{figure} 500\par 501\noindent 502This frame contains an embedded image file. 503\VAR{MIME type} is a NULL-terminated MIME type such as ``image/jpeg'' 504or ``image/png''. 505\VAR{description} is a Latin-1 or UCS-2 encoded NULL-terminated string, 506depending on if \VAR{encoding} is 0 or 1, respectively. 507\VAR{picture Type} is one of the following: 508\begin{table}[h] 509{\relsize{-1} 510\begin{tabular}{|r|l||r|l|} 511\hline 512value & type & value & type \\ 513\hline 5140 & other & 1 & 32x32 pixels `file icon' (PNG only) \\ 5152 & other file icon & 3 & cover (front) \\ 5164 & cover (back) & 5 & leaflet page \\ 5176 & media (e.g. label side of CD) & 7 & lead artist / lead performer / soloist \\ 5188 & artist / performer & 9 & conductor \\ 51910 & band / orchestra & 11 & composer \\ 52012 & lyricist / text writer & 13 & recording location \\ 52114 & during recording & 15 & during performance \\ 52216 & movie / video screen capture & 17 & a bright colored fish \\ 52318 & illustration & 19 & band / artist logotype \\ 52420 & publisher / studio logotype & & \\ 525\hline 526\end{tabular} 527\caption{APIC image types} 528} 529\end{table} 530 531\clearpage 532 533\subsubsection{APIC Frame Example} 534\begin{figure}[h] 535 \includegraphics{figures/id3v23/apic-example.pdf} 536\end{figure} 537\begin{table}[h] 538\begin{tabular}{rl} 539text encoding : & \texttt{0} (Latin-1) \\ 540MIME type : & \texttt{"image/png"} \\ 541picture type : & \texttt{3} (front cover) \\ 542description : & \texttt{"Description"} \\ 543\end{tabular} 544\end{table} 545 546\clearpage 547 548\subsubsection{Text Frames} 549\begin{figure}[h] 550 \includegraphics{figures/id3v23/t___.pdf} 551\end{figure} 552\par 553\noindent 554Frames beginning with `T' are text frames. 555\VAR{encoding} is 0 for Latin-1 encoding, 1 for UCS-2 encoding. 556\begin{table}[h] 557 {\relsize{-2} 558 \begin{tabular}{|c|l||c|l|} 559 \hline 560 ID & Meaning & ID & Meaning \\ 561 \hline 562 \texttt{TALB} & album name & 563 \texttt{TBPM} & beats-per-minute \\ 564 \texttt{TCOM} & composer & 565 \texttt{TCON} & content type \\ 566 \texttt{TCOP} & copyright message & 567 \texttt{TDAT} & date \\ 568 \texttt{TDLY} & playlist delay & 569 \texttt{TENC} & encoded by \\ 570 \texttt{TEXT} & lyricist / text writer & 571 \texttt{TFLT} & file type \\ 572 \texttt{TIME} & time & 573 \texttt{TIT1} & content group description \\ 574 \texttt{TIT2} & title / songname / content description & 575 \texttt{TIT3} & subtitle / description refinement \\ 576 \texttt{TKEY} & initial key & 577 \texttt{TLAN} & language(s) \\ 578 \texttt{TLEN} & length & 579 \texttt{TMED} & media type \\ 580 \texttt{TOAL} & original album / movie / show title & 581 \texttt{TOFN} & original filename \\ 582 \texttt{TOLY} & original lyricist(s) / text writer(s) & 583 \texttt{TOPE} & original artist(s) / performer(s) \\ 584 \texttt{TORY} & original release year & 585 \texttt{TOWN} & file owner / licensee \\ 586 \texttt{TPE1} & lead performer(s) / soloist(s) & 587 \texttt{TPE2} & band / orchestra / accompaniment \\ 588 \texttt{TPE3} & conductor / performer refinement & 589 \texttt{TPE4} & interpreted, remixed, or otherwise modified by \\ 590 \texttt{TPOS} & part of a set & 591 \texttt{TPUB} & publisher \\ 592 \texttt{TRCK} & track number / position in set & 593 \texttt{TRDA} & recording dates \\ 594 \texttt{TRSN} & internet radio station name & 595 \texttt{TRSO} & internet radio station owner \\ 596 \texttt{TSIZ} & size & 597 \texttt{TSRC} & ISRC (International Standard Recording Code) \\ 598 \texttt{TSSE} & software / hardware and encoding settings & 599 \texttt{TYER} & year \\ 600 \hline 601 \end{tabular} 602 } 603\end{table} 604 605\subsubsection{Text Frame Example} 606\begin{figure}[h] 607 \includegraphics{figures/id3v23/t___-example.pdf} 608\end{figure} 609\par 610\noindent 611\begin{tabular}{rl} 612frame type : & \texttt{"TIT2"} (title / song name / content description) \\ 613encoding : & \texttt{0} (Latin-1) \\ 614text : & \texttt{"Track Name"} \\ 615\end{tabular} 616 617\clearpage 618 619\subsubsection{User-Defined Text Frame} 620 621\begin{figure}[h] 622 \includegraphics{figures/id3v23/txxx.pdf} 623\end{figure} 624\par 625\noindent 626This frame is for user-defined text information. 627\VAR{description} is a NULL-terminated string indicating 628what the frame is for. 629\VAR{encoding} is 0 for Latin-1 encoding, 1 for UCS-2 encoding. 630 631\subsubsection{GEOB Frame} 632\begin{figure}[h] 633\includegraphics{figures/id3v23/geob.pdf} 634\end{figure} 635This frame contains an embedded file (i.e. a ``general encapsulated object''). 636\VAR{MIME type} is a NULL-terminated Latin-1 string. 637\VAR{filename} and \VAR{content description} are NULL-terminated 638Latin-1 or UCS-2 encoded strings, if \VAR{encoding} is 0 or 1, 639respectively. 640\VAR{encapsulated object} is binary file data. 641 642\clearpage 643 644\subsubsection{COMM Frame} 645\begin{figure}[h] 646\includegraphics{figures/id3v23/comm.pdf} 647\end{figure} 648\par 649\noindent 650This frame supports a lengthy string of comment text. 651\VAR{encoding} is 0 for Latin-1, 1 for UCS-2. 652\VAR{language} is a 3 byte ASCII string. 653\VAR{description} is a NULL-terminated string 654containing a short description of the comment. 655Note that for UCS-2 comments, the NULL terminator is 2 bytes. 656The remainder of the frame is the comment text itself. 657 658\subsubsection{COMM Frame Example} 659\begin{figure}[h] 660 \includegraphics{figures/id3v23/comm-example.pdf} 661\end{figure} 662\begin{table}[h] 663\begin{tabular}{rl} 664encoding : & \texttt{0} (Latin-1) \\ 665language : & \texttt{"eng"} \\ 666description : & \texttt{""} (empty string) \\ 667comment text : & \texttt{"Comment Text"} \\ 668\end{tabular} 669\end{table} 670 671\clearpage 672 673\subsubsection{URL Frames} 674\begin{figure}[h] 675\includegraphics{figures/id3v23/w___.pdf} 676\end{figure} 677\par 678\noindent 679Frames beginning with `W' are URL links to external resources. 680\par 681\begin{table}[h] 682\begin{tabular}{|c|l|} 683\hline 684ID & Meaning \\ 685\hline 686\texttt{WCOM} & commercial information \\ 687\texttt{WCOP} & copyright / legal information \\ 688\texttt{WOAF} & official audio file webpage \\ 689\texttt{WOAR} & official artist/performer webpage \\ 690\texttt{WOAS} & official audio source webpage \\ 691\texttt{WORS} & official internet radio station homepage \\ 692\texttt{WPAY} & payment \\ 693\texttt{WPUB} & publisher's official webpage \\ 694\hline 695\end{tabular} 696\end{table} 697 698\subsubsection{User-Defined URL Frame} 699 700\begin{figure}[h] 701\includegraphics{figures/id3v23/wxxx.pdf} 702\end{figure} 703\par 704\noindent 705This is a frame for user-defined links to external resources. 706\VAR{description} is a NULL-terminated string indicating 707what the frame is for. 708\VAR{encoding} is 0 for Latin-1 encoding, 1 for UCS-2 encoding. 709 710\clearpage 711 712\subsubsection{USLT Frame} 713\begin{figure}[h] 714 \includegraphics{figures/id3v23/uslt.pdf} 715\end{figure} 716\par 717\noindent 718This frame is for unsynchronized (non-karaoke) lyrics text, 719similar to a comment frame. 720\VAR{encoding} is 0 for Latin-1, 1 for UCS-2. 721\VAR{language} is a 3 byte ASCII string. 722\VAR{description} is a NULL-terminated string 723containing a short description of the lyrics. 724Note that for UCS-2 comments, the NULL terminator is 2 bytes. 725The remainder of the frame is the lyrics text itself. 726 727\clearpage 728 729\subsection{ID3v2.4 Header} 730\begin{figure}[h] 731\includegraphics{figures/id3v24/header.pdf} 732\end{figure} 733 734\subsection{ID3v2.4 Frames} 735\begin{figure}[h] 736 \includegraphics{figures/id3v24/frames.pdf} 737\end{figure} 738\par 739\noindent 740The ID3v2.4 frame size field also also ``sync safe'', like the ID3v2 headers. 741It also contains \VAR{tag alter}, \VAR{file alter}, \VAR{read only}, 742\VAR{grouping}, \VAR{compression}, \VAR{encryption}, \VAR{unsync} 743and \VAR{data length} flags which are typically usused. 744 745\clearpage 746 747\subsubsection{APIC Frame} 748\begin{figure}[h] 749\includegraphics{figures/id3v24/apic.pdf} 750\end{figure} 751\par 752\noindent 753This frame contains an embedded image file. 754\VAR{MIME type} is a NULL-terminated MIME type such as ``image/jpeg'' 755or ``image/png''. 756\VAR{encoding} determines the encoding of the NULL-terminated 757\VAR{description} field. 758Valid encodings are \texttt{0} for Latin-1, 759\texttt{1} for UTF-16, 760\texttt{2} for UTF-16BE 761and \texttt{3} for UTF-8. 762\VAR{picture type} is one of the following: 763\begin{table}[h] 764{\relsize{-1} 765\begin{tabular}{|r|l||r|l|} 766\hline 767value & type & value & type \\ 768\hline 7690 & other & 1 & 32x32 pixels `file icon' (PNG only) \\ 7702 & other file icon & 3 & cover (front) \\ 7714 & cover (back) & 5 & leaflet page \\ 7726 & media (e.g. label side of CD) & 7 & lead artist / lead performer / soloist \\ 7738 & artist / performer & 9 & conductor \\ 77410 & band / orchestra & 11 & composer \\ 77512 & lyricist / text writer & 13 & recording location \\ 77614 & during recording & 15 & during performance \\ 77716 & movie / video screen capture & 17 & a bright colored fish \\ 77818 & illustration & 19 & band / artist logotype \\ 77920 & publisher / studio logotype & & \\ 780\hline 781\end{tabular} 782\caption{APIC image types} 783} 784\end{table} 785 786\clearpage 787 788\subsubsection{APIC Frame Example} 789\begin{figure}[h] 790\includegraphics{figures/id3v24/apic-example.pdf} 791\end{figure} 792\begin{table}[h] 793\begin{tabular}{rl} 794encoding : & \texttt{0} (Latin-1) \\ 795MIME type : & \texttt{"image/png"} \\ 796picture type : & \texttt{3} (front cover) \\ 797description : & \texttt{"Description"} \\ 798\end{tabular} 799\end{table} 800\par 801\noindent 802As with all ID3v2.4 frames, the \VAR{frame size} field contains 803embedded 0 bits as follows: 804\begin{figure}[h] 805\includegraphics{figures/id3v24/size.pdf} 806\end{figure} 807\par 808\noindent 809which indicates the frame's data size is: 810\begin{equation*} 811(0 \times 2 ^ {21}) + (0 \times 2 ^ {14}) + (1 \times 2 ^ {7}) + 55 = 183 \text{ bytes} 812\end{equation*} 813 814\clearpage 815 816\subsubsection{Text Frames} 817\begin{figure}[h] 818 \includegraphics{figures/id3v24/t___.pdf} 819\end{figure} 820\par 821\noindent 822Frames beginning with `T' are text frames. 823\VAR{encoding} is 0 for Latin-1 encoding, 1 for UTF-16 encoding, 8242 for UTF-16BE encoding and 3 for UTF-8 encoding. 825\begin{table}[h] 826 {\relsize{-2} 827 \begin{tabular}{|c|l||c|l|} 828 \hline 829 ID & Meaning & ID & Meaning \\ 830 \hline 831 \texttt{TALB} & album name & 832 \texttt{TBPM} & beats-per-minute \\ 833 \texttt{TCOM} & composer & 834 \texttt{TCON} & content type \\ 835 \texttt{TCOP} & copyright message & 836 \texttt{TDAT} & date \\ 837 \texttt{TDEN} & encoding time & 838 \texttt{TDLY} & playlist delay \\ 839 \texttt{TDOR} & original release time & 840 \texttt{TDRC} & recording time \\ 841 \texttt{TDRL} & release time & 842 \texttt{TDTG} & tagging time \\ 843 \texttt{TENC} & encoded by & 844 \texttt{TEXT} & lyricist / text writer \\ 845 \texttt{TFLT} & file type & 846 \texttt{TIPL} & involved people list \\ 847 \texttt{TIT1} & content group description & 848 \texttt{TIT2} & title / songname / content description \\ 849 \texttt{TIT3} & subtitle / description refinement & 850 \texttt{TKEY} & initial key \\ 851 \texttt{TLAN} & language(s) & 852 \texttt{TLEN} & length \\ 853 \texttt{TMCL} & musician credits list & 854 \texttt{TMED} & media type \\ 855 \texttt{TMOO} & mood & 856 \texttt{TOAL} & original album / movie / show title \\ 857 \texttt{TOFN} & original filename & 858 \texttt{TOLY} & original lyricist(s) / text writer(s) \\ 859 \texttt{TOPE} & original artist(s) / performer(s) & 860 \texttt{TOWN} & file owner / licensee \\ 861 \texttt{TPE1} & lead performer(s) / soloist(s) & 862 \texttt{TPE2} & band / orchestra / accompaniment \\ 863 \texttt{TPE3} & conductor / performer refinement & 864 \texttt{TPE4} & interpreted, remixed, or otherwise modified by \\ 865 \texttt{TPOS} & part of a set & 866 \texttt{TPRO} & produced notice \\ 867 \texttt{TPUB} & publisher & 868 \texttt{TRCK} & track number / position in set \\ 869 \texttt{TRSN} & internet radio station name & 870 \texttt{TRSO} & internet radio station owner \\ 871 \texttt{TSOA} & album sort order & 872 \texttt{TSOP} & performer sort order \\ 873 \texttt{TSOT} & title sort order & 874 \texttt{TSRC} & ISRC (International Standard Recording Code) \\ 875 \texttt{TSSE} & software / hardware and encoding settings & 876 \texttt{TSST} & set subtitle \\ 877 \hline 878 \end{tabular} 879 } 880\end{table} 881 882\clearpage 883 884\subsubsection{Text Frame Example} 885\begin{figure}[h] 886 \includegraphics{figures/id3v24/t___-example.pdf} 887\end{figure} 888\begin{table}[h] 889\begin{tabular}{rl} 890frame ID : & \texttt{"TIT2"} (title / song name / content description) \\ 891encoding : & \texttt{0} (Latin-1) \\ 892text: & \texttt{"Track Name"} \\ 893\end{tabular} 894\end{table} 895 896\subsubsection{User-Defined Text Frame} 897\begin{figure}[h] 898 \includegraphics{figures/id3v24/txxx.pdf} 899\end{figure} 900\par 901\noindent 902This frame is for user-defined text information. 903\VAR{encoding} is 0 for Latin-1 encoding, 1 for UTF-16 encoding, 9042 for UTF-16BE encoding and 3 for UTF-8 encoding. 905\VAR{description} is a NULL-terminated string indicating 906what the frame is for. 907 908\clearpage 909 910\subsubsection{COMM Frame} 911\begin{figure}[h] 912\includegraphics{figures/id3v24/comm.pdf} 913\end{figure} 914\par 915\noindent 916This frame supports a lengthy string of comment text. 917\VAR{encoding} determines the encoding of the NULL-terminated 918\VAR{description} field and \VAR{comment text} field. 919Valid encodings are \texttt{0} for Latin-1, 920\texttt{1} for UTF-16, 921\texttt{2} for UTF-16BE 922and \texttt{3} for UTF-8. 923\VAR{language} is a 3 byte ASCII string. 924 925\subsubsection{COMM Frame Example} 926\begin{figure}[h] 927\includegraphics{figures/id3v24/comm-example.pdf} 928\end{figure} 929\begin{table}[h] 930\begin{tabular}{rl} 931encoding : & \texttt{0} (Latin-1) \\ 932language : & \texttt{"eng"} \\ 933description : & \texttt{""} (empty string) \\ 934comment text : & \texttt{"Comment Text"} \\ 935\end{tabular} 936\end{table} 937 938\clearpage 939 940\subsubsection{URL Frames} 941\begin{figure}[h] 942\includegraphics{figures/id3v24/w___.pdf} 943\end{figure} 944\par 945\noindent 946Frames with IDs beginning with `W' are URL links to external resources. 947\par 948\begin{table}[h] 949\begin{tabular}{|c|l|} 950\hline 951ID & Meaning \\ 952\hline 953\texttt{WCOM} & commercial information \\ 954\texttt{WCOP} & copyright / legal information \\ 955\texttt{WOAF} & official audio file webpage \\ 956\texttt{WOAR} & official artist/performer webpage \\ 957\texttt{WOAS} & official audio source webpage \\ 958\texttt{WORS} & official internet radio station homepage \\ 959\texttt{WPAY} & payment \\ 960\texttt{WPUB} & publisher's official webpage \\ 961\hline 962\end{tabular} 963\end{table} 964 965\clearpage 966 967\subsubsection{User-Defined URL Frame} 968 969\begin{figure}[h] 970\includegraphics{figures/id3v24/wxxx.pdf} 971\end{figure} 972\par 973\noindent 974This is a frame for user-defined links to external resources. 975\VAR{encoding} is 0 for Latin-1 encoding, 1 for UTF-16 encoding, 9762 for UTF-16BE encoding and 3 for UTF-8 encoding. 977\VAR{description} is a NULL-terminated string indicating 978what the frame is for. 979 980\clearpage 981 982\subsubsection{GEOB Frame} 983\begin{figure}[h] 984 \includegraphics{figures/id3v24/geob.pdf} 985\end{figure} 986\par 987\noindent 988This frame contains an embedded file (i.e. a ``general encapsulated object''). 989\VAR{MIME type} is a NULL-terminated Latin-1 string. 990\VAR{filename} and \VAR{content description} are NULL-terminated 991strings encoded as Latin-1, UTF-16, UTF-16BE or UTF-8, 992depending on the \VAR{encoding field}. 993\VAR{encapsulated object} is binary file data. 994 995\clearpage 996 997\subsubsection{USLT Frame} 998\begin{figure}[h] 999 \includegraphics{figures/id3v24/uslt.pdf} 1000\end{figure} 1001\par 1002\noindent 1003This frame is for unsynchronized (non-karaoke) lyrics text, 1004similar to a comment frame. 1005\VAR{language} is a 3 byte ASCII string. 1006\VAR{description} and \VAR{lyrics text} are a NULL-terminated strings 1007encoded as Latin-1, UTF-16, UTF-16BE or UTF-8, 1008depending on the \VAR{encoding field}. 1009