1## README for file(1) Command and the libmagic(3) library ## 2 3 @(#) $File: README,v 1.59 2019/09/19 01:04:01 christos Exp $ 4 5Mailing List: file@astron.com 6Mailing List archives: http://mailman.astron.com/pipermail/file/ 7Bug tracker: http://bugs.astron.com/ 8E-mail: christos@astron.com 9Build Status: https://travis-ci.org/file/file 10 11Phone: Do not even think of telephoning me about this program. Send cash first! 12 13This is Release 5.x of Ian Darwin's (copyright but distributable) 14file(1) command, an implementation of the Unix File(1) command. 15It knows the 'magic number' of several thousands of file types. 16This version is the standard "file" command for Linux, 17*BSD, and other systems. (See "patchlevel.h" for the exact release number). 18 19You can download the latest version of the original sources for file from: 20 21 ftp://ftp.astron.com/pub/file/ 22 23A public read-only git repository of the same sources is available at: 24 25 https://github.com/file/file 26 27We are continuously being fuzzed by OSS-FUZZ: 28 29 https://bugs.chromium.org/p/oss-fuzz/issues/list?sort=-opened&can=1&q=proj:file 30 31The major changes for 5.x are CDF file parsing, indirect magic, name/use 32(recursion) and overhaul in mime and ascii encoding handling. 33 34The major feature of 4.x is the refactoring of the code into a library, 35and the re-write of the file command in terms of that library. The library 36itself, libmagic can be used by 3rd party programs that wish to identify 37file types without having to fork() and exec() file. The prime contributor 38for 4.0 was Mans Rullgard. 39 40UNIX is a trademark of UNIX System Laboratories. 41 42The prime contributor to Release 3.8 was Guy Harris, who put in megachanges 43including byte-order independence. 44 45The prime contributor to Release 3.0 was Christos Zoulas, who put 46in hundreds of lines of source code changes, including his own 47ANSIfication of the code (I liked my own ANSIfication better, but 48his (__P()) is the "Berkeley standard" way of doing it, and I wanted UCB 49to include the code...), his HP-like "indirection" (a feature of 50the HP file command, I think), and his mods that finally got the 51uncompress (-z) mode finished and working. 52 53This release has compiled in numerous environments; see PORTING 54for a list and problems. 55 56This fine freeware file(1) follows the USG (System V) model of the file 57command, rather than the Research (V7) version or the V7-derived 4.[23] 58Berkeley one. That is, the file /etc/magic contains much of the ritual 59information that is the source of this program's power. My version 60knows a little more magic (including tar archives) than System V; the 61/etc/magic parsing seems to be compatible with the (poorly documented) 62System V /etc/magic format (with one exception; see the man page). 63 64In addition, the /etc/magic file is built from a subdirectory 65for easier(?) maintenance. I will act as a clearinghouse for 66magic numbers assigned to all sorts of data files that 67are in reasonable circulation. Send your magic numbers, 68in magic(5) format please, to the maintainer, Christos Zoulas. 69 70COPYING - read this first. 71README - read this second (you are currently reading this file). 72INSTALL - read on how to install 73src/apprentice.c - parses /etc/magic to learn magic 74src/apptype.c - used for OS/2 specific application type magic 75src/ascmagic.c - third & last set of tests, based on hardwired assumptions. 76src/asctime_r.c - replacement for OS's that don't have it. 77src/asprintf.c - replacement for OS's that don't have it. 78src/asctime_r.c - replacement for OS's that don't have it. 79src/asprintf.c - replacement for OS's that don't have it. 80src/buffer.c - buffer handling functions. 81src/cdf.[ch] - parser for Microsoft Compound Document Files 82src/cdf_time.c - time converter for CDF. 83src/compress.c - handles decompressing files to look inside. 84src/ctime_r.c - replacement for OS's that don't have it. 85src/der.[ch] - parser for Distinguished Encoding Rules 86src/dprintf.c - replacement for OS's that don't have it. 87src/elfclass.h - common code for elf 32/64. 88src/encoding.c - handles unicode encodings 89src/file.c - the main program 90src/file.h - header file 91src/file_opts.h - list of options 92src/fmtcheck.c - replacement for OS's that don't have it. 93src/fsmagic.c - first set of tests the program runs, based on filesystem info 94src/funcs.c - utilility functions 95src/getline.c - replacement for OS's that don't have it. 96src/getopt_long.c - replacement for OS's that don't have it. 97src/gmtime_r.c - replacement for OS's that don't have it. 98src/is_csv.c - knows about Comma Separated Value file format (RFC 4180). 99src/is_json.c - knows about JavaScript Object Notation format (RFC 8259). 100src/is_tar.c, tar.h - knows about Tape ARchive format (courtesy John Gilmore). 101src/localtime_r.c - replacement for OS's that don't have it. 102src/magic.h.in - source file for magic.h 103src/mygetopt.h - replacement for OS's that don't have it. 104src/magic.c - the libmagic api 105src/names.h - header file for ascmagic.c 106src/pread.c - replacement for OS's that don't have it. 107src/print.c - print results, errors, warnings. 108src/readcdf.c - CDF wrapper. 109src/readelf.[ch] - Stand-alone elf parsing code. 110src/softmagic.c - 2nd set of tests, based on /etc/magic 111src/mygetopt.h - replacement for OS's that don't have it. 112src/strcasestr.c - replacement for OS's that don't have it. 113src/strlcat.c - replacement for OS's that don't have it. 114src/strlcpy.c - replacement for OS's that don't have it. 115src/strndup.c - replacement for OS's that don't have it. 116src/tar.h - tar file definitions 117src/vasprintf.c - for systems that don't have it. 118doc/file.man - man page for the command 119doc/magic.man - man page for the magic file, courtesy Guy Harris. 120 Install as magic.4 on USG and magic.5 on V7 or Berkeley; cf Makefile. 121 122Magdir - directory of /etc/magic pieces 123------------------------------------------------------------------------------ 124 125If you submit a new magic entry please make sure you read the following 126guidelines: 127 128- Initial match is preferably at least 32 bits long, and is a _unique_ match 129- If this is not feasible, use additional check 130- Match of <= 16 bits are not accepted 131- Delay printing string as much as possible, don't print output too early 132- Avoid printf arbitrary byte as string, which can be a source of 133 crash and buffer overflow 134 135- Provide complete information with entry: 136 * One line short summary 137 * Optional long description 138 * File extension, if applicable 139 * Full name and contact method (for discussion when entry has problem) 140 * Further reference, such as documentation of format 141 142------------------------------------------------------------------------------ 143 144gpg for dummies: 145 146$ gpg --verify file-X.YY.tar.gz.asc file-X.YY.tar.gz 147gpg: assuming signed data in `file-X.YY.tar.gz' 148gpg: Signature made WWW MMM DD HH:MM:SS YYYY ZZZ using DSA key ID KKKKKKKK 149 150To download the key: 151 152$ gpg --keyserver hkp://keys.gnupg.net --recv-keys KKKKKKKK 153 154------------------------------------------------------------------------------ 155 156 157Parts of this software were developed at SoftQuad Inc., developers 158of SGML/HTML/XML publishing software, in Toronto, Canada. 159SoftQuad was swallowed up by Corel in 2002 and does not exist any longer. 160