• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

dist/H03-May-2022-

doc/H03-May-2022-5330

m4/H29-Jan-2014-303271

man/H29-Jan-2014-2,6332,290

sample-hashes/H03-May-2022-14,02114,010

src/H03-May-2022-12,9048,931

tests/H03-May-2022-862,045861,888

.gitignoreH A D29-Jan-2014366 3128

AUTHORSH A D29-Jan-201497 42

CONFIGURE_FC.shH A D29-Jan-20142.6 KiB7962

COPYINGH A D29-Jan-201418.6 KiB367299

ChangeLogH A D29-Jan-201446.8 KiB1,460798

FILEFORMATH A D29-Jan-20142.1 KiB6239

INSTALLH A D29-Jan-20149 KiB230175

Makefile.amH A D29-Jan-20142.8 KiB10689

NEWSH A D29-Jan-201417 KiB621366

READMEH A D29-Jan-20140

README.mdH A D29-Jan-20146.4 KiB178140

TODOH A D29-Jan-20144 KiB8475

bootstrap.shH A D29-Jan-2014133 87

config.guessH A D29-Jan-201440.4 KiB1,3891,194

config.subH A D29-Jan-201429.3 KiB1,4901,349

configure.acH A D29-Jan-20149 KiB271227

depcompH A D29-Jan-201411.8 KiB424278

install-shH A D29-Jan-20145.4 KiB252153

missingH A D29-Jan-20146.7 KiB216143

mkinstalldirsH A D29-Jan-20141.8 KiB10072

README.md

1This is md5deep, a set of cross-platform tools to computer hashes, or
2message digests, for any number of files while optionally recursively
3digging through the directory structure.  It can also take a list of known
4hashes and display the filenames of input files whose hashes either do or
5do not match any of the known hashes. This version supports MD5, SHA-1,
6SHA-256, Tiger, and Whirlpool hashes.
7
8See the file [NEWS](NEWS) for a list of changes between releases.
9
10See the file [COPYING](COPYING) for information about the licensing for this program.
11
12See the file [INSTALL](INSTALL) for (generic) compilation and installation
13instructions. Here's the short version that should just work in many cases:
14
15```shell
16sh bootstrap.sh # runs autoconf, automake
17./configure
18make
19make install
20```
21
22Note that you must be normally root to install to the default location.
23The sudo command is helpful for doing so. You can specify an alternate
24installation location using the --prefix option to the configure script.
25For example, to install to /home/foo/bin, use:
26
27>$ ./configure --prefix=/home/foo
28
29There is complete documentation on how to use the program on the
30project's homepage, [https://github.com/jessek/hashdeep](https://github.com/jessek/hashdeep)
31
32## md5deep vs. hashdeep
33
34For historical reasons, the program has different options and features
35when run with the names "hashdeep" and "md5deep."
36
37hashdeep has a feature called "audit" which:
38> \* Can also use a list of known hashes to audit a set of FILES. Errors
39>   are reported to standard error. If no FILES are specified, reads from
40>   standard input.
41>
42> -a Audit mode. Each input file is compared against the set of knowns. An
43>    audit is said to pass if each input file is matched against exactly
44>    one file in set of knowns. Any collisions, new files, or missing files
45>    will make the audit fail. Using this flag alone produces a message,
46>    either "Audit passed" or "Audit Failed".
47>
48>    -v - prints the number of files in each category
49>    -v -v = prints all discrepancies
50>    -v -v -v = prints the results for every file examined and every known file.
51>
52> -k <file> - The -k option must be used to load the audit file
53
54To perform an audit:
55>  hashdeep -r dir  > /tmp/auditfile            # Generate the audit file
56>  hashdeep -a k /tmp/auditfile -r dir          # test the audit
57
58Notice that the audit is performed with a standard hashdeep output
59file. (Internally, the audit is computed as part of the hashing process.)
60
61## Unicode Issues
62POSIX-based modern computer systems consider filenames to be a
63sequence of bytes that are rendered as the application wishes. This
64means that filenames typically contain ASCII but can contain UTF-8,
65UTF-16, latin1, or even invalid Unicode codings.
66
67Windows-based systems have one set of API calls for ASCII-based
68filenames and another set for filenames encoded as UCS-2, which
69"produces a fixed-length format by simply using the code point as the
7016-bit code unit and produces exactly the same result as UTF-16 for
7163,488 code points in the range 0-0xFFFF" according to [wikipedia]
72(http://en.wikipedia.org/wiki/UTF-16/UCS-2). But wikipedia disputes the
73factual accuracy of this statement on the talk page. it's pretty clear
74that nobody is entirely sure that Windows actually does, and Windows
75itself may not be consistent.
76
77Version 3 of this program addressed this issue by using the TCHAR
78variable to hold filenames on Windowa dn by refusing to print them,
79priting a "?" instead. Version 4 of this program translates TCHAR
80strings to std::string strings at the soonest opportunity using the
81[Windows function WideCharToMultiByte]
82(http://msdn.microsoft.com/en-us/library/dd374130%28v=vs.85%29.aspx). Flags
83have been added escape Unicode when it is printed.
84
85There is no way (apparently) on Windows to open a UTF-8 filename; it needs to be
86converted back to a multi-byte filename with MultiByteToWideChar.
87
88Fortunately, we never really need to convert back.
89
90Notice that on Windows the files hashed can have unicode characters
91but the file with the hashes must have an ASCII name.
92
93COMPILING FOR WINDOWS:
94> -D_UNICODE causes TCHAR to be defined as 'wchar_t'.
95
96COMPILING FOR POSIX:
97> -D_UNICODE is not defined, causing TCHAR to be defined as 'char'.
98
99Previously, win32 functions were controlled with #ifdef statements, like this:
100
101```C
102#ifdef _WIN32
103  _wfullpath(d_name,fn,PATH_MAX);
104#else
105  if (NULL == realpath(fn,d_name))
106    return TRUE;
107#endif
108```
109
110There was also a file called tchar-local.h which actually changed the semantics
111of functions on different platforms, with things like this:
112
113```C
114   #define  _tcsncpy   strncpy
115   #define  _tstat_t   struct stat
116```
117
118This made the code very difficult to maintain.
119
120With the 4.0 rewrite, we have changed this code with C++ functions that return
121objects were possible and avoid the use of #defines that so that on _WIN32 systems
122the function realpath() gets defined prior to its use, and the mainline code
123lacks the realpath() function. You can see this in cycles.cpp:
124
125```C
126/* Return the canonicalized absolute pathname in UTF-8 on Windows and POSIX systems */
127std::string get_realpath(const TCHAR *fn)
128{
129#ifdef _WIN32
130    /*
131     * expand a relative path to the full path.
132     * http://msdn.microsoft.com/en-us/library/506720ff(v=vs.80).aspx
133     */
134    TCHAR absPath[PATH_MAX];
135    if(_fullpath(absPath,fn,PAT_HMAX)==0) return "";
136    return tchar_to_utf8(absPath);
137#else
138    char resolved_name[PATH_MAX];	//
139    if(realpath(fn,resolved_name)==0) return "";
140    return string(resolved_name);
141#endif
142}
143```
144
145You can install mingw and then simply configure with something like this:
146>$ export PATH=$PATH:/usr/local/i386-mingw32-4.3.0/bin
147>$ ./configure --host=i386-mingw32
148
149
150## Hash Algorithm References
151
152The MD5 algorithm is defined in RFC 1321:
153http://www.ietf.org/rfc/rfc1321.txt
154
155The SHA1 algorithm is defined in FIPS 180-1:
156http://www.itl.nist.gov/fipspubs/fip180-1.htm
157
158The SHA256 algorithm is defined FIPS 180-2:
159http://csrc.nist.gov/publications/fips/fips180-2/fips180-2.pdf
160
161The Tiger algorithm is defined at:
162http://www.cs.technion.ac.il/~biham/Reports/Tiger/
163
164The Whirlpool algorithm is defined at:
165http://planeta.terra.com.br/informatica/paulobarreto/WhirlpoolPage.html
166
167## Theory of Operation
168
169* main.cpp
170  * sets up the system
171* dig.cpp
172  * iterates through the individual directories
173  * calls hash_file() in hash.cpp for each file to hash
174* hash.cpp
175  * performs the hashing of each file
176* display.cpp
177  * stores/displays the results
178