1 //! # Fingerprints
2 //!
3 //! This module implements change-tracking so that Cargo can know whether or
4 //! not something needs to be recompiled. A Cargo `Unit` can be either "dirty"
5 //! (needs to be recompiled) or "fresh" (it does not need to be recompiled).
6 //! There are several mechanisms that influence a Unit's freshness:
7 //!
8 //! - The `Fingerprint` is a hash, saved to the filesystem in the
9 //!   `.fingerprint` directory, that tracks information about the Unit. If the
10 //!   fingerprint is missing (such as the first time the unit is being
11 //!   compiled), then the unit is dirty. If any of the fingerprint fields
12 //!   change (like the name of the source file), then the Unit is considered
13 //!   dirty.
14 //!
15 //!   The `Fingerprint` also tracks the fingerprints of all its dependencies,
16 //!   so a change in a dependency will propagate the "dirty" status up.
17 //!
18 //! - Filesystem mtime tracking is also used to check if a unit is dirty.
19 //!   See the section below on "Mtime comparison" for more details. There
20 //!   are essentially two parts to mtime tracking:
21 //!
22 //!   1. The mtime of a Unit's output files is compared to the mtime of all
23 //!      its dependencies' output file mtimes (see `check_filesystem`). If any
24 //!      output is missing, or is older than a dependency's output, then the
25 //!      unit is dirty.
26 //!   2. The mtime of a Unit's source files is compared to the mtime of its
27 //!      dep-info file in the fingerprint directory (see `find_stale_file`).
28 //!      The dep-info file is used as an anchor to know when the last build of
29 //!      the unit was done. See the "dep-info files" section below for more
30 //!      details. If any input files are missing, or are newer than the
31 //!      dep-info, then the unit is dirty.
32 //!
33 //! Note: Fingerprinting is not a perfect solution. Filesystem mtime tracking
34 //! is notoriously imprecise and problematic. Only a small part of the
35 //! environment is captured. This is a balance of performance, simplicity, and
36 //! completeness. Sandboxing, hashing file contents, tracking every file
37 //! access, environment variable, and network operation would ensure more
38 //! reliable and reproducible builds at the cost of being complex, slow, and
39 //! platform-dependent.
40 //!
41 //! ## Fingerprints and Metadata
42 //!
43 //! The `Metadata` hash is a hash added to the output filenames to isolate
44 //! each unit. See the documentation in the `compilation_files` module for
45 //! more details. NOTE: Not all output files are isolated via filename hashes
46 //! (like dylibs). The fingerprint directory uses a hash, but sometimes units
47 //! share the same fingerprint directory (when they don't have Metadata) so
48 //! care should be taken to handle this!
49 //!
50 //! Fingerprints and Metadata are similar, and track some of the same things.
51 //! The Metadata contains information that is required to keep Units separate.
52 //! The Fingerprint includes additional information that should cause a
53 //! recompile, but it is desired to reuse the same filenames. A comparison
54 //! of what is tracked:
55 //!
56 //! Value                                      | Fingerprint | Metadata
57 //! -------------------------------------------|-------------|----------
58 //! rustc                                      | ✓           | ✓
59 //! Profile                                    | ✓           | ✓
60 //! `cargo rustc` extra args                   | ✓           | ✓
61 //! CompileMode                                | ✓           | ✓
62 //! Target Name                                | ✓           | ✓
63 //! TargetKind (bin/lib/etc.)                  | ✓           | ✓
64 //! Enabled Features                           | ✓           | ✓
65 //! Immediate dependency’s hashes              | ✓[^1]       | ✓
66 //! CompileKind (host/target)                  | ✓           | ✓
67 //! __CARGO_DEFAULT_LIB_METADATA[^4]           |             | ✓
68 //! package_id                                 |             | ✓
69 //! authors, description, homepage, repo       | ✓           |
70 //! Target src path relative to ws             | ✓           |
71 //! Target flags (test/bench/for_host/edition) | ✓           |
72 //! -C incremental=… flag                      | ✓           |
73 //! mtime of sources                           | ✓[^3]       |
74 //! RUSTFLAGS/RUSTDOCFLAGS                     | ✓           |
75 //! LTO flags                                  | ✓           | ✓
76 //! config settings[^5]                        | ✓           |
77 //! is_std                                     |             | ✓
78 //!
79 //! [^1]: Build script and bin dependencies are not included.
80 //!
81 //! [^3]: See below for details on mtime tracking.
82 //!
83 //! [^4]: `__CARGO_DEFAULT_LIB_METADATA` is set by rustbuild to embed the
84 //!        release channel (bootstrap/stable/beta/nightly) in libstd.
85 //!
86 //! [^5]: Config settings that are not otherwise captured anywhere else.
87 //!       Currently, this is only `doc.extern-map`.
88 //!
89 //! When deciding what should go in the Metadata vs the Fingerprint, consider
90 //! that some files (like dylibs) do not have a hash in their filename. Thus,
91 //! if a value changes, only the fingerprint will detect the change (consider,
92 //! for example, swapping between different features). Fields that are only in
93 //! Metadata generally aren't relevant to the fingerprint because they
94 //! fundamentally change the output (like target vs host changes the directory
95 //! where it is emitted).
96 //!
97 //! ## Fingerprint files
98 //!
99 //! Fingerprint information is stored in the
100 //! `target/{debug,release}/.fingerprint/` directory. Each Unit is stored in a
101 //! separate directory. Each Unit directory contains:
102 //!
103 //! - A file with a 16 hex-digit hash. This is the Fingerprint hash, used for
104 //!   quick loading and comparison.
105 //! - A `.json` file that contains details about the Fingerprint. This is only
106 //!   used to log details about *why* a fingerprint is considered dirty.
107 //!   `CARGO_LOG=cargo::core::compiler::fingerprint=trace cargo build` can be
108 //!   used to display this log information.
109 //! - A "dep-info" file which is a translation of rustc's `*.d` dep-info files
110 //!   to a Cargo-specific format that tweaks file names and is optimized for
111 //!   reading quickly.
112 //! - An `invoked.timestamp` file whose filesystem mtime is updated every time
113 //!   the Unit is built. This is used for capturing the time when the build
114 //!   starts, to detect if files are changed in the middle of the build. See
115 //!   below for more details.
116 //!
117 //! Note that some units are a little different. A Unit for *running* a build
118 //! script or for `rustdoc` does not have a dep-info file (it's not
119 //! applicable). Build script `invoked.timestamp` files are in the build
120 //! output directory.
121 //!
122 //! ## Fingerprint calculation
123 //!
124 //! After the list of Units has been calculated, the Units are added to the
125 //! `JobQueue`. As each one is added, the fingerprint is calculated, and the
126 //! dirty/fresh status is recorded. A closure is used to update the fingerprint
127 //! on-disk when the Unit successfully finishes. The closure will recompute the
128 //! Fingerprint based on the updated information. If the Unit fails to compile,
129 //! the fingerprint is not updated.
130 //!
131 //! Fingerprints are cached in the `Context`. This makes computing
132 //! Fingerprints faster, but also is necessary for properly updating
133 //! dependency information. Since a Fingerprint includes the Fingerprints of
134 //! all dependencies, when it is updated, by using `Arc` clones, it
135 //! automatically picks up the updates to its dependencies.
136 //!
137 //! ### dep-info files
138 //!
139 //! Cargo passes the `--emit=dep-info` flag to `rustc` so that `rustc` will
140 //! generate a "dep info" file (with the `.d` extension). This is a
141 //! Makefile-like syntax that includes all of the source files used to build
142 //! the crate. This file is used by Cargo to know which files to check to see
143 //! if the crate will need to be rebuilt.
144 //!
145 //! After `rustc` exits successfully, Cargo will read the dep info file and
146 //! translate it into a binary format that is stored in the fingerprint
147 //! directory (`translate_dep_info`). The mtime of the fingerprint dep-info
148 //! file itself is used as the reference for comparing the source files to
149 //! determine if any of the source files have been modified (see below for
150 //! more detail). Note that Cargo parses the special `# env-var:...` comments in
151 //! dep-info files to learn about environment variables that the rustc compile
152 //! depends on. Cargo then later uses this to trigger a recompile if a
153 //! referenced env var changes (even if the source didn't change).
154 //!
155 //! There is also a third dep-info file. Cargo will extend the file created by
156 //! rustc with some additional information and saves this into the output
157 //! directory. This is intended for build system integration. See the
158 //! `output_depinfo` module for more detail.
159 //!
160 //! #### -Zbinary-dep-depinfo
161 //!
162 //! `rustc` has an experimental flag `-Zbinary-dep-depinfo`. This causes
163 //! `rustc` to include binary files (like rlibs) in the dep-info file. This is
164 //! primarily to support rustc development, so that Cargo can check the
165 //! implicit dependency to the standard library (which lives in the sysroot).
166 //! We want Cargo to recompile whenever the standard library rlib/dylibs
167 //! change, and this is a generic mechanism to make that work.
168 //!
169 //! ### Mtime comparison
170 //!
171 //! The use of modification timestamps is the most common way a unit will be
172 //! determined to be dirty or fresh between builds. There are many subtle
173 //! issues and edge cases with mtime comparisons. This gives a high-level
174 //! overview, but you'll need to read the code for the gritty details. Mtime
175 //! handling is different for different unit kinds. The different styles are
176 //! driven by the `Fingerprint.local` field, which is set based on the unit
177 //! kind.
178 //!
179 //! The status of whether or not the mtime is "stale" or "up-to-date" is
180 //! stored in `Fingerprint.fs_status`.
181 //!
182 //! All units will compare the mtime of its newest output file with the mtimes
183 //! of the outputs of all its dependencies. If any output file is missing,
184 //! then the unit is stale. If any dependency is newer, the unit is stale.
185 //!
186 //! #### Normal package mtime handling
187 //!
188 //! `LocalFingerprint::CheckDepinfo` is used for checking the mtime of
189 //! packages. It compares the mtime of the input files (the source files) to
190 //! the mtime of the dep-info file (which is written last after a build is
191 //! finished). If the dep-info is missing, the unit is stale (it has never
192 //! been built). The list of input files comes from the dep-info file. See the
193 //! section above for details on dep-info files.
194 //!
195 //! Also note that although registry and git packages use `CheckDepInfo`, none
196 //! of their source files are included in the dep-info (see
197 //! `translate_dep_info`), so for those kinds no mtime checking is done
198 //! (unless `-Zbinary-dep-depinfo` is used). Repository and git packages are
199 //! static, so there is no need to check anything.
200 //!
201 //! When a build is complete, the mtime of the dep-info file in the
202 //! fingerprint directory is modified to rewind it to the time when the build
203 //! started. This is done by creating an `invoked.timestamp` file when the
204 //! build starts to capture the start time. The mtime is rewound to the start
205 //! to handle the case where the user modifies a source file while a build is
206 //! running. Cargo can't know whether or not the file was included in the
207 //! build, so it takes a conservative approach of assuming the file was *not*
208 //! included, and it should be rebuilt during the next build.
209 //!
210 //! #### Rustdoc mtime handling
211 //!
212 //! Rustdoc does not emit a dep-info file, so Cargo currently has a relatively
213 //! simple system for detecting rebuilds. `LocalFingerprint::Precalculated` is
214 //! used for rustdoc units. For registry packages, this is the package
215 //! version. For git packages, it is the git hash. For path packages, it is
216 //! the a string of the mtime of the newest file in the package.
217 //!
218 //! There are some known bugs with how this works, so it should be improved at
219 //! some point.
220 //!
221 //! #### Build script mtime handling
222 //!
223 //! Build script mtime handling runs in different modes. There is the "old
224 //! style" where the build script does not emit any `rerun-if` directives. In
225 //! this mode, Cargo will use `LocalFingerprint::Precalculated`. See the
226 //! "rustdoc" section above how it works.
227 //!
228 //! In the new-style, each `rerun-if` directive is translated to the
229 //! corresponding `LocalFingerprint` variant. The `RerunIfChanged` variant
230 //! compares the mtime of the given filenames against the mtime of the
231 //! "output" file.
232 //!
233 //! Similar to normal units, the build script "output" file mtime is rewound
234 //! to the time just before the build script is executed to handle mid-build
235 //! modifications.
236 //!
237 //! ## Considerations for inclusion in a fingerprint
238 //!
239 //! Over time we've realized a few items which historically were included in
240 //! fingerprint hashings should not actually be included. Examples are:
241 //!
242 //! * Modification time values. We strive to never include a modification time
243 //!   inside a `Fingerprint` to get hashed into an actual value. While
244 //!   theoretically fine to do, in practice this causes issues with common
245 //!   applications like Docker. Docker, after a layer is built, will zero out
246 //!   the nanosecond part of all filesystem modification times. This means that
247 //!   the actual modification time is different for all build artifacts, which
248 //!   if we tracked the actual values of modification times would cause
249 //!   unnecessary recompiles. To fix this we instead only track paths which are
250 //!   relevant. These paths are checked dynamically to see if they're up to
251 //!   date, and the modification time doesn't make its way into the fingerprint
252 //!   hash.
253 //!
254 //! * Absolute path names. We strive to maintain a property where if you rename
255 //!   a project directory Cargo will continue to preserve all build artifacts
256 //!   and reuse the cache. This means that we can't ever hash an absolute path
257 //!   name. Instead we always hash relative path names and the "root" is passed
258 //!   in at runtime dynamically. Some of this is best effort, but the general
259 //!   idea is that we assume all accesses within a crate stay within that
260 //!   crate.
261 //!
262 //! These are pretty tricky to test for unfortunately, but we should have a good
263 //! test suite nowadays and lord knows Cargo gets enough testing in the wild!
264 //!
265 //! ## Build scripts
266 //!
267 //! The *running* of a build script (`CompileMode::RunCustomBuild`) is treated
268 //! significantly different than all other Unit kinds. It has its own function
269 //! for calculating the Fingerprint (`calculate_run_custom_build`) and has some
270 //! unique considerations. It does not track the same information as a normal
271 //! Unit. The information tracked depends on the `rerun-if-changed` and
272 //! `rerun-if-env-changed` statements produced by the build script. If the
273 //! script does not emit either of these statements, the Fingerprint runs in
274 //! "old style" mode where an mtime change of *any* file in the package will
275 //! cause the build script to be re-run. Otherwise, the fingerprint *only*
276 //! tracks the individual "rerun-if" items listed by the build script.
277 //!
278 //! The "rerun-if" statements from a *previous* build are stored in the build
279 //! output directory in a file called `output`. Cargo parses this file when
280 //! the Unit for that build script is prepared for the `JobQueue`. The
281 //! Fingerprint code can then use that information to compute the Fingerprint
282 //! and compare against the old fingerprint hash.
283 //!
284 //! Care must be taken with build script Fingerprints because the
285 //! `Fingerprint::local` value may be changed after the build script runs
286 //! (such as if the build script adds or removes "rerun-if" items).
287 //!
288 //! Another complication is if a build script is overridden. In that case, the
289 //! fingerprint is the hash of the output of the override.
290 //!
291 //! ## Special considerations
292 //!
293 //! Registry dependencies do not track the mtime of files. This is because
294 //! registry dependencies are not expected to change (if a new version is
295 //! used, the Package ID will change, causing a rebuild). Cargo currently
296 //! partially works with Docker caching. When a Docker image is built, it has
297 //! normal mtime information. However, when a step is cached, the nanosecond
298 //! portions of all files is zeroed out. Currently this works, but care must
299 //! be taken for situations like these.
300 //!
301 //! HFS on macOS only supports 1 second timestamps. This causes a significant
302 //! number of problems, particularly with Cargo's testsuite which does rapid
303 //! builds in succession. Other filesystems have various degrees of
304 //! resolution.
305 //!
306 //! Various weird filesystems (such as network filesystems) also can cause
307 //! complications. Network filesystems may track the time on the server
308 //! (except when the time is set manually such as with
309 //! `filetime::set_file_times`). Not all filesystems support modifying the
310 //! mtime.
311 //!
312 //! See the `A-rebuild-detection` flag on the issue tracker for more:
313 //! <https://github.com/rust-lang/cargo/issues?q=is%3Aissue+is%3Aopen+label%3AA-rebuild-detection>
314 
315 use std::collections::hash_map::{Entry, HashMap};
316 use std::convert::TryInto;
317 use std::env;
318 use std::hash::{self, Hash, Hasher};
319 use std::path::{Path, PathBuf};
320 use std::str;
321 use std::sync::{Arc, Mutex};
322 use std::time::SystemTime;
323 
324 use anyhow::{bail, format_err, Context as _};
325 use cargo_util::{paths, ProcessBuilder};
326 use filetime::FileTime;
327 use log::{debug, info};
328 use serde::de;
329 use serde::ser;
330 use serde::{Deserialize, Serialize};
331 
332 use crate::core::compiler::unit_graph::UnitDep;
333 use crate::core::Package;
334 use crate::util;
335 use crate::util::errors::CargoResult;
336 use crate::util::interning::InternedString;
337 use crate::util::{internal, path_args, profile, StableHasher};
338 use crate::CARGO_ENV;
339 
340 use super::custom_build::BuildDeps;
341 use super::job::{Job, Work};
342 use super::{BuildContext, Context, FileFlavor, Unit};
343 
344 /// Determines if a `unit` is up-to-date, and if not prepares necessary work to
345 /// update the persisted fingerprint.
346 ///
347 /// This function will inspect `unit`, calculate a fingerprint for it, and then
348 /// return an appropriate `Job` to run. The returned `Job` will be a noop if
349 /// `unit` is considered "fresh", or if it was previously built and cached.
350 /// Otherwise the `Job` returned will write out the true fingerprint to the
351 /// filesystem, to be executed after the unit's work has completed.
352 ///
353 /// The `force` flag is a way to force the `Job` to be "dirty", or always
354 /// update the fingerprint. **Beware using this flag** because it does not
355 /// transitively propagate throughout the dependency graph, it only forces this
356 /// one unit which is very unlikely to be what you want unless you're
357 /// exclusively talking about top-level units.
prepare_target(cx: &mut Context<'_, '_>, unit: &Unit, force: bool) -> CargoResult<Job>358 pub fn prepare_target(cx: &mut Context<'_, '_>, unit: &Unit, force: bool) -> CargoResult<Job> {
359     let _p = profile::start(format!(
360         "fingerprint: {} / {}",
361         unit.pkg.package_id(),
362         unit.target.name()
363     ));
364     let bcx = cx.bcx;
365     let loc = cx.files().fingerprint_file_path(unit, "");
366 
367     debug!("fingerprint at: {}", loc.display());
368 
369     // Figure out if this unit is up to date. After calculating the fingerprint
370     // compare it to an old version, if any, and attempt to print diagnostic
371     // information about failed comparisons to aid in debugging.
372     let fingerprint = calculate(cx, unit)?;
373     let mtime_on_use = cx.bcx.config.cli_unstable().mtime_on_use;
374     let compare = compare_old_fingerprint(&loc, &*fingerprint, mtime_on_use);
375     log_compare(unit, &compare);
376 
377     // If our comparison failed (e.g., we're going to trigger a rebuild of this
378     // crate), then we also ensure the source of the crate passes all
379     // verification checks before we build it.
380     //
381     // The `Source::verify` method is intended to allow sources to execute
382     // pre-build checks to ensure that the relevant source code is all
383     // up-to-date and as expected. This is currently used primarily for
384     // directory sources which will use this hook to perform an integrity check
385     // on all files in the source to ensure they haven't changed. If they have
386     // changed then an error is issued.
387     if compare.is_err() {
388         let source_id = unit.pkg.package_id().source_id();
389         let sources = bcx.packages.sources();
390         let source = sources
391             .get(source_id)
392             .ok_or_else(|| internal("missing package source"))?;
393         source.verify(unit.pkg.package_id())?;
394     }
395 
396     if compare.is_ok() && !force {
397         return Ok(Job::new_fresh());
398     }
399 
400     // Clear out the old fingerprint file if it exists. This protects when
401     // compilation is interrupted leaving a corrupt file. For example, a
402     // project with a lib.rs and integration test (two units):
403     //
404     // 1. Build the library and integration test.
405     // 2. Make a change to lib.rs (NOT the integration test).
406     // 3. Build the integration test, hit Ctrl-C while linking. With gcc, this
407     //    will leave behind an incomplete executable (zero size, or partially
408     //    written). NOTE: The library builds successfully, it is the linking
409     //    of the integration test that we are interrupting.
410     // 4. Build the integration test again.
411     //
412     // Without the following line, then step 3 will leave a valid fingerprint
413     // on the disk. Then step 4 will think the integration test is "fresh"
414     // because:
415     //
416     // - There is a valid fingerprint hash on disk (written in step 1).
417     // - The mtime of the output file (the corrupt integration executable
418     //   written in step 3) is newer than all of its dependencies.
419     // - The mtime of the integration test fingerprint dep-info file (written
420     //   in step 1) is newer than the integration test's source files, because
421     //   we haven't modified any of its source files.
422     //
423     // But the executable is corrupt and needs to be rebuilt. Clearing the
424     // fingerprint at step 3 ensures that Cargo never mistakes a partially
425     // written output as up-to-date.
426     if loc.exists() {
427         // Truncate instead of delete so that compare_old_fingerprint will
428         // still log the reason for the fingerprint failure instead of just
429         // reporting "failed to read fingerprint" during the next build if
430         // this build fails.
431         paths::write(&loc, b"")?;
432     }
433 
434     let write_fingerprint = if unit.mode.is_run_custom_build() {
435         // For build scripts the `local` field of the fingerprint may change
436         // while we're executing it. For example it could be in the legacy
437         // "consider everything a dependency mode" and then we switch to "deps
438         // are explicitly specified" mode.
439         //
440         // To handle this movement we need to regenerate the `local` field of a
441         // build script's fingerprint after it's executed. We do this by
442         // using the `build_script_local_fingerprints` function which returns a
443         // thunk we can invoke on a foreign thread to calculate this.
444         let build_script_outputs = Arc::clone(&cx.build_script_outputs);
445         let metadata = cx.get_run_build_script_metadata(unit);
446         let (gen_local, _overridden) = build_script_local_fingerprints(cx, unit);
447         let output_path = cx.build_explicit_deps[unit].build_script_output.clone();
448         Work::new(move |_| {
449             let outputs = build_script_outputs.lock().unwrap();
450             let output = outputs
451                 .get(metadata)
452                 .expect("output must exist after running");
453             let deps = BuildDeps::new(&output_path, Some(output));
454 
455             // FIXME: it's basically buggy that we pass `None` to `call_box`
456             // here. See documentation on `build_script_local_fingerprints`
457             // below for more information. Despite this just try to proceed and
458             // hobble along if it happens to return `Some`.
459             if let Some(new_local) = (gen_local)(&deps, None)? {
460                 *fingerprint.local.lock().unwrap() = new_local;
461             }
462 
463             write_fingerprint(&loc, &fingerprint)
464         })
465     } else {
466         Work::new(move |_| write_fingerprint(&loc, &fingerprint))
467     };
468 
469     Ok(Job::new_dirty(write_fingerprint))
470 }
471 
472 /// Dependency edge information for fingerprints. This is generated for each
473 /// dependency and is stored in a `Fingerprint` below.
474 #[derive(Clone)]
475 struct DepFingerprint {
476     /// The hash of the package id that this dependency points to
477     pkg_id: u64,
478     /// The crate name we're using for this dependency, which if we change we'll
479     /// need to recompile!
480     name: InternedString,
481     /// Whether or not this dependency is flagged as a public dependency or not.
482     public: bool,
483     /// Whether or not this dependency is an rmeta dependency or a "full"
484     /// dependency. In the case of an rmeta dependency our dependency edge only
485     /// actually requires the rmeta from what we depend on, so when checking
486     /// mtime information all files other than the rmeta can be ignored.
487     only_requires_rmeta: bool,
488     /// The dependency's fingerprint we recursively point to, containing all the
489     /// other hash information we'd otherwise need.
490     fingerprint: Arc<Fingerprint>,
491 }
492 
493 /// A fingerprint can be considered to be a "short string" representing the
494 /// state of a world for a package.
495 ///
496 /// If a fingerprint ever changes, then the package itself needs to be
497 /// recompiled. Inputs to the fingerprint include source code modifications,
498 /// compiler flags, compiler version, etc. This structure is not simply a
499 /// `String` due to the fact that some fingerprints cannot be calculated lazily.
500 ///
501 /// Path sources, for example, use the mtime of the corresponding dep-info file
502 /// as a fingerprint (all source files must be modified *before* this mtime).
503 /// This dep-info file is not generated, however, until after the crate is
504 /// compiled. As a result, this structure can be thought of as a fingerprint
505 /// to-be. The actual value can be calculated via `hash_u64()`, but the operation
506 /// may fail as some files may not have been generated.
507 ///
508 /// Note that dependencies are taken into account for fingerprints because rustc
509 /// requires that whenever an upstream crate is recompiled that all downstream
510 /// dependents are also recompiled. This is typically tracked through
511 /// `DependencyQueue`, but it also needs to be retained here because Cargo can
512 /// be interrupted while executing, losing the state of the `DependencyQueue`
513 /// graph.
514 #[derive(Serialize, Deserialize)]
515 pub struct Fingerprint {
516     /// Hash of the version of `rustc` used.
517     rustc: u64,
518     /// Sorted list of cfg features enabled.
519     features: String,
520     /// Hash of the `Target` struct, including the target name,
521     /// package-relative source path, edition, etc.
522     target: u64,
523     /// Hash of the `Profile`, `CompileMode`, and any extra flags passed via
524     /// `cargo rustc` or `cargo rustdoc`.
525     profile: u64,
526     /// Hash of the path to the base source file. This is relative to the
527     /// workspace root for path members, or absolute for other sources.
528     path: u64,
529     /// Fingerprints of dependencies.
530     deps: Vec<DepFingerprint>,
531     /// Information about the inputs that affect this Unit (such as source
532     /// file mtimes or build script environment variables).
533     local: Mutex<Vec<LocalFingerprint>>,
534     /// Cached hash of the `Fingerprint` struct. Used to improve performance
535     /// for hashing.
536     #[serde(skip)]
537     memoized_hash: Mutex<Option<u64>>,
538     /// RUSTFLAGS/RUSTDOCFLAGS environment variable value (or config value).
539     rustflags: Vec<String>,
540     /// Hash of some metadata from the manifest, such as "authors", or
541     /// "description", which are exposed as environment variables during
542     /// compilation.
543     metadata: u64,
544     /// Hash of various config settings that change how things are compiled.
545     config: u64,
546     /// The rustc target. This is only relevant for `.json` files, otherwise
547     /// the metadata hash segregates the units.
548     compile_kind: u64,
549     /// Description of whether the filesystem status for this unit is up to date
550     /// or should be considered stale.
551     #[serde(skip)]
552     fs_status: FsStatus,
553     /// Files, relative to `target_root`, that are produced by the step that
554     /// this `Fingerprint` represents. This is used to detect when the whole
555     /// fingerprint is out of date if this is missing, or if previous
556     /// fingerprints output files are regenerated and look newer than this one.
557     #[serde(skip)]
558     outputs: Vec<PathBuf>,
559 }
560 
561 /// Indication of the status on the filesystem for a particular unit.
562 enum FsStatus {
563     /// This unit is to be considered stale, even if hash information all
564     /// matches. The filesystem inputs have changed (or are missing) and the
565     /// unit needs to subsequently be recompiled.
566     Stale,
567 
568     /// This unit is up-to-date. All outputs and their corresponding mtime are
569     /// listed in the payload here for other dependencies to compare against.
570     UpToDate { mtimes: HashMap<PathBuf, FileTime> },
571 }
572 
573 impl FsStatus {
up_to_date(&self) -> bool574     fn up_to_date(&self) -> bool {
575         match self {
576             FsStatus::UpToDate { .. } => true,
577             FsStatus::Stale => false,
578         }
579     }
580 }
581 
582 impl Default for FsStatus {
default() -> FsStatus583     fn default() -> FsStatus {
584         FsStatus::Stale
585     }
586 }
587 
588 impl Serialize for DepFingerprint {
serialize<S>(&self, ser: S) -> Result<S::Ok, S::Error> where S: ser::Serializer,589     fn serialize<S>(&self, ser: S) -> Result<S::Ok, S::Error>
590     where
591         S: ser::Serializer,
592     {
593         (
594             &self.pkg_id,
595             &self.name,
596             &self.public,
597             &self.fingerprint.hash_u64(),
598         )
599             .serialize(ser)
600     }
601 }
602 
603 impl<'de> Deserialize<'de> for DepFingerprint {
deserialize<D>(d: D) -> Result<DepFingerprint, D::Error> where D: de::Deserializer<'de>,604     fn deserialize<D>(d: D) -> Result<DepFingerprint, D::Error>
605     where
606         D: de::Deserializer<'de>,
607     {
608         let (pkg_id, name, public, hash) = <(u64, String, bool, u64)>::deserialize(d)?;
609         Ok(DepFingerprint {
610             pkg_id,
611             name: InternedString::new(&name),
612             public,
613             fingerprint: Arc::new(Fingerprint {
614                 memoized_hash: Mutex::new(Some(hash)),
615                 ..Fingerprint::new()
616             }),
617             // This field is never read since it's only used in
618             // `check_filesystem` which isn't used by fingerprints loaded from
619             // disk.
620             only_requires_rmeta: false,
621         })
622     }
623 }
624 
625 /// A `LocalFingerprint` represents something that we use to detect direct
626 /// changes to a `Fingerprint`.
627 ///
628 /// This is where we track file information, env vars, etc. This
629 /// `LocalFingerprint` struct is hashed and if the hash changes will force a
630 /// recompile of any fingerprint it's included into. Note that the "local"
631 /// terminology comes from the fact that it only has to do with one crate, and
632 /// `Fingerprint` tracks the transitive propagation of fingerprint changes.
633 ///
634 /// Note that because this is hashed its contents are carefully managed. Like
635 /// mentioned in the above module docs, we don't want to hash absolute paths or
636 /// mtime information.
637 ///
638 /// Also note that a `LocalFingerprint` is used in `check_filesystem` to detect
639 /// when the filesystem contains stale information (based on mtime currently).
640 /// The paths here don't change much between compilations but they're used as
641 /// inputs when we probe the filesystem looking at information.
642 #[derive(Debug, Serialize, Deserialize, Hash)]
643 enum LocalFingerprint {
644     /// This is a precalculated fingerprint which has an opaque string we just
645     /// hash as usual. This variant is primarily used for rustdoc where we
646     /// don't have a dep-info file to compare against.
647     ///
648     /// This is also used for build scripts with no `rerun-if-*` statements, but
649     /// that's overall a mistake and causes bugs in Cargo. We shouldn't use this
650     /// for build scripts.
651     Precalculated(String),
652 
653     /// This is used for crate compilations. The `dep_info` file is a relative
654     /// path anchored at `target_root(...)` to the dep-info file that Cargo
655     /// generates (which is a custom serialization after parsing rustc's own
656     /// `dep-info` output).
657     ///
658     /// The `dep_info` file, when present, also lists a number of other files
659     /// for us to look at. If any of those files are newer than this file then
660     /// we need to recompile.
661     CheckDepInfo { dep_info: PathBuf },
662 
663     /// This represents a nonempty set of `rerun-if-changed` annotations printed
664     /// out by a build script. The `output` file is a relative file anchored at
665     /// `target_root(...)` which is the actual output of the build script. That
666     /// output has already been parsed and the paths printed out via
667     /// `rerun-if-changed` are listed in `paths`. The `paths` field is relative
668     /// to `pkg.root()`
669     ///
670     /// This is considered up-to-date if all of the `paths` are older than
671     /// `output`, otherwise we need to recompile.
672     RerunIfChanged {
673         output: PathBuf,
674         paths: Vec<PathBuf>,
675     },
676 
677     /// This represents a single `rerun-if-env-changed` annotation printed by a
678     /// build script. The exact env var and value are hashed here. There's no
679     /// filesystem dependence here, and if the values are changed the hash will
680     /// change forcing a recompile.
681     RerunIfEnvChanged { var: String, val: Option<String> },
682 }
683 
684 enum StaleItem {
685     MissingFile(PathBuf),
686     ChangedFile {
687         reference: PathBuf,
688         reference_mtime: FileTime,
689         stale: PathBuf,
690         stale_mtime: FileTime,
691     },
692     ChangedEnv {
693         var: String,
694         previous: Option<String>,
695         current: Option<String>,
696     },
697 }
698 
699 impl LocalFingerprint {
700     /// Checks dynamically at runtime if this `LocalFingerprint` has a stale
701     /// item inside of it.
702     ///
703     /// The main purpose of this function is to handle two different ways
704     /// fingerprints can be invalidated:
705     ///
706     /// * One is a dependency listed in rustc's dep-info files is invalid. Note
707     ///   that these could either be env vars or files. We check both here.
708     ///
709     /// * Another is the `rerun-if-changed` directive from build scripts. This
710     ///   is where we'll find whether files have actually changed
find_stale_item( &self, mtime_cache: &mut HashMap<PathBuf, FileTime>, pkg_root: &Path, target_root: &Path, cargo_exe: &Path, ) -> CargoResult<Option<StaleItem>>711     fn find_stale_item(
712         &self,
713         mtime_cache: &mut HashMap<PathBuf, FileTime>,
714         pkg_root: &Path,
715         target_root: &Path,
716         cargo_exe: &Path,
717     ) -> CargoResult<Option<StaleItem>> {
718         match self {
719             // We need to parse `dep_info`, learn about the crate's dependencies.
720             //
721             // For each env var we see if our current process's env var still
722             // matches, and for each file we see if any of them are newer than
723             // the `dep_info` file itself whose mtime represents the start of
724             // rustc.
725             LocalFingerprint::CheckDepInfo { dep_info } => {
726                 let dep_info = target_root.join(dep_info);
727                 let info = match parse_dep_info(pkg_root, target_root, &dep_info)? {
728                     Some(info) => info,
729                     None => return Ok(Some(StaleItem::MissingFile(dep_info))),
730                 };
731                 for (key, previous) in info.env.iter() {
732                     let current = if key == CARGO_ENV {
733                         Some(
734                             cargo_exe
735                                 .to_str()
736                                 .ok_or_else(|| {
737                                     format_err!(
738                                         "cargo exe path {} must be valid UTF-8",
739                                         cargo_exe.display()
740                                     )
741                                 })?
742                                 .to_string(),
743                         )
744                     } else {
745                         env::var(key).ok()
746                     };
747                     if current == *previous {
748                         continue;
749                     }
750                     return Ok(Some(StaleItem::ChangedEnv {
751                         var: key.clone(),
752                         previous: previous.clone(),
753                         current,
754                     }));
755                 }
756                 Ok(find_stale_file(mtime_cache, &dep_info, info.files.iter()))
757             }
758 
759             // We need to verify that no paths listed in `paths` are newer than
760             // the `output` path itself, or the last time the build script ran.
761             LocalFingerprint::RerunIfChanged { output, paths } => Ok(find_stale_file(
762                 mtime_cache,
763                 &target_root.join(output),
764                 paths.iter().map(|p| pkg_root.join(p)),
765             )),
766 
767             // These have no dependencies on the filesystem, and their values
768             // are included natively in the `Fingerprint` hash so nothing
769             // tocheck for here.
770             LocalFingerprint::RerunIfEnvChanged { .. } => Ok(None),
771             LocalFingerprint::Precalculated(..) => Ok(None),
772         }
773     }
774 
kind(&self) -> &'static str775     fn kind(&self) -> &'static str {
776         match self {
777             LocalFingerprint::Precalculated(..) => "precalculated",
778             LocalFingerprint::CheckDepInfo { .. } => "dep-info",
779             LocalFingerprint::RerunIfChanged { .. } => "rerun-if-changed",
780             LocalFingerprint::RerunIfEnvChanged { .. } => "rerun-if-env-changed",
781         }
782     }
783 }
784 
785 impl Fingerprint {
new() -> Fingerprint786     fn new() -> Fingerprint {
787         Fingerprint {
788             rustc: 0,
789             target: 0,
790             profile: 0,
791             path: 0,
792             features: String::new(),
793             deps: Vec::new(),
794             local: Mutex::new(Vec::new()),
795             memoized_hash: Mutex::new(None),
796             rustflags: Vec::new(),
797             metadata: 0,
798             config: 0,
799             compile_kind: 0,
800             fs_status: FsStatus::Stale,
801             outputs: Vec::new(),
802         }
803     }
804 
805     /// For performance reasons fingerprints will memoize their own hash, but
806     /// there's also internal mutability with its `local` field which can
807     /// change, for example with build scripts, during a build.
808     ///
809     /// This method can be used to bust all memoized hashes just before a build
810     /// to ensure that after a build completes everything is up-to-date.
clear_memoized(&self)811     pub fn clear_memoized(&self) {
812         *self.memoized_hash.lock().unwrap() = None;
813     }
814 
hash_u64(&self) -> u64815     fn hash_u64(&self) -> u64 {
816         if let Some(s) = *self.memoized_hash.lock().unwrap() {
817             return s;
818         }
819         let ret = util::hash_u64(self);
820         *self.memoized_hash.lock().unwrap() = Some(ret);
821         ret
822     }
823 
824     /// Compares this fingerprint with an old version which was previously
825     /// serialized to filesystem.
826     ///
827     /// The purpose of this is exclusively to produce a diagnostic message
828     /// indicating why we're recompiling something. This function always returns
829     /// an error, it will never return success.
compare(&self, old: &Fingerprint) -> CargoResult<()>830     fn compare(&self, old: &Fingerprint) -> CargoResult<()> {
831         if self.rustc != old.rustc {
832             bail!("rust compiler has changed")
833         }
834         if self.features != old.features {
835             bail!(
836                 "features have changed: previously {}, now {}",
837                 old.features,
838                 self.features
839             )
840         }
841         if self.target != old.target {
842             bail!("target configuration has changed")
843         }
844         if self.path != old.path {
845             bail!("path to the source has changed")
846         }
847         if self.profile != old.profile {
848             bail!("profile configuration has changed")
849         }
850         if self.rustflags != old.rustflags {
851             bail!(
852                 "RUSTFLAGS has changed: previously {:?}, now {:?}",
853                 old.rustflags,
854                 self.rustflags
855             )
856         }
857         if self.metadata != old.metadata {
858             bail!("metadata changed")
859         }
860         if self.config != old.config {
861             bail!("configuration settings have changed")
862         }
863         if self.compile_kind != old.compile_kind {
864             bail!("compile kind (rustc target) changed")
865         }
866         let my_local = self.local.lock().unwrap();
867         let old_local = old.local.lock().unwrap();
868         if my_local.len() != old_local.len() {
869             bail!("local lens changed");
870         }
871         for (new, old) in my_local.iter().zip(old_local.iter()) {
872             match (new, old) {
873                 (LocalFingerprint::Precalculated(a), LocalFingerprint::Precalculated(b)) => {
874                     if a != b {
875                         bail!(
876                             "precalculated components have changed: previously {}, now {}",
877                             b,
878                             a
879                         )
880                     }
881                 }
882                 (
883                     LocalFingerprint::CheckDepInfo { dep_info: adep },
884                     LocalFingerprint::CheckDepInfo { dep_info: bdep },
885                 ) => {
886                     if adep != bdep {
887                         bail!(
888                             "dep info output changed: previously {:?}, now {:?}",
889                             bdep,
890                             adep
891                         )
892                     }
893                 }
894                 (
895                     LocalFingerprint::RerunIfChanged {
896                         output: aout,
897                         paths: apaths,
898                     },
899                     LocalFingerprint::RerunIfChanged {
900                         output: bout,
901                         paths: bpaths,
902                     },
903                 ) => {
904                     if aout != bout {
905                         bail!(
906                             "rerun-if-changed output changed: previously {:?}, now {:?}",
907                             bout,
908                             aout
909                         )
910                     }
911                     if apaths != bpaths {
912                         bail!(
913                             "rerun-if-changed output changed: previously {:?}, now {:?}",
914                             bpaths,
915                             apaths,
916                         )
917                     }
918                 }
919                 (
920                     LocalFingerprint::RerunIfEnvChanged {
921                         var: akey,
922                         val: avalue,
923                     },
924                     LocalFingerprint::RerunIfEnvChanged {
925                         var: bkey,
926                         val: bvalue,
927                     },
928                 ) => {
929                     if *akey != *bkey {
930                         bail!("env vars changed: previously {}, now {}", bkey, akey);
931                     }
932                     if *avalue != *bvalue {
933                         bail!(
934                             "env var `{}` changed: previously {:?}, now {:?}",
935                             akey,
936                             bvalue,
937                             avalue
938                         )
939                     }
940                 }
941                 (a, b) => bail!(
942                     "local fingerprint type has changed ({} => {})",
943                     b.kind(),
944                     a.kind()
945                 ),
946             }
947         }
948 
949         if self.deps.len() != old.deps.len() {
950             bail!("number of dependencies has changed")
951         }
952         for (a, b) in self.deps.iter().zip(old.deps.iter()) {
953             if a.name != b.name {
954                 let e = format_err!("`{}` != `{}`", a.name, b.name)
955                     .context("unit dependency name changed");
956                 return Err(e);
957             }
958 
959             if a.fingerprint.hash_u64() != b.fingerprint.hash_u64() {
960                 let e = format_err!(
961                     "new ({}/{:x}) != old ({}/{:x})",
962                     a.name,
963                     a.fingerprint.hash_u64(),
964                     b.name,
965                     b.fingerprint.hash_u64()
966                 )
967                 .context("unit dependency information changed");
968                 return Err(e);
969             }
970         }
971 
972         if !self.fs_status.up_to_date() {
973             bail!("current filesystem status shows we're outdated");
974         }
975 
976         // This typically means some filesystem modifications happened or
977         // something transitive was odd. In general we should strive to provide
978         // a better error message than this, so if you see this message a lot it
979         // likely means this method needs to be updated!
980         bail!("two fingerprint comparison turned up nothing obvious");
981     }
982 
983     /// Dynamically inspect the local filesystem to update the `fs_status` field
984     /// of this `Fingerprint`.
985     ///
986     /// This function is used just after a `Fingerprint` is constructed to check
987     /// the local state of the filesystem and propagate any dirtiness from
988     /// dependencies up to this unit as well. This function assumes that the
989     /// unit starts out as `FsStatus::Stale` and then it will optionally switch
990     /// it to `UpToDate` if it can.
check_filesystem( &mut self, mtime_cache: &mut HashMap<PathBuf, FileTime>, pkg_root: &Path, target_root: &Path, cargo_exe: &Path, ) -> CargoResult<()>991     fn check_filesystem(
992         &mut self,
993         mtime_cache: &mut HashMap<PathBuf, FileTime>,
994         pkg_root: &Path,
995         target_root: &Path,
996         cargo_exe: &Path,
997     ) -> CargoResult<()> {
998         assert!(!self.fs_status.up_to_date());
999 
1000         let mut mtimes = HashMap::new();
1001 
1002         // Get the `mtime` of all outputs. Optionally update their mtime
1003         // afterwards based on the `mtime_on_use` flag. Afterwards we want the
1004         // minimum mtime as it's the one we'll be comparing to inputs and
1005         // dependencies.
1006         for output in self.outputs.iter() {
1007             let mtime = match paths::mtime(output) {
1008                 Ok(mtime) => mtime,
1009 
1010                 // This path failed to report its `mtime`. It probably doesn't
1011                 // exists, so leave ourselves as stale and bail out.
1012                 Err(e) => {
1013                     debug!("failed to get mtime of {:?}: {}", output, e);
1014                     return Ok(());
1015                 }
1016             };
1017             assert!(mtimes.insert(output.clone(), mtime).is_none());
1018         }
1019 
1020         let opt_max = mtimes.iter().max_by_key(|kv| kv.1);
1021         let (max_path, max_mtime) = match opt_max {
1022             Some(mtime) => mtime,
1023 
1024             // We had no output files. This means we're an overridden build
1025             // script and we're just always up to date because we aren't
1026             // watching the filesystem.
1027             None => {
1028                 self.fs_status = FsStatus::UpToDate { mtimes };
1029                 return Ok(());
1030             }
1031         };
1032         debug!(
1033             "max output mtime for {:?} is {:?} {}",
1034             pkg_root, max_path, max_mtime
1035         );
1036 
1037         for dep in self.deps.iter() {
1038             let dep_mtimes = match &dep.fingerprint.fs_status {
1039                 FsStatus::UpToDate { mtimes } => mtimes,
1040                 // If our dependency is stale, so are we, so bail out.
1041                 FsStatus::Stale => return Ok(()),
1042             };
1043 
1044             // If our dependency edge only requires the rmeta file to be present
1045             // then we only need to look at that one output file, otherwise we
1046             // need to consider all output files to see if we're out of date.
1047             let (dep_path, dep_mtime) = if dep.only_requires_rmeta {
1048                 dep_mtimes
1049                     .iter()
1050                     .find(|(path, _mtime)| {
1051                         path.extension().and_then(|s| s.to_str()) == Some("rmeta")
1052                     })
1053                     .expect("failed to find rmeta")
1054             } else {
1055                 match dep_mtimes.iter().max_by_key(|kv| kv.1) {
1056                     Some(dep_mtime) => dep_mtime,
1057                     // If our dependencies is up to date and has no filesystem
1058                     // interactions, then we can move on to the next dependency.
1059                     None => continue,
1060                 }
1061             };
1062             debug!(
1063                 "max dep mtime for {:?} is {:?} {}",
1064                 pkg_root, dep_path, dep_mtime
1065             );
1066 
1067             // If the dependency is newer than our own output then it was
1068             // recompiled previously. We transitively become stale ourselves in
1069             // that case, so bail out.
1070             //
1071             // Note that this comparison should probably be `>=`, not `>`, but
1072             // for a discussion of why it's `>` see the discussion about #5918
1073             // below in `find_stale`.
1074             if dep_mtime > max_mtime {
1075                 info!(
1076                     "dependency on `{}` is newer than we are {} > {} {:?}",
1077                     dep.name, dep_mtime, max_mtime, pkg_root
1078                 );
1079                 return Ok(());
1080             }
1081         }
1082 
1083         // If we reached this far then all dependencies are up to date. Check
1084         // all our `LocalFingerprint` information to see if we have any stale
1085         // files for this package itself. If we do find something log a helpful
1086         // message and bail out so we stay stale.
1087         for local in self.local.get_mut().unwrap().iter() {
1088             if let Some(item) =
1089                 local.find_stale_item(mtime_cache, pkg_root, target_root, cargo_exe)?
1090             {
1091                 item.log();
1092                 return Ok(());
1093             }
1094         }
1095 
1096         // Everything was up to date! Record such.
1097         self.fs_status = FsStatus::UpToDate { mtimes };
1098         debug!("filesystem up-to-date {:?}", pkg_root);
1099 
1100         Ok(())
1101     }
1102 }
1103 
1104 impl hash::Hash for Fingerprint {
hash<H: Hasher>(&self, h: &mut H)1105     fn hash<H: Hasher>(&self, h: &mut H) {
1106         let Fingerprint {
1107             rustc,
1108             ref features,
1109             target,
1110             path,
1111             profile,
1112             ref deps,
1113             ref local,
1114             metadata,
1115             config,
1116             compile_kind,
1117             ref rustflags,
1118             ..
1119         } = *self;
1120         let local = local.lock().unwrap();
1121         (
1122             rustc,
1123             features,
1124             target,
1125             path,
1126             profile,
1127             &*local,
1128             metadata,
1129             config,
1130             compile_kind,
1131             rustflags,
1132         )
1133             .hash(h);
1134 
1135         h.write_usize(deps.len());
1136         for DepFingerprint {
1137             pkg_id,
1138             name,
1139             public,
1140             fingerprint,
1141             only_requires_rmeta: _, // static property, no need to hash
1142         } in deps
1143         {
1144             pkg_id.hash(h);
1145             name.hash(h);
1146             public.hash(h);
1147             // use memoized dep hashes to avoid exponential blowup
1148             h.write_u64(fingerprint.hash_u64());
1149         }
1150     }
1151 }
1152 
1153 impl DepFingerprint {
new(cx: &mut Context<'_, '_>, parent: &Unit, dep: &UnitDep) -> CargoResult<DepFingerprint>1154     fn new(cx: &mut Context<'_, '_>, parent: &Unit, dep: &UnitDep) -> CargoResult<DepFingerprint> {
1155         let fingerprint = calculate(cx, &dep.unit)?;
1156         // We need to be careful about what we hash here. We have a goal of
1157         // supporting renaming a project directory and not rebuilding
1158         // everything. To do that, however, we need to make sure that the cwd
1159         // doesn't make its way into any hashes, and one source of that is the
1160         // `SourceId` for `path` packages.
1161         //
1162         // We already have a requirement that `path` packages all have unique
1163         // names (sort of for this same reason), so if the package source is a
1164         // `path` then we just hash the name, but otherwise we hash the full
1165         // id as it won't change when the directory is renamed.
1166         let pkg_id = if dep.unit.pkg.package_id().source_id().is_path() {
1167             util::hash_u64(dep.unit.pkg.package_id().name())
1168         } else {
1169             util::hash_u64(dep.unit.pkg.package_id())
1170         };
1171 
1172         Ok(DepFingerprint {
1173             pkg_id,
1174             name: dep.extern_crate_name,
1175             public: dep.public,
1176             fingerprint,
1177             only_requires_rmeta: cx.only_requires_rmeta(parent, &dep.unit),
1178         })
1179     }
1180 }
1181 
1182 impl StaleItem {
1183     /// Use the `log` crate to log a hopefully helpful message in diagnosing
1184     /// what file is considered stale and why. This is intended to be used in
1185     /// conjunction with `CARGO_LOG` to determine why Cargo is recompiling
1186     /// something. Currently there's no user-facing usage of this other than
1187     /// that.
log(&self)1188     fn log(&self) {
1189         match self {
1190             StaleItem::MissingFile(path) => {
1191                 info!("stale: missing {:?}", path);
1192             }
1193             StaleItem::ChangedFile {
1194                 reference,
1195                 reference_mtime,
1196                 stale,
1197                 stale_mtime,
1198             } => {
1199                 info!("stale: changed {:?}", stale);
1200                 info!("          (vs) {:?}", reference);
1201                 info!("               {:?} != {:?}", reference_mtime, stale_mtime);
1202             }
1203             StaleItem::ChangedEnv {
1204                 var,
1205                 previous,
1206                 current,
1207             } => {
1208                 info!("stale: changed env {:?}", var);
1209                 info!("       {:?} != {:?}", previous, current);
1210             }
1211         }
1212     }
1213 }
1214 
1215 /// Calculates the fingerprint for a `unit`.
1216 ///
1217 /// This fingerprint is used by Cargo to learn about when information such as:
1218 ///
1219 /// * A non-path package changes (changes version, changes revision, etc).
1220 /// * Any dependency changes
1221 /// * The compiler changes
1222 /// * The set of features a package is built with changes
1223 /// * The profile a target is compiled with changes (e.g., opt-level changes)
1224 /// * Any other compiler flags change that will affect the result
1225 ///
1226 /// Information like file modification time is only calculated for path
1227 /// dependencies.
calculate(cx: &mut Context<'_, '_>, unit: &Unit) -> CargoResult<Arc<Fingerprint>>1228 fn calculate(cx: &mut Context<'_, '_>, unit: &Unit) -> CargoResult<Arc<Fingerprint>> {
1229     // This function is slammed quite a lot, so the result is memoized.
1230     if let Some(s) = cx.fingerprints.get(unit) {
1231         return Ok(Arc::clone(s));
1232     }
1233     let mut fingerprint = if unit.mode.is_run_custom_build() {
1234         calculate_run_custom_build(cx, unit)?
1235     } else if unit.mode.is_doc_test() {
1236         panic!("doc tests do not fingerprint");
1237     } else {
1238         calculate_normal(cx, unit)?
1239     };
1240 
1241     // After we built the initial `Fingerprint` be sure to update the
1242     // `fs_status` field of it.
1243     let target_root = target_root(cx);
1244     let cargo_exe = cx.bcx.config.cargo_exe()?;
1245     fingerprint.check_filesystem(
1246         &mut cx.mtime_cache,
1247         unit.pkg.root(),
1248         &target_root,
1249         cargo_exe,
1250     )?;
1251 
1252     let fingerprint = Arc::new(fingerprint);
1253     cx.fingerprints
1254         .insert(unit.clone(), Arc::clone(&fingerprint));
1255     Ok(fingerprint)
1256 }
1257 
1258 /// Calculate a fingerprint for a "normal" unit, or anything that's not a build
1259 /// script. This is an internal helper of `calculate`, don't call directly.
calculate_normal(cx: &mut Context<'_, '_>, unit: &Unit) -> CargoResult<Fingerprint>1260 fn calculate_normal(cx: &mut Context<'_, '_>, unit: &Unit) -> CargoResult<Fingerprint> {
1261     // Recursively calculate the fingerprint for all of our dependencies.
1262     //
1263     // Skip fingerprints of binaries because they don't actually induce a
1264     // recompile, they're just dependencies in the sense that they need to be
1265     // built.
1266     //
1267     // Create Vec since mutable cx is needed in closure.
1268     let deps = Vec::from(cx.unit_deps(unit));
1269     let mut deps = deps
1270         .into_iter()
1271         .filter(|dep| !dep.unit.target.is_bin())
1272         .map(|dep| DepFingerprint::new(cx, unit, &dep))
1273         .collect::<CargoResult<Vec<_>>>()?;
1274     deps.sort_by(|a, b| a.pkg_id.cmp(&b.pkg_id));
1275 
1276     // Afterwards calculate our own fingerprint information.
1277     let target_root = target_root(cx);
1278     let local = if unit.mode.is_doc() {
1279         // rustdoc does not have dep-info files.
1280         let fingerprint = pkg_fingerprint(cx.bcx, &unit.pkg).with_context(|| {
1281             format!(
1282                 "failed to determine package fingerprint for documenting {}",
1283                 unit.pkg
1284             )
1285         })?;
1286         vec![LocalFingerprint::Precalculated(fingerprint)]
1287     } else {
1288         let dep_info = dep_info_loc(cx, unit);
1289         let dep_info = dep_info.strip_prefix(&target_root).unwrap().to_path_buf();
1290         vec![LocalFingerprint::CheckDepInfo { dep_info }]
1291     };
1292 
1293     // Figure out what the outputs of our unit is, and we'll be storing them
1294     // into the fingerprint as well.
1295     let outputs = cx
1296         .outputs(unit)?
1297         .iter()
1298         .filter(|output| !matches!(output.flavor, FileFlavor::DebugInfo | FileFlavor::Auxiliary))
1299         .map(|output| output.path.clone())
1300         .collect();
1301 
1302     // Fill out a bunch more information that we'll be tracking typically
1303     // hashed to take up less space on disk as we just need to know when things
1304     // change.
1305     let extra_flags = if unit.mode.is_doc() {
1306         cx.bcx.rustdocflags_args(unit)
1307     } else {
1308         cx.bcx.rustflags_args(unit)
1309     }
1310     .to_vec();
1311 
1312     let profile_hash = util::hash_u64((
1313         &unit.profile,
1314         unit.mode,
1315         cx.bcx.extra_args_for(unit),
1316         cx.lto[unit],
1317     ));
1318     // Include metadata since it is exposed as environment variables.
1319     let m = unit.pkg.manifest().metadata();
1320     let metadata = util::hash_u64((&m.authors, &m.description, &m.homepage, &m.repository));
1321     let mut config = StableHasher::new();
1322     if let Some(linker) = cx.bcx.linker(unit.kind) {
1323         linker.hash(&mut config);
1324     }
1325     if unit.mode.is_doc() && cx.bcx.config.cli_unstable().rustdoc_map {
1326         if let Ok(map) = cx.bcx.config.doc_extern_map() {
1327             map.hash(&mut config);
1328         }
1329     }
1330     if let Some(allow_features) = &cx.bcx.config.cli_unstable().allow_features {
1331         allow_features.hash(&mut config);
1332     }
1333     let compile_kind = unit.kind.fingerprint_hash();
1334     Ok(Fingerprint {
1335         rustc: util::hash_u64(&cx.bcx.rustc().verbose_version),
1336         target: util::hash_u64(&unit.target),
1337         profile: profile_hash,
1338         // Note that .0 is hashed here, not .1 which is the cwd. That doesn't
1339         // actually affect the output artifact so there's no need to hash it.
1340         path: util::hash_u64(path_args(cx.bcx.ws, unit).0),
1341         features: format!("{:?}", unit.features),
1342         deps,
1343         local: Mutex::new(local),
1344         memoized_hash: Mutex::new(None),
1345         metadata,
1346         config: config.finish(),
1347         compile_kind,
1348         rustflags: extra_flags,
1349         fs_status: FsStatus::Stale,
1350         outputs,
1351     })
1352 }
1353 
1354 /// Calculate a fingerprint for an "execute a build script" unit.  This is an
1355 /// internal helper of `calculate`, don't call directly.
calculate_run_custom_build(cx: &mut Context<'_, '_>, unit: &Unit) -> CargoResult<Fingerprint>1356 fn calculate_run_custom_build(cx: &mut Context<'_, '_>, unit: &Unit) -> CargoResult<Fingerprint> {
1357     assert!(unit.mode.is_run_custom_build());
1358     // Using the `BuildDeps` information we'll have previously parsed and
1359     // inserted into `build_explicit_deps` built an initial snapshot of the
1360     // `LocalFingerprint` list for this build script. If we previously executed
1361     // the build script this means we'll be watching files and env vars.
1362     // Otherwise if we haven't previously executed it we'll just start watching
1363     // the whole crate.
1364     let (gen_local, overridden) = build_script_local_fingerprints(cx, unit);
1365     let deps = &cx.build_explicit_deps[unit];
1366     let local = (gen_local)(
1367         deps,
1368         Some(&|| {
1369             pkg_fingerprint(cx.bcx, &unit.pkg).with_context(|| {
1370                 format!(
1371                     "failed to determine package fingerprint for build script for {}",
1372                     unit.pkg
1373                 )
1374             })
1375         }),
1376     )?
1377     .unwrap();
1378     let output = deps.build_script_output.clone();
1379 
1380     // Include any dependencies of our execution, which is typically just the
1381     // compilation of the build script itself. (if the build script changes we
1382     // should be rerun!). Note though that if we're an overridden build script
1383     // we have no dependencies so no need to recurse in that case.
1384     let deps = if overridden {
1385         // Overridden build scripts don't need to track deps.
1386         vec![]
1387     } else {
1388         // Create Vec since mutable cx is needed in closure.
1389         let deps = Vec::from(cx.unit_deps(unit));
1390         deps.into_iter()
1391             .map(|dep| DepFingerprint::new(cx, unit, &dep))
1392             .collect::<CargoResult<Vec<_>>>()?
1393     };
1394 
1395     Ok(Fingerprint {
1396         local: Mutex::new(local),
1397         rustc: util::hash_u64(&cx.bcx.rustc().verbose_version),
1398         deps,
1399         outputs: if overridden { Vec::new() } else { vec![output] },
1400 
1401         // Most of the other info is blank here as we don't really include it
1402         // in the execution of the build script, but... this may be a latent
1403         // bug in Cargo.
1404         ..Fingerprint::new()
1405     })
1406 }
1407 
1408 /// Get ready to compute the `LocalFingerprint` values for a `RunCustomBuild`
1409 /// unit.
1410 ///
1411 /// This function has, what's on the surface, a seriously wonky interface.
1412 /// You'll call this function and it'll return a closure and a boolean. The
1413 /// boolean is pretty simple in that it indicates whether the `unit` has been
1414 /// overridden via `.cargo/config`. The closure is much more complicated.
1415 ///
1416 /// This closure is intended to capture any local state necessary to compute
1417 /// the `LocalFingerprint` values for this unit. It is `Send` and `'static` to
1418 /// be sent to other threads as well (such as when we're executing build
1419 /// scripts). That deduplication is the rationale for the closure at least.
1420 ///
1421 /// The arguments to the closure are a bit weirder, though, and I'll apologize
1422 /// in advance for the weirdness too. The first argument to the closure is a
1423 /// `&BuildDeps`. This is the parsed version of a build script, and when Cargo
1424 /// starts up this is cached from previous runs of a build script.  After a
1425 /// build script executes the output file is reparsed and passed in here.
1426 ///
1427 /// The second argument is the weirdest, it's *optionally* a closure to
1428 /// call `pkg_fingerprint` below. The `pkg_fingerprint` below requires access
1429 /// to "source map" located in `Context`. That's very non-`'static` and
1430 /// non-`Send`, so it can't be used on other threads, such as when we invoke
1431 /// this after a build script has finished. The `Option` allows us to for sure
1432 /// calculate it on the main thread at the beginning, and then swallow the bug
1433 /// for now where a worker thread after a build script has finished doesn't
1434 /// have access. Ideally there would be no second argument or it would be more
1435 /// "first class" and not an `Option` but something that can be sent between
1436 /// threads. In any case, it's a bug for now.
1437 ///
1438 /// This isn't the greatest of interfaces, and if there's suggestions to
1439 /// improve please do so!
1440 ///
1441 /// FIXME(#6779) - see all the words above
build_script_local_fingerprints( cx: &mut Context<'_, '_>, unit: &Unit, ) -> ( Box< dyn FnOnce( &BuildDeps, Option<&dyn Fn() -> CargoResult<String>>, ) -> CargoResult<Option<Vec<LocalFingerprint>>> + Send, >, bool, )1442 fn build_script_local_fingerprints(
1443     cx: &mut Context<'_, '_>,
1444     unit: &Unit,
1445 ) -> (
1446     Box<
1447         dyn FnOnce(
1448                 &BuildDeps,
1449                 Option<&dyn Fn() -> CargoResult<String>>,
1450             ) -> CargoResult<Option<Vec<LocalFingerprint>>>
1451             + Send,
1452     >,
1453     bool,
1454 ) {
1455     assert!(unit.mode.is_run_custom_build());
1456     // First up, if this build script is entirely overridden, then we just
1457     // return the hash of what we overrode it with. This is the easy case!
1458     if let Some(fingerprint) = build_script_override_fingerprint(cx, unit) {
1459         debug!("override local fingerprints deps {}", unit.pkg);
1460         return (
1461             Box::new(
1462                 move |_: &BuildDeps, _: Option<&dyn Fn() -> CargoResult<String>>| {
1463                     Ok(Some(vec![fingerprint]))
1464                 },
1465             ),
1466             true, // this is an overridden build script
1467         );
1468     }
1469 
1470     // ... Otherwise this is a "real" build script and we need to return a real
1471     // closure. Our returned closure classifies the build script based on
1472     // whether it prints `rerun-if-*`. If it *doesn't* print this it's where the
1473     // magical second argument comes into play, which fingerprints a whole
1474     // package. Remember that the fact that this is an `Option` is a bug, but a
1475     // longstanding bug, in Cargo. Recent refactorings just made it painfully
1476     // obvious.
1477     let pkg_root = unit.pkg.root().to_path_buf();
1478     let target_dir = target_root(cx);
1479     let calculate =
1480         move |deps: &BuildDeps, pkg_fingerprint: Option<&dyn Fn() -> CargoResult<String>>| {
1481             if deps.rerun_if_changed.is_empty() && deps.rerun_if_env_changed.is_empty() {
1482                 match pkg_fingerprint {
1483                     // FIXME: this is somewhat buggy with respect to docker and
1484                     // weird filesystems. The `Precalculated` variant
1485                     // constructed below will, for `path` dependencies, contain
1486                     // a stringified version of the mtime for the local crate.
1487                     // This violates one of the things we describe in this
1488                     // module's doc comment, never hashing mtimes. We should
1489                     // figure out a better scheme where a package fingerprint
1490                     // may be a string (like for a registry) or a list of files
1491                     // (like for a path dependency). Those list of files would
1492                     // be stored here rather than the the mtime of them.
1493                     Some(f) => {
1494                         let s = f()?;
1495                         debug!(
1496                             "old local fingerprints deps {:?} precalculated={:?}",
1497                             pkg_root, s
1498                         );
1499                         return Ok(Some(vec![LocalFingerprint::Precalculated(s)]));
1500                     }
1501                     None => return Ok(None),
1502                 }
1503             }
1504 
1505             // Ok so now we're in "new mode" where we can have files listed as
1506             // dependencies as well as env vars listed as dependencies. Process
1507             // them all here.
1508             Ok(Some(local_fingerprints_deps(deps, &target_dir, &pkg_root)))
1509         };
1510 
1511     // Note that `false` == "not overridden"
1512     (Box::new(calculate), false)
1513 }
1514 
1515 /// Create a `LocalFingerprint` for an overridden build script.
1516 /// Returns None if it is not overridden.
build_script_override_fingerprint( cx: &mut Context<'_, '_>, unit: &Unit, ) -> Option<LocalFingerprint>1517 fn build_script_override_fingerprint(
1518     cx: &mut Context<'_, '_>,
1519     unit: &Unit,
1520 ) -> Option<LocalFingerprint> {
1521     // Build script output is only populated at this stage when it is
1522     // overridden.
1523     let build_script_outputs = cx.build_script_outputs.lock().unwrap();
1524     let metadata = cx.get_run_build_script_metadata(unit);
1525     // Returns None if it is not overridden.
1526     let output = build_script_outputs.get(metadata)?;
1527     let s = format!(
1528         "overridden build state with hash: {}",
1529         util::hash_u64(output)
1530     );
1531     Some(LocalFingerprint::Precalculated(s))
1532 }
1533 
1534 /// Compute the `LocalFingerprint` values for a `RunCustomBuild` unit for
1535 /// non-overridden new-style build scripts only. This is only used when `deps`
1536 /// is already known to have a nonempty `rerun-if-*` somewhere.
local_fingerprints_deps( deps: &BuildDeps, target_root: &Path, pkg_root: &Path, ) -> Vec<LocalFingerprint>1537 fn local_fingerprints_deps(
1538     deps: &BuildDeps,
1539     target_root: &Path,
1540     pkg_root: &Path,
1541 ) -> Vec<LocalFingerprint> {
1542     debug!("new local fingerprints deps {:?}", pkg_root);
1543     let mut local = Vec::new();
1544 
1545     if !deps.rerun_if_changed.is_empty() {
1546         // Note that like the module comment above says we are careful to never
1547         // store an absolute path in `LocalFingerprint`, so ensure that we strip
1548         // absolute prefixes from them.
1549         let output = deps
1550             .build_script_output
1551             .strip_prefix(target_root)
1552             .unwrap()
1553             .to_path_buf();
1554         let paths = deps
1555             .rerun_if_changed
1556             .iter()
1557             .map(|p| p.strip_prefix(pkg_root).unwrap_or(p).to_path_buf())
1558             .collect();
1559         local.push(LocalFingerprint::RerunIfChanged { output, paths });
1560     }
1561 
1562     for var in deps.rerun_if_env_changed.iter() {
1563         let val = env::var(var).ok();
1564         local.push(LocalFingerprint::RerunIfEnvChanged {
1565             var: var.clone(),
1566             val,
1567         });
1568     }
1569 
1570     local
1571 }
1572 
write_fingerprint(loc: &Path, fingerprint: &Fingerprint) -> CargoResult<()>1573 fn write_fingerprint(loc: &Path, fingerprint: &Fingerprint) -> CargoResult<()> {
1574     debug_assert_ne!(fingerprint.rustc, 0);
1575     // fingerprint::new().rustc == 0, make sure it doesn't make it to the file system.
1576     // This is mostly so outside tools can reliably find out what rust version this file is for,
1577     // as we can use the full hash.
1578     let hash = fingerprint.hash_u64();
1579     debug!("write fingerprint ({:x}) : {}", hash, loc.display());
1580     paths::write(loc, util::to_hex(hash).as_bytes())?;
1581 
1582     let json = serde_json::to_string(fingerprint).unwrap();
1583     if cfg!(debug_assertions) {
1584         let f: Fingerprint = serde_json::from_str(&json).unwrap();
1585         assert_eq!(f.hash_u64(), hash);
1586     }
1587     paths::write(&loc.with_extension("json"), json.as_bytes())?;
1588     Ok(())
1589 }
1590 
1591 /// Prepare for work when a package starts to build
prepare_init(cx: &mut Context<'_, '_>, unit: &Unit) -> CargoResult<()>1592 pub fn prepare_init(cx: &mut Context<'_, '_>, unit: &Unit) -> CargoResult<()> {
1593     let new1 = cx.files().fingerprint_dir(unit);
1594 
1595     // Doc tests have no output, thus no fingerprint.
1596     if !new1.exists() && !unit.mode.is_doc_test() {
1597         paths::create_dir_all(&new1)?;
1598     }
1599 
1600     Ok(())
1601 }
1602 
1603 /// Returns the location that the dep-info file will show up at for the `unit`
1604 /// specified.
dep_info_loc(cx: &mut Context<'_, '_>, unit: &Unit) -> PathBuf1605 pub fn dep_info_loc(cx: &mut Context<'_, '_>, unit: &Unit) -> PathBuf {
1606     cx.files().fingerprint_file_path(unit, "dep-")
1607 }
1608 
1609 /// Returns an absolute path that target directory.
1610 /// All paths are rewritten to be relative to this.
target_root(cx: &Context<'_, '_>) -> PathBuf1611 fn target_root(cx: &Context<'_, '_>) -> PathBuf {
1612     cx.bcx.ws.target_dir().into_path_unlocked()
1613 }
1614 
compare_old_fingerprint( loc: &Path, new_fingerprint: &Fingerprint, mtime_on_use: bool, ) -> CargoResult<()>1615 fn compare_old_fingerprint(
1616     loc: &Path,
1617     new_fingerprint: &Fingerprint,
1618     mtime_on_use: bool,
1619 ) -> CargoResult<()> {
1620     let old_fingerprint_short = paths::read(loc)?;
1621 
1622     if mtime_on_use {
1623         // update the mtime so other cleaners know we used it
1624         let t = FileTime::from_system_time(SystemTime::now());
1625         debug!("mtime-on-use forcing {:?} to {}", loc, t);
1626         paths::set_file_time_no_err(loc, t);
1627     }
1628 
1629     let new_hash = new_fingerprint.hash_u64();
1630 
1631     if util::to_hex(new_hash) == old_fingerprint_short && new_fingerprint.fs_status.up_to_date() {
1632         return Ok(());
1633     }
1634 
1635     let old_fingerprint_json = paths::read(&loc.with_extension("json"))?;
1636     let old_fingerprint: Fingerprint = serde_json::from_str(&old_fingerprint_json)
1637         .with_context(|| internal("failed to deserialize json"))?;
1638     // Fingerprint can be empty after a failed rebuild (see comment in prepare_target).
1639     if !old_fingerprint_short.is_empty() {
1640         debug_assert_eq!(
1641             util::to_hex(old_fingerprint.hash_u64()),
1642             old_fingerprint_short
1643         );
1644     }
1645     let result = new_fingerprint.compare(&old_fingerprint);
1646     assert!(result.is_err());
1647     result
1648 }
1649 
log_compare(unit: &Unit, compare: &CargoResult<()>)1650 fn log_compare(unit: &Unit, compare: &CargoResult<()>) {
1651     let ce = match compare {
1652         Ok(..) => return,
1653         Err(e) => e,
1654     };
1655     info!(
1656         "fingerprint error for {}/{:?}/{:?}",
1657         unit.pkg, unit.mode, unit.target,
1658     );
1659     info!("    err: {:?}", ce);
1660 }
1661 
1662 /// Parses Cargo's internal `EncodedDepInfo` structure that was previously
1663 /// serialized to disk.
1664 ///
1665 /// Note that this is not rustc's `*.d` files.
1666 ///
1667 /// Also note that rustc's `*.d` files are translated to Cargo-specific
1668 /// `EncodedDepInfo` files after compilations have finished in
1669 /// `translate_dep_info`.
1670 ///
1671 /// Returns `None` if the file is corrupt or couldn't be read from disk. This
1672 /// indicates that the crate should likely be rebuilt.
parse_dep_info( pkg_root: &Path, target_root: &Path, dep_info: &Path, ) -> CargoResult<Option<RustcDepInfo>>1673 pub fn parse_dep_info(
1674     pkg_root: &Path,
1675     target_root: &Path,
1676     dep_info: &Path,
1677 ) -> CargoResult<Option<RustcDepInfo>> {
1678     let data = match paths::read_bytes(dep_info) {
1679         Ok(data) => data,
1680         Err(_) => return Ok(None),
1681     };
1682     let info = match EncodedDepInfo::parse(&data) {
1683         Some(info) => info,
1684         None => {
1685             log::warn!("failed to parse cargo's dep-info at {:?}", dep_info);
1686             return Ok(None);
1687         }
1688     };
1689     let mut ret = RustcDepInfo::default();
1690     ret.env = info.env;
1691     for (ty, path) in info.files {
1692         let path = match ty {
1693             DepInfoPathType::PackageRootRelative => pkg_root.join(path),
1694             // N.B. path might be absolute here in which case the join will have no effect
1695             DepInfoPathType::TargetRootRelative => target_root.join(path),
1696         };
1697         ret.files.push(path);
1698     }
1699     Ok(Some(ret))
1700 }
1701 
pkg_fingerprint(bcx: &BuildContext<'_, '_>, pkg: &Package) -> CargoResult<String>1702 fn pkg_fingerprint(bcx: &BuildContext<'_, '_>, pkg: &Package) -> CargoResult<String> {
1703     let source_id = pkg.package_id().source_id();
1704     let sources = bcx.packages.sources();
1705 
1706     let source = sources
1707         .get(source_id)
1708         .ok_or_else(|| internal("missing package source"))?;
1709     source.fingerprint(pkg)
1710 }
1711 
find_stale_file<I>( mtime_cache: &mut HashMap<PathBuf, FileTime>, reference: &Path, paths: I, ) -> Option<StaleItem> where I: IntoIterator, I::Item: AsRef<Path>,1712 fn find_stale_file<I>(
1713     mtime_cache: &mut HashMap<PathBuf, FileTime>,
1714     reference: &Path,
1715     paths: I,
1716 ) -> Option<StaleItem>
1717 where
1718     I: IntoIterator,
1719     I::Item: AsRef<Path>,
1720 {
1721     let reference_mtime = match paths::mtime(reference) {
1722         Ok(mtime) => mtime,
1723         Err(..) => return Some(StaleItem::MissingFile(reference.to_path_buf())),
1724     };
1725 
1726     for path in paths {
1727         let path = path.as_ref();
1728         let path_mtime = match mtime_cache.entry(path.to_path_buf()) {
1729             Entry::Occupied(o) => *o.get(),
1730             Entry::Vacant(v) => {
1731                 let mtime = match paths::mtime_recursive(path) {
1732                     Ok(mtime) => mtime,
1733                     Err(..) => return Some(StaleItem::MissingFile(path.to_path_buf())),
1734                 };
1735                 *v.insert(mtime)
1736             }
1737         };
1738 
1739         // TODO: fix #5918.
1740         // Note that equal mtimes should be considered "stale". For filesystems with
1741         // not much timestamp precision like 1s this is would be a conservative approximation
1742         // to handle the case where a file is modified within the same second after
1743         // a build starts. We want to make sure that incremental rebuilds pick that up!
1744         //
1745         // For filesystems with nanosecond precision it's been seen in the wild that
1746         // its "nanosecond precision" isn't really nanosecond-accurate. It turns out that
1747         // kernels may cache the current time so files created at different times actually
1748         // list the same nanosecond precision. Some digging on #5919 picked up that the
1749         // kernel caches the current time between timer ticks, which could mean that if
1750         // a file is updated at most 10ms after a build starts then Cargo may not
1751         // pick up the build changes.
1752         //
1753         // All in all, an equality check here would be a conservative assumption that,
1754         // if equal, files were changed just after a previous build finished.
1755         // Unfortunately this became problematic when (in #6484) cargo switch to more accurately
1756         // measuring the start time of builds.
1757         if path_mtime <= reference_mtime {
1758             continue;
1759         }
1760 
1761         return Some(StaleItem::ChangedFile {
1762             reference: reference.to_path_buf(),
1763             reference_mtime,
1764             stale: path.to_path_buf(),
1765             stale_mtime: path_mtime,
1766         });
1767     }
1768 
1769     debug!(
1770         "all paths up-to-date relative to {:?} mtime={}",
1771         reference, reference_mtime
1772     );
1773     None
1774 }
1775 
1776 enum DepInfoPathType {
1777     // src/, e.g. src/lib.rs
1778     PackageRootRelative,
1779     // target/debug/deps/lib...
1780     // or an absolute path /.../sysroot/...
1781     TargetRootRelative,
1782 }
1783 
1784 /// Parses the dep-info file coming out of rustc into a Cargo-specific format.
1785 ///
1786 /// This function will parse `rustc_dep_info` as a makefile-style dep info to
1787 /// learn about the all files which a crate depends on. This is then
1788 /// re-serialized into the `cargo_dep_info` path in a Cargo-specific format.
1789 ///
1790 /// The `pkg_root` argument here is the absolute path to the directory
1791 /// containing `Cargo.toml` for this crate that was compiled. The paths listed
1792 /// in the rustc dep-info file may or may not be absolute but we'll want to
1793 /// consider all of them relative to the `root` specified.
1794 ///
1795 /// The `rustc_cwd` argument is the absolute path to the cwd of the compiler
1796 /// when it was invoked.
1797 ///
1798 /// If the `allow_package` argument is true, then package-relative paths are
1799 /// included. If it is false, then package-relative paths are skipped and
1800 /// ignored (typically used for registry or git dependencies where we assume
1801 /// the source never changes, and we don't want the cost of running `stat` on
1802 /// all those files). See the module-level docs for the note about
1803 /// `-Zbinary-dep-depinfo` for more details on why this is done.
1804 ///
1805 /// The serialized Cargo format will contain a list of files, all of which are
1806 /// relative if they're under `root`. or absolute if they're elsewhere.
translate_dep_info( rustc_dep_info: &Path, cargo_dep_info: &Path, rustc_cwd: &Path, pkg_root: &Path, target_root: &Path, rustc_cmd: &ProcessBuilder, allow_package: bool, ) -> CargoResult<()>1807 pub fn translate_dep_info(
1808     rustc_dep_info: &Path,
1809     cargo_dep_info: &Path,
1810     rustc_cwd: &Path,
1811     pkg_root: &Path,
1812     target_root: &Path,
1813     rustc_cmd: &ProcessBuilder,
1814     allow_package: bool,
1815 ) -> CargoResult<()> {
1816     let depinfo = parse_rustc_dep_info(rustc_dep_info)?;
1817 
1818     let target_root = target_root.canonicalize()?;
1819     let pkg_root = pkg_root.canonicalize()?;
1820     let mut on_disk_info = EncodedDepInfo::default();
1821     on_disk_info.env = depinfo.env;
1822 
1823     // This is a bit of a tricky statement, but here we're *removing* the
1824     // dependency on environment variables that were defined specifically for
1825     // the command itself. Environment variables returend by `get_envs` includes
1826     // environment variables like:
1827     //
1828     // * `OUT_DIR` if applicable
1829     // * env vars added by a build script, if any
1830     //
1831     // The general idea here is that the dep info file tells us what, when
1832     // changed, should cause us to rebuild the crate. These environment
1833     // variables are synthesized by Cargo and/or the build script, and the
1834     // intention is that their values are tracked elsewhere for whether the
1835     // crate needs to be rebuilt.
1836     //
1837     // For example a build script says when it needs to be rerun and otherwise
1838     // it's assumed to produce the same output, so we're guaranteed that env
1839     // vars defined by the build script will always be the same unless the build
1840     // script itself reruns, in which case the crate will rerun anyway.
1841     //
1842     // For things like `OUT_DIR` it's a bit sketchy for now. Most of the time
1843     // that's used for code generation but this is technically buggy where if
1844     // you write a binary that does `println!("{}", env!("OUT_DIR"))` we won't
1845     // recompile that if you move the target directory. Hopefully that's not too
1846     // bad of an issue for now...
1847     //
1848     // This also includes `CARGO` since if the code is explicitly wanting to
1849     // know that path, it should be rebuilt if it changes. The CARGO path is
1850     // not tracked elsewhere in the fingerprint.
1851     on_disk_info
1852         .env
1853         .retain(|(key, _)| !rustc_cmd.get_envs().contains_key(key) || key == CARGO_ENV);
1854 
1855     for file in depinfo.files {
1856         // The path may be absolute or relative, canonical or not. Make sure
1857         // it is canonicalized so we are comparing the same kinds of paths.
1858         let abs_file = rustc_cwd.join(file);
1859         // If canonicalization fails, just use the abs path. There is currently
1860         // a bug where --remap-path-prefix is affecting .d files, causing them
1861         // to point to non-existent paths.
1862         let canon_file = abs_file.canonicalize().unwrap_or_else(|_| abs_file.clone());
1863 
1864         let (ty, path) = if let Ok(stripped) = canon_file.strip_prefix(&target_root) {
1865             (DepInfoPathType::TargetRootRelative, stripped)
1866         } else if let Ok(stripped) = canon_file.strip_prefix(&pkg_root) {
1867             if !allow_package {
1868                 continue;
1869             }
1870             (DepInfoPathType::PackageRootRelative, stripped)
1871         } else {
1872             // It's definitely not target root relative, but this is an absolute path (since it was
1873             // joined to rustc_cwd) and as such re-joining it later to the target root will have no
1874             // effect.
1875             (DepInfoPathType::TargetRootRelative, &*abs_file)
1876         };
1877         on_disk_info.files.push((ty, path.to_owned()));
1878     }
1879     paths::write(cargo_dep_info, on_disk_info.serialize()?)?;
1880     Ok(())
1881 }
1882 
1883 #[derive(Default)]
1884 pub struct RustcDepInfo {
1885     /// The list of files that the main target in the dep-info file depends on.
1886     pub files: Vec<PathBuf>,
1887     /// The list of environment variables we found that the rustc compilation
1888     /// depends on.
1889     ///
1890     /// The first element of the pair is the name of the env var and the second
1891     /// item is the value. `Some` means that the env var was set, and `None`
1892     /// means that the env var wasn't actually set and the compilation depends
1893     /// on it not being set.
1894     pub env: Vec<(String, Option<String>)>,
1895 }
1896 
1897 // Same as `RustcDepInfo` except avoids absolute paths as much as possible to
1898 // allow moving around the target directory.
1899 //
1900 // This is also stored in an optimized format to make parsing it fast because
1901 // Cargo will read it for crates on all future compilations.
1902 #[derive(Default)]
1903 struct EncodedDepInfo {
1904     files: Vec<(DepInfoPathType, PathBuf)>,
1905     env: Vec<(String, Option<String>)>,
1906 }
1907 
1908 impl EncodedDepInfo {
parse(mut bytes: &[u8]) -> Option<EncodedDepInfo>1909     fn parse(mut bytes: &[u8]) -> Option<EncodedDepInfo> {
1910         let bytes = &mut bytes;
1911         let nfiles = read_usize(bytes)?;
1912         let mut files = Vec::with_capacity(nfiles as usize);
1913         for _ in 0..nfiles {
1914             let ty = match read_u8(bytes)? {
1915                 0 => DepInfoPathType::PackageRootRelative,
1916                 1 => DepInfoPathType::TargetRootRelative,
1917                 _ => return None,
1918             };
1919             let bytes = read_bytes(bytes)?;
1920             files.push((ty, paths::bytes2path(bytes).ok()?));
1921         }
1922 
1923         let nenv = read_usize(bytes)?;
1924         let mut env = Vec::with_capacity(nenv as usize);
1925         for _ in 0..nenv {
1926             let key = str::from_utf8(read_bytes(bytes)?).ok()?.to_string();
1927             let val = match read_u8(bytes)? {
1928                 0 => None,
1929                 1 => Some(str::from_utf8(read_bytes(bytes)?).ok()?.to_string()),
1930                 _ => return None,
1931             };
1932             env.push((key, val));
1933         }
1934         return Some(EncodedDepInfo { files, env });
1935 
1936         fn read_usize(bytes: &mut &[u8]) -> Option<usize> {
1937             let ret = bytes.get(..4)?;
1938             *bytes = &bytes[4..];
1939             Some(u32::from_le_bytes(ret.try_into().unwrap()) as usize)
1940         }
1941 
1942         fn read_u8(bytes: &mut &[u8]) -> Option<u8> {
1943             let ret = *bytes.get(0)?;
1944             *bytes = &bytes[1..];
1945             Some(ret)
1946         }
1947 
1948         fn read_bytes<'a>(bytes: &mut &'a [u8]) -> Option<&'a [u8]> {
1949             let n = read_usize(bytes)? as usize;
1950             let ret = bytes.get(..n)?;
1951             *bytes = &bytes[n..];
1952             Some(ret)
1953         }
1954     }
1955 
serialize(&self) -> CargoResult<Vec<u8>>1956     fn serialize(&self) -> CargoResult<Vec<u8>> {
1957         let mut ret = Vec::new();
1958         let dst = &mut ret;
1959         write_usize(dst, self.files.len());
1960         for (ty, file) in self.files.iter() {
1961             match ty {
1962                 DepInfoPathType::PackageRootRelative => dst.push(0),
1963                 DepInfoPathType::TargetRootRelative => dst.push(1),
1964             }
1965             write_bytes(dst, paths::path2bytes(file)?);
1966         }
1967 
1968         write_usize(dst, self.env.len());
1969         for (key, val) in self.env.iter() {
1970             write_bytes(dst, key);
1971             match val {
1972                 None => dst.push(0),
1973                 Some(val) => {
1974                     dst.push(1);
1975                     write_bytes(dst, val);
1976                 }
1977             }
1978         }
1979         return Ok(ret);
1980 
1981         fn write_bytes(dst: &mut Vec<u8>, val: impl AsRef<[u8]>) {
1982             let val = val.as_ref();
1983             write_usize(dst, val.len());
1984             dst.extend_from_slice(val);
1985         }
1986 
1987         fn write_usize(dst: &mut Vec<u8>, val: usize) {
1988             dst.extend(&u32::to_le_bytes(val as u32));
1989         }
1990     }
1991 }
1992 
1993 /// Parse the `.d` dep-info file generated by rustc.
parse_rustc_dep_info(rustc_dep_info: &Path) -> CargoResult<RustcDepInfo>1994 pub fn parse_rustc_dep_info(rustc_dep_info: &Path) -> CargoResult<RustcDepInfo> {
1995     let contents = paths::read(rustc_dep_info)?;
1996     let mut ret = RustcDepInfo::default();
1997     let mut found_deps = false;
1998 
1999     for line in contents.lines() {
2000         if let Some(rest) = line.strip_prefix("# env-dep:") {
2001             let mut parts = rest.splitn(2, '=');
2002             let env_var = match parts.next() {
2003                 Some(s) => s,
2004                 None => continue,
2005             };
2006             let env_val = match parts.next() {
2007                 Some(s) => Some(unescape_env(s)?),
2008                 None => None,
2009             };
2010             ret.env.push((unescape_env(env_var)?, env_val));
2011         } else if let Some(pos) = line.find(": ") {
2012             if found_deps {
2013                 continue;
2014             }
2015             found_deps = true;
2016             let mut deps = line[pos + 2..].split_whitespace();
2017 
2018             while let Some(s) = deps.next() {
2019                 let mut file = s.to_string();
2020                 while file.ends_with('\\') {
2021                     file.pop();
2022                     file.push(' ');
2023                     file.push_str(deps.next().ok_or_else(|| {
2024                         internal("malformed dep-info format, trailing \\".to_string())
2025                     })?);
2026                 }
2027                 ret.files.push(file.into());
2028             }
2029         }
2030     }
2031     return Ok(ret);
2032 
2033     // rustc tries to fit env var names and values all on a single line, which
2034     // means it needs to escape `\r` and `\n`. The escape syntax used is "\n"
2035     // which means that `\` also needs to be escaped.
2036     fn unescape_env(s: &str) -> CargoResult<String> {
2037         let mut ret = String::with_capacity(s.len());
2038         let mut chars = s.chars();
2039         while let Some(c) = chars.next() {
2040             if c != '\\' {
2041                 ret.push(c);
2042                 continue;
2043             }
2044             match chars.next() {
2045                 Some('\\') => ret.push('\\'),
2046                 Some('n') => ret.push('\n'),
2047                 Some('r') => ret.push('\r'),
2048                 Some(c) => bail!("unknown escape character `{}`", c),
2049                 None => bail!("unterminated escape character"),
2050             }
2051         }
2052         Ok(ret)
2053     }
2054 }
2055