Hashes and Checksums¶
Makex uses hashes (or checksums) as part of the execution process to reduce the number of Tasks requiring re-execution.
Hashing Files¶
Makex will create and store hashes of a Task’s input and output files, and the hashes of the Makex file in which the task was defined.
Each file Makex acts on has a hash
of its contents, which, is a digest; and a fingerprint
,
which, is its modification time concatenated with its size.
A hash is not valid by itself without a fingerprint.
Hashes will be regenerated if no hash for the file has been generated before; or if the fingerprint of the file doesn’t match an existing/stored fingerprint for the corresponding file.
Task Hashing¶
Makex uses a strategy to hash Tasks. Hashing a Task involves making a unique and stable identifier based on:
The Task’s name.
The Task’s output path.
The Task’s required input files.
The unique and stable identifier of any of the Task’s requirements which may be Tasks themselves.
The Task’s Actions, and their arguments.
The Makex file in which the task was defined. Note: Any changes to this file will cause a task to become stale.
Any Environment variables used in the Makex File. Environment variables which are used in a Makex file are recorded.
If any of these change, the hash will change, and the Task will be re-executed.
Where hashes are stored¶
Typically, Makex stores hashes and fingerprints in the extended attributes of a file.
The filesystem of both the cache and workspace SHOULD have extended attributes support.
The following is a non-exhaustive list of filesystems which support extended attributes:
Linux (Most of them): ext2/3/4, XFS, ZFS, Btrfs…
Mac: HFS+…
Windows: FAT, HPFS, NTFS…
The attribute in which the hash and fingerprint is stored is named user.checksum.{type}
;
where {type}
is one of sha256
or md5
.
If a filesystem without extended attribute support is detected, Makex will fall back to storing hashes in a local database.
Note
Currently, this detection will fail if the filesystem is read-only.