Visual Basic for Applications/File Hashing in VBA

Summary

 * This section contains code for making file hashes, that is, hashes of entire files.
 * Several algorithms are provided, with output options for base64 or hex. The VBA code below generates the digests for the MD5, SHA1, SHA2-256, SHA2-384, and SHA2-512 hashes.
 * The code is made for single files, but the code given on an adjacent page, Folder Hashing in VBA, can be used for recursive hash listings, again with a choice of hashes and output options.
 * String hash routines are given in another section.
 * In general these hashes do not make use of a seed value, but to illustrate the method, the code contains one such example, (FileToSHA512SALT).  Please note that its output differs from that of the SHA512Managed class.   A note exists in the respective procedure in case other salted (seeded) inputs are of interest.
 * These listed algorithms can hash any single file up to about 200MB (Mega Bytes) in length, beyond which an out of memory error will be generated in GetFileBytes. Specific tests found that hashes work well for a 200MB zip file but fail for a 500MB zip file; the exact break point is unclear.  For files larger than 200MB, other facilities exist.
 * Large file hashing, say beyond 200MB is best done with other tools.  Four such examples are mentioned here:
 * Microsoft's FCIV utility, is free download. It is a command-line application, capable of hashing both single files and whole folder trees.  It handles large files with ease, but only for MD5 and SHA1 hashes.  It sends both base64 and HEX outputs to the screen but only b64 output format to a file.   Prepared files can be verified against any new run, but results only to the screen. It is a bit tricky to use, even with their instructions, so the pages Running the FCIV Utility from VBA and File Checksum Integrity Verifier (FCIV) Examples might be found of use to the novice.  So far, Microsoft have not extended the coding to include contemporary algorithms.
 * PowerShell in Windows 8.1 and above, can make large single-file hashes, using all of the MD5. SHA1, SHA256, SHA384, and SHA512 algorithms.  It produces output on the screen only, though the output can also be piped to the clipboard for pasting as required. There are no simple options for hashing a folder or for output to an xml file.   For completion, an example of its use is given in File Checksum Integrity Verifier (FCIV) Examples.   In Windows 10, hashes can also be obtained at the command prompt with certutil -hashfile  MD5, though size limitations are unclear.   (Change md5 to sha1, sha256, or sha512, etc).
 * An external application that can handle large files is MD5 and SHA Checksum Utility.  It is a stand-alone application, and a basic version is available as a free download.   It produces MD5, SHA1, SHA2/256, and SHA2/512 hashes for single files. The outputs are in HEX and are displayed together on a neat user interface. A more complex commercial version is also available.
 * FSUM Fast File Integrity Checker is another free, external application for command line use. It resembles FCIV in many ways but includes up to date algorithms. (MD2, MD4, MD5, SHA-1, SHA-2( 256, 384, 512), RIPEMD-160, PANAMA, TIGER,  ADLER32, and CRC32).   In addition to large file HEX hashes it can carry out flat or recursive folder hashes. The code to enter is not identical to that of FCIV but a text file is provided with examples in its use. The web page FSUM Fast File Integrity Checker has the download and other details, though the text file fails to mention that results can be easily piped to the clipboard with |clip.  Although a  graphical interface exists elsewhere, the command-line application has been found the most stable..
 * The permissions for files need to be considered when attempting hashing. Hashing has to access files to obtain the bytes that they contain. Although this does not involve actually running the files, some folder and file types might be found locked at run time.   In fact, this type of access is the main difference between string hashing and file hashing.   Whenever files are accessed, error handling tends to be needed.   It is assumed here that the user will add his own error-handling, or that he will go-around files that are troublesome before the hashing attempt. Users should know that the code cannot handle an empty text file; for example, a Notepad file that has been saved without any text in it.  The GetFileBytes routine will error.   A message and exit will be produced if an empty file is encountered, as for a file in excess of 200MB.
 * User files and folders have few restrictions. The empty file problem apart, those who want to access user files in folders that they have made themselves will not usually have any problems, and interested parties should know that there is a recursive folder hashing module in another section of this series that might be of related interest. Folder Hashing in VBA also contains notes on how to avoid virtual folder problems with music, video, and other Microsoft libraries.
 * Hashing is concerned only with the content of a file, and not its name, or other file details.  This means that duplicates of files under any name can be found by comparing their hashes.   In secure systems with deliberately confusing file names, this means that a very long file list could be hashed until a searched-for hash value is found, rather than depending on a less secure  file name to find it.   Alternatively, file names are sometimes just the file's hash value, so that hashing can reveal any error or illegal change.   In such a case a hacker might change the file then name the file with a corresponding hash, but he does not know the required hash algorithm or private string to use, so changes will always be detected when the owner runs his own hash verification.

Code Listings
IMPORTANT. It was found that the hash routines errored in a Windows 10, 64 bit Office setup. However, subsequent checking revealed the solution. The Windows platform must have intalled the Net Framework 3.5 (includes .Net 2 and .Net 3), this older version, and not only the Net Framework 4.8 Advanced Services that was enabled in Turn Windows Features on and off. When it was selected there, the routines worked perfectly.

Modifications

 * Added default code for transfer of results to the clipboard, 11 Sep 2020
 * Set file selection dialog to open with all-file types to be listed, 25 July 2019
 * Added file selection dialog, and file size limits, 17 Jun 2019

Using Built-in Windows Functions in VBA
The code to make hashes of STRINGS and for bulk file hashing is given elsewhere in this set. The panel below bears code that is virtually identical to that for strings, but with only slight modification, is used to make hashes of single whole FILES. The user provides a full path to the file via a selection dialog as the starting parameter. A parameter option allows for a choice of hex or base-64 outputs. Functions are included for MD5, SHA1, SHA2-256, SHA2-384, and SHA2-512 hashes.

For frequent use, the selection dialog is most convenient, though the code contains a commented-out line for those who intend to type the file address into the procedure; simply comment out the line not needed.

In each case, coders can find the unmodified hash values in the bytes array and at that point they are in 8-bit bytes, that is, the numbers that represent the ASCI code as it applies to a full eight-bit, 256 character set. The code that follows the filling of the bytes array in each case decides which version of the ASCI character set to deliver. For a hex set of characters, 0-9, and A to F, the total bit set is broken into double the number of four-bit bytes, then returned for use. For the base-64 set, lower case letters,upper case letters, and integers mainly, six bit characters are made for output. These two sets are the most useful here, since they consist of commonly used characters. The 128 and 256 ASCI sets are too full of both exotic and non-printing characters to be useful. For each hash version its bit count is a constant, so the length of its output will vary according to the chosen type.

As a general point; message boxes do not allow copying of their text. If copying is needed, replace the message box with an input box, and set the output hash to be the default value of the box. Then it can be copied with ease. Alternatively use the output of the Debug.Print method in the immediate window. A procedure has been included to overwrite the clipboard with the results: If this is not inteded then comment the line out in the top procedure.