markneustadt.com

A little tidbit about MD5 Checksums

Checksums are handy. According to Wikipedia:

A checksum or hash sum is a small-size datum from a block of digital data for the purpose of detecting errors which may have been introduced during its transmission or storage.

We find ourselves using checksums to verify that the file a developer has created is the same as what has been deployed.

What is interesting about the MD5 checksum is that the file name and date are not part of the calculation.

For a previous post, I created multiple files.  The files were supposed to be identical for the test to really be valid.  Each file was 24,000 lines long though.  Both had different names and different dates.  The MD5 checksum on both files is the same though.

In Windows, you can use the command certutil to reveal the checksum.

C:\certutil -hashfile "FileName.txt" MD5

That command will show the 32 byte checksum for the file.  If you run it against two different files that have the same contents, the checksum will be the same.  Similarly, if you rename the file and run it again, the checksum will be the same as the contents of the file haven’t changed.

E:\Dev\Git>certutil -hashfile executetrycatch1000.txt MD5
MD5 hash of file executetrycatch1000.txt:
3c 06 af 0c e3 b8 50 ab 1e e9 9a 32 7a 39 b7 7e
CertUtil: -hashfile command completed successfully.

E:\Dev\Git>certutil -hashfile execute1000.txt MD5
MD5 hash of file execute1000.txt:
3c 06 af 0c e3 b8 50 ab 1e e9 9a 32 7a 39 b7 7e
CertUtil: -hashfile command completed successfully.

E:\Dev\Git>certutil -hashfile execute1000_renamed.txt MD5
MD5 hash of file execute1000_renamed.txt:
3c 06 af 0c e3 b8 50 ab 1e e9 9a 32 7a 39 b7 7e
CertUtil: -hashfile command completed successfully.

Here, I ran the command against two different files that should (and do) have the same contents.  Then I renamed the second file and checked it against the renamed version.  All the checksums came out the same.

md5

 

Leave a Reply

Scroll To Top