Main Tutorials

Python MD5 Hashing Example

The MD5, defined in RFC 1321, is a hash algorithm to turn inputs into a fixed 128-bit (16 bytes) length of the hash value.

Note
MD5 is not collision-resistant – Two different inputs may producing the same hash value. Read this MD5 vulnerabilities.

In Python, we can use hashlib.md5() to generate a MD5 hash value from a String.

1. Python MD5 Hashing


import hashlib

result = hashlib.md5(b"Hello MD5").hexdigest()
print(result)

result = hashlib.md5("Hello MD5".encode("utf-8")).hexdigest()
print(result)

m = hashlib.md5(b"Hello MD5")
print(m.name)
print(m.digest_size) # 16 bytes (128 bits)
print(m.digest())    # bytes
print(m.hexdigest()) # bytes in hex representation

Output

Terminal

e5dadf6524624f79c3127e247f04b548
e5dadf6524624f79c3127e247f04b548
md5
16
b'\xe5\xda\xdfe$bOy\xc3\x12~$\x7f\x04\xb5H'
e5dadf6524624f79c3127e247f04b548

2. Python MD5 File Checksum

Here is a text file

c:\\test\\readme.txt

Hello MD5

Here is the Python code to read the file and hash the content with md5.


import hashlib


def md5checksum(fname):

    md5 = hashlib.md5()

    # handle content in binary form
    f = open(fname, "rb")

    while chunk := f.read(4096):
        md5.update(chunk)

    return md5.hexdigest()


result = md5checksum("c:\\test\\readme.txt")
print(result)

Output

Terminal

e5dadf6524624f79c3127e247f04b548

References

About Author

author image
Founder of Mkyong.com, love Java and open source stuff. Follow him on Twitter. If you like my tutorials, consider make a donation to these charities.

Comments

Subscribe
Notify of
2 Comments
Most Voted
Newest Oldest
Inline Feedbacks
View all comments
operation420.net
1 year ago

Hello, what is the minimum Python version for this to work? I am using 3.6.8 and for the part:

while chunk := f.read(4096):

   while chunk := infile_obj.read(4096): md5.update(chunk)
                ^
SyntaxError: invalid syntax

What exactly does the := (colon-equal) part mean? I haven’t seen that before, although this works on newer versions of Python3.

Also, for the number (4096) in the .read call, does that number matter? I’ve previously calculated MD5SUMs in Python2 and the Syntax was different and the .read call used 8192 as an argument. Does changing this value affect the calculated checksum?

Thanks.

Can changing MD5 to SHA1, SHA256, SHA512 etc calculate those checksums with the same code?

Last edited 1 year ago by operation420.net
Anonymous
1 year ago

:= Is called the walrus operator and is available in versions 3.8+

The number in the read call is the number of bytes to read at a time. It doesn’t affect the checksum, but it does affect the performance.

Yes, you can change the hash algorithm by changing the hashlib.md5() to hashlib.sha1() etc.