MD5 hash discrepancy between Python and PHP?

I'm trying to create a checksum of a binary file (flv/f4v, etc) to verify the contents of the file between the server and client computers. The application that's running on the client computer is python-based, while the server is using PHP.

PHP code is as follows:

$fh = fopen($filepath, 'rb');
$contents = fread($fh, filesize($filepath));
$checksum = md5(base64_encode($contents));
fclose($fh);

Python code is as follows:

def _get_md5(filepath):
    fh = open(filepath, 'rb')
    md5 = hashlib.md5()
    md5.update(f.read().encode('base64'))
    checksum = md5.hexdigest()
    f.close()
    return checksum

on the particular file I'm testing, the PHP and Python md5 hash strings are as follows, respectively:

cfad0d835eb88e5342e843402cc42764
0a96e9cc3bb0354d783dfcb729248ce0

Server is running CentOS, while the client is a MacOSX environment. I would greatly appreciate any help in understanding why the two are generating different hash results, or if it something I overlooked (I am relatively new to Python...). Thank you!

[post mortem: the problem was ultimately the difference between Python and PHP's base64 encoding varieties. MD5 works the same between the two scripting platforms (at least using .hexdigest() in Python).]

Answers


I would rather assume that the base64 implementations differ.

EDIT

PHP:

php -r 'var_dump(base64_encode(str_repeat("x", 10)));'
string(16) "eHh4eHh4eHh4eA=="

Python (Note the trailing newline):

>>> ("x" * 10).encode('base64')
'eHh4eHh4eHh4eA==\n'

PHP and python use different base64 flavors:


The problem seems to be that your base-64-encoding the file data, changing the structure of the binary data, in php I belive that it does not base_64 encode the file.

Give this a go:

def md5_file(filename):
    //MD5 Object
    crc = hashlib.md5()
    //File Pointer Object
    fp = open(filename, 'rb')

    //Loop the File to update the hash checksum
    for i in fp:
        crc.update(i)

    //Close the resource
    fp.close()

    //Return the hash
    return crc.hexdigest()

and within PHP use md5_file and see if that works accordingly.

python taken from: http://www.php2python.com/wiki/function.md5-file/


Python appends a newline '\n' to the string when using .encode, therefore the input strings to the md5 function are different. This issue in the Python bug tracker explains it in detail. See below for the gist of it:

>>> import base64
>>> s='I am a string'
>>> s.encode('base64')
'SSBhbSBhIHN0cmluZw==\n'
>>> base64.b64encode(s)
'SSBhbSBhIHN0cmluZw=='
>>> s.encode('base64')== base64.b64encode(s)+'\n'
True

Need Your Help

What do I need to further qualify the DataContext for a binding?

c# wpf data-binding mvvm

The files I have created and will be referring to in this question are:

Hide System Bar in Jelly Bean Tablet (Rooted)

android root

I have an Android Jelly Bean Tablet which has been rooted and trying to run an application which has the code to hide the system bar but it's not getting hidden can any one help me out on this.