Comparing generated executables for equivilance

I need to compare 2 executables and/or shared objects, compiled using the same compiler/flags and verify that they have not changed. We work in a regulated environment, so it would be really useful for testing purposes to isolate exactly what parts of the executable has changed.

Using MD5Sums/Hashes doesn't work due to the headers containing information about the file.

Does anyone know of a program or way to verify that 2 files are executionally the same even if they were built at a different time?


An interesting question. I have a similar problem on linux. Intrusion detection systems like OSSEC or tripwire may generate false positives if the hashsum of an executable changes all of a sudden. This may be nothing worse than the Linux "prelink" program patching the executable file for faster startups.

In order to compare two binaries (in the ELF format), one can use the "readelf" executable and then "diff" to compare outputs. I'm sure there are refined solutions, but without further ado, a poor man's comparator in Perl:

#!/usr/bin/perl -w

$exe = $ARGV[0];

if (!$exe) {
   die "Please give name of executable\n"
if (! -f $exe) {
   die "Executable $exe not found or not a file\n";
if (! (`file '$exe'` =~ /\bELF\b.*?\bexecutable\b/)) {
   die "file command says '$exe' is not an ELF executable\n";

# Identify sections in ELF

@lines = pipeIt("readelf --wide --section-headers '$exe'");

@sections = ();

for my $line (@lines) {
   if ($line =~ /^\s*\[\s*(\d+)\s*\]\s+(\S+)/) {
      my $secnum = $1;
      my $secnam = $2;
      print "Found section $1 named $2\n";
      push @sections, $secnam;

# Dump file header

@lines = pipeIt("readelf --file-header --wide '$exe'");
print @lines;

# Dump all interesting section headers

@lines = pipeIt("readelf --all --wide '$exe'");
print @lines;

# Dump individual sections as hexdump

for my $section (@sections) {
   @lines = pipeIt("readelf --hex-dump='$section' --wide '$exe'");
   print @lines;

sub pipeIt {
   my($cmd) = @_;
   my $fh;
   open ($fh,"$cmd |") or die "Could not open pipe from command '$cmd': $!\n";
   my @lines = <$fh>;
   close $fh or die "Could not close pipe to command '$cmd': $!\n";
   return @lines;

Now you can run for example, on machine 1:

./ /usr/bin/curl > curl_machine1

And on machine 2:

./ /usr/bin/curl > curl_machine2

After having copypasted, SFTP-ed or NSF-ed (you don't use FTP, do you?) the files into the same filetree, compare the files:

diff --side-by-side --width=200 curl_machine1 curl_machine2 | less

In my case, differences exist in section ".gnu.conflict", ".gnu.liblist", ".got.plt" and ".dynbss", which might be ok for a "prelink" intervention, but in the code section, ".text", which would be a Bad Sign.

To follow up, here is what I came up with finally:

Instead of comparing the final executables & shared objects, we compared the .o files output before linking. We assumed that the linking process was sufficiently reproducible that this would be fine.

It works in some of our cases, where we have two builds were we've made some small change that shouldn't effect the final code (Code pretty-printer) but doesn't help us if we do not have the build intermediary output.

You can compare the contents of RO and RW initialized sections by generating a binary file from the ELF file.

objcopy <elf_file> -O binary <binary_file>

Use the generated binary files to compare if they are identical, using diff, for example.

In my opinion, this is enough to grantee you are generating the same executable.

A few years back I had to do the same thing. We had to prove that we could rebuild the executable from source when given only a revision number, revision control repository, build tools, and build configuration. Note: If any of these change you may see a difference.

I remember there is some timestamps in the executable. The trick is to realize that the file is not just a bunch of bytes, that can not be interpreted. The file has sections, most will not change, but there will be a section for time of build (or some such thing).

I don't remember all the details, but the commands you will need are { objcopy, objdump, nm }, I think objdump would be the first to try.

Hope this helps.

Need Your Help

TableModel not loaded with Data from JPA Query on MouseClick over JTabbedPane

java swing jpa jtable jtabbedpane

I have a Java Class, i create a Query (JPA with EclipseLink) to Load Info from DB when a JTabbedPane is clicked, the data is loaded into the TableModel which is bound to a JTable.

Hadoop 2.6 Cluster mapred-site.xml jobtracker port not listening

hadoop mapreduce hadoop2

I have been trying so many configuration, yet it's impossible to have mapreduce jobtracker be listening if it's port (I'm on Hadoop 2.6),