As developers, we are used to see the difference between two (text) files via diff, git diff, or related tools. How to see the difference between two Excel files?

There are many options, and your choice would probably depend on what you mean by “difference”.

1) CSV + Meld

This is the easiest approach. Save both files as .csv, and use diff to compare them. Alternatively, you can use Meld, which is basically a UI for diff.

For big Excel files, or for two files with lots of differences, it’s usage is limited.

2) ExcelCompare

ExcelCompare is a sophisticated Java based tool with lots of options. I guess it’s most useful if you already have an idea what kind of difference to expect.

3) CSV + MySQL

This approach requires the most effort but comes with high flexibility.

First, save both Excel files in .csv format.

Then, create a scratch database in MySQL:

> create database excel_diff_tmp;

Next, create two tables which match your Excel file’s tables:

CREATE TABLE `sheet1` (
  id int(11) NOT NULL AUTO_INCREMENT,
  column_a varchar(255) DEFAULT NULL,
  column_b varchar(255) DEFAULT NULL,
  # ...
  PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8;

Now, load both .csv files into the database using LOAD DATA LOCAL INFILE:

LOAD DATA LOCAL INFILE '/path/to/file1.csv'
INTO TABLE sheet1 
FIELDS TERMINATED BY ',' 
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

You now have the full power of SQL at you hands to compare these two tables. You can do things like

  • Show rows which are in file 1, but not in file 2 (using all columns)
  • Show rows which are in file 1, but not in file 2 (using only some columns)
  • Show rows equal in both files
  • etc.

From MySQL Workbench, you could export the result back as a .csv file, which you can finally convert to Excel format.