Scanning Error Discovery!

I mentioned earlier that the scanned version of Aladore 1914 had 418 pages while the 1915 edition had 416 pages.  There wasn’t any obvious differences between the front and back matter, so it meant somewhere in there the 1915 scan was missing two pages… scary!

To try to discover the difference, it would be easiest to have the two PDF’s side-by-side or overlaid to visually compare.  A side-by-side view is a feature of many paid full version PDF tools, such as Adobe Acrobat or Nitro Pro.  This still requires turning pages on each PDF separately which can get annoying if the problem is a few hundred pages in!  I am not aware of any free PDF reader that has this view.  A paid option (with a free trial) that specializes in this task is DiffPDF.  However, as usual, I want to steer towards as open as possible instead.

An open source alternative is (the quite similarly named) diff-pdf,  It is not very polished, but is pretty neat once you figure it out.  Basically, download the package and unzip.  The program is run from the command line.  To make things simple, I put the two Aladore PDFs I wanted to compare into the diff-pdf directory.  I then ran the command:

[file path]\diff-pdf.exe --view aladore_1914_IA.pdf aladore_1915_IA.pdf

This analyses the listed files and starts up a GUI visualization of the two PDFs overlaid.  It looks like this:

Aladore title page in diff-pdf

Luckily, the discrepancy between the two scanned editions was easy to find–it was immediately obvious in diff-pdf when the editions got out of sync.  It turns out the 1915 scan is missing pages viii and ix, in the table of contents.  This is a typical scanning error: basically the pages probably stuck together, and the operator didn’t notice when turning the page.  Luckily it was in the table of contents and not the text itself.

Nice to have that mystery solved!


