ImageMagick to Extract a PDF Page

← What is Code Review? | posts | Git Revision Colon Slash →

← Git-Config Colors and Include | TIL | Git Revision Colon Slash →

Today I was working with a PDF that was 27,000 pages long (god help me).

I never tried to open that PDF. This is just the page count I got from a python script I wrote to parse that PDF to a CSV file.

When the time came to spot-check the results of that python script I needed to compare some pages deep within the PDF with the output on the CSV file.

I use the Zathura document viewer to view PDFs, but I was reasonably certain that it would choke on such a large document. Instead I extracted one page at a time using ImageMagick.

convert 'big.pdf[2000]' big-pg2000.pdf

Then I opened the generated PDF using Zathura.

zathura big-pg2000.pdf

I was able to compare that side-by-side with the generated CSV.

Easy peasy.