Splitting PDFs with pyPdf

Posted by Wilmer van der Gaast on Fri 2011-06-03

A simple task, yet I couldn't find a quick cmdline to do it with, apart from pdftk, 15MB of Java rubbish.

Instead, here only 10 or so lines of Python. It was so fast I wasn't sure if it worked until I saw the results were there. Usage: split [prefix] [infiles...]. Multiple infiles possible. First argument is the filename prefix to use for all created files.

CODE:

import pyPdf
import sys

n = 0
for f in sys.argv[2:]:
f = pyPdf.PdfFileReader(open(f))
for p in f.pages:
of = pyPdf.PdfFileWriter()
of.addPage(p)
of.write(open("%s-%03d.pdf" % (sys.argv[1], n), "w"))
n += 1

Don't pay attention to Serendipity screwing up the code layout. We all know it's rubbish, I just can't be arsed to migrate to something better. :-/

Trackbacks

Trackback specific URI for this entry

No Trackbacks

Comments

Display comments as Linear | Threaded

Anonymous on Sun 2012-02-05:

Have a look at stapler: https://github.com/hellerbarde/stapler

Trackbacks

Comments

Add Comment