Skip to content

Splitting PDFs with pyPdf

A simple task, yet I couldn't find a quick cmdline to do it with, apart from pdftk, 15MB of Java rubbish.

Instead, here only 10 or so lines of Python. It was so fast I wasn't sure if it worked until I saw the results were there. Usage: split [prefix] [infiles...]. Multiple infiles possible. First argument is the filename prefix to use for all created files.

CODE:
import pyPdf
import sys

n = 0
for f in sys.argv[2:]:
f = pyPdf.PdfFileReader(open(f))
for p in f.pages:
of = pyPdf.PdfFileWriter()
of.addPage(p)
of.write(open("%s-%03d.pdf" % (sys.argv[1], n), "w"))
n += 1


Don't pay attention to Serendipity screwing up the code layout. We all know it's rubbish, I just can't be arsed to migrate to something better. :-/

Trackbacks

No Trackbacks

Comments

Display comments as Linear | Threaded

Anonymous on :

Have a look at stapler: https://github.com/hellerbarde/stapler

Add Comment

Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA

BBCode format allowed
Form options