Export Plone to PDFs

Since many years ago, we’ve had a private installation of Plone in Niteo that we call Intra. Short for “intranet”. It is meant to hold our company-wide, non-project-specific documents. I.e. financial reports, internal newsletters, various guides, and internal policies.

But as of 2017, we moved most of these documents to our public Handbook, so the wider community can benefit from them and to make the decision of applying for one of our open positions a more informed one.

Last week we were on our bi-annual In-Real-Life meet, where we discuss any and all aspects of our work. It turns out, everyone (including marketing!) got so used to writing Markdown that they no longer want to use an in-browser editor. And they like the detailed history of changes that git provides.

We decided to sunset our Intra. To move the remaining couple of documents to GitHub and then make searchable PDFs of entire Intra, for future reference. Wanna know how we did it? Read on!

Exporting entire Plone site into a directory tree of PDFs

The way to do it is to open all Plone objects with a headless Chromium browser and use the Print-to-PDF feature:

  1. Spin up a local copy of your production Plone site.
  2. Go to portal_workflows, go through all workflows and give all permissions to Anonymous
  3. Still in portal_workflows, click Update portal security for your changes to apply. This will take some time, especially on bigger sites.
  4. Start bin/instance debug and run the following:
    >>> for brain in app.Plone.portal_catalog():
    >>>    print brain.getURL().split('http://nohost/Plone/')[1]
  5. Copy printed paths into paths.txt.
  6. Use the export.py gist.

The result should look something like this:

Done!

A few parting words: I still believe that Plone is by far the best Web Content Management system out there. It is secure, it scales well and it has a fantastic community backing it. It’s just that we, Niteo, do not need a CMS right now. We are small and used to having everything in Markdown on GitHub. The moment we outgrow our current structure and need to re-introduce access control and similar features, Plone will be the first candidate. Probably in the flavor of Quaive, an Intranet platform built on top of Plone.