July 15, 2013

Produce & Publish Status Update 4

Produce & Publish Plone Client Connector extended with DOCX import

A recent discussion with a customer brought me to the point rewriting the existing DOCX import into Plone (as it exists in the "old" Produce & Publish Authoring Environment) from scratch. The new functionality provides basically the following:

  • an import form that can be called in the context of a Plone folder object
  • the import form allows you to upload a Word DOCX document
  • the conversion process in the background will send the file to the Produce & Publish Server where is it converted to XHTML using unoconv
  • the generated XHTML document is split into the contents and styles
  • the html-ish context is stored as Plone Document, the style sheets are stored as Plone files, images are stored as Plone images
  • a dedicated view will render the html-ish context together with the extracted styles from the DOCX document

The conversion result actually looks reasonably well.

A nice side-effect: the names of DOCX styles(?) ("Formatvorlage" in German) are preserved as CSS class in the HTML. So when the original DOCX document contains a paragraph marked as "address" we will have <p class="address"> in the HTML. This makes it easy to apply global styles to common CSS classes as used for example as part of a company-wide DOCX template.