Overview over all blog items

Jun 25, 2015

The Docker way on dealing with "security"

RANT: The Docker developers are so serious about security

The Docker way on dealing with "security"

DOCKER - The next generation bullshit in IT

Docker recently launched the beta version of their new Docker image hub (hub-beta.docker.com), a registry for public and private Docker images.

How serious is Docker about security?

Completely unserious.

The first version of the new hub required a minimal password length of exactly one character. After laughing about their password security policy they changed it to four characters and recently the minimum password length is now finally six characters? What's next? Seven chars? Eight chars? This is the best example that the Docker devs do not have the slighest idea about basic security. All services moved forward to longer password, more complex passwords, two factor authentication and these untalented Docker dev kids  introduce a one character minimum password policy for hub-beta.docker.com....find the problem yourself...


We are currently working on a better integration of Elasticsearch with Plone 4.3 and 5.0. The technical foundation of our work is the collective.elasticindex implementation by Infrae. While the collective.elasticindex add-on is working nicely out of the box, we have the need to bring the Elasticsearch integration of Plone a bit forward.

Project goals

  • support for Plone 4.3 and 5 
  • proper support for multi-lingual content
  • language dependent analyzers on the Elasticsearch (german text should be analyzed in ES with the german analyzer etc)
  • more flexible configuration of the ES indicies
  • fast search feedback (find-as-you-type)
  • "real-time" filtering of search results by meta data (similar to extended search)
  • better highlighting support
  • easier customization of the search result list using Mustache

Right now we are moving fast and ripping and replacing old code with new code where needed. We hope to create an alpha release within the next weeks that will hopefully run on Plone 4.3 and Plone 5.

 


May 22, 2015

XML-driven Plone portal "Onkopedia" finally online

Onkopedia is a medical guideline portal in the field of hematology and oncology. It is based on the Plone content management system and driven by an XML publishing workflow with the conversion from DOCX to XML/HTML and PDF.

I am pleased to announce the official (re)launch of the the Onkopedia (www.onkopedia.com) portal after one year of hard conceptual and implementation work.

Onkopedia is a medical guideline portal in the field of hematology and oncology. It features offical  guidelines for the diagnosis and treatment of diseases. The guidelines and supplementary documents are grouped by audience:

  • Onkopedia for physicians
  • My Onkopedia for patients and their relatives
  • Onkopedia-P for caregivers

The Onkopedia project started in 2010 with a simple DOCX to HTML/PDF publishing workflow. After some years with a growning amount of context, new external requirements it became obvious that an updated infrastructure and a new publishing workflow would be necessary. XML as document standard was directly on the desk and ZOPYX started in 2014 with the conceptual design and architecture of Onkopedia.

The new system features a new complex but easy-to-use conversion workflow with a DOCX to HTML+XML conversion build on top of the c-rex.net platform by Practice Innovation. The XML to PDF conversion is based on the Produce & Publish system in combination with the PDFreactor converter by RealObjects. The Plone content management system was used as implementation platform for the complete system in combination with the open-source XML database eXist-db Version 2.2. The integration layer of Plone with eXist-db is available as open-source project XML Director.

Reference

Project partners


May 18, 2015

CSS Paged Media workshop @XML London 2015

Join me at the XML London 2015 conference for hands-on-training on generating high-quality PDF documents from XML/HTML.

I will attend the XML London 2015 conference from June 5th to 7th and give a hands-on-training on

CSS Paged Media and generating high-quality PDF documents from XML/HTML

The training will involve a lot of live coding in order to show you what you ask and what want to see. 

The training material will evolve over the next two week in our public repository https://github.com/zopyx/css-paged-media-tutorial.

The training consists of two slots:

  • slot one will teach  you the CSS Paged Media basics
  • slot two involves styling of a real world content document (HTML source with lots of chapters, images and tables)

Requirements

  • Participants must have either PrinceXML 10 or PDFreactor 7 installed on their systems. Both converters are available for free for private or evaluation purposes. The trainer can assist you with the installation on Mac or Linux (not with Windows) but please make sure that you install the converter before the tutorial in order to safe time for the real cool stuff.
  • Particpants must have basic skills in HTML and CSS.

Trainer

Andreas Jung is working in the electronic publishing business for almost 20 years. Andreas is a Python & Plone freelancer, works on large internet and web applications, publishing solutions and funder of the Produce & Publish and XML Director projects. 

Further information on CSS Paged Media

 

 


Mar 31, 2015

New hands-on training "Generating high-quality PDF documents from XML and HTML using CSS Paged Media"

Our hands-on training "Generating high-quality PDF documents from XML and HTML using CSS Paged Media" teaches you to generate high-quality PDF print layouts with HTML or XML as input and Cascading Stylesheets for the definition of print layouts and styling.

Introduction

"CSS Paged Media" turned into a serious solution for generating high-quality PDF documents from XML or HTML over the last years. The advantages of the CSS Paged Media approach obvious:

  • basic knowledge in XML/HTML and CSS sufficient (you do not need to be an XML expert)
  • separation between content and layout/styling
  • easy to learn, easy to use
  • lower costs
  • higher flexibility 

Usecases

  • text-oriented publications (books, newspapers, documentation etc.).
  • layout-oriented publictions (flyers, brochures, web-to-print applications)

Contents

  • Introduction CSS Paged Media
  • The region model of CSS Paged Media
  • Basic formatting
  • Multi-column layouts
  • Mehrspaltiger Satz
  • Pagination
  • Images
  • Footnotes
  • Header and footer
  • Automatic table of content generation

Requirements

  • basic knowledge in XML/HTML and cascading stylesheets
  • basic knowledge in typography

Software

  • PDFreactor 7.0 (alternative: PrinceXML 9)

Price and Location

Our hands-on trainings usually usually takes place at your company or organization. We teach in small groups of up to three people. We offer an individual training based on your requirements, needs and the skills of your employees. The price depends on the number of training days and the location of the training. Contact us directly for further information and quotes.

Trainer

Andreas Jung has been working for more than 20 years in the field of Electronic Publishing and developed several PDF generation solutions over the last ten years. Andreas Jung is chief developer and author of the  Produce & Publish product family and found of the  XML Director project.


Mar 05, 2015

callas software GmbH releases pdfChip - a quick test

Quick test of a new PDF converter supporting CSS Paged Media.

callas software GmbH released today their pdfChip PDF converter product that follows the same CSS Paged Media approach. We are using CSS Paged Media converters like PrinceXML or PDFreactor for years for generating high-quality PDF documents from XML and HTML content. As an expert in this field I did some quick tests with  the new converter based on real-world customer content that are running our Produce & Publish solution in production for years.

Quick results

  • the feature set of pdfChip is comparable to where PrinceXML and PDFreactor were two or three years ago
  • PDF quality is ok but behind other tools due to missing features like build-in hyphenation
  • no build-in hyphenation support (except using Javascript)
  • no multi-column support (documented)
  • no flexbox support (at least undocumented)
  • no support for page regions, named pages (at least undocumented)
  • no footnote support (at least undocumented)
  • does not seem to implement all the CSS (3) features that we see in decent versions of PrinceXML and PDFreactor (CSS dot leaders missing, hyphenation, no repetitive table headers when a table spans multiple pages)
  • Poor documentation
  • Worth the money? Definitely NO. This product is completely overpriced. The smallest version pdfChip S costs 5000 EUR + VAT and allows to generate documents with up to 25 (twenty five) pages only! The next bigger version supports up to 250 pages per document for 10.000 EUR + VAT. Our customers have in general between 1 and 250 pages...so 10.000 is a huge investment. The unlimited version costs 25.000 EUR. Even Antennahouse Formatter offering much more features and much better typographical quality for a better price. Our tools (PDFreactor, PrinceXML) are in the price range from 2.250 EUR to 3800 USD with almost no restrictions (except: requires a box with 4 or less CPUs). So you have to pay a three or four times higher price for a tool with less features and many restrictions? I think this ridiculous. 

Update

The statement "The Paged Media Module is currently not supported by pdfChip"  makes it clear that pdfChip does not want to support the defacto standard for HTML/CSS based publishing or only to a certain degree. pdfChip appears half-baked and essential features that exist in other tools for years are missing. Unfortunately this product is also marketed as being superior over all other tools. This is not the case. There are better and cheaper alternatives.

Update (20.10.2015)

Even half a year after the first evaluation there is no progress visible. The product remains completely overpriced. Interesting enough: half of the documentation covers barcodes...well, this software seems to be a very expensive barcode generator. As written in my original positing: there are better alternatives. We really hope that alternative projects like Vivliostyle catch up fast in order to provide better and cheaper alternatives. Well, PrinceXML and PDFreactor are obviously superior.

Feel free to contact us for CSS Paged Media consulting.

 


Since two months I have been very busy with finding a working combination of Linux kernel, hardware and Linux distribution that would actually stable for running Docker in production. Only one out of eight combination worked for me.

Feel the pain and frustration?

All tests done with Docker 1.5.0 final without special configuration of the underlaying driver (AUFS vs. device mapper).

Linux Distribution Kernel Hardware Hoster Comment Status
Ubuntu 14.04 3.13 Bare metal Hosteurope the only working combination (using AUFS) WORKING
Ubuntu 14.04 3.13 VM Hosteurope hoster-patched kernel in order to fit its OpenVZ virtualization FAIL
OpenSuse 13.1 2.6 Bare metal self-hosted kernel panics after some minutes during a Docker build  FAIL
OpenSuse 13.1 3.11 Bare metal self-hosted kernel panics once or twice a day during Docker builds (possibly related to BTRFS crashes) FAIL
CentOS 6.6 2.6 VM Hetzner slow IO, long Docker builds (10 times slower than on the same VM), likely related
to CentOS and/or the Docker device mapper (although supported by Docker)
FAIL
CentOS 7.0 >3.10 VM Hetzner slow IO, long Docker builds (10 times slower than on the same VM), likely related
to CentOS and/or the Docker device mapper (although supported by Docker). Also: Docker did not play well with the 'firewalld' of CentOS. Reconfiguration of the firewalld caused a network loss of all Docker containers and the docker daemon had to be restart....a major fail.
FAIL
Ubuntu 14.04 3.13 VM Contabo Docker builds much slower than directly on the same VM, not as extreme as with CentOS, Docker container execution speed OK partly WORKING
CentOS 7.0 3.13 VM Contabo same problems as with other CentOS versions FAIL

Conclusions:

  • CentOS is completely unusable for running Docker - at least with the default device-mapper 
  • OpenSuse 13.1 problems likely related to BTRFS issues (in combination with the device-mapper)
  • running Docker on virtual machines in general does not seem to make much sense
  • Ubuntu 14.04 on real hardware seems to be the only reasonable combination right now
  • Docker does not perform any reasonable runtime checks for checking the sanity of the Linux host (crashes and unrelated or non-speaking error messages are the only thing you get from Docker)
  • The general attitude of the Docker devs: we-don't-care and works-for-us -> case closed
  • The monolithic design of Docker is broken. Restarting Docker - for whatever reason - implies a shutdown of all containers (using --restart you can restart all containers upon restart of the Docker daemon)
  • The Docker documentation lies about working and supported distro and kernel support (see link above) and the Dockers obviously do not care about instead of fixing their documentation and in particular: testing Docker on different hardware and distros - apparently their testing producedures are broken from ground up.

However there is hope...the upcoming CoreOS Rocket runtime engine looks very promising...however Rocket is still in early stages. At least Rocket already supports loading Docker images. On the other side: Rocket tried to place a pull request for Docker in order to achieve image compatibility between Docker and Rocket ...but typically for the ignorance and arrogance of the Docker devs: they give a shit and only care about their own thing. Unfortunately the Docker developers are corrupted by too much venture captial and became ignorant through the Docker hype.


Feb 20, 2015

XML Director 0.4.0 release/Newsletter #4

XML Director is a Plone-based XML content-management-system (framework) backed by eXist-db or BaseX.


Feb 11, 2015

MongoDB gate

Yesterday researchers of my home university Universität des Saarland published a report about 40.000 MongoDB servers in the world running on public ports and without authentication. This is kind of a nightmare. Disclosure of customer data, credit card numbers etc....but whom to blame? Of course MongoDB is (technology-wise) a crappy database and it would be easy to blame MongoDB altogether.

There are only two minor problems with MongoDB here: 

  • the MongoDB daemon binds to all public IP addresses by default depending on the distribution or download package. It is said that the standard installers bind to localhost only however the daemon distributed with the binary packages binds to 0.0.0.0 - BAD DESIGN DECISION 
  • MongoDB does not require a password by default. So every MongoDB server is open without authentication by default - BAD DESIGN DECISION

However there is no direct technical exploit in MongoDB responsible for the disclosure of private data - just bad design decisions (having their impact here). Unfortunately the answer of MongoDB CTO Eliot Horowitz on this issue is both cheap, weak and poor and one does not seem to care about the implications from this report.

More important in this case is the human factor.

Obviously several thousand adminsitrators are incompetent or incapable performing very basic administration tasks like

  • configuring a daemon to localhost or a private IP only
  • configuring a firewall

My theory on this is that more and more untalented IT workers are in charge for dealing with technology issues, networking and programming aspects that are far beyond their horizon. This is not only a problem of MongoDB but can be also observed with other IT technology. Watching mailing lists, IRC, Stackoverflow and other related media over the last years is becoming a growing pain. The technology is getting more and more diverse and complex but the intelligence and motivation of the "typical" IT workers seems to go down year by year. Yes, this is a typical Andreas Jung rant but many tasks in software development and system administration should be left to people that know what they are doing. But many IT departments apparently do not care about competence, security and privacy (any more). Mistakes happen every day - even for experienced IT workers and experts. However this report with 40.000 open MongoDB installations indicates some more fundamental problems how IT security handled in organizations: badly. And my recommendation: unmotivated and untalended script kiddies should keep their fingers away from security critical infrastructure and components.