November 23, 2008

SQLStorage is dead - long live RDBMSStorage!

Archetypes-based content-types are traditionally tied to the ZODB as backend storage. It has always been hard making the content available through a RDBMS - the only solution for a long time has been SQLStorage. However the implementation and the behavior was more than flaky. Lately projects like collective.tin (Laurence Rowe) and Contentmirroring by Kapil appeared on the scene. The  major reasons for using a RDBMS backend are scalablity and efficiency: with a standard Plone/CMF installation you can create between 4-8 objects per second. Tests with a RDBMS backend have shown that creating the same objects within a database like Postgres is  slightly faster: up to 350(!) objects per second.

As a sunday-afternoon exercise I created a prototype of a new RDBMS-storage for Archetypes on top of SQLAlchemy - the preliminary name is RDBMSStorage. The database model uses table inheritance for storing the common metadata within a dublincore table. Content-type specific fields are stored in dedicated tables like document. SQLAlchemy allows us to map the Archetypes field types nicely to SQL datatypes. The code right now is only of prototype quality and for experiementing however the first results look promising.

Some stupid Archetypes behavior observed during the implementation:

  • Archetypes creates six new object instances when creating a new document through the portal factory
  • The get() method for fields like title, description or text are called up to eight times when viewing a document. That's pretty dumb and might lead to performance issues unless we use some smart caching.