August 4, 2006

TextIndexNG 3.1.13 coming up

I have been working on a new TextIndexNG3 release today that will basically contain two fixes:

  1. The phrase search could possibly return fault positives. The phrase search was implemented as a substring search on the list of word ids that is encoded using the WidCode.py module of ZCTextIndex. Under certain circumstances an encoded substring could be found although I did not represent a valid search result.
  2. For large documents the encoded list of word ids night become long. In all older versions these strings have been stored as values of an OOBTree. This might cause large transactions since the ZODB organizes its data in buckets. Imagine you have a bucket with 20 strings where each string has 1 MB of data. When you modify a single string the complete bucket (roughtly 20MB) would have been transfered to the ZODB. As a solution all strings are now wrapped as a persistent subobject that can be loaded and stored individually.

Both errors were reported by Dieter Maurer and affect the textindexng.storage.Storage class implementation. Thanks!

TextIndexNG 3.1.13 is supposed to be released this weekend or early next week.