Small. Fast. Reliable.
Choose any three.
As the fts2 code was developed, I've been keeping notes about additional modifications to improve performance, or allow further scaling, or make things smaller. Some of the ideas are sort of crazy. What follows is mostly a brain-dump, and should not be taken to be specific development paths.


store interior nodes inline

Currently, for various reasons the leaf nodes must be a contiguous range of block ids. Thus, the interior nodes at height 1 are accumulated in memory, then flushed to the next range of blocks (and so on up the tree, thought height 1 is obviously most interesting). The memory footprint is N/C, with C being really rather large. The 64kdoc segment in the Enron corpus has only 85 interior blocks total, implying a memory footprint under 170k. But the 1mdoc segment is going to need a couple megabytes.

Since the blocks encode their height in the first byte, it would be easy to have segment merges skip interior nodes which were stored inline. This would only be of slight impact on merge performance, since interior blocks are only perhaps half a percent of total blocks. Then interior nodes could just be flushed as needed.

The problem is that interior nodes rely on their subtree nodes being contiguous. This would hold for the height 1 nodes (their leaf nodes would be contiguous), but the height 2 and above nodes would need to be encoded differently. The 64kdoc segment had 20k leaf nodes, implying that level-1 interior nodes would be about 230 blockids apart. Adding a delta-encoded blockid per term at level 2 and above would thus be perhaps 2 bytes per term, which might reduce the fanout there from 230 to 180 or so. That doesn't seem too bad.


smaller rowids in %_segments

The current system stores all blocks in a single %_segments table which is keyed by rowid. When we do a merge, the newly created blocks are necessarily receiving a higher rowid than the blocks the merged data is coming from, which blocks are eventually deleted. Since the new segment uses higher rowids, the deletes rowids cannot be reclaimed.

This can be finessed by using one segment table for segments in odd levels, another for even levels, and merging as soon as a level is full. For instance, if we have enough level-0 segments in the even table, we merge to a level-1 segment in the odd table, then delete the level-0 segments from the even table. Since they were at the highest rowids in the even table, those rowids can be reclaimed when we start generating more level-0 segments. The "merging as soon as full" clause means that when we merge level-1 segments into a level-2 segment, there won't be any higher rowids in the even table for level-0 segments.

This isn't very useful for segment encoding - we only refer to blockids in the segment dir, and at the front of each interior node, so even a large segment will only have a couple hundred blockids encoded. It might be more useful in increasing the btree density for the %_segments table. In experiments with the Enron corpus, it looks like the impact is about a 1/10 drop in max(rowid), plus another 1/2 from splitting across tables.