Contains utility classes and interfaces for sequences (lists), arrays, and trees.

Features:

The iteration and position framework

A position points between two elements in its "containing" sequence, or it points before the first element (if any), or after the last. Sometimes, we loosely say that the position refers to or points to an element of the sequence; in that case we mean the position is immediately before the element.

An iterator is an object that has a current position, but that can be moved so it gets a new position. What needs to be emphasized is that all Sequence implementations. use the same SeqPosition class to implement positions and iteration. In other "collections frameworks" each sequence class has its corresponding iterator class, but in this framework you instead add the methods that handle iteration to the sequence class itself, extending the AbstractSequence class. A consequence of this is that if you have an unused SeqPosition object, you can use it to iterate over any Sequence you choose. This potentially saves memory allocation, which gains you the most when iterating over a nested sequence structure, like a tree of XML data.

We do this by requiring that any position be representable using a pair of an int and an Object. In other words, the state of an iterator is restricted to be an (int, Object) pair. Of course that is all you could need, since if you need more state you can always create a helper object, and store the extra state there. The assumption we make though is that for most Sequences, an (int, Object) would be enough, without creating any new objects (beyond, sometimes, the SeqPosition itself).

The int part of a position-state is normally named ipos, and the Object part of a position-state is named xpos. Neither of these values have any meaning in themselves, they only have meaning as interpreted by the specific Sequence. For example, ipos might be the offset of the position in the sequence, but it might also some "magic cookie". If a Sequence supports insertions and deletions, and you want positions to be automatically adjusted, then a position has to be a magic cookie rather than an offset. (A typical implementation is for such a position to be an index into a table of positions managed by the Sequence itself.) So a complete position is actually a (int, Object, AbstractSequence) triple, where the AbstractSequence component is the object that interprets the (int, Object) pair. Normally, the AbstractSequence part of a position triple is the Sequence "containing" the position, but it can be any AbstractSequence that implements the various methods that provide the semantics for the (int, Object) pair.

The PositionContainer interface is a more abstract version of SeqPosition, in that it can represents more than one position. For example, instead of an array of separately allocated SeqPosition objects, you might use some data structure that implements PositionContainer, which is abstractly a list of (int, Object, Sequence)-triples.

The consumer protocol

You typically use an iteration framework it process the elements of a sequence in such a way that the data consumer is active and in control, while the sequence itself (the data producer) is a passive object. The Consumer works the other way round: The data producer is active and in control, while the data consumer is passive, consuming elements when requested by the data producer. The Consumer is a more abstract variant of SAX's ContentHandler interface for processing "events"; however Consumer is intended for general value sequences, and is less tied to XML.


Per Bothner
Last modified: Sun Jun 17 10:32:21 CST 2001