A description of the evaluation process |
This chapter describes the process by which the evaluator turns a text string representing an XPath expression, into a sequence of items. This description should act as a guide for anyone interested in examining the code, and that includes the author (I'm writing it right now to help me check I am being consistent throughout).
Conceptually, the process proceeds in a series of phases. In practice, there is some limited overlapping of these phases. To get a good picture of how an XPath evaluation is supposed to work, see XML Path Language (XPath) 2.0.
The steps taken by the Gobo XPath evaluation engine are as follows:
The input to the evaluation process is a text string, representing an XPath Expression, and a Context Item. First then, some definitions:
A sequence of one item is completely interchangeable with the item itself.
The class for a sequence is XM_XPATH_SEQUENCE_VALUE.
The class for an item is XM_XPATH_ITEM.
The class for a node is XM_XPATH_NODE.
The architecture supports multiple implementations of the data model's tree structure. The only implementations at present are the standard tree implementation and the tiny tree implementation. In these implementation, the class for a node is XM_XPATH_TREE_NODE and XM_XPATH_TINY_NODE respectively.
A value can be regarded as a Sequence, although sometimes it is a sequence of length one, or even zero.
The class for a value is XM_XPATH_VALUE.
The class for an atomic value is XM_XPATH_ATOMIC_VALUE.
The class for an expression is XM_XPATH_EXPRESSION.
An expression is either a Value or an instance of XM_XPATH_COMPUTED_EXPRESSION.
The class for an iterator is XM_XPATH_SEQUENCE_ITERATOR.
XM_XPATH_EXPRESSION_FACTORY has a routine make_expression which takes a STRING (holding the text of the expression to be parsed) and an XM_XPATH_STATIC_CONTEXT. The result of calling make_expression is an optimized XM_XPATH_EXPRESSION in parsed_expression. If a parse error has occurred though, this will be Void. In this case is_parse_error will be set to True, and parsed_error_value will be set to an instance of XM_XPATH_ERROR_VALUE.
A side-effect is that functions and variables may well be bound in the static context.
Setting the debug-key "XPath expression factory" will cause make_expression to print a textual representation of the expression tree to the standard error stream, immediately after parsing is sucessfull.
If parsing is sucessful XM_XPATH_EXPRESSION_FACTORY's make_expression routine goes on to call simplify on the expression. This performs context-independent optimizations on the expression and (recusively) it's sub-expressions. Current may be marked in error (So the caller of simplify must test is_error. If it is True, you can access error_value).
Note that if a simplification error occurs, make_expression treats it the same way as a parse error.
Setting the debug-key "XPath expression factory" will cause make_expression to print a textual representation of the simplified expression tree to the standard error stream, immediately after simplification.
After the simplication process is complete (the picture here is itself simplified, as simplify may itself be called by later phases, especially if static analysis is unable to completely determine the type of an operand), the next phase is static analysis of the expression, to determine the types of all expressions. This is accomplished by calling analyze on the expression.
Analyze takes an XM_XPATH_STATIC_CONTEXT as it's sole parameter.
It may change the static context (?? check this some time).
As a command, is quite likely change the expression in one of several ways:Replacement may be done, for instance, because static analysis can show that the expression is in fact a constant value. In that case the expression can be pre-evaluated.
Another reason for replacement is exactly the opposite: static analysis is unable to determine if the type or cardinality of the expression is correct or incorrect. In this case, the expression is wrapped in an XM_XPATH_ITEM_CHECKER or an XM_XPATH_CARDINALITY_CHECKER respectively. These classes postpone the checks until evaluation time.
Setting the debug-key "XPath evaluator" will cause evaluate to print a textual representation of the expression tree to the standard error stream, immediately after static analysis is sucessfull.
If static analysis is sucessfull, evaluate proceeds to the evaluation stage. XM_XPATH_EXPRESSION has no fewer than six routines for performing evaluation. All of them take an XM_XPATH_CONTEXT as sole parameter, though this may be Void on occaisions. If it is not Void, then it is liable to be altered by any of these routines (as the context_item is liable to change), so none of them are pure functions.
TODO: pre-condition of analyzed - check if this is always checked for, as other evaluation routines cannot have it (because of cardinality-checker/item-checker)
This is a function, so it does not change Current (but may change the evaluation context).
The result of evaluation is set in last_evaluated_item. This is Void if Current evaluates without error to an empty sequence.
If an error is detected, then last_evaluated_item's is_error will be set to True. Callers of evaluate_item must check for this possibility (after first checking for Void). The error value can be accessed via error_value. A class XM_XPATH_INVALID_ITEM is available for returning an error.
XM_XPATH_SEQUENCE_ITERATOR[G -> XM_XPATH_ITEM] is modelled on DS_LIST_CURSOR, although it only has a subset of the features (before, after, off, item, start and forth. It also has routine another, which produces another iterator which will iterate over the same items as the original would have initially done).
Another difference is that XM_XPATH_SEQUENCE_ITERATOR[G -> XM_XPATH_ITEM] does not iterate over a physical list object - the sequences are only conceptual. Ususally the next XM_XPATH_ITEM will be evaluated (possibly changing the evaluation context as a side effect) when start/forth is called.
XM_XPATH_SEQUENCE_ITERATOR[G -> XM_XPATH_ITEM] is a deferred class. A number of descendants are available for actual use:
Classes that return anXM_XPATH_MAPPING_ITERATOR from their iterator routine need to implement XM_XPATH_MAPPING_FUNCTION. This has a single feature map, which returns anXM_XPATH_MAPPED_ITEM. This is either Void, or an XM_XPATH_ITEM, or anXM_XPATH_SEQUENCE_ITERATOR[G -> XM_XPATH_ITEM]. If map detects an error, then it's return value must encapsulate an XM_XPATH_ITEM in error.
Copyright © 2004, Colin Adams and others mailto:colin@colina.demon.co.uk http://www.gobosoft.com Last Updated: Thursday, April 15th, 2004 |