geowave icon indicating copy to clipboard operation
geowave copied to clipboard

Allow a temporal feature to be indexed in multiple temporal indices using different temporal attributes

Open rwgdrummer opened this issue 9 years ago • 0 comments

A feature with more than one than temporal attribute (e.g. creation date, approval date), may be indexed in more than one temporal index, one per each attribute. When a CBO is constructed, the CBO can choose the best index based on the query.

Below pulled from a comment of an older ticket: I way want *multiple * temporal indices over different time attributes of a feature. For example, I may want one index on start and end date and another index on creation date. A query processor can expect the query and determine which index to use based on which attributes are both available and most constraining. How this achieved?

My thought is that we continue to use time descriptors with an optional index identifier. The identifier indicates which index to apply with these attributes. Our ability to name indices is critical, as we do not want the data indexed by different sets of attributes in the same index table. The data adapter would be armed with one or more time descriptors, one for each index.

Armed with this, how does this change the rest of the code? When CQLQuery is inquired for a Query (SpatialQuery), it returns one based on time constrained derived from the CQL and aligned with the time descriptor attributes. (method is List getIndexConstraints( final NumericIndexStrategy indexStrategy )). Change this method to accept an Index, not just the strategy, so the ID is present. Using the ID, look up the time descriptor that matches. If none matches, use a default (one provided without id which is often the one inferred from the feature type). Using specific time descriptor for the index, pull out the relevant constraints from the query. This concept moves some of the logic performed in the constructor into this method. It also forces the retention of the time descriptors in the instance (client side only) _unless _we also change the getIndexConstraints method to accept the adapter as well. I think this is not a good idea since Time Descriptors is specific to one type of adapter. We could generalize time descriptors as a concept to be used across different adapters-- have an adapter return fieldIds used by a given index (and its CommonIndexModel).

What is really nice is that the query optimizer can choose which indices is presented to this method.

I need to ExtractTimeVisitorFilter to handle 'or' different to build up the hierarchy of pairs. This can used by the CBO to see which constraints *constrain * the data the most and thus lead to the best index. Alternatively, I just repeat the use of the visitor given time descriptors. This is more in line with what is done today. This would result in multiple 'parse' steps for each temporal index. Not desirable, although, not horrible since we are not likely to see more than two or three different temporal indices.

The QueryIndexHelper need not change much. It can be given sets of constraints and a time descriptor, providing a 'trimmed' range based on stats. Armed with histogram stats, a true indication of constrained rows can aid in the final decision for the appropriate index.

rwgdrummer avatar Mar 29 '16 18:03 rwgdrummer