|
|
|
Background
Range queries against multidimensional data arise in all relational
databases and in particular in data-warehousing and data-mining
applications. Together with our project partners SAP, TST and NEC we
identified critical performance problems of relational database systems for
this type of queries. In this project we investigate the UB-Tree, an index
structure for the efficient processing of multidimensional range queries,
which can easily be implemented as middleware above existing RDBMS, and
its performance for multidimensional range queries. The UB-Tree
is balanced and has all the guaranteed performance characteristics of
B-Trees, i.e., it requires linear space for storage and logarithmic time
for the basic operations of insertion, retrieval and deletion. Therefore
it can efficiently support OLTP. In addition the UB-Tree preserves
clustering of objects with respect to Cartesian distance. Therefore, it
shows its main strengths for multidimensional data and in particular for
relational data, if a tuple is simply interpreted as a point in
multidimensional space. It has very high potential for parallel
processing. A single UB-Tree can replace a large number of secondary
indexes and join indexes including foreign column join indexes (FCJ). For
updates this means that only one UB-Tree must be managed instead of several
secondary indexes. This reduces runtime and storage requirements
substantially. For retrieval the UB-Tree has multiplicative complexity with
respect to the relative size of the ranges for range queries, resulting in
a dramatic performance improvement over multiple secondary indexes which
have additive range query complexity. Furthermore, using the
Tetris-Algorithm the UB-Tree enables reading data in any arbitrary sort
order without the necessity of external sorting. Thus data need to be read
only once to perform most of the operations of the relational algebra, such
as ordering, grouping, aggregation, projection and joining. Therefore, the
UB-Tree can support OLAP very efficiently. It is useful for geometric
databases, data-warehousing and data-mining applications, but even more for
databases in general, where multiple secondary indexes on one relation or
FCJ-indexes to join several relations are widespread, which can all be
replaced by a single UB-Tree in-dex. Therefore, the difficult index
selection problem largely disappears and the UB-Tree offers the
potential to integrate OLAP with OLTP in the same processing environment.
Basic Concept of the UB-Tree
| |