Z-order vs R-tree, the sequel

last time we came to the conclusion that for the efficient operation of the spatial index based on Z-order, you must do 2 things:

the

efficient algorithm to obtain pointervalue
low-level work with a B-tree

That's it, we can do it under the cut.

Let's start with the more interesting of the algorithm.

the

Split into subqueries

Will consider 2-dimensional algorithm (indexing point), because of its relative simplicity. However, at large dimensions, the algorithm is easily generalized.

Now we (for simplicity) will use non-negative integer coordinates. The coordinates are limited to 32 bits, so the value of the index key can be stored in a uint64

We need the following simple properties of z-number:

Let the plane marked with a rectangle. Then among all points of the rectangle of the smallest z-room is the lower left corner of the rectangle (let's call him “z”) and the largest – the upper-right corner (we call it “Z”). This property in an obvious way follows from the method of constructing the z-room.
Every rectangle can be split uniquely into two rectangles (vertical or horizontal slit) so that all z-numbers of the first rectangle is less than all z-numbers a second. This follows from the self-similarity Z-curve. The unit cell of the four cells is divided in half, then in half two cuts, the same thing happens at all levels of the hierarchy.

How to cut a extent in order to preserve the continuity of peginterferon?

From the self-similarity curve that can only be cut on a grid in increments of powers of two and with the node at the origin.

But what exactly is a grid of 64 available to choose? It's pretty simple. The extent of the cut should occupy in the desired lattice more than one cell, otherwise there is simply nothing to cut. On the other hand, for any coordinate size may not exceed 2 cells, and at least one must be strictly 2 cells. If both dimensions of the cut extent is 2 cells will be cut at the coordinate, whose discharge in the construction of the Z-values of the senior.

How to find a grille? It's too easy. It is enough to compare Z values of the angles z and z Begin to compare with the high-order digits and find the category where their values diverged. So the desired lattice.

How to make the cut? Here it is necessary to recall the method of constructing the Z-values, namely that both x&y coordinates are interleaved across the category. Therefore:

the

Let the distinction between z and Z began in category m
Cut will be on one coordinate, which accounted for m, regardless of x or y, even in this case x, however, y it is exactly the same
Of the extent (x0,y0,x1,y1) we need to obtain two: (x0,y0,x2-1,y1),(x2,y0,x1,y1)
And in order to obtain x2, it is sufficient to reset all bits of the x coordinate under m, i.e. one
x2-1 is obtained having m bits and assigns 1 to all x's younger charges

So, what looks like the algorithm used to generate subqueries? Pretty simple:

set up a queue of subqueries, initially in the queue only one item — the desired extent
While the queue is not empty:
1. Get the item from the top of the queue

This method ensures that we generate the subqueries, wherein the Z-value of the final (i.e., those that triggered the stopping criterion) subqueries only increase proposal back does not occur.
the

stopping Criteria

It is very important, if they neglected, generating the subqueries will continue until, until everything is cut into unit squares.

It should be noted that, in the formulation of such a criterion, we cannot rely on request parameters, area, geometric properties ... In the real data point distribution can be very uneven, for example, the city and empty space. The population of the areas we are in advance unknown.

So the necessary integration with the tree search index, which, as we recall, a B-tree.

Looks like the subquery from the point of view of the index? Is a collection of data located on a certain number of pages. If the subquery is empty, then the interval data is empty, but however much I look, because to understand that no data, we need to try to read them and to come down from the top of the tree to leaf pages. It may happen that an empty request looks beyond the tree.

Generally, the process of subtraction data of the subquery consists of two phases

the

sensing of the tree. Starting with the root page, we are looking for a key less or equal Z-value of lower left corner of the subquery (z) and so we go down to the leaf page. The output is a stack of pages and the leaf page “in the air”. On this page looking for an element greater than or equal to the required one.
Reading forward to the end of the subquery. In PostgreSQL leaf page B-tree linked list if it was not, in order to get the next page would have to climb up the stack of pages, and then to go down on her.

So, after the probing request we have in hand a leaf page which is deemed to start our data. Different situations are possible:

element is Found greater than the upper-right corner of the subquery (Z). Ie no data.
element Found page is less than Z, the last element of the page is less than Z. i.e. the subquery starts on this page ends up somewhere in the future.
element Found page is less than Z, the last element is greater than Z. i.e. the entire subquery is located on this page.
element Found page is less than Z, the last page element is equal to Z. i.e., a subquery starts on this page, but perhaps ends with the following (multiple elements with the same coordinates). And maybe much further if a lot of duplicates.

Option N1 does not require any action. For N2 seems natural following the stopping criterion (splitting subqueries) — we will cut them until, until you get options 3 or 4. Option N3 all obviously, in the case N4 of the data pages may be few, but to cut the subquery is pointless because on the next page(Zach) can only be data with the key equal to Z, after the cut we find ourselves in the same situation. To cope with this, simply consider the following(ing) all the data pages with the key equal to Z. They may not be, in General, the N4 is a rather exotic case.

the

B-tree

Low-level work with a B-tree presents no special difficulties. But you'll have to create extension. The General logic is as follows — register SRF function:

the

CREATE TYPE __ret_2d_lookup AS (c_tid TID, x integer, y integer);
CREATE FUNCTION zcurve_2d_lookup(text, integer, integer, integer, integer)
RETURNS SETOF __ret_2d_lookup
AS 'MODULE_PATHNAME'
LANGUAGE  C  IMMUTABLE STRICT;

Which receives the name of the index and extent. And returns a set of elements, each of which is a pointer to the string table and coordinates.

Access proper tree is as follows:

the

const char *relname; /* external parameter */
...
List *relname_list;
RangeVar *relvar;
Relation rel;
...
relname_list = stringToQualifiedNameList(relname);
relvar = makeRangeVarFromNameList(relname_list);
rel = indexOpen(rel);
...
indexClose(rel);

Getting the root page:

the

int access = BT_READ;
Buffer buf;
...
buf = _bt_getroot(rel, access);

In General, the index search done like a normal search in B-tree (see postgresql/src/backend/access/nbtree/nbtsearch.c). Changes related to the specific key, perhaps you could do without her, then it would be a bit slower.
Search inside the page looks like this:

the

Page page;
BTPageOpaque opaque;
OffsetNumber low, high;
int32 result, cmpval;
Datum datum;
bool isNull;
...
page = BufferGetPage(buf);
opaque = (BTPageOpaque) PageGetSpecialPointer(page);
low = P_FIRSTDATAKEY(opaque);
high = PageGetMaxOffsetNumber(page);
...
here is a binary search
...
/* for a leaf page, return the found value */
if (P_ISLEAF(opaque))
return low;
/* for intermediate - previous item */
OffsetNumberPrev return(low);

The receipt of the item page:

the

OffsetNumber offnum; 
Page page;
Relation rel;
TupleDesc itupdesc = RelationGetDescr(rel);
IndexTuple itup;
Datum datum;
bool isNull;
uint64 val;
...
itup = (IndexTuple) PageGetItem(page, PageGetItemId(page, offnum));
datum = index_getattr(itup, 1, itupdesc, &isNull);
val = DatumGetInt64(datum);

the

Final algorithm

set up a queue of subqueries, initially in the queue only one item — the desired extent
While the queue is not empty:
1. Get the item from the top of the queue
2. Perform the probe query the index for z (lower left corner). In order to save, you can do the probing, not every time, but only if the last value found (initially 0) is greater than or equal to z
3. In the case when the minimum value exceeds Z (top right corner), finish processing of the subselect, go to item 2
4. Check the last element of the leaf page B-tree, which stopped a probe request
5. If it is greater than or equal to Z, select page elements, filtered them for the accessories search and remember the extent of the resulting point.
6. If it is equal to Z, read index forward to the complete exhaustion of the key with the value equal to Z and also remember them
7. otherwise, the last value of page is less than Z
  1. Compare z and Z, find the category m, which we will cut
  2. as described Above, there are two subquery
  3. and Put them into a queue, first one with large Z-values, then the second

the

Preliminary results

In the title illustration of this article there are split real query on a subquery and the resulting point. The comparison found an R-tree results obtained with the above algorithm. Picture time debugging and you can see that one point is not enough.

But pictures is pictures, and I want to see the performance comparison. From our side will be the same table:

the

create table test_points (x integer,y integer);
COPY test_points from '/home/.../data.csv';
create index zcurve_test_points on test_points(zcurve_val_from_xy(x, y));

And requests of type:

the

select count(1) from zcurve_2d_lookup('zcurve_test_points', 500000,500000,501000,501000);

We compare with R-tree as a de facto standard. Moreover, in contrast to the last article, we need a “index only scan” on the R-tree because our Z-index no longer refers to the table.

the

create table test_pt as (select point(x,y) from test_points);
create index test_pt_idx on test_pt using gist (point);
vacuum test_pt;

Such data on request:

the

explain (analyze, buffers) select * from test_pt where 
point <@ box(
point is(500000, 500000), 
point(510000, 510000));

produces:

the

 QUERY PLAN
---------------------------------------------------------------------------------------------
Index Only Scan using test_pt_idx on test_pt (cost=0.42..539.42 rows=10000 width=16) 
(actual time=0.075..0.531 rows=968 loops=1)
Index Cond: (point <@ '(510000,510000),(500000,500000)'::box)
Heap Fetches: 0
Buffers: shared hit=20
Planning time: 0.088 ms
Execution time: 0.648 ms
(6 rows)

as required.

A proper comparison:
the the the the

request Type	index Type	Time msec.	Shared reads	Shared hits
100X100 ~1 point	R-tree Z-curve	0.4* 0.34*	1.8 1.2	3.8 3.8
1000X1000 ~100 points	R-tree Z-curve	0.5...7** 0.41*	6.2 2.8	4.9 37
10000Х10000 ~10,000 points	R-tree Z-curve	4...150* 6.6**	150 43.7	27 2900

* — data obtained by averaging a series of length 100 000
** — data obtained by averaging a series of different lengths, 10 000 vs 100 000
*** — data obtained by averaging a series of different lengths, 1 000 vs 10 000
**** data obtained from averaging a series length of 10,000

The measurements were carried out on a modest virtual machine with two cores and 4 GB of RAM, so the times are not absolute values, but the numbers of pages read can be trusted.

the

Insights

the

in any case, on small queries Z-index works faster R-tree
and reads at the same time significantly fewer pages
R-tree much earlier starts to massively miss the cache
itself Z-index is four times smaller, so the cache works more efficiently
and, it is possible that mistakes and go past the disk cache of the host machine, otherwise it is difficult to explain such a difference

the

What's next

Considered a two-dimensional bitmap index is suitable only for proof-of-concept, life it little useful.

Even for three-dimensional index 64-bit key is not enough (or very close) for a useful resolution.

So there will be:

the

transition to a different key, numeric
3D option, including the points on the sphere
verify the possibility of working with the Hilbert curve
full measurements
4D option — rectangles
6D variant — rectangles on the sphere
...

PS: the Source code posted here with the BSD license, described in this article corresponds to the branch “raw 2D”

PPS: Algorithm itself was developed in the old days (2004-2005) in collaboration with Artushina.

PPPS: a Huge thanks to the guys from PostgresPro for the fact that I decided to implement this algorithm in PostgreSQL.

PPPPS: Continued here

Article based on information from habrahabr.ru

Поиск по этому блогу

computer express