Explaining the inexplicable

Friends, we are happy to continue to publish interesting materials dedicated to various aspects of working with PostgreSQL. Today's transfer opens up a whole series of articles authored by Hubert Lubaczewski, which will certainly appeal to a wide readership.



One of the first things he hears newly-DBA – "use the EXPLAIN". And at the first attempt he is faced with the incomprehensible:

the
 the QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Sort (cost=146.63..148.65 rows=808 width=138) (actual time=55.009..55.012 rows=71 loops=1)
Sort Key: n.nspname, p.proname, (pg_get_function_arguments(p.oid))
Sort Method: quicksort Memory: 43kB
-> Hash Join (cost=1.14..107.61 rows=808 width=138) (actual time=42.495..54.854 rows=71 loops=1)
Hash Cond: (p.pronamespace = n.oid)
-> Seq Scan on pg_proc p (cost=0.00..89.30 rows=808 width=78) (actual time=0.052..53.465 rows=2402 loops=1)
Filter: pg_function_is_visible(oid)
-> Hash (cost=1.09..1.09 rows=4 width=68) (actual time=0.011..0.011 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on pg_namespace n (cost=0.00..1.09 rows=4 width=68) (actual time=0.005..0.007 rows=4 loops=1)
Filter: ((nspname <> 'pg_catalog'::name) AND (nspname <> 'information_schema'::name))

What could it mean?

It is useless to try to understand the above explain. Let's start with something simpler. But before that I would like you to understand one important thing:

PostgreSQL remembers

This means that PostgreSQL keeps some meta-information (information about information). The number of rows, number of distinct values, most common values, and so on. For large tables, this information is based on a random sample, but the overall Postgres really knows a lot about our data.

So let's look at a simple query and its explain:

the
$ explain select * from test where i = 1;
QUERY PLAN 
------------------------------------------------------
Seq Scan on test (cost=0.00..40.00 rows=12 width=4)
Filter: (i = 1)
(2 rows)

The query is quite simple and, I think, requires no further comment.

To explain the first line and all lines starting with “->" is the operation. The rest of the line – supplementary information to the operation above.

In our case there is only one operation – sequential scan of table test.

There is also additional information about the filter.

Sequential scan means that PostgreSQL will "open" the data from the table and read them. Theoretically, it can filter out (delete) line, but, in General, ready to read and return the entire worksheet.

Why ready? Will explain in a minute.

So, the line Seqscan informs us that we scan the table sequentially. And that table is called “test" (though here lies one of the biggest challenges of explain – it doesn't show a diagram, and I was ayalas than once).

And what is this number in brackets after the surgery?

I want to ask you a question. You have this table:

the
 Table "public.t"
Column | Type | Modifiers 
-------------+---------+------------------------------------------------
id | integer | not null default nextval('t_id_seq'::regclass)
some_column | integer | 
something | the text | 
Indexes:
"t_pkey" PRIMARY KEY, btree (id)
"q" btree (some_column)

Having the definition of this table and the query:

the
SELECT * FROM t where some_column = 123;

What do you think is the best way to execute this query? Sequentially scan the table or use the index?

If your answer is: of course, to use an index on that column has an index, so that this method will be faster — then I ask: what about the situation when the table has only one row and it contains the value of some_column equal to 123?

To produce a sequential scan, I need to read only one page table (8192 bytes), and I get a string. In order to use the index, I need to read the page from the index, to check whether the table rows that match the criteria and then to read a page from the table.
In the end – twice the work!

You could say: "of Course, but it's about very small tables, so the speed doesn't matter". Well. Then let's imagine a table in which 10 billion rows, and each of them has some_column = 123. The index is of no help, and in reality seriously worsen the situation.

Of course, if you have a million rows and only one of them some_column = 123, an index scan will be the most appropriate solution.

Thus, it is impossible to say whether this query to use an index, and whether he does use the index (we are talking about General cases). To understand this, you need more information. And this fact leads us to a simple conclusion: depending on the situation, one method of obtaining data will be better or worse than the other.

PostgreSQL (up to a point) checks all possible scenarios. He knows how many rows and how many rows (most likely) fall under the specified criteria, so it can make very smart decisions.

But how are these decisions? This is what shows the first set of numbers in explain. This is the cost.

Some people think that costs are estimated in seconds. This is not so. Their unit – "removing one page in serial (sequential) order." That is estimated time and resource usage.

In postgresql.conf you might notice these are the settings:

the
seq_page_cost = 1.0 # measured on an arbitrary scale
random_page_cost = 4.0 # same scale as above
cpu_tuple_cost = 0.01 # same scale as above
cpu_index_tuple_cost = 0.005 # same scale as above
cpu_operator_cost = 0.0025 # same scale as above

That is, we can even change the cost of reading consecutive pages. These options set the costs, as suggested by PostgreSQL, you will need to implement different methods of executing the same query.

So for example let's create a simple table with 1000 rows with some text and the index:

the
create table test (id serial primary key, some_text text);
CREATE TABLE

insert into test (some_text) select 'whatever' from generate_series(1.1000 level);
INSERT 0 1000

We see that you run the explain with the condition by id fails with the following:

the
explain select * from test where id = 50;
QUERY PLAN 
-----------------------------------------------------------------------
Index Scan using test_pkey on test (cost=0.28..8.29 rows=1 width=36)
Index Cond: (id = 50)
(2 rows)

What if we told Postgres that an index scan cannot be used under any circumstances?

the
explain select * from test where id = 50;
QUERY PLAN 
------------------------------------------------------------------------
Bitmap Heap Scan on test (cost=4.28..8.30 rows=1 width=13)
Recheck Cond: (id = 50)
-> Bitmap Index Scan on test_pkey (cost=0.00..4.28 rows=1 width=0)
Index Cond: (id = 50)
(4 rows)

And it also let's disconnect:

the
explain select * from test where id = 50;
QUERY PLAN 
------------------------------------------------------
Seq Scan on test (cost=0.00..18.50 rows=1 width=13)
Filter: (id = 50)
(2 rows)

OK, now let's get them next to each other:

the
Index Scan using test_pkey on test (cost=0.28..8.29 rows=1 width=36)
Bitmap Heap Scan on test (cost=4.28..8.30 rows=1 width=13)
Seq Scan on test (cost=0.00..18.50 rows=1 width=13)

By default, Postgres uses IndexScan. Why? It's simple – in this case, it is the least expensive way. Costs will amount to 8.29, while for bitmap heap scan (whatever it is) will be required 8.30 am, and for a seq scan – 18.5.

OK, but the costs are designated by two numbers: number..number. What is it, and why I am only talking about the second number? If we took into account the first number, then the winner would be the seq scan, as it is set to zero, and indexscan – 0.28, 4.28 and already have a bitmap heap scan.

The value cost is displayed in the range (number ..number), because it shows the cost of the line beginning of the operation and the cost of retrieving all rows (under all refers returned by this operation, not all available in the table).

What are the initial costs? Seqscan for them you have to do is read a page, and return line. And that's all. But, for example, to sort a dataset you need to read all the data and do sort them before you can consider returning even first of the rows. This is well illustrated by the following explain:
the
 the QUERY PLAN 
-------------------------------------------------------------------
Sort (cost=22.88..23.61 rows=292 width=202)
Sort Key: relfilenode
-> Seq Scan on pg_class (cost=0.00..10.92 rows=292 width=202)
(3 rows)

Please note that initial costs for Sort – 22.88, while total costs amount to just 23.61. So return rows from the Sort, in terms of costs slightly, but their sort – Yes.

The following information in the explain is “rows". This is the approximate number of rows, which, according to PostgreSQL, this operation is able to return (it might return less, for example, in the case of LIMIT). It is also very important for some operations — for example, joins (join). Merging two tables in which a total of 20 lines may be implemented in a variety of ways, and, by and large, no matter what. But when you combine a table of a million rows table with a billion rows, the way you do, very important (I'm not talking about inner join / left join, but rather about hash join, nested loop, merge join – if you don't understand what it was about, don't worry, I'll explain later).

Of course, this number can be misjudged – for many reasons. Sometimes it doesn't matter, but sometimes it is. But about the wrong estimates we'll talk later.

The last piece of information – the width (width). PostgreSQL is an assessment of how much, on average, the bytes contained in a string returned in this operation. For example:

the
explain select * from pg_class;
QUERY PLAN 
-------------------------------------------------------------
Seq Scan on pg_class (cost=0.00..10.92 rows=292 width=202)
(1 row)

explain select relname, relkind from pg_class;
QUERY PLAN 
------------------------------------------------------------
Seq Scan on pg_class (cost=0.00..10.92 rows=292 width=65)
(1 row)

As you can see, limiting the number of fields modified width, and, consequently, the amount of data that must pass through the execution of the request.

And now attention, the most important information: explain's are the trees. The top node requires information from nodes underneath it.

Let's look at the plan.

In 5 operations: sort, hash join, seq scan, hash, and the seq scan again. PostgreSQL performs top surgery – sorting, which in turn performs the following beneath it (hash join) and receives data from it. Hash join, to return the data to sort, has to run seq scan (in pg_proc) and hash (#4). And finally, hash to return the data, has to run seq scan on pg_namespace.

It is important to understand that some operations can return data instantly or, more importantly, gradually. For Example, The Seq Scan. And some can't. For example, here we see that Hash (#4) has the same initial costs, and his "cooperate" seq scan "all the rows". This means that the operation hash (so that she could return at least one row), you need to read all lines from its cooperati.

Part of a gradual return rows becomes especially important when you start writing functions. Let's consider this function:

the
CREATE OR REPLACE FUNCTION public.test()
RETURNS SETOF integer
LANGUAGE plpgsql
AS $function$
declare
i int4;
begin
for i in 1..3 loop
return next i;
perform pg_sleep(1);
end loop;
return;
end;
$function$;

If you do not understand, do not worry. The function returns 3 rows, each containing one integer – 1, 2 and 3. The important thing is that she sleeps for 1 second after returning each row.

This means that if I do this:

the
select * from test();

I'll have to wait for the results 3 seconds.

But how long have to wait for the return with this scenario:

the
select * from test() limit 1;

Let's see:

the
\timing
Timing is on.

select * from test() limit 1;
test 
------
1
(1 row)

Time: 3005.334 ms

The same 3 seconds. Why? Because PL/pgSQL (and most, if not all, languages PL/*) can't return partial results. Seem to be able to use “return next", but in fact, all the results are stored in the buffer and return together when the execution of the function ends.
On the other hand, "normal" operations can usually return partial data. This can be seen if you spend some trivial operation, such a sequential scan, for a tough table:

the
create table t as
select i as id,
repeat('depesz', 100)::text as payload
from generate_series(1,1000000) i;

This table shows that:

the
explain analyze select * from t;
QUERY PLAN 
----------------------------------------------------------------------------------------------------------------
Seq Scan on t (cost=0.00..185834.82 rows=10250082 width=36) (actual time=0.015..232.380 rows=1000000 loops=1)
Total runtime: 269.666 ms
(2 rows)

explain analyze select * from t limit 1;
QUERY PLAN 
--------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..0.02 rows=1 width=36) (actual time=0.003..0.003 rows=1 loops=1)
-> Seq Scan on t (cost=0.00..185834.82 rows=10250082 width=36) (actual time=0.003..0.003 rows=1 loops=1)
Total runtime: 0.016 ms
(3 rows)

(please, look only on “Total runtime: ..")

As you can see, a sequential scan was over very quickly – as soon as satisfied appetite LIMIT by exactly 1 row.

Please note that even here the costs (which are not the best criterion for comparison of queries) show that the top node (the seq scan in the first query and the second limit) has very different values to return all the rows 185834.82 against 0.02.

So the first 4 digits for any operation (two cost estimates, number of rows and width) are approximate. They may be correct or may not be.

The other 4 numbers you get when you run “EXPLAIN ANALYZE query" or “EXPLAIN ( ANALYZE on ) query", show real results.

Still indicated range, but this time it's real time. That's how much time PostgreSQL spent on the operation (on average, because he could run the same operation many times). And just like costs, time range: time at the beginning of the operation and the time to return all the data. Let's check the plan:

the
$ explain analyze select * from t limit 100;
QUERY PLAN 
--------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..9.33 rows=100 width=608) (actual time=0.008..0.152 rows=100 loops=1)
-> Seq Scan on t (cost=0.00..93333.86 rows=999986 width=608) (actual time=0.007..0.133 rows=100 loops=1)
Total runtime: 0.181 ms
(3 rows)

As you can see, the start time of the operation for Limit – 0.008 (the units here are in milliseconds). This is because the Seq Scan (which Limit has caused to receive the data) took 0.007 MS to return the first row, and then another 0.001 MS gone to processing within the limit.

Then (after the first row is returned), the limit continued to receive data from Seq Scan until I got 100 rows. Then he stopped a sequential scan (this happened 0.133 MS after the start of the request) and ended himself in another MS 0.019.

The actual number of rows, as is clear from the title, shows how many rows (on average) this operation returned. A loop indicates how many times this operation was performed.

In which case the operation will be called more than once? For example, in some cases with a join or nested queries. It will be like this plan.

Note that in the third operation in just two cycles. This means that the seq scan has been run twice, has returned, on average, 1 line, and to complete him at an average of 0.160 MS required. So the total time spent on this particular operation: 2 * 0.160 MS = 0.32 MS (as specified in columns exclusive/inclusive explain.depesz.com).

Very often poor performance query is related to the fact that he had many times to loop through some kind of subquery. For example, as here.

(Of course, this does not mean that all the fault of Postgres who choose this plan. Perhaps other options are simply not there or they were more expensive).

In the above example, you should pay attention to the fact that, although the actual time of operation 3 is only 0.003 MS, this operation was carried out more than 26,000 times, which ultimately resulted in almost 79мс time.
I think the theoretical information required to read explain's, exhausted. Most likely, you still don't understand what you mean by operations and other information, but at least now you know what the numbers mean and what is the difference between explain (which shows costs in abstract units, is based on an estimate of random examples) and explain analyze (which displays the actual time, the number of rows and execution time in units of measure that allow you to compare different queries).

As always, I'm afraid I missed a lot of things that can be important, but not caught my eye, or (even worse), I consider them to be “obvious”. If you feel that something is missing, please let me know and I'll try to fill in the gaps as soon as possible.

But I also want to add that plan to develop this text in 2-3 publications that will tell more about that:
the
    the
  • which operations are, how they work and what to expect when you see them in the output of explain;
  • what is statistics like Pg it gets how to view and how to extract the maximum benefit.


I Hope that this article was useful for you. Subscribe and follow. Soon the blog will appear to transfer to the next series! As usual, feel free to leave feedback and suggestions. The most interesting we will include in the forthcoming PG Day'16 Russia! :)
Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

Briefly on how to make your Qt geoservice plugin

Database replication PostgreSQL-based SymmetricDS

Developing for Sailfish OS: notifications for example apps for taking notes