Bug #680
slow sql queries with postgres backend and prepared statements
| Status: | Feedback | Start date: | 04/22/2009 | |
|---|---|---|---|---|
| Priority: | Normal | Due date: | ||
| Assignee: | EgS | % Done: | 0% |
|
| Category: | Quassel Core | |||
| Target version: | - | |||
| Version: | 0.5-pre | OS: | Linux |
Description
When prepared statements with postgresql those queries will be optimized at creation time. This can lead to suboptimal execution patterns. In this particular case fetching the backlog takes ~500ms per bufferid which leads to a timeout when connecting with a client to the core.
Reproducible: unknown (always on my system)
Steps to Reproduce:
1. connect to database with pqsl
2.
PREPARE quassel_3 AS SELECT messageid, time, type, flags, sender, message FROM backlog JOIN sender ON backlog.senderid = sender.senderid WHERE bufferid = $1 ORDER BY messageid DESC LIMIT $2;
3.
EXPLAIN ANALYZE EXECUTE quassel_3(13,1000);
Actual Results:
Limit (cost=0.00..2953.41 rows=1244 width=100) (actual time=52.738..481.666 rows=656 loops=1)
-> Nested Loop (cost=0.00..29522.22 rows=12435 width=100) (actual time=52.731..480.869 rows=656 loops=1)
-> Index Scan Backward using backlog_pkey on backlog (cost=0.00..24116.20 rows=12435 width=59) (actual time=52.601..473.359 rows=656 loops=1)
Filter: (bufferid = $1)
-> Index Scan using sender_pkey on sender (cost=0.00..0.42 rows=1 width=49) (actual time=0.006..0.008 rows=1 loops=656)
Index Cond: (sender.senderid = backlog.senderid)
Total runtime: 497.364 ms
(7 rows)
Expected Results:
Limit (cost=0.00..3659.73 rows=1000 width=100) (actual time=8.370..27.696 rows=656 loops=1)
-> Nested Loop (cost=0.00..18020.51 rows=4924 width=100) (actual time=8.366..26.780 rows=656 loops=1)
-> Index Scan using backlog_bufferid_idx on backlog (cost=0.00..14749.11 rows=4924 width=59) (actual time=8.327..17.198 rows=656 loops=1)
Index Cond: (bufferid = 13)
-> Index Scan using sender_pkey on sender (cost=0.00..0.65 rows=1 width=49) (actual time=0.008..0.010 rows=1 loops=656)
Index Cond: (sender.senderid = backlog.senderid)
Total runtime: 28.243 ms
(7 rows)
Workaround:
1. Create a function that dynamically calls the SQL query in order to force query optimization at runtime:
CREATE TYPE select_messages_result AS (messageid integer, time timestamp without time zone, type integer, flags integer, sender varchar(128), message text);
CREATE OR REPLACE FUNCTION select_messages(integer, integer)
RETURNS SETOF select_messages_result
STABLE
AS $$
DECLARE result select_messages_result;
BEGIN
FOR result IN EXECUTE 'SELECT messageid, time, type, flags, sender, message
FROM backlog
JOIN sender ON backlog.senderid = sender.senderid
WHERE bufferid = ' || quote_literal($1) || '
ORDER BY messageid DESC
LIMIT ' || quote_literal($2)
LOOP
RETURN NEXT result;
END LOOP;
END;
$$ LANGUAGE plpgsql;
2. Call function from the prepared statement
SELECT * from select_messages($1, $2)
Related files:
src/core/SQL/PostgreSQL/14/select_messages.sql
History
#1 Updated by EgS over 2 years ago
- Status changed from New to Rejected
This is rather a bug in Postgres than in Quassel. The error is, that Postgres doesn't pick up the proper index which is - for such a simple query - pretty surprising.
The benefit of prepared queries is exactly to avoid query planing for the same query over and over again. The proposed solution is to discard the benefits of prepared queries by introducing overhead via a new pgsql function...
If you want to force query planing for each executed query, the solution would be to simply waive prepared queries and go for regular - non prepared - queries.
#2 Updated by EgS over 2 years ago
- Status changed from Rejected to Resolved
Seems to be fixed in git:
http://bugs.quassel-irc.org/repositories/revision/1/4bb04fc5758f566e58d560d6f8c510c5cb038486
#3 Updated by m4yer over 2 years ago
Postgresql-8.4 seems to bring huge advantages (in connection with the latest patches).
Maybe a improvment in query planing (or so^^)
#4 Updated by sph over 2 years ago
PostgreSQL 8.4 fixes this bug indeed.
#5 Updated by teo over 2 years ago
- Status changed from Resolved to Feedback
This doesn't seem to be fixed yet, in Postgres 8.4.1, as confirmed by al and myself:
psql (8.4.1)
Type "help" for help.
quassel=> PREPARE quassel_3 AS SELECT messageid, time, type, flags, sender, message FROM backlog JOIN sender ON backlog.senderid = sender.senderid WHERE bufferid = $1 ORDER BY messageid DESC LIMIT $2;
PREPARE
quassel=> EXPLAIN ANALYZE EXECUTE quassel_3(13,1000);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..73077.69 rows=7300 width=100) (actual time=10354.128..10354.128 rows=0 loops=1)
-> Nested Loop (cost=0.00..730826.97 rows=73005 width=100) (actual time=10354.123..10354.123 rows=0 loops=1)
-> Index Scan Backward using backlog_pkey on backlog (cost=0.00..244924.08 rows=73005 width=60) (actual time=10354.118..10354.118 rows=0 loops=1)
Filter: (bufferid = $1)
-> Index Scan using sender_pkey on sender (cost=0.00..6.64 rows=1 width=48) (never executed)
Index Cond: (sender.senderid = backlog.senderid)
Total runtime: 10354.341 ms
(7 rows)
#6 Updated by APTX about 2 years ago
I would suggest lowering random_page_cost described: http://www.postgresql.org/docs/8.4/static/runtime-config-query.html#RUNTIME-CONFIG-QUERY-CONSTANTS
Some more info about how and why changing this constant can help: http://archives.postgresql.org/pgsql-performance/2004-02/msg00274.php
I also see no reason why you shouldn't prefer to do index scans on a database like quassels as quassel only has very large or very small tables, so the slowdown in using an index on a small table would be negligible compared to the speedup I got in that one query (from 4.5 seconds to under 1 ms)
#7 Updated by amiconn over 1 year ago
This problem is still reproducable with PostgreSQL 8.4.3, both on Debian x86 and x86_64. Lowering random_page_cost to 2.0 as suggested above does help (the default is 4.0). Just set this as a user variable for the database user 'quassel' (or whatever you called it):
ALTER ROLE quassel SET random_page_cost='2';
#8 Updated by sdancer over 1 year ago
With 8.4.4, I get:
quassel=# EXPLAIN ANALYZE EXECUTE quassel_3(1,10);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=83250.25..83285.42 rows=14068 width=103) (actual time=23129.867..23129.893 rows=10 loops=1)
-> Sort (cost=83250.25..83601.96 rows=140683 width=103) (actual time=23129.863..23129.872 rows=10 loops=1)
Sort Key: backlog.messageid
Sort Method: top-N heapsort Memory: 19kB
-> Hash Left Join (cost=6385.04..55827.38 rows=140683 width=103) (actual time=1736.794..18053.591 rows=2551568 loops=1)
Hash Cond: (backlog.senderid = sender.senderid)
-> Bitmap Heap Scan on backlog (cost=2646.93..45593.47 rows=140683 width=61) (actual time=1489.377..7584.482 rows=2551568 loops=1)
Recheck Cond: (bufferid = $1)
-> Bitmap Index Scan on backlog_bufferid_idx (cost=0.00..2611.76 rows=140683 width=0) (actual time=1470.950..1470.950 rows=2551568 loops=1)
Index Cond: (bufferid = $1)
-> Hash (cost=1781.27..1781.27 rows=89827 width=50) (actual time=247.141..247.141 rows=88872 loops=1)
-> Seq Scan on sender (cost=0.00..1781.27 rows=89827 width=50) (actual time=0.016..85.011 rows=88872 loops=1)
Total runtime: 23131.169 ms
(13 rows)
That's on a very limited virtual host.
#9 Updated by amiconn over 1 year ago
The bug suddenly came back even at random_page_cost=2 (PostgreSQL 8.4.4 on Debian x86_64, quasselcore 0.7-beta1+11 git-f16f4c5). Lowering random_page_cost further (random_page_cost=1) fixed it for now.
This really needs some in-depth analysis, given that other users see even different slow query plans.
#10 Updated by amiconn over 1 year ago
The bug still exists with default settings and PostgreSQL 9.0.0 (tested on Debian 32 bit, quasselcore 0.8-pre+19 git-fdec4a8). Lowering random_page_cost helps as usual.