A moderator reported that moving topics took a long time in production and our sandbox. I just tried it and got 70 seconds through the GUI! It didn’t used to take this long. I’m comparing our development sandbox a few months ago to our development sandbox today. Clearly something has changed. The question is what!
I heard a couple people liked reading about the thought process for troubleshooting slow queries. If you aren’t familiar with what a database explain is used for, see that entry before reading this one.
This time the path to success was a little more meandering. I’ve left the dead ends in here so you get a feel for the actual thought process and not a sanitized version. if you are a student, this is a decent feel for what problems look like in the “real” world.
|
Summary
Two orders of magnitude performance improvement on the database query and a web action that isn’t painful to wait for. I’m happy with that. Not only is it faster, but we uncovered a better business definition for the query.
I also learned the last time we tested a move on our sandbox server was before we tested migrating large quantities of data. Making it an invalid test.
Supplemental materials
Original query
SELECT t.*, p.user_id AS last_user_id, p.post_time, p.attach AS attach, f.forum_name \ FROM jforum_topics t, jforum_posts p, jforum_forums f \ WHERE p.post_id = t.topic_last_post_id \ AND p.need_moderate = 0 \ and p.forum_id = f.forum_id \ ORDER BY topic_views DESC \ LIMIT ?
Original explain
Row # QUERY PLAN
1 Limit (cost=253457.29..253457.39 rows=40 width=132)
2 -> Sort (cost=253457.29..254467.05 rows=403905 width=132)
3 Sort Key: t.topic_views
4 -> Hash Join (cost=26807.80..136006.35 rows=403905 width=132)
5 Hash Cond: (“outer”.forum_id = “inner”.forum_id)
6 -> Hash Join (cost=26707.81..129847.79 rows=403905 width=113)
7 Hash Cond: (“outer”.post_id = “inner”.topic_last_post_id)
8 -> Seq Scan on jforum_posts p (cost=0.00..50770.74 rows=1775019 width=24)
9 Filter: (need_moderate = 0)
10 -> Hash (cost=19386.05..19386.05 rows=403905 width=93)
11 -> Seq Scan on jforum_topics t (cost=0.00..19386.05 rows=403905 width=93)
12 -> Hash (cost=99.79..99.79 rows=79 width=27)
13 -> Seq Scan on jforum_forums f (cost=0.00..99.79 rows=79 width=27)
Final query
SELECT t.*, p.user_id AS last_user_id, p.post_time, p.attach AS attach, f.forum_name \ FROM jforum_topics t, jforum_posts p, jforum_forums f \ WHERE p.post_time >= ? \ and p.post_id = t.topic_last_post_id \ AND p.need_moderate = 0 \ and p.forum_id = f.forum_id \ ORDER BY topic_replies DESC \ LIMIT ?
Final explain:
Row # QUERY PLAN
1 Limit (cost=1774.18..1774.23 rows=20 width=132)
2 -> Sort (cost=1774.18..1774.28 rows=40 width=132)
3 Sort Key: t.topic_replies
4 -> Hash Join (cost=99.99..1773.12 rows=40 width=132)
5 Hash Cond: (“outer”.forum_id = “inner”.forum_id)
6 -> Nested Loop (cost=0.00..1672.53 rows=40 width=113)
7 -> Index Scan using idx_posts_time on jforum_posts p (cost=0.00..678.22 rows=177 width=24)
8 Index Cond: (post_time >= ‘2009-01-05 00:00:00’::timestamp without time zone)
9 Filter: (need_moderate = 0)
10 -> Index Scan using idx_topics_lp on jforum_topics t (cost=0.00..5.61 rows=1 width=93)
11 Index Cond: (“outer”.post_id = t.topic_last_post_id)
12 -> Hash (cost=99.79..99.79 rows=79 width=27)
13 -> Seq Scan on jforum_forums f (cost=0.00..99.79 rows=79 width=27)