I had a requirement to return the thread with the most replies in each forum at JavaRanch‘s Coderanch forums. In Postgresql 8.4, this would be very easy – just use the window functions. Unfortunately, we aren’t on Postgresql 8.4 yet. The other common pattern is something like
select stuff from mytable t1 where date = (select max(date) from mytable t2 where t1.key= t2.key)
This doesn’t work well for me either because the value I am comparing (number of posts is dynamic.) I decided to use a stored procedure to “simplify” things. I’ve written some stored procedures in Postgresql to do updates before and stored procedures in other languages to do queries so this didn’t seem like a huge task.
Postgresql calls them stored functions, so let’s proceed. First you need to create a type to represent a row that gets returned by the stored function.
CREATE TYPE highlighted_topic_per_forum_holder AS (last_user_id integer, post_time timestamp without time zone, <other columns go here>);
Then you create the stored procedure. The outer loop goes through each forum. The inner loop is the SQL that finds the post with the most replies that was posted to in some time period. It uses a nested query with a limit to return only 1 thread per forum. The rest of the SQL adds the relevant data. See postgresql and JDBC for why it takes varchar rather than timestamp.
CREATE or replace FUNCTION highlighted_topic_per_forum(character varying) RETURNS SETOF highlighted_topic_per_forum_holder AS $BODY$ DECLARE forum RECORD; r highlighted_topic_per_forum_holder%rowtype; BEGIN for forum in EXECUTE 'select forum_id from jforum_forums where categories_id in (1,7)' loop for r in EXECUTE 'select p.user_id AS last_user_id, p.post_time, p.attach AS attach, t.* ' || 'from jforum_topics t, jforum_posts p, ' || '(select topic_id, count(*) from jforum_posts ' || ' where post_time >= date '' ' || $1 || ' '' ' || ' and forum_id = ' || forum.forum_id || ' AND need_moderate = 0 ' || ' group by topic_id order by 2 desc limit 1) nested ' || ' where p.topic_id = nested.topic_id ' || ' and p.post_id = t.topic_last_post_id ' || ' order by post_time desc' loop return next r; end loop; end loop; return; END; $BODY$ LANGUAGE 'plpgsql';
Pingback: postgresql and jdbc | Down Home Country Coding With Scott Selikoff and Jeanne Boyarsky
Pingback: upgrading from postgres 8.3 to 8.4 | Down Home Country Coding With Scott Selikoff and Jeanne Boyarsky
Turns out I didn’t need postgres 8.4. The “distinct on” clause does what I needed.