Thursday, August 5, 2010

Dynamic column ordering and paging in PL/SQL

Update 09/30/2010: There was a problem with how I was doing the dynamic ordering.  I have resolved this and updated the post.

Update 10/08/2013: Oracle 12c finally has support for sane paging without jumping through all of these hoops. Until I get a chance to update this post you can read more at http://www.oracle-base.com/articles/12c/row-limiting-clause-for-top-n-queries-12cr1.php

The following demonstrates how a PL/SQL function or procedure can take in parameters, and dynamically sort based on those parameters.
Let's start with an example.  This example accepts two parameters, a VARCHAR2 that will be the column we want to sort on, and a REF CURSOR.  The REF CURSOR is an out variables that the calling program can use to iterate through to get the result set.

TYPE cursortype IS REF CURSOR;
PROCEDURE sort_example
(
  sort_column VARCHAR2,
  result_set OUT cursortype
) AS

BEGIN

  OPEN result_set FOR
    SELECT column_one,
      column_two
      FROM example_table
      ORDER BY DECODE
      (
        UPPER(sort_column),
        'COLUMN_ONE',
        column_one,
        'COLUMN_TWO',
        column_two,
        column_one
      );

EXCEPTION
  WHEN OTHERS THEN
    NULL;//Real error handling should go in this exception block.


END;

The DECODE statement examines the sort_column passed in and matches it to an actual column in the table(s) that we are selecting from.  If the variable passed in is not in our list of columns then it will default, in this example, to column_one.
Using this technique we have to define, up front, all of the columns that we are going to allow the calling procedure to order by.  If we want to allow ordering by any column, we must go through and list them out one-by-one.
It is also a good idea to add some extra validation, beyond the DECODE clause, to make sure that sort_column is an actual column in the table.  In our real packages that do this, we have an out variable named message.  If an invalid column name is passed in we notify the calling program by sticking a message to that effect in our message variable.
To make our procedure even more robust, it is nice to have the ability to decide if the column should be sorted in ascending or descending order.
This gets a little tricker, and to accomplish we need to understand how ROWNUM works in PL/SQL.  When a query is executed, Oracle assigns each returned result a pseudo-column nmaed ROWNUM, with the first returned row having a ROWNUM of 1, the second 2, etc.
Through a subquery, and ROWNUM, we can dynamically sort in either ascending or descending order.

TYPE cursortype IS REF CURSOR;
PROCEDURE sort_example
  (
    sort_column VARCHAR2,
    sort_direction VARCHAR2,
    result_set OUT cursortype
  ) AS


BEGIN

  OPEN result_set FOR
  SELECT *
    FROM
    (
      SELECT column_one,
        column_two
        FROM example_table


        ORDER BY DECODE
        (
          UPPER(sort_column),
          'COLUMN_ONE',
          column_one,
          'COLUMN_TWO',
          column_two,
          column_one
        )
    )
    ORDER BY
      CASE
        WHEN UPPER(sort_direction) = 'DESC' THEN
          ROWNUM * -1
        ELSE
          ROWNUM
      END;

EXCEPTION
  WHEN OTHERS THEN
    NULL;//Real error handling should go in this exception block.

END;

Here we have our original query, initially sorting on the passed in column, that is subqueried and then sorted again, either on the rownum, or the opposite of rownum.  If the calling program passed in 'DESC' for the sort_direction, then we sort descending, if anything else is passed in we sort ascending.

We can take this maddening query a step further, utilizing rownum again, to handle paging.  We use this in our applications to push paging out of our interface code, and onto the database.  It is much quicker for our database to only return ten or so results we want to show on each page, then to return the whole set of data, and have our PHP application parse through the data.
So here is our final procedure, with dynamic sorting and paging.

TYPE cursortype IS REF CURSOR;
PROCEDURE sort_example
  (
    sort_column VARCHAR2,
    sort_direction VARCHAR2,
    start_row NUMBER,
    end_row NUMBER,
    result_set OUT cursortype
  ) AS

BEGIN

  IF start_row < 1 THEN
    start_row = 1
  END IF;
  IF start_row > end_row THEN
    end_row = start_row;
  ELSIF end_row IS NULL THEN
    end_row = start_row + 10;
  END IF;


  OPEN result_set FOR
  SELECT *
    FROM
    (
      SELECT ROWNUM rnum,
        column_one,
        column_two
        FROM
        (
          SELECT *
            FROM
            (
              SELECT column_one,
                column_two
                FROM example_table


                ORDER BY DECODE
                (
                  UPPER(sort_column),
                  'COLUMN_ONE',
                  column_one,
                  'COLUMN_TWO',
                  column_two,
                  column_one
                )
            )
            ORDER BY
              CASE
                WHEN UPPER(sort_direction) = 'DESC' THEN
                  ROWNUM * -1
                ELSE
                  ROWNUM
              END
        )
    )
    WHERE rnum >= start_result
      AND rnum <= end_result
    ORDER BY rnum;

EXCEPTION
  WHEN OTHERS THEN
    NULL;//Real error handling should go in this exception block.

END;

Yes, this one is a beast.  Starting at the child-most query, we have our SELECT statement, ordering the data by the column passed in.  In the parent query we then use ROWNUM, and the sort_direction argument to sort either ascending or descending.  We subquery these results, capturing the ROWNUM into an aliased column named rnum, so that in it's parent query we can refer to it's ROWNUM, and not some other ROWNUM from any other part of the query.  In the parent-most query we use our new rnum column to get only the data from the rows asked for.
Before you do any querying, we check the validity of the start_result and end_result arguments to make sure they actually make sense.  If they don't then we do some manual patching.  You may want to handle this differently, such as throwing an exception.  It is really up to you.  We have a DEFAULT_PAGING environmental variable we store in our database.  When end_result is NULL, then we set it to start_result plus our DEFAULT_PAGING variable.
You may look at this and think it is complicated, and probably performs very poorly.  In our testing, as long as your inner-most query has the proper indexes and constraints, the performance is actually pretty good.  In our environment, it is definitely much faster than having our PHP parse through all the results, sort and order the whole thing, and then cut out only the rows we actually want, especially when we have result sets that are well over 100,000 rows long.
There is one final step we take, so that our interface application knows how many results there are in the whole query, so that it knows how many pages it can expect there to be.  This example won't show it, but we actually check for a cached rowcount first, and we have all sorts of logic to try to guarantee integrity of those results, but I'm not going to get into all that here.







TYPE cursortype IS REF CURSOR;
PROCEDURE sort_example
  (
    sort_column VARCHAR2,
    sort_direction VARCHAR2,
    start_row NUMBER,
    end_row NUMBER,
    result_set OUT cursortype,
    rowcount OUT NUMBER
  ) AS

BEGIN

  IF start_row < 1 THEN
    start_row = 1
  END IF;
  IF start_row > end_row THEN
    end_row = start_row;
  ELSIF end_row IS NULL THEN
    end_row = start_row + 10;
  END IF;

  OPEN result_set FOR
  SELECT *
    FROM
    (
      SELECT ROWNUM rnum,
        column_one,
        column_two
        FROM
        (
          SELECT *
            FROM
            (
              SELECT column_one,
                column_two
                FROM example_table


                ORDER BY DECODE
                (
                  UPPER(sort_column),
                  'COLUMN_ONE',
                  column_one,
                  'COLUMN_TWO',
                  column_two,
                  column_one
                )
            )
            ORDER BY
              CASE
                WHEN UPPER(sort_direction) = 'DESC' THEN
                  ROWNUM * -1
                ELSE
                  ROWNUM
              END
        )
    )
    WHERE rnum >= start_result
      AND rnum <= end_result
    ORDER BY rnum;

  SELECT COUNT(*)
    INTO rowcount
    FROM example_table;

EXCEPTION
  WHEN OTHERS THEN
    NULL;//Real error handling should go in this exception block.

END;

With any luck this will help someone out.  With better luck, someone will drop by and offer some tips on how to do this even better.