Quick discussions on any Oracle topic I feel like covering, and anything else going on in my life I care to share.

Monday, March 22, 2010

Dynamic Method 4 – Fetching a Variable Number of Columns in PL/SQL

Oracle defines four methods for performing dynamic SQL, DML and DDL statements. These are:
  • A non-query statement (e.g., “ALTER”, “UPDATE”) with no host/bind variables
  • A non-query statement with a known number of bind variables
  • A query statement with a known number of SELECT clause items and input bind variables
  • A query statement with an unknown number of SELECT clause itemas and/or input bind variables
  • Precompiler languages like Pro*C allow for ANSI-compatible and (the antiquated) non-ANSI methods for implementing SQL/DML/DDL to any degree of dynamism. Both styles generally require a very good understanding of the methodology, as well as moderate to advanced programming skill.

    PL/SQL offers two distinct and non-interchangeable methods of performing dynamic SQL: Package DBMS_SQL and the language feature EXECUTE IMMEDIATE. Both approaches have their pros and cons, as well as limitations. In this article, I show a technique for dynamically executing queries that return a variable number of columns using EXECUTE IMMEDIATE.

    In order to return varying sets of columns at runtime, we have to decide where to put them. We need a data structure that can easily handle one or more column values returned for each row, across multiple rows. In Oracle9iR2 and above, we can use a collection of collections. Here, I define two schema-level collections.
    DROP TYPE nt_table_row; -- drop 1st because of dependency
    -- stores one or column values (a single row)
    CREATE OR REPLACE
    TYPE nt_column_value IS TABLE OF VARCHAR2(4000);
    /
    -- stores one or more rows
    CREATE OR REPLACE
    TYPE nt_table_row IS TABLE OF nt_column_value;
    /
    The intent is for nt_column_value to capture each column’s value in each successive array element, like this:
    Figure 1, Mapping columns to array elements

    Then each one of these is attached to an nt_table_row element, like this:
    Figure 2, Storing a list of column values as rows

    The goal is to build a dynamic statement that fetches each column in the SELECT clause into each successive element of nt_column_value, then assign a collection of column values to each successive element of nt_table_row to represent each row.
    The collection nt_column_value is always loaded with the same number of columns for each row fetched, so we can extend it to the correct number of elements once and use it over and over. When it is assigned to one of the nt_table_row elements, a copy of it is stored. We can then overwrite the nt_column_value values with the next row fetched, and store those in the next element.

    If I was to prototype this as a typical anonymous block, it might look something like this:
    declare
    cursor cur is
    select
    column1
    ,column2
    ,column3
    ...
    ,columnN
    from some_table
    ;
    arr_row nt_table_row := nt_table_row();
    arr_column_value nt_column_value := nt_column_value();
    begin
    arr_column_value.extend(num_cols); /* always this many columns */
    open cur;
    loop
    fetch cur into
    arr_column_value(1)
    ,arr_column_value(2)
    ,arr_column_value(3)
    ...
    ,arr_column_value(N);
    exit when cur%notfound;
    arr_row.extend;
    arr_row(arr_row.count) := arr_column_value;
    end loop;
    close cur;
    end;
    This isn’t bad, but the looping and fetching and element-at-a-time extension doesn’t seem very efficient. I’d prefer to do bulk fetching, if possible. Fortunately, I can have the query return an nt_column_value collection directly for each row. The SELECT clause becomes:
    select
    nt_column_value(
    column1
    ,column2
    ,column3
    ...
    ,column
    )
    Then I can bulk collect the cursor directly into an nt_table_row collection:
      open cur;
    fetch cur bulk collect into arr_row;
    close cur;
    And I get the collection extension and assignment for free.

    Ultimately, this anonymous block has to be done dynamically. I need to generalize it into a template form. I use placeholders to indicate sections of code that has to be replaced at runtime. As a convention, I use a tilde (~) followed by a single character for each placeholder. Things like column lists, table name and other integral components must be “hard-coded” into the statement; you can’t use bind variables for those. So now my dynamic block looks like this:
      v_stmt VARCHAR2(32767) :=
    'declare
    cursor cur is
    select
    nt_column_value(~c)
    from ~t
    ;
    arr_row nt_table_row := nt_table_row();
    begin
    open cur;
    fetch cur bulk collect into arr_row;
    close cur;
    :a := arr_row;
    end;';
    The placeholder ~c is the column list. The placeholder for the table is ~t. I replace them with the REPLACE() function:
      -- build dynamic block
    v_stmt := REPLACE(v_stmt, '~c', v_column_list);
    v_stmt := REPLACE(v_stmt, '~t', v_table_name);
    where the v_column_list and v_table_name strings have yet to be constructed.

    I want to actually return the collection to the calling context, so the bind variable :a is needed to provide the interface. When calling with Execute Immediate, the collection is returned via an OUT bind variable:
      EXECUTE IMMEDIATE v_stmt
    USING OUT arr_row;
    In order to construct the column list, I need to go to the Data Dictionary and look them up for the given table. I prefer to provide each bind value as a cursor parameter because this programming style localizes the scope of the parameter name right where it’s used. If I don’t pass it in, I don’t reference it in my cursor declaration.

    For purposes of illustration, let’s just stick with the basic data types. Other data types, such as LONG or LOB columns, require special handling, particularly if they contain values over 32K. There are other unusual data types I don’t want to consider at this time, such as object types, timestamps, etc. My cursor looks like this:
      -- ignore object types, LONG, LOB's etc.
    CURSOR cur_columns?(p_table_name user_tab_columns.table_name%TYPE)
    IS
    SELECT utc.column_name, utc.data_type
    FROM user_tab_columns utc
    WHERE utc.table_name = p_table_name
    AND utc.data_type IN ('VARCHAR2','CHAR','DATE','NUMBER','FLOAT')
    ORDER BY utc.column_id
    ;
    I typically declare cursors up front, rather than embed them implicitly directly into the executable code, because I get more flexibility with explicit cursors. I can use them anywhere an implicit cursor would work, and then some. I have much more control with explicit cursors. I can reuse them easily. I can still fetch from them one at a time or do bulk fetches. About the only time I use implicit cursors is when I expect them to return exactly one row, or it’s a DML statement.

    I need to declare some variables to hold the pieces of my lists:
    v_sep VARCHAR2(1);   -- column name separator
    v_column_expr VARCHAR2(60); -- a single column expression
    v_column_list VARCHAR2(32767); -- the delimited list of columns
    v_data_type_list VARCHAR2(32767); -- delimited list of data types
    v_date_format VARCHAR2(22) := 'DD-MON-YYYY'; -- whatever you want
    Now I can assemble the lists. I want each column expression to always return a string, since that’s what I’m fetching them into. It’s always best to do data type conversions explicitly. This is especially true for DATE columns, where the return string is strongly dependent on session-specific settings like nls_date_format.
      -- build lists for dynamic block
    FOR rec IN cur_columns(UPPER(v_table_name)) LOOP
    v_column_expr :=
    (CASE
    WHEN rec.data_type IN ('VARCHAR2','CHAR') THEN
    rec.column_name
    WHEN rec.data_type = 'DATE' THEN
    'to_char(' || rec.column_name || ',c_date_format)'
    WHEN rec.data_type IN ('NUMBER','FLOAT') THEN
    'to_char(' || rec.column_name || ')'
    ELSE rec.column_name
    END)
    ;
    v_column_list := v_column_list || v_sep || v_column_expr;
    v_data_type_list := v_data_type_list || v_sep || rec.data_type;
    v_sep := ',';
    END LOOP;
    c_date_format is a constant string that is going to be declared in my dynamic block. The actual value for it will be passed in via a bind variable, so that the caller can set it as desired.

    In 9iR2, I tried putting the CASE expression directly into the v_column_list concatenation and assignment, but ran into a known bug that corrupted the string (and was fixed in 10g). That’s the only reason I used the intermediate variable v_column_expr.

    I put all of this together into a prototype as a proof of concept. I typically do this prior to embarking on building packages, to make sure that what I’m attempting will work. I have a couple of parallel arrays, arr_column_name and arr_data_type, to hold the column names and data types respectively, for display purposes. The variable v_limit is used simply to restrict the number of rows returned from the dynamic statement. Dynamically building a WHERE clause presents many more challenges than I will deal with here.
    DECLARE
    -- ignore object types, LONG, LOB's etc.
    CURSOR cur_columns(p_table_name user_tab_columns.table_name%TYPE)
    IS
    SELECT utc.column_name, utc.data_type
    FROM user_tab_columns utc
    WHERE utc.table_name = p_table_name
    AND utc.data_type IN ('VARCHAR2','CHAR','DATE','NUMBER','FLOAT')
    ORDER BY utc.column_id
    ;
    v_sep VARCHAR2(1);
    v_column_list VARCHAR2(32767);
    v_date_format VARCHAR2(22) := 'DD-MON-YYYY'; -- whatever you want
    v_limit PLS_INTEGER := 10; -- for illustration, limit #rows returned

    arr_row nt_table_row := nt_table_row(); -- 2D array
    -- parallel arrays for display usage
    arr_column_name nt_column_value := nt_column_value();
    arr_data_type nt_column_value := nt_column_value();
    v_table_name user_tables.table_name%TYPE := 'dept';
    v_stmt VARCHAR2(32767) :=
    'declare
    c_date_format constant varchar2(22) := :f;
    c_num_rows constant pls_integer := :l;
    cursor cur is
    select
    nt_column_value(~c)
    from ~t
    where rownum <= c_num_rows
    ;
    arr_row nt_table_row := nt_table_row();
    begin
    put_long_line(

    ''c_date_format=['' || c_date_format ||
    ''] c_num_rows=['' || c_num_rows ||
    '']'');
    open cur;

    fetch cur bulk collect into arr_row;
    close cur;

    :a := arr_row;
    end;';
    BEGIN
    -- build lists for dynamic block
    FOR rec IN cur_columns(UPPER(v_table_name)) LOOP
    -- v_column_expr :=
    v_column_list := v_column_list || v_sep ||
    (CASE
    WHEN rec.data_type IN ('VARCHAR2','CHAR') THEN
    rec.column_name
    WHEN rec.data_type = 'DATE' THEN
    'to_char(' || rec.column_name || ',c_date_format)'
    WHEN rec.data_type IN ('NUMBER','FLOAT') THEN
    'to_char(' || rec.column_name || ')'
    ELSE rec.column_name
    END)
    ;
    -- these are just for display purposes
    arr_column_name.EXTEND;
    arr_column_name(arr_column_name.COUNT) := rec.column_name;
    arr_data_type.EXTEND;
    arr_data_type(arr_data_type.COUNT) := rec.data_type;
    v_sep := ',';
    END LOOP;

    IF (v_column_list IS NOT NULL) THEN
    -- build dynamic block
    v_stmt := REPLACE(v_stmt, '~c', v_column_list);
    v_stmt := REPLACE(v_stmt, '~t', v_table_name);

    put_long_line(v_stmt);
    EXECUTE IMMEDIATE v_stmt
    USING
    v_date_format
    ,v_limit
    ,OUT arr_row
    ;
    put_long_line('Display results...');
    IF (arr_row.count > 0) THEN
    FOR i IN arr_row.FIRST .. arr_row.LAST LOOP
    put_long_line('Row #' || i || ':');
    FOR j IN arr_row(i).FIRST .. arr_row(i).LAST LOOP
    put_long_line(
    arr_column_name(j) || '(' ||
    arr_data_type(j) || ')=[' ||
    arr_row(i)(j) || ']'
    );
    END LOOP;
    END LOOP;
    ELSE
    put_long_line('No data found in table ' || v_table_name);
    END IF;
    ELSE
    put_long_line('Table not found: [' || v_table_name || ']');
    END IF;
    END;
    /
    Executing this program in the SCOTT schema returns the following output to the console. The dynamically constructed anonymous block is displayed, followed by the rows returned:

    No comments:

    Post a Comment