Tuesday, May 17, 2011

Enabling Oracle's table level constraints in parallel

In DW environment often fact tables are inordinately large. They can be easily be several billions and at times hundreds of billions especially in cases where the fact tables contain most granular data. Creating foreign key constraints for such fact tables can be quite a challenge.

For daily data load this issue can be handled easily assuming the fact table is partitioned on day or any other time based columns. Hopefully daily data processing doesn't need to touch data partitions relating to previous days. In that case incoming daily data is loaded into a non-partitioned table having identical structure as that of the fact table. We go ahead and enable the foreign key constraints in that table and then exchange partition with main table including the constraints.

Recently I faced a different problem where I had to drop the existing foreign key constraints from a fact table because the product master table which it was pointing to had to undergo significant data patch. After that operation is over I needed to recreate the foreign key constraints back. The fact table is question had several billions of records. A simple statement like the following ran for hours and didn't complete. The statement ran as single thread.

ALTER TABLE click_fact ADD CONSTRAINT click_fact_prod_fk FOREIGN KEY (product_sid) REFERENCES product_dim(product_sid) ;

After reading through the documentation for Oracle 11.2 and several other articles, here is what I did to exploit Oracle's parallelism:

ALTER TABLE click_fact ADD CONSTRAINT click_fact_product_fk FOREIGN KEY (product_sid) REFERENCES product_dim (product_sid) ENABLE NOVALIDATE; -- enables constraint but does not validate existing data

ALTER SESSION ENABLE PARALLEL DDL;
ALTER TABLE click_fact PARALLEL ;
ALTER TABLE click_fact MODIFY CONSTRAINT click_fact_product_fk VALIDATE ;
ALTER TABLE click_fact NOPARALLEL ;

In TOAD's session browser I saw the following query running in parallel while the constraint is getting validated. It is a very prudent way to use outer join to validate the constraint.

SELECT                                                 /*+ all_rows ordered */
      a.rowid,
       :1,
       :2,
       :3
FROM  CLICK_FACT a,
            PRODUCT_DIM b
WHERE  (A.PRODUCT_SID IS NOT NULL)
AND       (B.PRODUCT_SID(+) = A.PRODUCT_SID)
AND      (B.PRODUCT_SID IS NULL)

4 comments:

  1. Good Post; I looked for this at several places but could not find a solution. I found it here; appreciate it!
    Thanks
    -Ifte Mohammed

    ReplyDelete
  2. Yep...its good post. This works.

    Thanks
    Jitesh

    ReplyDelete
  3. Good insights, and a nice post. Thanks for saving me an hour!

    ReplyDelete
  4. Same here.. This post helped in figuring out the solution.
    Thanks for sharing.

    ReplyDelete