Looking for something? Try here..

Tuesday, July 31, 2018

Filtering date with GoldenGate

Today, while trying to break a big table to chunks to improve GoldenGate initial loading, I was trying to utilize the partitioning feature already available with the table which was based on time. Table is weekly partitioned, so if I try to filter rows based on partition, I can get data to be extracted easily by partition pruning. I have to use classic extract as the database version we are upgrading is 11.2.0.2 which doesn't support integrated extract.

So technical environment is as follows
==> [aiacl00378] [oracce] [DVCCE1DB] $ggsci -v

Oracle GoldenGate Command Interpreter for Oracle
Version 12.3.0.1.4 OGGCORE_12.3.0.1.0_PLATFORMS_180415.0359_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Apr 15 2018 21:16:09

Copyright (C) 1995, 2018, Oracle and/or its affiliates. All rights reserved.


==> [aiacl00378] [oracce] [DVCCE1DB] $uname -a
Linux aiacl00378 2.6.32-642.3.1.el6.x86_64 #1 SMP Sun Jun 26 18:16:44 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
As per Oracle support note Doc ID 1458966.1, if you are trying to filter records based on date, the following can be used.

TABLE {source table},
FILTER (@COMPUTE (@DATE("JUL", "YYYY-MM-DD", creation_time) >=
@DATE("JUL", "YYYY-MM-DD", "2012-01-01")) > 0);

My extract parameter file w.r.to the above is as below. 
EXTRACT IEXT1
setenv NLS_CHARACTERSET="AL32UTF8"
setenv NLS_LANG="AMERICAN_AMERICA.AL32UTF8"
setenv ORACLE_SID="DV1DB"
USERID ggs_dba, PASSWORD ggs_dba
RMTHOST ********, MGRPORT 12000
RMTFILE /u01/app/oracle/product/gg12.3/dirdat/c1, megabytes 2000, PURGE
table ccedba.CUSTOMER, filter(@compute(@date('YYYY-MM-DD HH:MI:SS','DD-MMM-YY HH:MI:SS',creation_date) < @date('YYYY-MM-DD HH:MI:SS','DD-MMM-YY HH:MI:SS','22-JUN-18 0
0:00:00'))>0);

This above extract when started didn't extract any records.
Excerpt from report file the extract process generated:

***********************************************************************
*                   ** Run Time Statistics **                         *
***********************************************************************


Report at 2018-07-30 07:37:11 (activity since 2018-07-30 07:37:02)

Output to /u01/app/oracle/product/gg12.3/dirdat/c1:

No records extracted.

The database had a handful of records falling under this filter criteria.
SQL> SQL> select count(*) from ccedba.customer
  2  where creation_date < to_date('2018-06-22 00:00:00', 'yyyy-mm-dd hh24:mi:ss');

  COUNT(*)
----------
      3444

SQL> 
After trying different ways to make the above work, I came across another support note Doc ID 966966.1 where it talks about default date formats in both Oracle and GoldenGate.

So why not to use the direct format of date where GoldenGate defaults to so that the records can be filtered? And yes, it worked like a charm.
The parameter file looks like the below.
EXTRACT IEXT1
setenv NLS_CHARACTERSET="AL32UTF8"
setenv NLS_LANG="AMERICAN_AMERICA.AL32UTF8"
setenv ORACLE_SID="DV1DB"
USERID ggs_dba, PASSWORD ggs_dba
RMTHOST ********, MGRPORT 12000
RMTFILE /u01/app/oracle/product/gg12.3/dirdat/c1, megabytes 2000, PURGE
table ccedba.customer, WHERE (CREATION_DATE < '2018-06-22 00:00:00');

Excerpt from the report file generated for the above parameter file.
018-07-30 09:02:03  INFO    OGG-01478  Output file /u01/app/oracle/product/gg12.3/dirdat/c1 is using format RELEASE 12.3.

2018-07-30 09:02:03  INFO    OGG-02911  Processing table CCEDBA.CUSTOMER.

***********************************************************************
*                   ** Run Time Statistics **                         *
***********************************************************************


Report at 2018-07-30 09:02:06 (activity since 2018-07-30 09:01:58)

Output to /u01/app/oracle/product/gg12.3/dirdat/c1:

From Table CCEDBA.CUSTOMER:
       #                   inserts:      3444
       #                   updates:         0
       #                   deletes:         0
       #                  discards:         0


Look at the where clause for the table extraction. It's just too easy as I only made the date format as 'YYYY-MM-DD HH:MI:SS'.
I'm not sure whether this same approach works with GoldenGate versions below 12.3 as I didn't test for the same.

Note:
I have used options such as FILTER, SQLPREDICATE, WHERE caluses with different combinations of date format before arriving to this simple solution.

Happy Filtering!!!

7 comments:

  1. Thanks much Selvakumar for the post. I was also looking for same solution and tried multiple things but nothing worked.

    ReplyDelete
  2. Hi Selvakumar, Thanks for the post.
    I'm new to Goldengate, my scenario is slightly different, In source I've date column in integer format where date is stored as days (1900-01-01 - 31-03-2021 = 44284) so todays date will be stored as 44284 in database. Below sql works for a column, but when the same logic has to be applied for more than 100 columns and it hit goldengate and db performance.


    TABLE OGG.DATE_TRNFR, TARGET OGG.DATE_TRNFR, SQLEXEC (ID logic1, QUERY "select cast((varchar_format((date('1900-01-01') + ( ? - 1 ) days),'YYYYMMDD')) as int) DATE1 from OGG.DATE_TRNFR", PARAMS (P1 = DATE_COL)), COLMAP (USEDEFAULTS, DATE_COL = @IF(DATE_COL <= 0,@COLSTAT(NULL),@IF(DATE_COL > 72685,99991231,@getval(date1.DATE_COL)))); --- if value is less than 0 will print as "NULL" and greater than 72685 will print static date 99991231 (need not to be in actual date format) and for between values 1 to 72684, will execute the sqlexec.

    Target is not a database for me, I'm using OGG Bigdata and writing data in json format/file. I need to avoid sqlexec and do the reverse function of above logic in goldengate, do you think this can be achieved in Goldengate without sql logic?

    Thanks in advance

    ReplyDelete
    Replies
    1. Hi Sathish, If I understand correctly you don't want GG to do the conversions part. What I can think of is to create Virtual columns for the date columns with the conversion you need. This will be one time deployment where you will have data stored in proper format for you to just fetch using GG without any additional conversion.
      You have to use the FETCHCOLS option to extract the data from virtual column as by default virtual columns are not logged.

      Thanks!

      Delete
    2. Again, I'm not sure of the DB version you are working on. Virtual columns are feature available from 11gR1.

      Thanks!

      Delete
  3. how about filtering based on sysdate is is possible to filter with current date .. my requirment is
    When OGG encounters a delete record, it should only process those where the Date_EFF > Current_Date and sw = blank and Ind = blank.

    for sw and ind column i will pass @null in where condition but how to tackle the current data in where condition

    ReplyDelete
    Replies
    1. Hi,

      Sysdate might return date in the database format. What you can do is convert this sysdate to GG format and provide as input (Current_Date). That should work.

      Thanks!

      Delete