We had a complaint from application team that their database is running slow after the quarterly patching. We had patched many databases and only one database is complaining on slowness. The environment is Exacc and the database is a container (running version 19.16) which has multiple PDBs. So patching couldn't be a reason is the initial thought. So we stepped into investigate what is causing the slowness.
As a first step, we pulled an AWR report for the time frame where they did some DB operations.
We can see from above that most of the waits are related to Cluster and topping them is gc cr multi block mixed, gc cr block 2-way and gc buffer busy acquire.
So this means some query is running very frequently accessing 1 or 2 table blocks continuously. We need to figure out which query is running, accessing what table/index and how frequent and what's the elapsed time of the query. These information will get us to move forward.
Since the waits are related to Cluster, let's see SQL ordered by Cluster Wait Time (Global) section under SQL Statistics
SELECT /* chk_journal_reg */ rowid FROM "MDSYS"."SDO_TXN_JOURNAL_REG" WHERE sid = :1 AND sdo_txn_idx_id = :2 AND indexptniden = :3 AND operation = :4 AND rid = :5SELECT /* sel_dist_upd */ ROWID, ....., ins_no, del_no FROM "MDSYS"."SDO_DIST_METADATA_TABLE" WHERE sdo_index_owner = :own AND sdo_index_name = :nam AND sdo_partition_name = :ptn FOR UPDATE
Ok, these 2 queries look simple and not complicated, so what's causing this buffer busy issues? Investigation went in following direction
Check size of tables: They are too small.
Check the execution plan of the first code with high cluster waits gave us the clue. Though we have primary key defined on all five columns (SID, SDO_TXN_IDX_ID, INDEXPTNIDEN, OPERATION, RID), the query took the path of INDEX STORAGE FAST FULL SCAN which would traverse through all the blocks unnecessarily. The query took around 4 second per execution.
I hint the query to use the index already available (/*+ index(SDO_TXN_JOURNAL_REG SYS_C0014253) */) - this resulted in a much better plan and execution time. Execution time reduced to milliseconds. Remember this query executed around 53k times which would make a big difference in application performance. While checking for any plan change, I found that the index path was used until the patch was applied and the plan flipped
SQL> select distinct sql_id, plan_hash_value, id, operation, options 2* from dba_hist_sql_plan where sql_id='fw4q2s9j3bckt' order by plan_hash_value, id; SQL_ID PLAN_HASH_VALUE ID OPERATION OPTIONS ________________ __________________ _____ ___________________ _________________________ fw4q2s9j3bckt 1765869626 0 SELECT STATEMENT fw4q2s9j3bckt 1765869626 1 INDEX STORAGE FAST FULL SCAN fw4q2s9j3bckt 2134037127 0 SELECT STATEMENT fw4q2s9j3bckt 2134037127 1 INDEX UNIQUE SCAN SQL>
Reason for slowness is
Running a MV refresh using the DBMS_SCHEDULER package causes spatial to use the SDO_TXN_JOURNAL_REG transaction journal table, instead of the SDO_TXN_JOURNAL_GTT transaction journal table.
So, now the solution is to revert the sql profile created and apply spatial bundle bug fix Patch 34725493 or to apply RU 19.17
Since we already had applied another one off patch for a previous issue with Spatial, we couldn't apply this bug fix patch as it was conflicting with the one off patch. Raised an SR with Oracle to get the merge patch for both the conflicting patches but Oracle couldn't provide any commitment on when the merge patch will be ready.
As we need to get the production database back to normal as soon as we can, we had to roll back the entire 19.16 RU to bring it back to pre patch state (RU 19.15) and then the application verified all the jobs are running without issues. We could also not see any cluster related wait post the patch rollback. We rolled back as 19.17 is yet to be tested in our lower environments before we promote to production.
Interestingly, Oracle maintains a note on all the patches that needs to be applied post RU patch apply under What Is The Latest Spatial Patch Bundle for 12c and Higher Databases? (Doc ID 2514624.1) So every time we apply RU to the database, we need to refer this note and apply the patch that is mentioned for the related RU as well.
This investigation is not only a trouble shooting experience but also a knowledge gathering experience with Oracle Spatial component regarding the bugs related to Spatial and separate Spatial patch bundle, etc.,.
Happy Trouble shooting and Patching...!!!
Thanks for sharing 🙏
ReplyDelete