I have started a transaction in my bank account and money got deducted but the screen doesn't get me the confirmation of the transaction and ended with a time out error. There could be multiple reason and one of the reason being one of the node on which the session was working crashed.
How many of you encountered this situation?
Now, if the bank/application once recovered and connected to the database (on a surviving node) and there is no details of the previous transaction available to the new session just got connected. What happens to the data? Would it get committed?
If the application tries to run the same previous transaction, money might be deducted twice. If the application doesn't retry, your transaction might not be carried out and you might miss some important last date of bill payment
So, how does Oracle handles this?
In my last post, I explained about how Transparent Application Failover works and how it would be a great advantage for applications that mostly reads the database.
As a continuation of that post, we will see about the Transparent Application Continuity on what is it and how does it benefit applications utilizing this feature of Oracle.
Okay! We should know about Transaction Guard and Application Continuity before we dive into Transparent Application Continuity. Let's see them in brief to understand what they are..
Transaction Guard:
Transaction Guard provides a generic tool for applications to use for at-most-once execution in case of planned and unplanned outages. Applications use the logical transaction ID to determine the outcome of the last transaction open in a database session following an outage.2. Set the property COMMIT_OUTCOME=TRUE in the service to determine COMMIT is accessible and to get the commit outcome of the last transaction. RETENTION_TIMEOUT can be increased if necessary (default is 24 hrs)
3. Ensure that FAN is configured with ONS to talk with 12c database clients
srvctl modify service -db cdbrac -service orabliss_taf -commit_outcome TRUE
[oracle@ol819-rac2 bin]$ srvctl modify service -db cdbrac -service orabliss_taf -commit_outcome TRUE [oracle@ol819-rac2 bin]$ srvctl config service -db cdbrac -service orabliss_taf |egrep "Commit|Retention" Commit Outcome: true Retention: 86400 seconds [oracle@ol819-rac2 bin]$
A connection pool is a cache of database connection objects. The objects represent physical database connections that can be used by an application to connect to a database. At run time, the application requests a connection from the pool which reuses the connection objects and reduces new connection object creation. The pool also leverages many high availability and performance features available through an Oracle RAC database. These features include Fast Connection Failover (FCF), Run-time connection Load Balancing (RLB), and Connection Affinity.
So for Application Continuity to be enabled in the Oracle RAC, I have to modify my service as below.
srvctl modify service -db cdbrac -service orabliss_taf -failover_restore LEVEL1 -failoverdelay 30 -failoverretry 5 -failovertype TRANSACTION -replay_init_time 1800 -notification TRUE -drain_timeout 300 -stopoption IMMEDIATE
- Use a supported client. SQLPLUS 19c or later is supported and we will be using this in our demo. Few other clients are Oracle JDBC Replay Driver 18c or later, UCP 18c or later, OCI Session Pool 19c or later, etc.
- If using connection pool, connections should be returned to the connection pool
- Set FAILOVER_RESTORE to AUTO
- Enable mutables use in application by providing KEEP grants to Sequences and date time to users. Oracle 19c automatically KEEPs the mutables for sql.
- SIDE EFFECTS (like an extenal action such as sending emails, transfer files, etc) are disabled.
srvctl modify service -db cdbrac -service orabliss_taf -failover_restore AUTO -failoverdelay 30 -failoverretry 5 -failovertype AUTO -replay_init_time 1800 -notification TRUE -drain_timeout 300 -stopoption IMMEDIATE
- Connect to database using SQLPLUS with oracle_taf without Transaction Guard enabled (without -commit_outcome=TRUE)
- Perform a DML and check ltxid_trans table from a different SYS session
- Shut down instance running DML and observe error
- Modify oracle_taf to enable Transaction Guard and connect to database using the service
- Perform DML and check ltxid_trans table from a different session
- Modify oracle_taf to enable Transparent Application Continuity
- Connect to database using the TAC enabled service
- Perform DML and shut down instance running DML and observe the results
[oracle@ol819-rac1 ~]$ sqlplus dummy/dummy@orabliss_taf <<< Removed Banner >>> dummy@node1> select count(*) from test_table; COUNT(*) ---------- 8199536 dummy@node1> delete from test_table where object_id=57732; 112 rows deleted. dummy@node1> commit; Commit complete. dummy@node1>Check ltxid_trans table
[oracle@ol819-rac1 ~]$ sql / as sysdba <<< Removed Banner >>> SQL> @/home/oracle/ccon Enter value for container: orabliss old:alter session set container=&container new:alter session set container=orabliss Session altered. SQL> SQL> select * from ltxid_trans; no rows selected SQL>I'll now trigger a delete on the table and while the delete is running midway, I'll shut the OS (node 2) abruptly to create a crash scenario.
sys@node1> @/home/oracle/sessions.sql INST_ID SID SERIAL SQL_ID Username Seconds in Wait Command Machine OS User Status Module __________ ______ _________ _________ ___________ __________________ __________ _________________________ __________ ___________ _____________________________________________ 2 424 1281 DUMMY ol819-rac1.selvapc.com oracle inactive SQL*Plus sys@node1>
dummy@node1> select sys_context('userenv','sid') from dual; SYS_CONTEXT('USERENV','SID') -------------------------------------------------------------------------------- 424 dummy@node1> delete from test_table where rownum<500000; 499999 rows deleted. dummy@node1> <<< Shut down node 2 abruptly to create crash scenario >>> dummy@node1> commit; commit * ERROR at line 1: ORA-25405: transaction status unknown dummy@node1> select program,sid,failover_type,failover_method,failed_over from v$session where sid=sys_context('userenv', 'sid'); PROGRAM SID FAILOVER_TYPE FAILOVER_M FAI ------------------------------------------------ ---------- ------------- ---------- --- sqlplus@ol819-rac1.selvapc.com (TNS V1-V3) 460 SELECT BASIC YES dummy@node1> select count(*) from test_table; COUNT(*) ---------- 8199424 dummy@node1>We can see the transaction didn't commit even though the session got failed over to the other instance.
dummy@node1> select sys_context('userenv','sid') from dual; SYS_CONTEXT('USERENV','SID') -------------------------------------------------------------------------------- 430 dummy@node1> insert into test_table select * from test_table;Session 430 from node 1 got connected to node 2 and started an insert statement. While the statement is under process, I shutdown the node 2 abruptly.
sys@node1> @/home/oracle/sessions.sql INST_ID SID SERIAL SQL_ID Username Seconds in Wait Command Machine OS User Status Module __________ ______ _________ _________ ___________ __________________ __________ _________________________ __________ ___________ ___________ 2 430 10274 DUMMY ol819-rac1.selvapc.com oracle inactive SQL*Plus <<< Shut down node 2 here... >>> sys@node1> @/home/oracle/sessions.sql no rows selectedAfter sometime, when I cancelled the session I got the below error
dummy@node1> insert into test_table select * from test_table; ^Cinsert into test_table select * from test_table * ERROR at line 1: ORA-25408: can not safely replay call dummy@node1>This is because the fact that Transparent Application Failover (TAF) cannot handle transactions that include data manipulation language (DML). In this case, we need to log out of application, reconnect and rerun the transaction. We can see the session got failedover to instance 1
dummy@node1> select program,sid,failover_type,failover_method,failed_over from v$session where sid=sys_context('userenv', 'sid'); PROGRAM SID FAILOVER_TYPE FAILOVER_M FAI ------------------------------------------------ ---------- ------------- ---------- --- sqlplus@ol819-rac1.selvapc.com (TNS V1-V3) 175 SELECT BASIC YES dummy@node1>
[oracle@ol819-rac1 ~]$ srvctl modify service -db cdbrac -service orabliss_taf -commit_outcome TRUE [oracle@ol819-rac1 ~]$ srvctl config service -db cdbrac -service orabliss_taf |egrep "Commit|Retention" Commit Outcome: true Retention: 86400 seconds [oracle@ol819-rac1 ~]$ sqlplus dummy/dummy@orabliss_taf dummy@node1> select sys_context('userenv','sid') from dual; SYS_CONTEXT('USERENV','SID') -------------------------------------------------------------------------------- 423 dummy@node1> delete from test_table where rownum<500000; 499999 rows deleted. dummy@node1> commit; Commit complete. dummy@node1>
SYS@node2> select * from ltxid_trans; MAJ_VERSION MIN_VERSION INST_ID DB_ID SESSION_GUID TXN_UID COMMIT_NO START_DATE SERVICE_ID STATE FLAGS REQ_FLAGS ERROR_CODE ______________ ______________ __________ _____________ ___________________________________ __________ ____________ _________________________________________ _____________ ________ ________ ____________ _____________ 1 2 2 2960387100 F8B741BDAFB62A60E053CA38A8C0014E 111 0 07-APR-23 08.33.32.185503000 AM +05:30 1 1 0 0 0 SYS@node2>Now, as we have enabled transaction guard once the transaction is committed, we can see the ltxid_trans table is updated with the session and transaction information. This information will be used for replaying the transaction during failover
dummy@node1> insert into test_table select * from test_table; 7199426 rows created. <<< Shutdown OS abruptly >>> dummy@node1> commit; commit * ERROR at line 1: ORA-25402: transaction must roll back dummy@node1>Now, as the final phase of our testing we will enhance the orabliss_taf to enable Transparent Application Continuity (TAC) and observe the results
[oracle@ol819-rac1 ~]$ srvctl modify service -db cdbrac -service orabliss_taf -failover_restore AUTO -failovertype AUTO -failoverdelay 30 -failoverretry 5 -replay_init_time 1800 -notification TRUE -drain_timeout 300 -stopoption IMMEDIATE [oracle@ol819-rac1 ~]$ [oracle@ol819-rac1 ~]$ sqlplus dummy/dummy@orabliss_taf ... ... dummy@node1> select sys_context('userenv','sid') from dual; SYS_CONTEXT('USERENV','SID') -------------------------------------------------------------------------------- 232 dummy@node1> insert into test_table select * from test_table; <<< Shut down node 2 while insert is running >>> 14898851 rows created. dummy@node1> dummy@node1> commit; Commit complete. dummy@node1> select sys_context('userenv','sid') from dual; SYS_CONTEXT('USERENV','SID') -------------------------------------------------------------------------------- 42 dummy@node1> select program,sid,failover_type,failover_method,failed_over from v$session where sid=sys_context('userenv', 'sid'); PROGRAM SID FAILOVER_TYPE FAILOVER_M FAI ------------------------------------------------ ---------- ------------- ---------- --- sqlplus@ol819-rac1.selvapc.com (TNS V1-V3) 42 AUTO BASIC YES dummy@node1>Though I have introduced the crash in between the DML statement, the session failed over to instance 1 and completed the insert statement without any issues.
SYS@node1> @/home/oracle/sessions.sql INST_ID SID SERIAL SQL_ID Username Seconds in Wait Command Machine OS User Status Module __________ ______ _________ ________________ ___________ __________________ __________ _________________________ __________ ___________ ___________ 1 24 7850 DUMMY ol819-rac1.selvapc.com oracle inactive SQL*Plus 2 232 62564 19uk9dbxy0zbr DUMMY 3 INSERT ol819-rac1.selvapc.com oracle active SQL*Plus SYS@node1> @/home/oracle/sessions.sql INST_ID SID SERIAL SQL_ID Username Seconds in Wait Command Machine OS User Status Module __________ ______ _________ ________________ ___________ __________________ __________ _________________________ __________ ___________ ___________ 1 24 7850 DUMMY ol819-rac1.selvapc.com oracle inactive SQL*Plus 1 42 53960 19uk9dbxy0zbr DUMMY 2 INSERT ol819-rac1.selvapc.com oracle active SQL*Plus SYS@node1>We can see the session 232 was running the INSERT statement against INST_ID 2 and then failed over to INST_ID 1 with SID 42.
Happy TACing...!!!
nice share. worth a read .thanks Selvakumar
ReplyDelete