Oracle Tips and Techs: March 2023

I was asked a question a few days ago whether DMLs will be failed over to surviving instance if one of the instance in a Oracle RAC is crashed.

I took sometime to get a proper answer to this question as I was not sure how DMLs were handled in RAC. All I know until that time was "Select" queries will be failed over to another instance. This mechanism is called Transparent Application Failover or TAF which is in Oracle from a long time since Oracle Parallel Server (OPS) introduced in Oracle 8i.

Transparent Application Failover

When an instance fails, TAF creates a new session on the surviving node or when the SELECT mode is configured, the sessions are replayed on the surviving node. For example, let us assume we are querying a table with a million records and when the query is in midway and node fails, the session will be created on the other surviving node to the state where the query is started and the SELECT is replayed in the other node to provide output to the end user/application. This new session creation on the other node is transparent to the end user/application.

FAN and ONS

Fast Application Notification (FAN) is a critical component in all of the High Availability solutions where continuous service and continuous connections built into RAC are extended to applications and application servers. When a state of database service changes such as down or unresponsive, the new status is posted to interested subscribers via FAN events. More on FAN can be found here

Oracle Notification Services (ONS) is installed as part of installation of Oracle Grid Infrastructure or Oracle dataguard installation. ONS is responsible for propagating FAN events to all other ONS daemons it is registered with. There is one ONS daemon started per cluster node. ONS facilitates FAN and also the load balancing feature of Oracle RAC for workload management.

Session Failover without TAF

Let's see an example of a session's behavior when we connect via the default service. This service is basic service created to connect to the database.

My TNS entry is as below

orabliss =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = ol819-scan)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = orabliss.selvapc.com)
)
)

I have created a table TEST_TABLE under DUMMY user in the database so I'll perform the full table select and simulate the crash by turning off the node where the session is connected.

The select statement takes ~ 50 seconds to complete and I will be terminating the node while the query is running.

Session details:

You can see I have connected to INST_ID 2 (ol819-rac2) from Machine ol819-rac1

First normal query execution:

SQL> select * from test_table;

=== Suppressed query output ===

440,813 rows selected.

Elapsed: 00:00:50.744

Now I'll be introducing a crash by turning off node 2 after firing the select query again and monitor the existing session from node 1:

...
...
...
OWNER
--------------------------------------------------------------------------------------------------------------------------------
OBJECT_NAME
--------------------------------------------------------------------------------------------------------------------------------
SUBOBJECT_NAME                                                                                                                    OBJECT_ID DATA_OBJECT_ID OBJECT_TYPE             CREATED   LAST_DDL_
-------------------------------------------------------------------------------------------------------------------------------- ---------- -------------- ----------------------- --------- ---------
TIMESTAMP           STATUS  T G S  NAMESPACE EDITION_NAME                                                                                                                     SHARING            E O A
------------------- ------- - - - ---------- -------------------------------------------------------------------------------------------------------------------------------- ------------------ - - -
DEFAULT_COLLATION                                                                                    D S CREATED_APPID CREATED_VSNID MODIFIED_APPID MODIFIED_VSNID
---------------------------------------------------------------------------------------------------- - - ------------- ------------- -------------- --------------
2019-04-17:01:42:19 VALID   N N N          1                                                                                                                                  METADATA LINK        Y N
                                                                                                     N N

ERROR:
ORA-03135: connection lost contact
Process ID: 55738
Session ID: 599 Serial number: 54452



29370 rows selected.

SQL>

Now, once the node is down the session running the select query encountered an error ORA-03135. Note the query fetched only ~29k records of total ~ 440k records before failure.

Configuring TAF

TAF is configured by creating a service on RAC and assigning failover parameters as shown below. Do not use the default service that is created while creating a container database or PDB as this service will be running on all the instances and running always on a RAC system. You should not alter or modify this service. This default service cannot be disabled, relocated or restricted so it doesn't provide high availability. Instead create a specific service with unique identifier to connect to the intended database with the intended properties.

From version 11.2 onwards we can create services using srvctl command.

srvctl add service -db cdbrac -pdb orabliss -service orabliss_taf -tafpolicy BASIC -failovertype SELECT -clbgoal LONG -preferred cdbrac1,cdbrac2

[oracle@ol819-rac1 ~]$ srvctl add service -db cdbrac -pdb orabliss -service orabliss_taf -tafpolicy BASIC -failovertype SELECT -clbgoal LONG -preferred cdbrac1,cdbrac2
[oracle@ol819-rac1 ~]$ srvctl start service -db cdbrac -service orabliss_taf
[oracle@ol819-rac1 ~]$ srvctl status service -db cdbrac -service orabliss_taf
Service orabliss_taf is running on instance(s) cdbrac1,cdbrac2
[oracle@ol819-rac1 ~]$

We now see the details of the services from the database as below

-tafpolicy accepts only BASIC from version 11.2 meaning the failover occurs only at the time of failure and there is no pre-connect option available.

-failovertype of SELECT enables the read on other surviving node once there is a crash on the session's existing node. Other option for TAF is SESSION where upon node failure the session is connected on the surviving instance but does not continue the SELECT.

Now, my TNS entry is as below using the newly created service

orabliss_taf =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = ol819-scan)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = orabliss_taf.selvapc.com)
)
)

I'm connecting to the DB using the orabliss_taf service and simulate the crash as before.

SQL> select program,sid,failover_type,failover_method,failed_over from v$session where sid=sys_context('userenv', 'sid');

PROGRAM                                                 SID FAILOVER_TYPE FAILOVER_M FAI
------------------------------------------------ ---------- ------------- ---------- ---
sqlplus@ol819-rac1.selvapc.com (TNS V1-V3)              606 SELECT        BASIC      NO

Elapsed: 00:00:00.00
SQL> select * from test_table;

=== Suppressed query output ===

=== Introduced the crash while query is running midway ===

440813 rows selected.

Elapsed: 00:08:01.75
SQL>
SQL> select program,sid,failover_type,failover_method,failed_over from v$session where sid=sys_context('userenv', 'sid');

PROGRAM                                                 SID FAILOVER_TYPE FAILOVER_M FAI
------------------------------------------------ ---------- ------------- ---------- ---
sqlplus@ol819-rac1.selvapc.com (TNS V1-V3)               26 SELECT        BASIC      YES

Elapsed: 00:00:00.06
SQL>

While the original session was created with SID=606, the session failed over to node 1 and have a new SID=26. We can see the FAILED_OVER column also changed to YES.

The query also took a significantly longer time of ~ 8 minutes against ~ 50 seconds as the failover happens to the other node and then the select has to start again from the beginning after the instance recovery is completed. So this 8 minute is a combined time of all the failover activities along with the query execution time where the session waits for the instance recovery as well.

We can see the alert log will register the below details indicating the instance recovery due to node 2 crash. This is just to check whether instance recovery is completed properly

Reconfiguration started (old inc 4, new inc 6)
List of instances (total 1) :
 1
Dead instances (total 1) :
 2
My inst 1
publish big name space -  dead or down/up instance detected, invalidate domain 0
 Global Resource Directory frozen
* dead instance detected - domain 0 invalid = TRUE
* dead instance detected - domain 2 invalid = TRUE, need cdb-level instance recovery
* dead instance detected - domain 3 invalid = TRUE, need cdb-level instance recovery
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
...
...
 Post SMON to start 1st pass IR
2023-03-18T10:11:45.794262+05:30
CDB IR excluding pdb 2 which was cleanly closed.
2023-03-18T10:11:45.794383+05:30
Instance recovery: looking for dead threads
2023-03-18T10:11:45.812848+05:30
Beginning instance recovery of 1 threads
...
...
...
Started redo application at
 Thread 2: logseq 83, block 157204, offset 0
...
...
...
2023-03-18T10:11:46.025646+05:30
Redo thread 2 internally disabled at seq 84 (SMON)
CDB instance recovery complete: pdb 2 valid 1 (flags x10, pdb flags x40080)
CDB instance recovery complete: pdb 3 valid 1 (flags x10, pdb flags x40080)
CDB instance recovery complete: pdb 0 valid 1 (flags x10, pdb flags x80)
...
...

So, the Transparent Appliation Failover (TAF) option is great for applications which reads a lot as they don't have to worry about DB getting down as the services will get the work done transparent to the application.

Now, what if a batch job runs in the database performing millions of DML and then the node is crashed? This could happen and mostly it's not by software error but sometimes due to patching window where applications are not appraised of the schedule, issues with underlying OS or storage causing DB node to crash, etc,. What would be the condition of the database sessions? Will the DMLs continue to run on the other surviving node? Do we have any automated mechanism to take care of this situation for us?

Let's see about that in the next post.

References:

Oracle Notification Client and FAN

Document 460982.1 (oracle.com)

Server Control Utility Reference (oracle.com)

Happy TAFing...!!!