Looking for something? Try here..

Monday, January 17, 2022

Wrong ORA error during PDB cloning!

 Last week, while trying to clone a PDB under a CDB on one cluster (non-prod) to another PDB under different CDB on a different cluster (prod), I encountered a series of errors where the first 2 are what someone can expect but the last error is something I couldn't figure out how it got generated. Let's take a look into it in detail


Set up: 

  • Updated tnsnames.ora file on all the nodes of the cluster
  • Created db link from Prod to non prod CDB to connect to pdb
  • Test the connection

QA01=
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(Host = scanname.example.com)(Port = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = QA01_SVC.example.com)
      )
    )

So far, so good. We are able to connect to the PDB via DB link and hence we can progress to clone the PDB from source CDB to target CDB.
SQL> CREATE PLUGGABLE DATABASE PROD01 from QA01@QA01 keystore identified by xxx;
CREATE PLUGGABLE DATABASE PROD01 from QA01@QA01 keystore identified by xxx
*
ERROR at line 1:
ORA-17628: Oracle error 46659 returned by remote Oracle server
ORA-46659: master keys for the given PDB not found
 
With reference to MOS Doc id: 2778618.1, to resolve the above error when executing CREATE PLUGGABLE statement add " including shared key" and the end of statement.
Here are some series of errors that takes place..
SQL> CREATE PLUGGABLE DATABASE PROD01 from QA01@QA01 keystore identified by xxx including shared key;
CREATE PLUGGABLE DATABASE PROD01 from QA01@QA01 keystore identified by xxx including shared key
*
ERROR at line 1:
ORA-65169: error encountered while attempting to copy file
+DATAC3/CDB01/CE3C246E805F6F3AE053D8A1C30A498C/DATAFILE/libraryd.2661.1085857629
ORA-17627: ORA-12170: TNS:Connect timeout occurred
ORA-17629: Cannot connect to the remote database server
  
Cool, now we at least have a different error than the above. From alert log.. 
Fatal NI connect error 12170, connecting to:
 (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(Host=scanname.example.com)(Port=1521))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=QA01_SVC.example.com)(CID=(PROGRAM=oracle)(HOST=pd07)(USER=oracle))))

  VERSION INFORMATION:
        TNS for Linux: Version 19.0.0.0.0 - Production
        TCP/IP NT Protocol Adapter for Linux: Version 19.0.0.0.0 - Production
  Version 19.12.0.0.0
  Time: 12-JAN-2022 18:21:25
  Tracing not turned on.
  Tns error struct:
    ns main err code: 12535

TNS-12535: TNS:operation timed out
    ns secondary err code: 12560
    nt main err code: 505

TNS-00505: Operation timed out
    nt secondary err code: 0
    nt OS err code: 0
2022-01-12T18:21:25.948274+00:00
Errors in file /u02/app/oracle/diag/rdbms/***/****/trace/****_p000_82225.trc:
ORA-17627: ORA-12170: TNS:Connect timeout occurred
ORA-17629: Cannot connect to the remote database server
 
There are no nt secondary err code which means this is purely timeout from the database. We have not set any timeout value and the default INBOUND_CONNECT_TIMEOUT of 60 seconds kicked in and the connection timed out. No issues with this. All cool. 
Just after receiving the timeout, I made the second attempt. 
SQL> CREATE PLUGGABLE DATABASE PROD01 from QA01@QA01 keystore identified by xxx including shared key;
CREATE PLUGGABLE DATABASE PROD01 from QA01@QA01 keystore identified by xxx including shared key
*
ERROR at line 1:
ORA-65169: error encountered while attempting to copy file
+DATAC3/CDB01/CE3C246E805F6F3AE053D8A1C30A498C/DATAFILE/ddtbsd.2636.1085857609
ORA-17627: ORA-01017: invalid username/password; logon denied
ORA-17629: Cannot connect to the remote database server
 
This attempt turns out to be weird as the same command just a minute later throws out a totally weird error of ORA-01017. If we have got the error in the first attempt, I might have suspected the password provided and rechecked the password but now the password which worked properly while testing is giving us ORA-01017. 
Alert log says.. 
2022-01-12T18:23:27.300138+00:00
Errors in file /u02/app/oracle/diag/rdbms/***/***/trace/***_p004_82257.trc:
ORA-17627: ORA-01017: invalid username/password; logon denied
ORA-17629: Cannot connect to the remote database server
2022-01-12T18:24:26.097107+00:00
Thread 7 advanced to log sequence 94 (LGWR switch),  current SCN: 37359630199069
  Current log# 26 seq# 94 mem# 0: +DATAC4/CDB0101/ONLINELOG/group_26.2116.1092461287
2022-01-12T18:24:29.260711+00:00
Deleted file +DATAC4/CDB01/D566C47DE983511DE05396A1C30A5D5E/DATAFILE/libraryd.2124.1093803079.
2022-01-12T18:24:29.287000+00:00
Deleted file +DATAC4/CDB01/D566C47DE983511DE05396A1C30A5D5E/DATAFILE/users.2133.1093803079.
2022-01-12T18:24:29.408203+00:00
Deleted file +DATAC4/CDB01/D566C47DE983511DE05396A1C30A5D5E/DATAFILE/pltbsd.2046.1093803079.
2022-01-12T18:24:29.514934+00:00
Deleted file +DATAC4/CDB01/D566C47DE983511DE05396A1C30A5D5E/DATAFILE/pptbsx.2050.1093803079.
2022-01-12T18:24:29.547221+00:00
Deleted file +DATAC4/CDB01/D566C47DE983511DE05396A1C30A5D5E/DATAFILE/cltbsd.2126.1093803079.
2022-01-12T18:24:29.558701+00:00
Deleted file +DATAC4/CDB01/D566C47DE983511DE05396A1C30A5D5E/DATAFILE/dmtbsx.2131.1093803079.
2022-01-12T18:24:29.610720+00:00
**************************************************************
Undo Create of Pluggable Database PROD01 with pdb id - 3.
**************************************************************
Checking the trace file associated with the error.. There are multiple session id with the same process id, so taking a deeper look reveals the below.
...
... trimmed o/p ...

*** 2022-01-10T16:28:38.119788+00:00 (CDB$ROOT(1))
*** SESSION ID:(1188.31010) 2022-01-10T16:28:38.119813+00:00
*** CLIENT ID:() 2022-01-10T16:28:38.119816+00:00
*** SERVICE NAME:(pdb01) 2022-01-10T16:28:38.119819+00:00
*** MODULE NAME:(sqlplus@pd07 (TNS V1-V3)) 2022-01-10T16:28:38.119823+00:00
*** ACTION NAME:() 2022-01-10T16:28:38.119826+00:00
*** CLIENT DRIVER:() 2022-01-10T16:28:38.119829+00:00
*** CONTAINER ID:(1) 2022-01-10T16:28:38.119832+00:00

OSSIPC:
IPCDAT DESTROY QP for qp 0x18b96e48 ep 0x18b96cd0 drain start time 146888533
OSSIPC:IPCDAT DESTROY QP for qp 0x18b96e48 ep 0x18b96cd0 drain complete num reqs drained 0 drain end time 146888534 drain time 1 msec time slept 0

... trimmed o/p ...

*** 2022-01-12T17:47:46.797793+00:00 ((6))
*** SESSION ID:(1191.49246) 2022-01-12T17:47:46.799247+00:00
*** SERVICE NAME:() 2022-01-12T17:47:46.799261+00:00
*** MODULE NAME:(sqlplus@pd07 (TNS V1-V3)) 2022-01-12T17:47:46.799264+00:00
*** ACTION NAME:() 2022-01-12T17:47:46.799268+00:00
*** CONTAINER ID:(6) 2022-01-12T17:47:46.799274+00:00

Using key:
 kcbtek structure 0x7ffd3b4751e0
 utsn: 0x10030000001e (4099/30), alg: 0 keystate: 0 inv?: 0 usesalt?: 0 enctbs?: 0 obf?: 0 keyver: 4294967295 fbkey?: 0 fullenc?: 0 frn?: 0 rcv?: 0 skipchk?: 0 use_for_dec?: 0
 encrypted key: 0000000000000000000000000000000000000000000000000000000000000000
 mklocact 0 mkloc 0, mkid: 00000000000000000000000000000000 kcl: [0x7f344f3a15b8,0x7f3446627b30] kverl: [0x7f3446627af0,0x7f3446627af0]
kpdbfRekeyFileBlock: decrypting block file 100 block 56737 (<6/30, 0xdda1) failed with error 3.  Continue with operation..
kcbtse_encdec_tbsblk: WARNING cannot decrypt encrypted block since a valie key is not found. <4099/30, 0xdda2 (afn 100 block 56738)>

... trimmed o/p ...

 encrypted key: 0000000000000000000000000000000000000000000000000000000000000000
 mklocact 0 mkloc 0, mkid: 00000000000000000000000000000000 kcl: [0x7f344f3a15b8,0x7f3446627b30] kverl: [0x7f3446627af0,0x7f3446627af0]
kpdbfRekeyFileBlock: decrypting block file 100 block 109895 (<6/30, 0x1ad47) failed with error 3.  Continue with operation..
OCI error val is 1017 and errmsg is 'ORA-01017: invalid username/password; logon denied

*** 2022-01-12T18:23:27.299715+00:00 (CDB$ROOT(1))
*** SESSION ID:(1191.47408) 2022-01-12T18:23:27.299745+00:00
*** SERVICE NAME:(SYS$USERS) 2022-01-12T18:23:27.299749+00:00
*** MODULE NAME:(sqlplus@pd07 (TNS V1-V3)) 2022-01-12T18:23:27.299753+00:00
*** ACTION NAME:() 2022-01-12T18:23:27.299757+00:00
*** CONTAINER ID:(1) 2022-01-12T18:23:27.299760+00:00

'
ORA-17627: ORA-01017: invalid username/password; logon denied
<error barrier> at 0x7ffd3b473998 placed ksrpc.c@5166
ORA-17629: Cannot connect to the remote database server
<error barrier> at 0x7ffd3b47cf80 placed kpoodr.c@237
OSSIPC:
IPCDAT DESTROY QP for qp 0x18bc3ce8 ep 0x18b50e90 drain start time 472374487

*** 2022-01-14T13:03:35.743519+00:00 (CDB$ROOT(1))
*** SESSION ID:(1195.64486) 2022-01-14T13:03:35.743542+00:00
*** SERVICE NAME:() 2022-01-14T13:03:35.743548+00:00
*** MODULE NAME:() 2022-01-14T13:03:35.743552+00:00
*** ACTION NAME:() 2022-01-14T13:03:35.743557+00:00

OSSIPC:IPCDAT DESTROY QP for qp 0x18bc3ce8 ep 0x18b50e90 drain complete num reqs drained 0 drain end time 472374494 drain time 7 msec time slept 0
OSSIPC:
IPCDAT DESTROY QP for qp 0x18b5f038 ep 0x18b5ee10 drain start time 472374537
OSSIPC:IPCDAT DESTROY QP for qp 0x18b5f038 ep 0x18b5ee10 drain complete num reqs drained 0 drain end time 472374537 drain time 0 msec time slept 0
Process termination requested for pid 82257 [source = rdbms], [info = 2] [request issued by pid: 199810, uid: 1001]
The trace file is updated with Timestamp when a new SESSION ID is logged with the same process id. We can see that there is already a session started at 2022-01-12T17:47:46 (line 19) with SESSION ID:(1191.49246) and our actual session of interest is logged at 2022-01-12T18:23:27 (line 41) with SESSION ID:(1191.47408). Match this timestamp with the timestamp of alert log when we issued the clone command and also see the serial# differs with both the sessions. 
Interestingly we can see the single quote (') started at line 39 to quote the error message but it didn't complete there and ended at line 48 after our clone session (SESSION ID:(1191.47408)) is established which then throws the error from the previous session on the current session. This might be a bug from Oracle software but I didn't raise a case with Oracle.
We can also see the clone is half way done before the error was encountered with the line "Undo Create of Pluggable Database PROD01 with pdb id - 3". All the files that were cloned are getting deleted once the error is encountered so I think this is definitely not an issue with invalid username/password

I didn't do anything after the error I received and left the session as it is as it was late night for me. Next day in the same session, I issued the same exact command
SQL> CREATE PLUGGABLE DATABASE PROD01 from QA01@QA01 keystore identified by xxx including shared key;

Pluggable database created.
This time in the 3rd attempt, the pluggable database created without any issues with the same command issued in the previous 2 attempts. 
So, when Oracle throws out weird error it's wise to investigate the alert log and associated trace file to see if the error thrown is genuine. This doesn't happen frequently but in my case it happened. 

So, is this a bug? I'm not sure as I didn't raise a case with Oracle as I could get past the error and complete my task of cloning. 

References: 

Happy troubleshooting...!!!