Thursday, December 29, 2022

Troubleshooting Kerberos and Oracle CMU configuration

 In one of my previous posts, I have shared complete steps to configure Oracle Centrally Managed Users (CMU) which was completely done in my personal lab environment. Though I have encountered few hiccups, I was able to set it up without big issues. 

Now I'm trying to perform a proof of concept of the same set up in my work environment where I was stuck with several different issues and crossing them one after the other. As I was working with Oracle support and trying to iron out all issues, I learnt a few things related to AD, LDAP, etc.,.


In this post, I'll share the troubleshooting steps for the issues encountered so that it could help fellow DBAs to set up in their environment. 

The following support note will be of much help but its very limited in identifying the exact issue. We can use this note as starting point to dig deeper. 

Kerberos Troubleshooting Guide (Doc ID 185897.1)

Ok, let's start with the different troubleshooting steps. This is not step by step approach and any step can be used any time as per need basis

1. Tracing:

We might just get random errors like ORA-12638, ORA-12631, etc., but it lead you to nowhere. So need to trace the session in order to work with Oracle support and provide them with what they are looking for. So how do you trace the session? 

We need to set the below parameters in sqlnet.ora in both database server (if server trace is needed) and client machine (usually this will be mandatory to have). 

TRACE_UNIQUE_SERVER = ON
TRACE_LEVEL_SERVER = 16
TRACE_DIRECTORY_SERVER = /oracle/ABC/19.0.0/network/log
TRACE_FILE_SERVER = Server_Trace
TRACE_TIMESTAMP_SERVER = ON
DIAG_ADR_ENABLED=OFF
TRACE_UNIQUE_CLIENT = ON
TRACE_LEVEL_CLIENT = 16
TRACE_DIRECTORY_CLIENT = /oracle/ABC/19.0.0/network/log
TRACE_FILE_CLIENT = Client_Trace
TRACE_TIMESTAMP_CLIENT = ON
DIAG_ADR_ENABLED=OFF
Once tracing is completed, we can switch the parameter TRACE_LEVEL_SERVER or TRACE_LEVEL_CLIENT to OFF which will stop tracing of sessions. Make sure you just turn on and turn off quickly once you test your connection as this is instance wide and all the sessions will be traced filling up your log location/disk quickly. 

You can identify the trace file by looking into the sqlnet.log file, just giving a sample log content from the sqlnet.log
***********************************************************************
Fatal NI connect error 12631, connecting to:
 (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=xxxxx.com)(PORT=1521))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=ABC)(CID=(PROGRAM=sqlplus)(HOST=xxxxx)(USER=oracle))))

  VERSION INFORMATION:
        TNS for Linux: Version 19.0.0.0.0 - Production
        TCP/IP NT Protocol Adapter for Linux: Version 19.0.0.0.0 - Production
  Version 19.16.0.0.0
  Time: 23-DEC-2022 19:34:37
  Tracing to file: /oracle/ABC/19.0.0/network/log/client_trace_3655062.trc
  Tns error struct:
    ns main err code: 12631
    TNS-12631: Username retrieval failed
    ns secondary err code: 0
	
***********************************************************************
The logfile is indicated for the error that we have received. We can provide this log file to Oracle support for further review

2. Generating krb5 trace:

Below commands would help generate krb5 trace where we can trace actions related to okinit 

a) Set krb5_trace parameter

export KRB5_TRACE=<path>/krb5trace.trc
echo $KRB5_TRACE

b) On the database server run

$ okdstry
$ okinit <xxx>
$ oklist
$ sqlplus /@<connect_identifier>

This will generate krb5trace.trc file which can be uploaded to support for their analysis

3. Use of same case for principal name:

While creating SPN for database and generating keytab file, make sure you are using small case for creation and configuration 

I have used 
ktpass -princ ORACLE/linux-8.selvapc.com@ORABLISS.com -pass Qwerty123# -mapuser oserv@orabliss.com -crypto ALL -ptype KRB5_NT_PRINCIPAL -out Z:\linux-8-database.keytab

This resulted in error in my PoC environment. 

Instead use small case for database SPN
ktpass -princ oracle/linux-8.selvapc.com@ORABLISS.com -pass Qwerty123# -mapuser oserv@orabliss.com -crypto ALL -ptype KRB5_NT_PRINCIPAL -out Z:\linux-8-database.keytab  

sqlnet.ora should also have principal name in small case. 

SQLNET.AUTHENTICATION_KERBEROS5_SERVICE=oracle

4. ldapbind:

The ldapbind command-line tool enables you to see whether you can authenticate a client to a server. When we set up CMU and if we are getting error, we can check whether the client machine is getting authenticated with the server by using the command as below.

ldapbind -h <AD Host> -p 389 -D <value of ORACLE.SECURITY.DN> -w <value of ORACLE.SECURITY.PASSWORD>
ldapbind -h <AD Host> -p 389 -D <value of ORACLE.SECURITY.USERNAME> -w <value of ORACLE.SECURITY.PASSWORD>
ldapbind -h <AD Host> -p 636 -U 2 -W "file:<wallet location>" -P <wallet password> -D <value of ORACLE.SECURITY.DN> -w <value of ORACLE.SECURITY.PASSWORD>
ldapbind -h <AD Host> -p 636 -U 2 -W "file:<wallet location>" -P <wallet password> -D <value of ORACLE.SECURITY.USERNAME> -w <value of ORACLE.SECURITY.PASSWORD>

The first 2 test will test authentication with TCP and the next 2 will test authentication with SSL which we would have already set up in CMU. 

Example given below: 
$ ldapbind -h xxx.xxx.example.com -p 389 -D "CN=SV-oraservice,OU=Service,OU=Accounts,DC=xxx,DC=xxx,DC=example,DC=com" -w '%*9&nnh!!GFulh!G'
bind successful
$ ldapbind -h xxx.xxx.example.com -p 389 -D SV-oraservice -w '%*9&nnh!!GFulh!G'
bind successful
$ ldapbind -h xxx.xxx.example.com -p 636 -U 2 -W "file:/oracle/grid/19.0.0/network/admin/cmu_wallet" -P W9&nnhYT# -D "CN=SV-oraservice,OU=Service,OU=Accounts,DC=xxx,DC=xxx,DC=example,DC=com" -w '%*9&nnh!!GFulh!G'
bind successful
$ ldapbind -h xxx.xxx.example.com -p 636 -U 2 -W "file:/oracle/grid/19.0.0/network/admin/cmu_wallet" -P W9&nnhYT# -D SV-oraservice -w '%*9&nnh!!GFulh!G'
bind successful
$
If there is any issue with the connectivity, we will be getting a different error. For eg, let me try with a wrong password for the service account..
$ ldapbind -h xxx.xxx.example.com -p 389 -D "CN=SV-oraservice,OU=Service,OU=Accounts,DC=xxx,DC=xxx,DC=example,DC=com" -w 'wrongpass'
ldap_bind: Invalid credentials
ldap_bind: additional info: 80090308: LdapErr: DSID-0C090439, comment: AcceptSecurityContext error, data 52e, v4563
$
You can see when I provided a wrong password, I got the Invalid credentials error as expected. 

5. LDAP force sync

When there is change in the AD side such as password change for the service account, Oracle sometimes won't sync up with the changes and will throw error. We need to force sync Oracle with the LDAP services using the below parameters. 
Note: This is needed only when you doubt that Oracle is not synced up with AD/LDAP 
SQL> alter system set "_ldap_config_force_sync_up"=TRUE;

System altered.

SQL> alter system set "_ldap_config_force_sync_up"=FALSE;

System altered.

SQL> exit

$ sqlplus /@<connect string> 
 
We just need to set _ldap_config_force_sync_up TRUE and FALSE and retry the connection to the database. 

6. Viewing available certificates in AD server

Usually certificates are provided by issuing authority or if it's internal as in this case, AD team will provide you with the certificates. In my case, if I need to check what are the certificates I might need from AD team for the configuration to work, we can check using the openssl -showcert command. 

The command is as follows 
openssl s_client -connect <AD host>:636 -showcerts
$ orapki wallet display -wallet .
Oracle PKI Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright (c) 2004, 2021, Oracle and/or its affiliates. All rights reserved.

Requested Certificates:
User Certificates:
Oracle Secret Store entries:
ORACLE.SECURITY.DN
ORACLE.SECURITY.PASSWORD
ORACLE.SECURITY.USERNAME
Trusted Certificates:
Subject: CN=EXAMPLE MSPKI XXX Root CA
$
$ openssl s_client -connect abcdef12.xxx.xxx.example.com:636 -showcerts
CONNECTED(00000003)
depth=1 DC = com, DC = example, DC = xxx, DC = xxx, CN = qasmime95
verify error:num=20:unable to get local issuer certificate
verify return:1
depth=0 CN = ABCDEF12.xxx.xxx.example.com
verify return:1
---
Certificate chain
 0 s:CN = ABCDEF12.xxx.xxx.example.com
   i:DC = com, DC = example, DC = xxx, DC = xxx, CN = qasmime95
-----BEGIN CERTIFICATE-----
MIIFuDCCBKCgAwIBAgITbwAABIUbCx91qS38VQAAAAAEhTANBgkqhkiG9w0BAQsF
ADB2MRMwEQYKCZImiZPyLGQBGRYDY29tMRkwFwYKCZImiZPyLGQBGRYJaG9uZXl3
...
...
...
ZA8ptBPzUXSgSNI+5wfh3F58Tfoi8Ig07ryyR4HD3WKGDE2DDSmMdnZ8oB2cBnuJ
T1RQK8+16GKI1bQFoNRfbwZZXmOjlIJiESZvUg==
-----END CERTIFICATE-----
 1 s:DC = com, DC = example, DC = xxx, DC = xxx, CN = qasmime95
   i:CN = EXAMPLE QMSPKI XXX Root CA
-----BEGIN CERTIFICATE-----
MIIGJDCCBAygAwIBAgITfQAAAAvG9D//4L986AAAAAAACzANBgkqhkiG9w0BAQsF
ADAsMSowKAYDVQQDEyFIb25leXdlbGwgUU1TUEtJIEludGVybmFsIFJvb3QgQ0Ew
...
...
...
ZNaJmBGIBZsLM6IrUw/WGAdNdPQUbcXMBdlSGY5DaGYMGNjnoOqPHQ==
-----END CERTIFICATE-----
---
Server certificate
subject=CN = ABCDEF12.xxx.xxx.example.com

issuer=DC = com, DC = example, DC = xxx, DC = xxx, CN = qasmime95
...
...
 
$
This check is required if you suspect that the provider team has given a prod certificate in place of non prod certificate or vice versa. You can see from above that the wallet has MSPKI XXX Root CA whereas the AD server has QMSPKI XXX Root CA. So we have an incorrect certificate in the wallet and hence it needs to be removed and proper certificate needs to be imported. 
Note: O/p of the command has been trimmed to include only needed content. 

7. Check your account availability

I know this can be an odd situation as to check our account availability but its better to check whether the account that you are trying to access is created/available in the AD and is binded to the service account. We can use ldapsearch command to check for the account existence with all the account details 

Command is as below 
ldapsearch -R -h <AD host> -p 389 -b "DC=xxxx,DC=xx,DC=com" -D <ad super user> -w "<ad user password>" "sAMAccountName=<AD user>"
$ ldapsearch -R -h abcdef12.xxxxx.xx.example.com -p 389 -b "DC=xxxxx,DC=xx,DC=example,DC=com" -D SV-oraservice -w 'password' "sAMAccountName=h368225"
CN=Selvakumar Nagulan,OU=Accounts,OU=Tier1,OU=Admin,DC=xxxxx,DC=xx,DC=example,DC=com
objectClass=top
objectClass=person
objectClass=organizationalPerson
objectClass=user
cn=Selvakumar Nagulan
...
...
 
$
Note that we use our service account as super user to check whether our DB user is properly bind to super user or not. You will get a blank response if the user is not bind or not configured. 



So that's all for now as I was able to successfully configure CMU after working with Oracle support with 2 different support tickets and internal AD team spanning over a month. You might connect with me on any issues regarding setting up CMU in your organization and I would be happy to help with my experiences troubleshooting the same. 

References: 

Oracle Centrally Managed Users (CMU) configuration

Happy Troubleshooting!!! 
 
 
 
 
 
 
 
 

2 comments: