Looking for something? Try here..

Monday, December 20, 2021

Oracle Application/session tracing methods - 3

 This is a continuation of the previous posts regarding how to trace an individual session as discussed in the below link

Oracle Application/session tracing methods - 1

and the methods of how to trace the sessions at application or database level as discussed in the below link

Oracle Application/session tracing methods - 2

You might see when we start tracing at application level, we get a bunch of trace files as there are multiple session involved. For eg., when we trace using a module, multiple sessions can run queries involving same module. In such cases, we can't go dig each and every trace file to investigate the issue. It would be much easier if we have a single consolidated trace file so that we can have a better understand of the issue we are investigating. Also, the trace file needs to be in human readable format rather some bunch of random random numbers here and there which only the computer can understand. 


In this post we are going to look at how to consolidate multiple trace files as well how to make these trace files human readable using the below tools

  • trcsess
  • tkprof

trcsess

We can use the trcsess to consolidate trace files based on several criteria such as 

  • Session id
  • Client id
  • Service name
  • Action name
  • Module name

Let's see how to use this tool with an example. In my previous post, when I mentioned about tracing session with service name, I used Swingbench to generate load using 4 session. The session traces are as below

..
..
-rw-r-----. 1 oracle oinstall  4954796 Dec 15 00:25 orcl_ora_19415.trm
-rw-r-----. 1 oracle oinstall 44587093 Dec 15 00:25 orcl_ora_19415.trc
-rw-r-----. 1 oracle oinstall  4233304 Dec 15 00:25 orcl_ora_19413.trm
-rw-r-----. 1 oracle oinstall 37791460 Dec 15 00:25 orcl_ora_19413.trc
-rw-r-----. 1 oracle oinstall  4911273 Dec 15 00:25 orcl_ora_19411.trm
-rw-r-----. 1 oracle oinstall 44277819 Dec 15 00:25 orcl_ora_19411.trc
-rw-r-----. 1 oracle oinstall  3892558 Dec 15 00:25 orcl_ora_19409.trm
-rw-r-----. 1 oracle oinstall 34574211 Dec 15 00:25 orcl_ora_19409.trc
..
..
These 4 traces can be consolidated into a single trace file as below

Syntax: trcsess [output=] [session=] [clientid=] [service=] [action=] [module=]
[oracle@linux75-2 trace]$
[oracle@linux75-2 trace]$ trcsess output=service_orcl_trace.trc service=ORCL orcl_ora_19415.trc orcl_ora_19413.trc orcl_ora_19411.trc orcl_ora_19409.trc
[oracle@linux75-2 trace]$
[oracle@linux75-2 trace]$ ls -lrt service_orcl_trace.trc
-rw-r--r--. 1 oracle oinstall 161234967 Dec 19 23:56 service_orcl_trace.trc
[oracle@linux75-2 trace]$ head -20 service_orcl_trace.trc
*** [ Unix process pid: 19415 ]
*** 2021-12-15T00:25:01.309981+05:30
*** [ Unix process pid: 19413 ]
*** 2021-12-15T00:25:01.309867+05:30
*** [ Unix process pid: 19409 ]
*** 2021-12-15T00:25:01.309928+05:30
*** [ Unix process pid: 19415 ]
*** 2021-12-15T00:25:01.309986+05:30
*** [ Unix process pid: 19413 ]
*** 2021-12-15T00:25:01.309871+05:30
*** [ Unix process pid: 19409 ]
*** 2021-12-15T00:25:01.309933+05:30
*** [ Unix process pid: 19415 ]
*** 2021-12-15T00:25:01.309990+05:30
*** CLIENT DRIVER:() 2021-12-15T00:25:01.309999+05:30

WAIT #0: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=11623002622
WAIT #0: nam='SQL*Net message from client' ela= 111757 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=11623121966
WAIT #0: nam='latch: shared pool' ela= 230 address=1613615352 number=551 why=1787423120 obj#=-1 tim=11623122364
WAIT #0: nam='library cache load lock' ela= 13162 object address=1757868440 lock address=1759806192 100*mask+namespace=5177347 obj#=-1 tim=11623135636
[oracle@linux75-2 trace]$ 
You can notice that I'm providing the trace files for the trcsess to look into for trace details, otherwise trcsess will look into all the trace files inside the trace directory. 
Also, once the trace files are consolidated, the trace file shows the pid of all the sessions traced in the beginning. 

tkprof

The raw trace file as I mentioned in the beginning of this post is in Oracle proprietary format like below. 
..
..
*** 2021-12-15T00:25:01.309990+05:30
*** CLIENT DRIVER:() 2021-12-15T00:25:01.309999+05:30

WAIT #0: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=11623002622
WAIT #0: nam='SQL*Net message from client' ela= 111757 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=11623121966
WAIT #0: nam='latch: shared pool' ela= 230 address=1613615352 number=551 why=1787423120 obj#=-1 tim=11623122364
WAIT #0: nam='library cache load lock' ela= 13162 object address=1757868440 lock address=1759806192 100*mask+namespace=5177347 obj#=-1 tim=11623135636
WAIT #139804731097848: nam='cursor: pin S wait on X' ela= 2567 idn=3873422482 value=1735166787584 where=21474836480 obj#=-1 tim=11623138951
=====================
PARSING IN CURSOR #139804731097848 len=82 dep=1 uid=0 oct=3 lid=0 tim=11623139023 hv=3873422482 ad='71383950' sqlid='0k8522rmdzg4k'
select privilege# from sysauth$ where (grantee#=:1 or grantee#=1) and privilege#>0
END OF STMT
PARSE #139804731097848:c=379,e=3165,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2057665657,tim=11623139020
EXEC #139804731097848:c=161,e=161,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2057665657,tim=11623139260
FETCH #139804731097848:c=74,e=74,p=0,cr=4,cu=0,mis=0,r=1,dep=1,og=4,plh=2057665657,tim=11623139366
WAIT #139804731068608: nam='cursor: pin S wait on X' ela= 3282 idn=3008674554 value=1735166787584 where=21474836480 obj#=-1 tim=11623142719
=====================
PARSING IN CURSOR #139804731068608 len=226 dep=1 uid=0 oct=3 lid=0 tim=11623142772 hv=3008674554 ad='71378ba8' sqlid='5dqz0hqtp9fru'
select /*+ connect_by_filtering index(sysauth$ i_sysauth1) */ privilege#, bitand(nvl(option$, 0), 72), grantee#, level from sysauth$ connect by grantee#=prior privilege# and privilege#>0 start with grantee#=:1 and privilege#>0
END OF STMT
PARSE #139804731068608:c=169,e=3381,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=1227530427,tim=11623142771
WAIT #139804731068608: nam='PGA memory operation' ela= 8 p1=131072 p2=2 p3=0 obj#=-1 tim=11623142861
EXEC #139804731068608:c=118,e=118,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=1227530427,tim=11623142960
WAIT #139804731068608: nam='PGA memory operation' ela= 7 p1=65536 p2=1 p3=0 obj#=-1 tim=11623143087
FETCH #139804731068608:c=209,e=209,p=0,cr=5,cu=0,mis=0,r=1,dep=1,og=4,plh=1227530427,tim=11623143182
FETCH #139804731068608:c=5,e=5,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=1227530427,tim=11623143213
STAT #139804731068608 id=1 cnt=1 pid=0 pos=1 obj=0 op='CONNECT BY WITH FILTERING (cr=5 pr=0 pw=0 str=1 time=215 us)'
STAT #139804731068608 id=2 cnt=1 pid=1 pos=1 obj=144 op='TABLE ACCESS BY INDEX ROWID BATCHED SYSAUTH$ (cr=3 pr=0 pw=0 str=1 time=29 us cost=3 size=20 card=2)'
STAT #139804731068608 id=3 cnt=1 pid=2 pos=1 obj=147 op='INDEX RANGE SCAN I_SYSAUTH1 (cr=2 pr=0 pw=0 str=1 time=6 us cost=2 size=0 card=2)'
STAT #139804731068608 id=4 cnt=0 pid=1 pos=2 obj=0 op='HASH JOIN  (cr=2 pr=0 pw=0 str=1 time=125 us cost=7 size=115 card=5)'
..
..
This is partially understandable but wouldn't it be nice to see the trace file in a more readable and easily understandable format? 
For this we have tkprof (transient kernel profiler) provided by Oracle. tkprof accepts a trace file as input and produces a formatted output file and there a few options that we can make use of. Explain plan for the captured queries can also be made available as an option. Usage is as below
[oracle@linux75-2 trace]$ tkprof
Usage: tkprof tracefile outputfile [explain= ] [table= ]
              [print= ] [insert= ] [sys= ] [sort= ]
  table=schema.tablename   Use 'schema.tablename' with 'explain=' option.
  explain=user/password    Connect to ORACLE and issue EXPLAIN PLAN.
  print=integer    List only the first 'integer' SQL statements.
  pdbtrace=user/password   Connect to ORACLE to retrieve SQL trace records.
  aggregate=yes|no
  insert=filename  List SQL statements and data inside INSERT statements.
  sys=no           TKPROF does not list SQL statements run as user SYS.
  record=filename  Record non-recursive statements found in the trace file.
  waits=yes|no     Record summary for any wait events found in the trace file.
  sort=option      Set of zero or more of the following sort options:
    prscnt  number of times parse was called
    prscpu  cpu time parsing
    prsela  elapsed time parsing
    prsdsk  number of disk reads during parse
    prsqry  number of buffers for consistent read during parse
    prscu   number of buffers for current read during parse
    prsmis  number of misses in library cache during parse
    execnt  number of execute was called
    execpu  cpu time spent executing
    exeela  elapsed time executing
    exedsk  number of disk reads during execute
    exeqry  number of buffers for consistent read during execute
    execu   number of buffers for current read during execute
    exerow  number of rows processed during execute
    exemis  number of library cache misses during execute
    fchcnt  number of times fetch was called
    fchcpu  cpu time spent fetching
    fchela  elapsed time fetching
    fchdsk  number of disk reads during fetch
    fchqry  number of buffers for consistent read during fetch
    fchcu   number of buffers for current read during fetch
    fchrow  number of rows fetched
    userid  userid of user that parsed the cursor

[oracle@linux75-2 trace]$ tkprof service_orcl_trace.trc service_orcl_trace.out explain='sys as sysdba' table=sys.tkprof_table sys=no sort=exeela,execpu waits=yes

TKPROF: Release 12.2.0.1.0 - Development on Mon Dec 20 00:25:40 2021

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.


password =
[oracle@linux75-2 trace]$ 
I have provided service_orcl_trace.trc file as input to tkprof to convert the file to service_orcl_trace.out file with explain plan for the queries along with their wait time statistics. 
I don't want to see the recursive calls made by sys and hence mentioned sys=no and would like to sort the output of the trace file with queries execute elapsed time and cpu time spent executing. 

Sample output produced by tkprof will look like below
 
********************************************************************************

SQL ID: 982zxphp8ht6c Plan Hash: 1666523684

select product_id, product_name, product_description, category_id,
  weight_class, supplier_id, product_status, list_price, min_price,
  catalog_url
from
  product_information where product_id = :1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse    95536      0.57       0.91          0          0          0           0
Execute  95536      1.00       1.58          0          0          0           0
Fetch    95536      0.80       1.32         28     286608          0       95536
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   286608      2.38       3.81         28     286608          0       95536

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 89
Number of plan statistics captured: 6

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  TABLE ACCESS BY INDEX ROWID PRODUCT_INFORMATION (cr=3 pr=1 pw=0 time=543 us starts=1 cost=2 size=176 card=1)
         1          1          1   INDEX UNIQUE SCAN PRODUCT_INFORMATION_PK (cr=2 pr=0 pw=0 time=244 us starts=1 cost=1 size=0 card=1)(object id 24503)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                   95536        0.00          0.16
  db file sequential read                        28        0.00          0.02
  SQL*Net message from client                 95536        0.01         24.82
  cursor: pin S                                  15        0.00          0.01
  latch: cache buffers chains                     1        0.00          0.00
********************************************************************************
 
The trace file now looks neat and easy to understand which can be interpreted and investigated for performance issues. 
A simple interpretation would be this specific select query has been run 95536 times with 28 blocks read for the disk or physical i/o and made 286608 logical i/o ti fetch 95536 records. This means the query is bringing 1 records per execution. Total elapsed time is 3.81 seconds with 2.38 seconds spent in cpu. The investigation can go for different queries to pin point the issues with the help of this tkprof trace file report. 

So we conclude our tracing methods with this and I hope it is easy to follow and understand. 

References

What is TRCSESS and How to use it ? (Doc ID 280543.1)
Pic courtesy: www.oracle.com

Happy tracing...!!!

No comments:

Post a Comment