We have had some difficulty correlating the IO statistics presented
by Oracle (V$FILESTAT.PHYRDS,V$FILESTAT.PHYWRTS) with IO transfers
by device available within UNIX (using local uaiostat, iostat enhanced).
Specifically, Oracle is reporting physical read rates significantly
higher than the IO rates on the devices. My speculation on this is that
some requests are being satisfied by UBC which doesn't really correspond
with some of the things I've been told about Oracle's behaviour.
To try and get a handle on this myself I ran a benchmark last Sunday
when I had the system to myself, running two tests which would be
read IO intensive after rebooting the system and the controllers:
analyze table ... validate structure cascade;
(6 million row table, ran for 18 minutes)
full database export to /dev/null (ran for 1 hour 26 minutes).
While running these tests I had background vmstat, vmubc, uaiostat,
dbx shows on vm_perfsum, oracle filestats, and hsz40 dstat all set
up for periodic reporting.
Summary tables of the benchmark results follow immediately, and
my questions for anybody in the know after that.
Table#1: Oracle physical read+write and uaiostat transfers comparison
-------
oracle
phyR+W
Oracle Oracle Oracle uaiostat uaiostat
Domain Files phyRead phyWrite IO's transfers delta
------ ----- ------- -------- ------ --------- --------
d1f101 1242 n/a
*1 d1f102 16 50155 2475 52630 11841 40789
d1f200 8 30411 186 30597 30305 292
d1f300 8 2359 424 2783 3618 -835
*2 d1f400 25 118309 2365 120674 68580 52094
d2f101 33 n/a
d2f200 6 27465 1936 29401 29270 131
d2f300 16 49123 702 49825 51526 -1701
d2f400 22 n/a
rz25 22090 n/a
------ ----- ------- -------- ------ --------- --------
Grand: 79 277822 8088 285910 218527 n/a
Note: Oracle was reported over 7888 seconds and
iostat was reported over 8081 seconds, so minor variances
are tolerable since the intervals did not exactly overlap.
Four of six domains corollate Oracle vs. uaiostat.
Table#1a: Domain d1f102 summary:
--------
Domain Mnt Tablespace FID phyRead phyWrite
------ --- ----- --- ------- --------
*1a d1f102 /u03 AUDITS 071 42666 63
d1f102 /u03 COMDATA 016 1148 85
d1f102 /u03 COMDATA 035 44 5
d1f102 /u03 DBH_DATA 028 448 0
d1f102 /u03 GENDATA 015 886 2288
d1f102 /u03 POSDATA 014 296 10
d1f102 /u03 POSDATA 065 459 0
d1f102 /u03 RESDATA 041 1 0
d1f102 /u03 SATDATA 013 141 0
d1f102 /u03 TASDATA 017 3753 24
d1f102 /u03 TEMP 026 313 0
------ --- ----- --- ------- --------
d1f102 Total: 16 50155 2475
Table#2: uaiostat transfers and hsz40 dstat read/write command comparison
-------
hsz40 hsz40
dstat dstat /
uaiostat reads + uaiostat
Domain transfers writes ratio
------ --------- ------ -----
d2f101 33 33 1.00
*3 d2f200 29270 40960 1.40
*4 d2f300 51523 73566 1.43
d2f400 22 22 1.00
rz25 20637 20870 1.01
Note: hsz40 dstat did not report write commands (showed as ********)
for either d2f200 or d2f400, total number is reads only.
Table#3: ubc activity (test#1 ran 10:28-10:56, test#2 ran 11:01-12:27)
-------
Interval
Interval Interval vpf_ubc pct UBC
*5 vpf_ubc vpf_ubc lookup lookup size
Time hit lookups hits hits (mb)
---- ----- ------- ------ ------ -----
10:23:08 36.5
10:38:12 0 946 103 10.89% 36.4
10:53:15 8108 215655 11872 5.51% 603.6
11:08:20 3409 127944 28467 22.25% 603.6
11:23:24 5518 183949 18478 10.05% 603.6
11:38:28 4076 191763 18435 9.61% 603.6
11:53:33 4814 193637 18525 9.57% 603.6
12:08:37 5525 183309 18391 10.03% 603.6
12:23:41 4483 176855 18599 10.52% 603.6
12:38:45 39 4079 142 3.48% 603.6
-------- ----- ------- ------ ------
*6 Total: 35972 1278137 133012 10.41%
Questions/Comments:
------------------
*1 Domain d1f102 Oracle rates do not correspond to the UNIX uaiostat.
This domain has Oracle files only.
*1a However, d1f102 does contain the Oracle audit tablespace which
would have been active independent of the tests (particularly
with periodic offload procedures).
My speculation is many of the physical reads on this domain were
satisfied by UBC even though they were not satisfied within
Oracle's buffers.
Can anybody offer an opinion?
*2 Domain d1f400 also holds ORACLE_HOME.
As such, many IO's would not have been reflected within Oracle.
*3 Does anybody know why hsz dstat failed to report the
write command rates?
*4 For the two active domains on the monitored hsz40's, the HSZ
read command rate significantly exceeded the uaiostat transfer rate.
During many of the intensive 5 minute reporting intervals
the ratio was close to exactly 1.50 hsz read commands to
UNIX transfers. Likewise, if you subtract out the Oracle write
from the uaiostat transfers the ratio also approaches 1.50 for
the entire period.
My only guess is the hsz40 'read-ahead' algorithm may have been
invoked on these domains since the tests primarily reflect
serial reads through large data files.
Can anybody offer an opinion?
*5 Does anybody know where there is a good discussion
(short of reading the UNIX source code) on the vpf_ubc* fields?
*6 The values reported by vmubc were meaningless,
Table#3 came from vm_perfsum.
Can anybody offer an opinion on vmubc and UBC usage analysis,
specifically as it relates to AdvFS?
#7 Does anybody know of any good reference material on breaking
out IO performance components (Oracle, UNIX, AdvFS, UBC, kzpsa,
hsz40, devices) along the entire path?
#8 The benchmark was not perfectly isolated to my tests, an
auto-start background process was also executing (which
accounts for the non-log/non-audit write activity which
occurred... relatively low volume).
Does anybody know of good io benchmarking materials for
Oracle under Digital UNIX?
(I would be happy to sign non-disclosures if necessary).
#9 Any other comments on method and/or configuration are
welcome.
I will, of course, summarize. You can expect that to be a week or
two depending on how long it takes to come up with explanations.
If there is interest, I would be willing to post the complete
benchmark results and detail tables on our web page. I could also
be coerced into a Fall DECUS presentation if anatomy of IO's
(UNIX+Oracle+Physical) is of interest to a number of others...
I've found the reference materials on this extremely sparse
(particularly for relating the components).
Environment Summary:
-------------------
Digital UNIX v3.2g, AdvFS on all filesystems, ubc-maxpercent=10/min=5
Oracle 7.2.3, shared_pool_size=1gb, db_block_buffers=30k, db_block_size=8192
8400 6gb memory 1 redundant pair HSZ40 v3.0-3, 1 redundant pair HSZ40 v2.7-2
all active domains in 6*disk raidsets (mix of rz28|rz29).
_____________________________________________________________________
Kurt Carlson, University of Alaska SOIS/TS, (907)474-6266
sxkac_at_alaska.edu 910 Yukon Drive #105.63, Fairbanks, AK 99775-6200
Received on Thu Apr 03 1997 - 08:32:01 NZST