Introduction
In recent years, processor speed and disk capacity have vastly
increased. These innovations only go so far however, since it
still takes too long to access data on disk. As you know too well,
I/O is the major bottleneck in VSE systems. When a job or application
puts demands on the I/O subsystem, you get sluggish response times
and long run-times.The answer, of course, is to minimize I/O operations.
Disk caching provides the easiest and most cost effective solution
to this problem. Caching will reduce the amount of time spent
waiting for an I/O to complete, resulting in improved on-line
response time and reduced batch execution times.
Caching
In an un-cached system, an I/O request must move through the operating
system to initiate the I/O request. The controller processes the
request from the operating system and directs it to the correct
device. The device then positions itself and begins the transfer
of data to the controller. Eventually, the data is stored in processor
storage and an interrupt notifies the system that the transfer
has completed.By using a hardware cache, the controller saves
the Most Recently Used (MRU) data in local storage. This eliminates
the need to communicate with the device completely, provided the
requested data is in the cache. Removing the device from the loop
usually results in savings of 60-80% in the time it takes to service
an I/O request. This savings makes an un-cached I/O request 3-5
times slower than an cached request. Software caching provides
even greater performance improvements because it removes the I/O
subsystem completely from the loop. When an I/O request is made,
the system first checks the local cache and uses the data stored
there, if available. Removing external hardware completely from
the loop results in savings approaching 100% in the time it takes
to service an I/O request. Even allowing for differing processor
speeds, a request serviced by a software cache would be measured
in micro-seconds, while an un-cached I/O request would be measured
in milli-seconds.
VSE Caching Considerations
VSAM Catalogs
VSAM catalogs are almost ideal cache candidates. They have
all the best cache characteristics: small datasets, very high
read-to-write ratios, and excellent locality-of-reference. Hit
rates of 99% are typical, resulting in sub-millisecond average
access times for reading catalog data. This has a dramatic effect
on the time it takes to process an OPEN or CLOSE request for a
VSAM dataset, generally reducing the OPEN/CLOSE time by 60-75%.
VSAM Share Options 4 Files
The advantage of VSAM Share Options 4 datasets is that multiple
applications can update the dataset at the same time. The disadvantage
is that performance is very poor. Each read (GET) and write (PUT)
results in an I/O request and no buffering is performed by VSAM.
Caching Share Options 4 datasets allows the caching product to
perform the buffering function and improves performance so that
Share Options 4 datasets have similar performance characteristics
as Share Options 2 datasets. This improvement yields dramatic
reductions in on-line response time and batch run times of applications
requiring Share Options 4 datasets. No changes to the system,
JCL or applications are required.
VSAM On-line Using Local Shared Resources (LSR)
CICS can be the single most important application run at many
installations. Improving the performance of I/O operations within
the CICS partition can cause dramatic reductions in online response
time. Due to storage constraints however, it is often difficult
to get VSAM to perform well in a CICS environment.
Using VSAM Local Shared Resources (LSR) reduces the amount of storage needed by sharing buffer resources among many datasets. Because VSAM searches its buffers sequentially very high CPU overhead results when allocating large numbers of buffers or using LSR for a high-activity datasets.
This problem limits the number of buffers that can be allocated to an LSR pool and limits the datasets sharing the pool to low and medium usage. In addition, datasets with split activity or used in browse mode (sequentially) do not perform well using shared resources.For datasets using LSR, caching effectively adds control interval (CI) read-ahead (due to the use of full track I/O operations) and extends the size of the LSR pools by the size of the cache being used.
VSAM On-line Using Non Shared Resources (NSR)
Due to the limitations of VSAM LSR, high usage datasets, datasets
with split activity and datasets accessed sequentially will perform
much better using Non-Shared Resources (NSR). Because of CICS
storage limitations and multiple concurrent accesses of datasets,
it is often impossible to allocate enough buffers to let VSAM
perform read-ahead by itself.
Caching cures these problems. When VSAM reads one control interval (CI) of data, the cache manager reads the track into the cache and returns the requested CI. The remaining CIs are now in the cache and are available when requested.
The high-speed software cache provided by the caching product becomes a huge buffer pool that dramatically improves VSAM performance.
VSAM Batch
While VSAM is a very versatile access method allowing a great
variety of processing options, the only processing option VSAM
truly excels at is sequential. When reading a dataset sequentially,
VSAM will try to use double buffered, overlapped I/O operations
to read as much data as possible with a minimum number of I/O
requests. Given enough buffer space, VSAM can read many tracks
or even cylinders of data with a single I/O operation. Therefore,
caching a dataset with more than 60K of buffer space generally
results in little improvement or may even cause the application
to run more slowly because the caching product may cause an increase
in I/O activity instead of a decrease. This rule applies only
to VSAM datasets accessed in sequential mode. If the dataset is
Share Options 4, using LSR, accessed direct, using user buffering,
etc., caching will still provide good benefit.
Databases
Databases excel at relating various pieces of information
together and allowing users easy access to the information. However,
they do not perform I/O operations very efficiently while doing
a lot of it. Caching the various database tables and work spaces
can cut the run time of some batch programs by more than 70%,
although run time reductions of 15-30% are more typical. Decisions
about which tables and spaces to cache must be made dynamically
because some tables cache well when accessed by some applications
and cache poorly when accessed by others.
VSE Libraries
Loading phases from libraries actually presents three sources
of problems. The first problem is the directory search process
used to locate phases. The search process requires a minimum of
one I/O operation to each lib.sublib in the phase search
chain until the phase has been located. The second problem is
loading the phase into storage. Loading the phase is done with
a single I/O operation unless the phase is fragmented. Many installations
never de-fragment their libraries so fragmented phases are very
common. The third problem is that most phases are loaded twice.
The first load request extracts the directory information so that
storage can be allocated while the second load request actually
loads the phase. This results in the directory search process
being invoked twice to load the same phase.
Applications like ICCF, LIBR, LNKEDT (and many 3rd party software packages) are impacted by the design of VSE libraries. These applications use VSE libraries for data storage and due to the structure of VSE libraries, they tend to perform very poorly.
The poor design of VSE libraries results in making them very good cache candidates. VSE libraries use 1024 byte (1k) physical records resulting in 33 records/track (varying depending on the device type used). Reading 33 records at 18ms per record takes 0.594 seconds, yet reading the whole track (using a read track command) takes only 0.028 seconds. So, caching a library can reduce the time required to access data by up to 95%.
Caching VSE libraries results in dramatic improvements in the performance of the applications using them.
VTOCs
The Volume Table Of Contents (VTOC) of a DASD volume can be
a very active dataset. While locating an existing entry in the
VTOC requires only a single I/O operation, creating a new entry
can take hundreds of I/O operations. The Common VTOC Handler (CVH)
routines actually read the VTOC sequentially twice. The first
time to locate an empty VTOC entry and a second time to update
the selected VTOC entry. DASD space management software adds an
additional level of overhead that (in some cases) requires the
VTOC to be read a 3rd time. Caching the VTOC reduces the number
of I/O operations required from hundreds to as few as two. This
can result in dramatic improvements in performance, especially
for VTOCs of volumes containing large numbers of dynamic sequential
work files.
Label Areas
The DOS.LABEL.FILE can be one of the most active datasets
on a VSE system. Yet, label areas located on Count-Key-Data (CKD)
or Extended-Count-Key-Data (ECKD) devices have very poor performance
characteristics. CKD/ECKD label areas use unblocked keyed records
resulting in poor disk space utilization. In addition, each file
being opened may require searching up to four different areas
within the label dataset before locating the record needed. Since
each unsuccessful search must be followed by a sense, to determine
a no-record-found condition occurred, up to seven I/O operations
may be required to locate the specific label. Fixed Block Architecture
(FBA) label areas have better characteristics because records
are blocked and have no keys. Caching the label area (or moving
it to a virtual FBA device) can reduce the total number of I/O
operations done on a system by 10% or more.
Shared DASD
To avoid integrity problems, caching products cannot automatically
cache data on shared DASD. You can, however, fully use a caching
product while ensuring data integrity. For a dataset used by multiple
systems, simply specify that it be write-protected and never cached
on non-production systems, and let the caching product cache it
normally on the production system. This statagy works well for
VSE libraries where updates can be done from a central or production
system.Of course, in the real world, while the volume may be shared
between multiple systems, most datasets are used by one application
at a time. Imforming the caching product of these datasets allows
normal cache function on a shared device.
VSE/ESA 2.1 and the Turbo Dispatcher
The amount of time your system spends running non-parallel
work units is critical to effective use of multiple CPUs. Typical
examples of non-parallel work units are most system services and
key 0 programs (such as VSE supervisor services, POWER and VTAM).
While VSE may be able to support up to 10 CPUs, only one CPU can
process non-parallel work units at a time. Vendor products tend
to make extensive use of Key 0 programming increasing the amount
of non-parallel work units that must be run. The best way to decrease
the number of non-parallel work units is to reduce the number
of I/O operations and supervisor service requests (SVCs). Sounds
easy, right?Less CPU is used to cache an I/O request than to actually
perform the I/O operation. In addition, caching the I/O request
requires only one non-parallel work unit while actually performing
the I/O operation requires at least three (and maybe more). Caching
usually reduces the number of I/O operations by 35-65%, so the
non-parallel work unit and CPU savings add up.
Dramatic Performance Improvements
Defered Hardware Expenses
Caching reduces the stress on your current I/O subsystem hardware.
Typical users see a 35-65% reduction in the actual number of I/O
operations to their DASD subsystem. This means that operations
actually performed are processed more efficiently because the
channels, control units and devices are less busy and are able
to respond with greater efficiency. Therefore, because existing
hardware usage is reduced, planned upgrades can often be deferred
or even eliminated.
Improved On-line Performance
CICS is often the single most important application used at
an installation. Improving response time can be very difficult.
Caching dramatically improves CICS performance by reducing the
amount of time each transaction must wait for I/O operations to
complete. This improvement typically results in a 35% reduction
in on-line response time.
In addition, installations using ICCF transactions for operation
and program development functions notice an immediate and dramatic
improvement in the productivity of these users.
Improved Batch Performance
Many installations would like to reduce the duration of their
nightly batch runs. Caching will usually result in a 10-20% reduction
in the duration of nightly batch runs. While each individual job
is different, some job executions are reduced by 10 minutes, others
5 minutes, still others 15 minutes. The result is a 10 hour nightly
batch run that now takes only 8.5 hours.
Moral of the Story
Caching results in dramatic performance gains allowing your
CPU to process more data in less time than ever before. Improved
online response time and reduced batch execution time increases
system productivity and reduces the overall cost of running your
computer operation.
Copyright 1995 by Barnard Software, Inc.