oltpsetup

OVERVIEW
PROJECTS
DIRECTORIES
PREPARATION
EXPERIMENTS
PARAMETERS
BURSTS
CONNECTION STORMS
24H SIMULATION
VERIFICATION
REMOTE EXECUTION
REMOTE REPORT REPLICATION
SAVING PASSWORDS IN KEYRING
UNDOCUMENTED PARAMETERS
BUGS
COPYRIGHT
SEE ALSO

OVERVIEW

This page shows how to setup and configure the oltp workload that is included with the RWP*Load Simulator

PROJECTS

On the system, that will be your application server, you will be running one or several projects that typically differ in how and to which database they connect. Each project must have a short name such as dbone, dbtwo, etc. For the rest of the discussion on this page, the project name key is used.

DIRECTORIES

In order to run the oltp workload, you need three directories:

A working directory where you keep the files that typically get modified while you use the oltp workload. Each project will be identified by its name and there will be two files associated with each project; these files must be stored in this directory. It is recommended that you create a directory separately from your home directory, but you can very well have multiple projects in the same directory as each project normally just needs those two files.

A results directory that will be filled with sub-directories and files when you are using the oltp workload. You should have one directory per project, and you should prepare sufficient space for up to 0.5MB for each single run you perform. You normally don’t need to inspect or modify any of the files created.

An awr directory that will be filled with sub-directories, html files, graphs, etc while you perform the runs. This directory must be visible externally via a browser, and it must be writable by your login user. If you e.g. use Apache as your web server, the directory typically must exist somewhere under /var/www/html. You should plan to have sufficient space for about 2MB plus 2MB per database instance for each single run. If your webserver runs on a different system than your runs, the directory can very well be made available using NFS or similar. In addition to providing the name of the directory, you should also provide the full URL that can be used to browse to that directory.

PREPARATION

From your RWP*Load Simulator distribution, make copies of the two files from the oltp directory, oltp.env and oltp.rwl to your working directory, and make the copies named after your project. If your project it named key, the files must therefore be named key.env and key.rwl

The file key.env must be modified such that it sets the environment variable RWLOLTP_NAME to your project name such as key. If needed, you may also set environment variables like TNS_ADMIN, TWO_TASK, etc. It is also recommended to set RWLOADSIM_PATH set (include) your project working directory.

EXPERIMENTS

The oltp workload is implemented using a combination of shell and rwl scripts. If you want to experiment with changes to the rwl scripts, you can create a copy in your working directory of any of the scripts found in the oltp directory of your rwloadsim distribution. Doing so implies the script in your working directory will be used in stead of the one in your rwloadsim distribution, and you can therefore make those changes you may find useful. If you do so, it is recommended that you add a printline command in the beginning of your modified script to inform about the use of an edited rwl file.

PARAMETERS

Your projects parameter file such as key.rwl contains all settings that define your project including database credentials, connect strings, workload size, etc. Some settings are compulsory, others have defaults that you should initially not change.

The following lists all compulsory parameters.

awrdirectory

Sets the name of the directory where awr, html, graphs, etc will be stored. This is the directory mentioned above that is visible externally from a browser.

awr_url_prefix

Sets the URL prefix, that can be used from a browser to browse the awr directory. A sample contents is https://www.example.org/oltpruns/key which typically will have / and some other file or directory names appended to it. At present, it is only used when creating daily reports using oltpday(2rwl).

resultsdir

Sets the name of the results directory as explained above.

normal_connect
pool_connect
batch_connect

Sets the connect strings either in URL style syntax such as //host/service or as names in tnsnames.ora that will be used when the oltp workload is executing its ordinary transactions. The first is used in the few cases where pooling is not in effect, the latter are used for the vast majority of cases where the transactions use pooling, pool_connect for the typically short transactions, batch_connect for the more batch oriented transactions.

In most cases, the three can be set to the same value.

If your database under test is an autonomous database, you should choose the connection names that have _tp at the end.

pool_type
batch_type

Normally the vast majority of transactions use session pools as described above. You can alternatively run either of the two (short and more batch oriented) using either dedicated or drcp connections to the database by setting either of these two parameters to the respective value.

Note that setting either to drcp implies each worker thread will use its own separate DRCP connection to the database, and that no ordinary session pool will be created. In this case, you must include :pooled in URL style connect string or (server=pooled) in a tnsnames.ora entry; rwloadsim does not verify this is done.

To achieve the typical DRCP use case where DRCP is used together with the session pool for either, you should keep the respective parameter at its default setting implying a session pool is used in rwloadsim, but additionally have :pooled (or server=pooled in tnsnames.ora).

The settings for DRCP can also be used with cman/tdm, i.e. PRCP.

Do not set either parameter to any other value.

rwl_min_pool
rwl_max_pool
rwl_pool_release

When pool_type is set to its default value of "sessionpool", these three parameters control the size of the normal pool in each process. The two former set the smallest and highest number of database sessions in the pool, and the last parameter sets the timeout after which idle sessions in the pool are released by disconnecting from the database.

The default values of respectively 3, 10 and 5.0s are normally appropriate.

max_sessionwait

This is the maximum time in seconds a worker thread will wait for a session to be made available or to be created in the pool. The time may typically expire if the pool is already at its maximum size and all other workers spend long time before returning their session to the pool. This will imply the transaction will be dropped with the drop count visible in the persec table with a vname of ’call_failure’.

The default value is 30.

cruser_connect

Is set to the connect string that will be used when schemas are created and filled with data. In most cases, it can be the same as the normal_connect, but if your database is an autonomous database, using the name with _tpurgent at the end may make data fill faster as it allows parallel processing.

cruser_username
cruser_password

Set these to a username respectively password that will be used for schema creation combined with the cruser_connect connect string. The typical username is "system" or "admin" in case of autonomous databases.

system_connect
system_username
system_password

These will be used during actual execution of your runs to primarily run queries against v$ tables etc. The typical username is "system" or "admin" in case of autonomous databases. You can normally set them to the same values as used for schema creation, although the connect string should be the one ending in _tp in case of an autonomous database.

Unless the next three parameters are provided, the credentials will also be used for gathering awr snapshots.

sysawr_connect
sysawr_username
sysawr_password

By default, awr snapshots and reporting will use the credentials provided by the three previous parameters. But if your database is a PDB, and if you have access to the root container, you should set these parameters to the credentials for the root rather than for the PDB. Doing so will make the awr reports more comprehensive as they include data from all PDBs.

If you (as recommended) have a separate database for your RWP*Load Simulator repository, set these:

results_in_test := 0;
results_connect
results_username
results_password

Setting results_in_test to zero implies the repository will not be created in your database under test. In that case, the other three parameters must be set to point to your already created repository database.

If you want to perform some quick tests and do not have an existing RWP*Load Simulator repository, you can set these parameters:

results_in_test := 1;
results_connect
results_username
results_password
rwloadsim_dir

Setting results_in_test to non-zero implies the repository will be created in your database under test unless one has already been created there. The three following parameters will be used to create the repository schema and later to access it. The rwloadsim_dir parameter must be set to be the full path-name to the admin directory of your installation of rwloadsim.

During execution, the oltp workload needs five different database schemas for which you must provide usernames and passwords. This is done using these ten parameters:

rwl_aw1_username rwl_aw1_password
rwl_aw2_username rwl_aw2_password
rwl_oe_username rwl_oe_password
rwl_run1_username rwl_run1_password
rwl_run2_username rwl_run2_password

The usernames can be any allowed schema names as long as the schemas not already exists. It is, however, important that the names for the last two can be identified using a like condition, so these two are typically named RWLRUN1 and RWLRUN2, as the like condition RWLRUN_ can be used to identify those. The usernames must be written in upper case. The passwords must be chosen to match any password restrictions of your database.

rwl_run_like

Set to the like condition that identifies the last two schema names mentioned above. The like condition must be written in uppercase.

default_tablespace

The three first schemas mentioned above will contain actual data, and they must therefore have a proper default tablespace configured. This parameter names that tablespace. A very rough size estimate is that you will initially need up to 10GB, which mostly depends on the size of your aw_cols tables. If you are going to make many runs, size can grow to 100GB or beyond.

The next three parameters control how big the aw_cols table, how skew the access to it is, and how many rows are read by each query. If the table is almost as large as (or even larger then) your buffer cache, you can get massive reads if there is no skew in access to the table. The purpose of the settings is to be able to adjust the actual amount of physical reads in your database under test. Se the oltp.rwl file for further information.

aw_cols_256MB_doublings

The table is created by first filling it to roughly 256MB and then doubling it a number of times; this parameter sets the number of times it gets doubled from the initial 256MB. The default is 4 resulting in an aw_cols table of an approximate size of 4GB.

aw_cols_skew

If not set, all rows in the aw_cols table will be uniformly queried. If you set a value, the rows actually queried will be skewly distributed, the higher the value, the more skew distribution. With higher skew, the buffer cache gets more effective. Typical values are in the range between 5 and several hundred, but you will normally need to experiment, in particular if your aw_cols table is as large as or ever larger than your buffer cache.

Note that if your aw_cols is so large that it will not fit in your buffer cache, you cannot completely prevent reads even with very high values for aw_cols_skew, so the awr report will be likely to report that your SGA is undersized. It is debatable if this is a bug or a feature of the oltp workload, and ideas for a better scheme are welcome.

aw_cols_query_factor

This parameter sets the average number of rows being read in each aw_cols query. Due to the extremely poor index clustering factor, these rows are likely to be distributed on the same amount of database blocks. The default value is 50, which is a good starting point for experiments.

The following parameters impact the generated output and graphs.

rwl_title

Set to a string that will be put on all html files, graphs, etc that are generated.

rwl_daily_html

Set to name the file that will contain daily overviews when using the oltpday(2rwl) utility; it should name a html file.

rwl_heading

Set to a string that will be the heading for the daily reports, generated using oltpday(2rwl).

The following parameters are used to control the actual workload, and they have reasonable default values. Note that only some parameters are documented here, there rest are discussed in the oltp.rwl file that you copied to a file named after your project such as key.rwl.

ratefactor

This is a double value (default 0.2) that sets the overall workload. You can very approximately assume that with the default value together with the default value for threadcount, making runs with one process will require in the order of 0.2 database CPU seconds per second.

rategradient
rategradientstep

These are double values, and if the first set set, the overall workload increases gradually during a run. If the value of rategradient is 0 (or is unset), the arrival rate will be constant during the run; this is the default behavior. If the value is set higher than 0 and at most to 1, the arrival rate will initially be lower than specified via ratefactor, half way through the run it will be what ratefactor specifies and at the finish of the run, it will be higher. Effectively, the rate will gradually increase from 100*rategradient percent below normal to 100*rategradient percent above normal. Values outside the range [0;1] will be adjusted to either 0 or 1.

If the second parameter is set to a value of 10 or higher, the increase will be stepwise rather than gradual and the increase will take place every rategradientstep seconds.

threadcount

This is an integer value (default 50) that sets the number of worker threads per process. If you want to experiment with, as an example, twice the load, you can do so by doubling either of ratefactor, threadcount, or the number of processes.

time_out

When time_out is set to an actual value of 0 or more, any transaction that is delayed by more than the value of time_out seconds will be dropped in stead of being actually executed. When a transaction is dropped, no database session will be acquired, and the transaction is not executed. In stead, the drop count will be increased by 1. This is only likely to happen when the database is under pressure and when the actual time to get a database session and execute a transaction is higher than the inverse of the arrival rate. The drop count is visible in the persec table with a vname of ’call_failure’.

Note that this is different from the max_sessionwait parameter which sets the maximum time a worker will wait for a session.

The following parameters with names starting with fix_ have effect when the execution is fixed at starttime, i.e. when external control is not used.

fix_logoffrate

This is a double value that can be set to increase the logon activity by causing session release to also disconnect. By default, this does not happen, but if set to a value in the range ]0;1], the value sets the proportion of sessions doing the make_order transaction that will disconnect during session release. The parameter only has effect when the normal pool is a session pool.

fix_cursorleak

This is a double value that can be set to make the application leak cursors at a certain rate. By default, this does not happen, but if set to a value in the range ]0;1], the value sets the proportion of cursors that will be leaked. If you want to experiment with the, start with very low values such as 0.001 and be prepared that you will receive ORA-01000 errors.

fix_hardparse

This is a double value that can be set to make hard parse take place. By default, no hard parse takes place, but if the parameter is set to a value higher than zero, hard parse will take place. The valid range is [0;1].

orders_hashcount

If your database does not have partitioning, set this to 0, otherwise leave it at the default. This must be done before you run oltpcreate(2rwl).

runplot_width

If you are using -g or -G to show running graphs, they will by default show the latest 60s of activity, unless this parameter is set to a different value.

gnuplotjs

This parameter is no longer used. However, if you are using gnuplot version 4.6 you need to make the contents of /usr/share/gnuplot/4.6/js/* available at usr/share/gnuplot/4.6/js/ seen from the root of your http daemon.

globalawr

If you set this integer to 1, a global awr report will be created.

BURSTS

You can include four different types of "bursts" in your run; one does a burst using a higher average transaction arrival rate, the other does a burst using a higher number of worker threads, the third does an activity burst by changing (logged on) sessions from being idle to being active, and the last is an implicit burst being the result of a period with reduced session pool sizes.

For the first, you must provide a factor, by which the arrival rate increases, and for the second, you must provide the count of extra worker threads. For both, you provide the time in seconds from the beginning of the run when the burst should start and the length of the burst in seconds.

To include a burst with higher arrival rate, include these parameters:

burst_start
burst_length
burst_factor
burst_repeat
burst_function

The two first parameters set the time to start and length of the burst, and the third parameter is the factor that is applied to the arrival rate during the burst. If burst_repeat is set to a value larger than the sum of burst_start and burst_length, the burst will be repeated very burst_repeat seconds. By default, the burst is not repeated.

You can additionally declare a function named burst_function with the following shape:

function burst_function(double runsec)
return double
is
# your code here
return 1.0;
end burst_function;

If the function is declared, it will be given runseconds() as its argument and it should return a factor that will be used as a multiplier to the ratefactor otherwise in use. If you want the multiplier to depend on the actual runtime, you can use the variable runperiod which contains the value of the -r argument to oltprun. The sample code shown here will have no impact as it always returns the value 1.0. Note that this is an experimental feature and that it may be changed in a later release.

To include a burst with extra worker threads, include these parameters:

wburst_start
wburst_length
wburst_count

The two first parameters set the time to start and length of the burst, and the third parameter is the number of extra worker threads during the burst. To prevent all processes from increasing the worker thread count at the same time, you can include procnumber in your calculation for wburst_start. As an example, if you set wburst_start to 200+procnumber, the increase in worker threads will be at 200s for process 1, 201s for process 2, etc.

All parameters are double except wburst_count that is an integer, and you can very well include both types of burst.

The third type of burst is provided by specifying a point in time during the run where the activity changes. All workers are connected (assuming no pooling is in place), but before that point in time, only some workers are active. When the point in time for the change is reached, all workers get active, which will cause an activity burst, which results in e.g. many cursors being (soft) parsed at the same time. The four parameters controlling this are:

aburst_workers
aburst_pre_factor
aburst_post_factor
aburst_start

In the period until aburst_start, only aburst_workers do actual work while the remaining (i.e. threadcount-aburst_workers) are connected but are idle. The arrival rate in the active workers during this period is a factor aburst_pre_factor higher than normal. If the value aburst_pre_factor*aburst_workes/threadcount is 1.0, the effective arrival rate during this period is the same as it would have been without an activity burst. In the period after aburst_start, all workers are active, in the arrival rate of these is the normal arrival rate multiplied by aburst_post_factor. The default for both factors is 1.0.

The last type of bursts comes due to a backlog of transactions that are delayed during a period with reduced sessions in the normal pool. It only has effect when using session pools.

To include a temporary delay by reduced session counts, include there parameters:

delay_start
delay_length
delay_min_pool
delay_max_pool

The two first parameters (double) set the time to start and the length of the delay, and the two latter parameter (integer) are the min and max number of sessions in the normal pool during the delay. If delay_length is zero, no delay will take place.

CONNECTION STORMS

You can emulate connection storms starting a certain timestamp during the run. The storm can come from two souces:

	1.		Having a subset of worker threads that block a session while otherwise doing nothing. At the same time, an other set of worker threads will get active and as a consequence the session pool will increase its size, while the expected arrival rate stays the same.
	2.		Making a number of database logons that are kept open but without doing any actual activity.

The following parameters control connection storms:

cstorm_start
cstorm_period
cstorm_threads

cstorm_xtra_start
cstorm_xtra_period
cstorm_xtra_logons
cstorm_xtra_rate

cstorm_xtra_username
cstorm_xtra_password
cstorm_xtra_connect

The first parameter sets the time when the first type of connection storm will start. The next two parameters sets for how long the block will last, and the count of threads that will block a session. If cstorm_threads has its default value of zero, there will be no storm from blocking sessions.

The cstorm_xtra_start and cstorm_xtra_period similarly control when the extra connections (that do not do any actual work) will initiate and for how long they will last. The default values are the same as for the start and period of the blocking sessions. The next two parameters set the number of extra logons to establish, and the rate in logons per second. If cstorm_xtra_logons has its default value of zero, no extra logons will be started.

By default, these extra connections use the rwl_run1_username, rwl_run1_password credentials with the normal_connect connect string. You can overwrite these three by providing the last three parameters mentioned above. Note that if the extra connection storm is to a different database than your normal connections, various database statistics will not include those from the extra connection storm.

You can combine the connection storm with the first type of burst mentioned above.

Note that this is an experimental feature that may change in a future release.

24H SIMULATION

You can use oltprun to simulate a full day workload with variation over time (day/night) and with inclusion of variout types of "issues" at intervals during the run. This is enabled by including

s24hour := 1;

in your key.rwl file together with several other parameters that all have names beginning with s24h_. The most important ones that all are of type double are:

s24h_high_beg
s24h_high_end
s24h_load_hi
s24h_load_lo

The first are used to set the range of the "high" period, simulating daytime with high activity. They are multipliers to the runtime (provided using the -r option to oltprun) and set the beginning and end of the simulated daytime. The s24h_high_beg values should be less than the s24_high_end value that should be less than 1.0. Default values are 0.30 and 0.85 implying the high period will be from 30% to 85% into the run.

The latter two parameers are factors that are applied to the overall loadfactor during respectively the high ("daytime") and low ("nighttime") periods. Default values are 0.2 and 1.0 implying the low period will be 20% of the load during the high period.

24h simulation is considered an experimental feature and the full documentation on all parameters is currently only available in the oltp.rwl file that you copy to your key.rwl file. You can very well combine 24h simulation with bursts.

VERIFICATION

To verify the directories and files are setup properly, run oltpverify -d.

To verify the settings for system users are correct, run oltpverify -s.

To verify you can connect to a repository, run oltpverify -r.

To create all schemas, run oltpcreate; to drop all schemas expect the repository, run oltpdrop.

To verify schemas are properly created, run oltpverify -u.

REMOTE EXECUTION

Normally, all the background processes started by e.g. oltprun are started on the same single host. If that host is running out of CPU resources, you can prepare distribution of the processes to multiple servers. This is done by declaring a procedure named rwlcall taking a single integer argument. The argument provided will be a process number in the range from 2 until the value of the -n argument to oltprun. The process with process number 1 will always be started on the same host where oltprun is executed. The procedure must output exactly two lines that are setting sh variables named rwlcall and rwlcallargs. Those two will be used to start the actual background process. A sample declaration that will distribute the workload to three different hosts is the following:

procedure rwlcall(integer prno)
is
printline "rwlcall=’ssh’";
printf "rwlcallargs=’-n -tt appsrv%02d.domain.tld /path/to/remoterwloadsim.bash’\n"
, prno%3+1;
end rwlcall;

A template for the bash script remoterwloadsim.bash is found in the oltp directory of your rwloadsim distribution. It should be copied to all remote systems and modifed according to the instructions in the file, setting the proper environment to execute rwloadsim remotely. Note that remote execution is an experimental feature that may change in a later release.

If the procedure is not declared, the following will be used

printline "rwlcall=’rwloadsim’";
printline "rwlcallargs=’’";

implying rwloadsim will be started locally. This is always used for process number 1, as it must run locally.

REMOTE REPORT REPLICATION

When rwloadsim is running on a machine that cannot host a webservice, it may be useful to replicate the report data to a remote machine. This can be achieved by setting the awr_replication_cmd parameter to a custom command that replicates the data after the report is generated. By default, awr_replication_cmd is not set, so no replication occurs.

If awr_replication_cmd is defined, it will be executed each time a report is generated, allowing automated replication to the specified remote machine. The command typically uses rsync to securely copy the report data. Below is an example of how to configure awr_replication_cmd with rsync:

rsync -avz -e "ssh -i /path/to/identity" \
--rsync-path="mkdir -p /path/to/replicate${subdir}${runnumber} \
&& rsync" $awrdirectory${subdir}${runnumber} \
user@host:/path/to/replicate${subdir}${runnumber}

In this example:

- -avz options enable archive mode (preserving file permissions, timestamps, etc.), verbose output, and compression during transfer.

- -e specifies the SSH command, here with a custom identity file (/path/to/identity).

- --rsync-path is used to ensure that the remote directory structure exists by creating the necessary directories with mkdir -p before starting the transfer.

- The path variables $subdir and $runnumber dynamically set the subdirectory and run number for replication, creating an organized directory structure on the remote machine.

This approach ensures that the report data is replicated to the specified path on the remote host, facilitating centralized data access even when rwloadsim is executed on a machine without webservice capabilities.

SAVING PASSWORDS IN KEYRING

If you have gnome keyring available, you can use that to save passwords rather than having them in your $RWLOLTP_NAME.rwl file. To do so, enter the needed passwords using secret-tool store with the attributes connect and username for the connect string and usernames respectively. The sample oltp.rwl file has a function called keyring_password that will use secret-tool to lookup the password. To use it, you need to uncomment the relevant lines in your $RWLOLTP_NAME.rwl file.

The primary use of this is to safely show your $RWLOLTP_NAME.rwl file in e.g. Zoom calls or in screen dumps. Note that passwords will still be visible if you call oltpparams with the --human option or if somebody can log in to your account and your gnome keyring.

UNDOCUMENTED PARAMETERS

There are some extra parameters, that you can set in either your key.rwl file or using -i or -d on the oltprun command line. To get documentation for these, you need to look at the parameters.rwl or oltp.rwl file in the oltp directory.

BUGS

All oltp commands accept -H to get help, many also accept --help, some also -h for that purpose while -h is used to provide actual argument values for a few commands. For rwloadsim itself, -h or --help is the option to output help while -H is used to output help from useroptions. The best way to get command syntax in a consistent way is to use rwlman; the second best is to try with a --help option, which works in all frequently used commands. This is confusing.

Some oltp commands accept both single letter and gnu style long options, which is also confusing.