Apache PL/SQL Gateway OWA Module (mod_owa)
Doug McMahon
May 8, 2023
Version 2.11.17
Copyright (c) 1999-2023 Oracle Corporation, All rights reserved.
This module is a demonstration and example of the use of Oracle technology.
It is made available for use at your own risk with no warranty of any kind.
Read the open-source license to make sure you understand the terms of use.
The module has been approved for export from the USA; the ECCN is EAR99,
NLR, with reference #01143.
*******************
Release Information
*******************
Description
===========
mod_owa is an open-source implementation of the PL/SQL gateway available
in Oracle's OAS/OWS product. The gateway is able to call PL/SQL procedures
in a manner similar to the way Java servlets are supported, with the
Oracle database acting as the "virtual machine". Arguments are marshalled
through SQL*Net to your procedures, and responses returned the same way.
HTTP header elements are made available via a CGI-like mechanism.
The original version of mod_owa was written by Alvydas Gelzinis
(alvydas@kada.lt) and Oksana Kulikova (oksana@kada.lt). My version is
a rewrite, adding functionality omitted by the original version, fixing
several bugs, and adding some performance improvements. The motivation
for this was to provide access to OWA-based content through the Apache
web server at a time when Oracle lacked an Apache strategy. Edward Jiang
(Edward.Jiang@oracle.com) contributed the file caching and range transfer
support.
Oracle has now released a Java version of the PL/SQL gateway, as part of
ORDS:
http://www.oracle.com/technetwork/developer-tools/rest-data-services/overview/index.html
This servlet is professional-quality, well-tested, and officially
supported. Although it's not open source, it's freely downloadable
and should work in your choice of Java application server. It is
100% compliant with the gateway spec (including the NLS/multi-byte
support and document upload/download) and supports additional functionality
needed by Apex. I encourage anyone considering the mod_owa gateway to
look at this alternative.
Oracle has its own Apache PL/SQL gateway module, modplsql, but this
module is being desupported. Though functionally similar, modplsql
and mod_owa share no source code. modplsql and the scripts for the
database (PL/SQL) side of the gateway are bundled with the Oracle database,
including the XE version, and with the Oracle application server. The
modplsql code, absent the Apache linkage, is also now built directly into
the database listener, allowing you to run PL/SQL applications directly
against the database without Apache.
Please note that mod_owa doesn't support numerous features that have been
added to modplsql, and therefore doesn't support some Oracle PL/SQL
applications, likely including Apex. A particular area of incompatibility
is file uploading and downloading, as documented in later sections. Some
users have, for various reasons, been required to avoid using the Oracle
HTTP listener (Oracle's Apache) and the Oracle application server. Note
that even in such cases, it should be possible to use mod_proxy to direct
PL/SQL traffic from your Apache site into the built-in (EPG) listener. This
article may be of help in configuring the EPG, either for direct use instead
of Apache, or as a back-end to a proxying Apache:
http://www.oracle-base.com/articles/10g/dbms_epg_10gR2.php
It is assumed that you've already obtained and installed the OWA PL/SQL
procedures supplied by Oracle with OAS, iAS, Portal, and several other
products. That code has not been approved for open-source license so I
can't make it available here. You can, of course, implement your own
OWA work-alike. An incomplete sample of one is included in the test
scripts, in file modowa.sql.
The module requires an Oracle client installation to run. Specificially,
it requires a so-called ORACLE_HOME area to be installed, complete with
the Required Support Files (RSF) including the OCI client library.
Versions of the module exist for Apache 2.4, 2.2, and 2.0. The older
1.3 version is deprecated. This documentation is oriented towards the
2.x versions, though 1.3 is discussed for the sake of those with older
releases. The code was designed for Oracle 8.1.x, and is currently
built with the Oracle 11g client libraries. It contains calls that won't
be satisfied by the oldest 8.0 versions of the client libraries.
A few new features have been added that have not been well-tested and
that may not be generally useful. Use them with caution. In particular,
the new OwaAlternate modes supporting request timing and logging are
very specialized to the user who requested the feature, and may require
you to change owalog.c if you want to use it. The support for WebDB
file upload and download and Apex-compatible PL/SQL calling modes is
also experimental.
If you have problems or questions, you can e-mail at Doug.McMahon@oracle.com.
Improvements and Enhancements
=============================
* Return OK for blank pages from PL/SQL if OwaHttp is configured.
* Add OwaDadName directive.
* Treat PATCH as a REST method for OwaHttp.
* OwaReject does exact matches for PACKAGE.PROCEDURE strings.
* Allow up to 32512 bytes wide for array bindings.
* OSX port by John T. Chung of Nyquest Consulting, LLC.
* Strip carriage returns and newlines from outbound headers.
* Use streaming read/write loop for large REST bodies.
* Make 64-bit LOB interfaces the default.
* Pass client IP address from owad if available.
* Support ETag, Expires, and Last-Modified headers for OwaDocTable downloads.
* Add REF cursor return mode.
* Added OwaHttp and OwaDav to support REST operations.
* Added Apache 2.4 version (same as 2.2 version).
* Add WDB_GATEWAY_LOGOUT cookie support for Basic authentication.
* Add OwaContentType to allow different defaults for responses.
* Add OwaCharsize directive to work around PL/SQL gateway bug.
* Add support for external authentication via Oracle Wallet or OS.
* Hack OwaLDAP to allow for CGI environment supplied basic authentication.
* Hack OwaLDAP to allow for NT authentication via mod_auth_sspi.
* Allow OwaStart to send a redirect.
* Add new OwaAlternate modes to support logging (experimental).
* Add OwaWait and OwaOptimizer directives.
* Add OwaDocTable directive and partial support for WebDB document API.
* Add OwaLDAP directive.
* Add errmsg to OwaSqlError URI.
* Added Apache 2.2 version.
* Added synonym support and caching to describe logic.
* Support 2-argument flexible argument mode.
* Add the OwaSession directive.
* Disallow punctuation and spaces in argument names.
* Add OwaSqlError directive.
* Add support for transfer of unprocessed POST data as RAW.
* Add new ERROR diagnostic flag.
* Allow OwaPool to be thread-limited with THREADS.
* Add OwaEnv directive.
* Add ability to store/retrieve files in file system.
* Add OwaReject directive.
* Add support for LocationMatch (regular expressions).
* Round array sizes and buffer widths to increase cursor sharing.
* Add TIMING diagnostic.
* Add GET_PAGE_RAW support and RAW binding mode.
* Add OwaReset for advanced package reset modes.
* Add support for NCHAR/NVARCHAR binding modes with OwaUnicode.
* Add time zone and date formatting directives (for Oracle 9i).
* Support native DB logins with oracle_realm directive.
* Add relaxed describe mode and describe schema.
* Add hooks for before/after procedures.
* Updated Apache 2.0 implementation to 2.0.16 beta.
* Reimplement latches using pthread mutexes.
* Delay RESET_PACKAGE call until after request is handled.
* Use pointer-array binding for flexible argument values.
* Add SQL statement diagnostics triggered by POOL flag.
* Get realm from OWA_SEC and generate Basic challenge.
* Pass Basic authentication header to OWA_SEC.
* Remove parameter and collection binding limits.
* Allow additional header elements to be returned for document downloads.
* Support Location = "/" and document_path = "/".
* Log SQL errors to Apache error log.
* Added "chunking" of oversized arguments in flexible calling mode.
* Added oracle_proc directive.
* Support redirects from POST requests.
* Pass Authentication header through to OWA CGI.
* Added backport to 7.3, added LOBs and describe to 8.0 backport.
* Added a connection pool cleanup thread.
* Added support for range-based file downloads.
* Added support for file-system caches.
* Added directive to set default start page.
* Added support for using alternate OWA implemenations.
* Improved handling of multi-byte character sets for content
uploads and downloads.
* Support BFILEs for content downloads.
* Support binary content returns for single-byte character sets.
* Added support for promotion of singleton arguments to array binding.
* Added support for file uploads and downloads through LOBs.
* Added support for legacy file up/download through LONGs.
* Ifdef'ed code to enable builds against OCI 8.0 (leaves out the LOB
support and the NLS check).
* Switched connection pool to semaphore-based model.
* Improved directive processing to eliminate need for quotes around
multi-valued attributes (e.g. oracle_diag).
* Added a parameter for DAD character set; send character set where
possible on HTTP response headers.
* Added the ability to recover from a database shutdown/restart that
invalidates connections in the pool.
* Add OwaHeader directive.
Bugs Found/Fixed
================
* Ensure OCI resources are freed when database close operations fail.
* Fix a bad bug resizing arrays for multi-part/form-data submissions.
* Pass OwaStart procedure as the CGI variable PATH_INFO.
* Add space for trailing null after 32512 bytes.
* Fix missing mutex for multi-threaded connection pool.
* Fix separator scan for BIG-5, GBK, and Shift-JIS character sets.
* Strip trailing suffix on parameter names for multipart requests
when in Apex mode.
* Truncate oversized strings to 32k when passing them to CGIPOST.
* Fix a bug with flexible argument handling in Apex mode.
* Get correct media type for REST requests.
* Report OS username on external authentication failure.
* Fix the buffer resize in Apex argument stream reader.
* Check file-system permission problems after connect failure
with status 1804 or 12705.
* Fix a bug with multi-file uploads when not using flex arguments.
* Fix questionable initialization for temporary connection members.
* Recompute max field names in chunked mode flexible arguments calls.
* Allow PL/SQL to return permanent redirections.
* Treat logon warnings as success (e.g. password expiry warning).
* Fix a bug with 0-argument redescribes to flex-argument calls.
* Better support filename-driven WPG_DOCLOAD file downloads.
* Allow Content-Disposition for OwaDocTable file downloads.
* Fixed array-fetch bug in REF cursors.
* Fixed handling of 0-argument arrays for flex args calling mode.
* Special processing for Windows file separator with multi-byte character sets.
* Skip OCI LOB operation for WPG-style uploads of 0-length files.
* Fix flexible argument mode for WPG-style file uploads.
* Fix thread handles for Unix to accomodate unsigned values.
* Fix a crash with config pointer deref in mutex init for Apache 2.x.
* Fix handling of Basic auth session reuse.
* Fix describe of PUBLIC synonyms.
* Fix OwaAdmin, broken in a previous change.
* Guard cleanup thread agains race conditions during shutdown.
* Allow flexible argument calls to POST via multipart/form-data.
* Add argument chunking to file upload to avoid 4000 byte OCI limit.
* Limit size of PL/SQL string buffers to 32512 bytes per PL/SQL maximum.
* Change describe processing to skip signature mismatches in normal mode.
* Fixed redirect processing for Apache 2.0.
* Merge URL arguments for POST requests including file uploads.
* Fix bind width of reserved argument for file uploads.
* Avoid deadlock with 0-length content in POST requests.
* Switch POST request reads to standard Apache APIs.
* Unescape argument names in case of unusual characters.
* Suppress SQL diagnostic page for failures during reset/disconnect.
* Always truncate LOBs after write (in case of a shorter update).
* Fix missing child callback functions for Apache 2.0.
* Allow pages to be returned with Content-Length of 0.
* Use SCRIPT_NAME instead of REQUEST_URI when fixing Apache CGI problems
(for compatibility with mod_rewrite).
* Fix broken logic in parse of OwaReset directive.
* Skip additional header elements in file uploads (for Mozilla).
* Fix incorrect swap of database character set and client character set.
* Prevent whitespace or punctuation in procedure/command names.
* Fix missing DAD character-set conversion for flexible argument mode.
* Fix cleanup-thread mutex initialization problem.
* Workaround for database crashes due to USER_ID/PASSWORD assignments
(see plsqlbug.htm).
* Create PATH_INFO from SCRIPT_NAME when not found in CGI environment.
* Make INIT_CGI_ENV call last to avoid shadow-process crash.
* Add work-around for OCI 0-width collection bind bug when doing
array promotion.
* Fixed bug with piecewise LONG file uploads/downloads.
* Fixed bug with boundary scan logic for large uploads.
* Work around keepalive problem with OCI bequeath connections by
marking Apache socket as "close on exec" on Unix.
* Free temporary LOB after document downloads.
* Fix bug with 0 array arguments to flexible calling mode.
* Correct error-page and header-only return processing.
* Rewrite SCRIPT_NAME and PATH_INFO environment variables for
structured Location names (e.g. "/virtual/path").
* Fix several problems with LONG and LONG RAW uploads, including the
inability to upload zero-length files.
* Prevent crash when initial database connect fails.
* Prevent crash when ARGS diagnostic used with large arguments/arrays.
* Workaround for OCI 0-width collection binding bug.
* Workaround for keepalive-related browser hangs (details at
modowabug.htm).
* Problem with misparsed headers (due to change in how header lines
are returned with the newer OWAs).
* Connection pool lockup on Windows when closed/reopened (due to omission of
semaphore reset).
* Invalid state for reusable connection if first use was for LOB
download (due to omission of parse for OWA.GET_PAGE cursor).
* Crash when displaying error messages from temporary connections
(due to unintialized error message buffer pointer).
* Intermittent spin in Windows WriteFile routine due to incorrect parameters
used to open the diagnostic file.
* Failure to read entire content for large POST requests.
* Fix behavior of OwaNLS to prioritize it over NLS_LANG.
General Release Notes
=====================
mod_owa is packed in two separate bundles, windows_all.zip for the Windows
platform, and unix_all.tgz for Linux and other Unix platforms. Both bundles
have files in a similar directory structure, described below. Note that
any ASCII-based file (e.g. source files and scripts) will be in the
carriage-control format native to the platform (i.e. with newline separators
for Unixes and carriage-return + line-feed for Windows).
src C source code for mod_owa.
doc Documentation files, including the license.
apache13 Apache 1.3 binaries and make file. The unix version is built
for Linux/X86 64-bit. This version is obsolete as of 2.10.0.
apache20 Apache 2.0 binaries and make file, also Linux/X86 64-bit.
There is no Windows version.
apache22 Apache 2.2 binaries and make file, also Linux/X86 64-bit.
The Windows version is build for Vista 64-bit.
apache24 Apache 2.4 binaries and make file, also Linux/X86 64-bit.
The Windows version is build for Vista 64-bit.
classes Binary class files for the servlet implementation
(very out of date).
java Java source files for the servlet implementation
(very out of date).
modtest Source and binaries for the simple mod_test module.
owad Source and binaries for the stand-alone owad server.
test SQL scripts and test pages for the tests and demos.
makefiles Legacy make files for different Unix platforms
(unix version only).
Windows Release Notes
=====================
Unpacking unzip windows_all.zip
Building If necessary, you can rebuild from source provided you have
the Oracle, Apache, and VC++ headers and link libraries.
The modowa.mak file has these located in places particular
to my development machine, so you will probably have to edit
the file for your environment. After this, nmake modowa.mak.
Installing Copy mod_owa.dll to /modules.
To configure, add to httpd.conf
LoadModule owa_module modules/mod_owa.dll
AddModule modowa.c
The AddModule directive may or may not be needed. It should
be implicit in the LoadModule; however, if you've used the
ClearModuleList directive, you may need to reactivate mod_owa
with the AddModule command shown. Apache 1.3 will give a
warning if AddModule was unnecessary, but will otherwise
start up correctly.
Note that the pre-built binaries are built using the Vista SDK
on Windows 7 with Oracle 12c.
Running Restart Apache. Note that the Oracle environment will be
assumed to be set in the context of the user starting
the web listener and/or from the registry. It is important
that ORACLE_HOME be set because the module must know where
the Oracle client environment is located. If you are
serving from a local database, ORACLE_SID should also be
set. Other Oracle context is taken from the environment
and/or registry as well, including important NLS settings
such as NLS_LANG.
Note: the name of the module has been changed from ApacheModuleOwa.dll
to mod_owa.dll. You can rename the file if you like; otherwise, be
sure your configuration file's LoadModule command matches the new name.
Unix Release Notes
==================
Unpacking tar xfz unix_all.tgz
Building If necessary, you can rebuild from source provided you have
the Oracle and Apache headers and the Oracle link libraries.
The modowa.mk file was written for Linux and has these
located in places particular to my development machine, so
you will probably have to edit the file for your environment.
After this, make -kf modowa.mk.
The binaries are for 64-bit x86 Linux and built with Oracle 11g.
I am no longer able to supply binaries for other Unix platforms;
please see the note below.
Installing Copy mod_owa.so to /modules (or libexec).
To configure, add to httpd.conf
LoadModule owa_module modules/mod_owa.so
AddModule modowa.c
(The module file name is mod_owa.sl on HP/UX.)
The AddModule directive may or may not be needed. It should
be implicit in the LoadModule; however, if you've used the
ClearModuleList directive, you may need to reactivate mod_owa
with the AddModule command shown. Apache 1.3 will give a
warning if AddModule was unnecessary, but will otherwise
start up correctly.
Note that the pre-built binaries are build on Fedora Core 17
(3.3 kernel) for Linux x86, using Oracle 11g libraries.
Unfortunately, Linux is fussy about the version numbers
on the OCI library, libclntsh.so, so you will probably
have to relink if you're using another database version.
I have found that libraries built in the RedHat 7.2 GCC
environment won't load into the executable verion of Apache
shipped with it, because of the posix threads problem discussed
below; replacing the shipped version with a rebuilt listener
solved the problem. Also, please note that I create symbolic
links from /usr/lib to the Oracle dynamic libraries, in particular
libclntsh.so, so that Apache will run on boot without
LD_LIBRARY_PATH being set.
Running Restart Apache. Note that the Oracle environment will be
assumed to be set in the context of the user starting
the web listener. It is important that ORACLE_HOME be
set because the module must know where the Oracle client
environment is located. Also, since the module is linked
dynamically you must ensure that the Oracle libclntsh.so
you linked with is on your LD_LIBRARY_PATH. If you are
serving from a local database and not using SQL*Net,
ORACLE_SID should also be set. Other Oracle context is
taken from the environment as well, including important
NLS settings such as NLS_LANG.
I am no longer able to supply binaries for HP/UX, AIX, or Solaris, because
I no longer have access to machines and environments where I can do the
builds. Users that need support for these should rebuild the code from
source using the makefiles supplied in the unix_all.tgz as a starting
point.
Special note: posix threads, httpd, and libclntsh.so
After upgrading to 8.1.6 and to RedHat 6.1, I discovered that mod_owa.so
would no longer load into httpd. The httpd simply fails to start,
without issuing any useful error message. I believe this problem is
due to the fact that Oracle's libclntsh.so now depends on the posix
threads library, libpthread.so, which must be linked with the main
executable, httpd, with the flag "-lpthread". If this is not done,
then when httpd attempts to load mod_owa.so, and hence libclntsh.so,
the loader will error out because it can't do a dynamic load (via the
dlopen() system call) of libpthread.so. Replacing the shipped Apache
with a rebuilt listener solved the problem; I edited the Makefile
produced by Apache's "configure" script to make sure LDFLAGS1 is
set to "-lpthread". A quick way to check this is to use the "ldd"
command to see what bindings your httpd is built with - make sure
libpthread.so is one of them!
Note: this may no longer be a problem on modern Apache builds, which
link to libpthread out of the box.
If you suspect you have this problem, I've included a very simple module,
mod_test.so, which does little more than initialize the OCI (and, by
omitting the "-DWITH_OCI" compile flag and the "-lclntsh" link flag,
not even that much). If you can't get mod_test.so to load, try building
it without the OCI dependency, and see if it loads. If it does, you
have this problem. If it still doesn't load, then you have some other
incompatibility between the build environment you're using to build
mod_test.so and the build environment that produced your httpd. Either
way, the only answer I have is to rebuild Apache from source to see
if that clears the problem.
Oracle ran into this problem with the first version of iAS, and we are
now going to ship the httpd that comes with iAS built with -lpthread
on all platforms.
HP/UX may also need to have libcl linked into httpd (e.g. "-lcl").
Older versions of Oracle on Linux (e.g. 8.0.x) did not seem to have
a dependency on libpthread. mod_owa was originally developed on
8.0.x, and so I didn't encounter this problem, or even understand it,
until after the upgrade to 8i.
OSX Release Notes
=================
An OSX binary can be produced by following the procedure for Unix;
in place of modowa.mk, use the make file osx.mk found in the makefiles
sub-directory. The OSX build was contributed by John T. Chung of
Nyquest Consulting. Please read his more detailed guide to building
the module for OSX, found in osx_notes.txt in the doc sub-directory.
Compiling, Linking, and Testing
===============================
If you need to rebuild mod_owa for any reason, you should be aware of the
following compilation flags:
WIN32 Compiles the code for Windows 32-bit.
WIN64 Compiles the code for Windows 64-bit.
UNIX Compiles the code for Unix versions.
OSX Compiles the code for OSX.
OLD_LOBS Use the old LOB read/write interfaces (2 Gbyte limit).
OVERSIZED_LOBS Use new LOB read/write interfaces (10g or higher).
OLD_MAX_STRING Limits PL/SQL array bindings to be 4000 bytes wide.
NO_FILE_CACHE Disables the file caching code (the file caching code
is not entirely portable).
NO_THREADS Disables the connection-pool cleanup thread code (which
is not entirely portable).
NO_SOCKETS Disables network/socket-related code (which is not
entirely portable).
APACHE20 Builds the Apache 2.0 linkage; the resulting module will
run under Apache 2.0 but not under 1.3.x.
APACHE22 Builds the Apache 2.2 linkage; the resulting module will
run only under Apache 2.2.
APACHE24 Builds the Apache 2.4 linkage; the resulting module will
run only under Apache 2.4.
The source files are:
modowa.h Header for everything
modowa.c Main linkage to Apache
owahand.c Core of the request handler
owaplsql.c Basic PL/SQL linkage
owadoc.c Document upload/download
owacache.c File system and shared-memory cache
owasql.c OCI-based functions
owanls.c Oracle-related character set functions
owautil.c String functions and misc. utilities
owafile.c OS-dependent I/O routines
owalog.c Apache-dependent logging code
A few very simple test procedures are supplied with mod_owa, in the form
of some HTML test forms and SQL scripts to create database test objects:
modowa.sql Creates minimal equivalent of OWA server-side.
Creates some simple packages that mimic the functionality
of the OWA sufficiently for mod_owa to make requests and
get page responses.
tables.sql Creates document storage tables used by demodoc.sql.
Creates user table needed by demoauth.sql. Also creates
the OWA_ARRAY object type needed by Java and used by the
other scripts.
owatest.sql Very basic test procedures for use with owatest.htm.
demodoc.sql A demonstration package for document storage and
retrieval.
demoalt.sql A demonstration/test package for using an OWA alternative.
demoauth.sql A demonstration/test package for Basic authentication.
owatest.htm Test form to invoke procedures in owatest.sql.
Note that the scripts contain references to SQL object types, LOBs, and
NCHARs, which you might have to remove to get them to run against an older
database server.
32-bit Builds
=============
I no longer provide 32-bit builds of mod_owa for Linux or Windows.
Users who need a 32-bit build should do so from source files.
On a transitional basis, the new windows_all.zip contains 32-bit
builds for Apache 2.4, in folder apache24_32, and for Apache 2.2,
in folder apache22_32.
LOB Size Restrictions
=====================
mod_owa wasn't designed to handle very large file-like LOB transfers.
Transfers up to 10s of megabytes in size aren't a problem, but sizes
into gigabytes are well beyond what was envisioned when the PL/SQL
gateway was developed.
The OCI client interface on which mod_owa relied for years still use 32-bit
signed integers and therefore imposes a 2 gigabyte upper limit on a LOB.
As of Oracle 10g, a newer interface became available that uses 64-bit
sizes; mod_owa now uses this newer version by default. As this breaks
compatibility with 9g and earlier, users who need the older interfaces
must rebuild the module with the OLD_LOBS flag set. (32-bit binaries for
Windows will continue to use the old interfaces.)
There is no point in enabling the larger LOB limits on 32-bit builds of
mod_owa because the module would be unable to process them in any case.
File uploads suffer from another drawback. mod_owa wasn't designed to
support Apex-style calls and file uploads, and so that functionality had
to be retrofitted to mod_owa's architecture. Since mod_owa performs its
argument gathering phase prior to pinning an OCI connection for the
request, this means that any files in the body have to be buffered in
memory before they can be streamed to LOBs via OCI. Until this is fixed,
assume any uploads are going to need an amount of memory equivalent to
the total request size, which is very problematic if gigabyte-sized
objects are transmitted.
As a practical matter it's unclear what PL/SQL and LOB application would
realistically need to support such objects. Oracle LOBs are transactional,
which is both their advantage and a potential weakness. For very large
objects (say, a 5 gigabyte movie) the need to keep an OCI connection
pinned during the entire upload process, plus the need to tie up an Apache
worker process or thread, seems not to scale very well. Another problem:
if a network glitch or other system error occurs during the transfer, the
entire operation gets rolled back. For these types of applications it seems
as if a torrent-like (fault-tolerant, multi-stream) approach would work
better. Ideally an application could transfer blocks of the large object
in arbitrary order, with multiple server-side workers cooperating to stream
blocks into a slowly reassembled copy, and with re-transmits for lost blocks,
errors, etc. Oracle LOBs can't work that way - only one logical database
transaction can have a LOB open for writting.
Oracle 7.3 and 8i Versions
==========================
Code path for these older versions of the Oracle client libraries has
been removed from the module.
Clock Rollover in 2038
======================
The Unix system clock measures time in seconds since January 1, 1970.
This clock will run past the maximum value of a 32-bit signed integer and
will wrap around, appearing to be a large negative integer, on or about
January 19, 2038. mod_owa is potentially vulnerable to this problem,
although because it uses 32-bit unsigned integers, mod_owa's internal
timestamp values may still work until approximately 2116. mod_owa uses
Unix-like timestamps internally even on Windows. Any problem would most
likely affect the file-system caching feature, because of the way timestamps
are reported from that layer. Other areas of the code that consult the
system clock (such as logging, header generation, and timeouts) have
been switched to 64-bit timestamps.
Troubleshooting
===============
I suggest starting by reading the information in this document
carefully. There's a good chance your problem is covered somewhere
already, and even the contents of this section assume you already
know concepts discussed later in this document. That said, here are
some common errors with mod_owa installations:
OWA_CUSTOM.AUTHORIZE
--------------------
Oracle ships the OWA_CUSTOM (and, in older OWAs, the OWA_INIT) package
with a default implementation of AUTHORIZE that always returns false.
If you've set OwaAuth in httpd.conf, mod_owa will call this routine
and it will effectively disable any calls to your procedures. The
symptom is usually an HTTP 404 error (URL not found). Remove the
OwaAuth directive or change AUTHORIZE to return true if you don't
want security. Otherwise, be sure to replace AUTHORIZE with a correct
security check.
Flexible arguments
------------------
The old OAS apparently had a feature whereby it would call a procedure
with the "flexible argument mode" despite the lack of the required "!"
prefix on the URL. It had some sort of heuristic where it was able to
figure out that this should be done, probably after a call built with
the signature set by the OWA specification silently failed. mod_owa
doesn't have this feature, and it can't be added because there is no
foolproof way to figure out that this is required (I can think of
several ambiguous cases), and because the architecture of mod_owa
simply can't support it. The usual symptom of this problem is a 400
("bad request") error (or a SQL error page if the diagnostic is enabled).
A quick fix is to specify such procedures with OwaFlex directives,
though it would be better to use the specified "!" prefix, or code
overloads of the procedure in question to match the call signatures
actually expected.
OWA gives PL/SQL errors with multi-byte databases
-------------------------------------------------
This is due to an incorrect implementation of the page buffer in the HTF
package (pubht.sql and privht.sql). The problem arises because SQL and
PL/SQL data buffers measure their maximum lengths in bytes, whereas all
string operations, e.g. SUBSTR, LENGTH, etc., measure in characters. As
a result, the HTBUF_LEN limit of 255 is not enforced properly with
multi-byte character sets. I am trying to get Oracle to fix this by
using the byte-based equivalents SUBSTRB, LENGTHB, etc. in the underlying
code. Meanwhile, the simple fix is to use the OwaCharsize directive to
reduce this limit based on your worst-case character size. For most
multi-byte character sets, this will be 2 bytes per character, so the
limit would be 127. For old UTF8 Unicode, it's 3 bytes per character,
meaning the limit should be 85. For the newer AL32UTF8 Unicode, it's
4 bytes per character, and the limit should be 63. For a small number
of Asian character sets it's also 4 bytes per character (you can recognize
them because they will have a "32" in the Oracle character set name,
e.g. "ZHT32EUC"), and the limit should be 63. OwaCharsize can be used
to set the number of bytes per character N, which will cause mod_owa to
force the buffer length down to floor(255/N). If in doubt, setting
OwaCharsize to 4 is generally safest, although it may reduce performance.
2000-byte CGI value limit in pubowa.sql
---------------------------------------
In older versions of the OWA, the tables that hold CGI name/value pairs
in PL/SQL were limited to 2000 bytes. When passing the CGI values to
PL/SQL, mod_owa truncates large values to the maximum size allowed by
the OCI, 3999 bytes. It is therefore possible, with an older OWA, to
get a PL/SQL "value or numeric" error from a string buffer overflow.
The simplest solution is to change the definition of vc_arr in
pubowa.sql and increase the limit to 4000, matching more modern
versions of OWA. I believe this will also work on older databases
since PL/SQL has always had a 32K maximum for string size. However,
it's possible that the network code won't support the 4000 maximum. If
you believe you have such a problem, try using one of the backport
versions of mod_owa (for example, the Oracle 8.0 version), since
these impose the lower 2K limit.
32512-byte limit for PL/SQL variables
-------------------------------------
In the original version of OCI 8, the driver limited the array binding mode
to arrays of standard SQL VARCHAR2 columns. This meant that they could be no
larger than 2000 bytes each. This limit was increased to 4000 when SQL was
improved to allow such columns, but it was still well short of the 32767
maximum size of a PL/SQL string variable. Furthermore, OCI network code
had a limit of 32512 bytes on any bind variable, which meant that even
simple scalars had to be limited to this lower value. It now appears that
the 4000-byte limit has been corrected, allowing arrays to be up to 32512
bytes wide. This is relevant for collection-bound arguments to PL/SQL
procedures, including the flexible argument mode. I've enhanced mod_owa
to allow for this new maximum, but this may be incompatible with very old
versions of OCI. If you get errors resulting from this, you'll have to
rebuild the module with the older limit of 4000, using the OLD_MAX_STRING
flag. The very latest versions of OCI appear to allow up to 32767 bytes
for both scalars and arrays. I may upgrade to this new limit in a future
release, but have chosen not to do so at this time because of compatibility
concerns.
Database crashes passing Basic authentication
---------------------------------------------
This is due to a bug in PL/SQL related to the use of RESET_PACKAGE.
The symptom can be repeated challenges for re-authentication even
after you've logged in, or it can take the form of "Internal Server
Error"s. You will generally see many failures with Oracle status
code 3113 in the mod_owa log file. The bug occurs intermittently
but always when executing the two assignment statements for
OWA.USER_ID and OWA.PASSWORD, which are required to support the
OWA_SEC functions. Unfortunately there is no API by which
the security-related package globals can be set, and mod_owa is
obliged to use RESET_PACKAGE. mod_owa runs the two assignment
statements as a separate PL/SQL operation, and this seems to
eliminate the crash in most, but not all, cases. If you believe
you have this problem, try using the SETSEC mode of the OwaAlternate
flag (described below). (Oracle has fixed this bug in 8.1.7,
the bug number was 1244988.)
OWA_SEC functions don't work
----------------------------
Users moving from OWS may find that their user authentication code
doesn't work without some modification, because fundamentally
mod_owa is an Apache driver for the OWA, not a replacement for
OWS. In particular, the functions in OWA_SEC are not fully
supported. The function OWA_SEC.SET_AUTHORIZATION_SCHEME is
irrelevant; mod_owa passes all requests to your PL/SQL code and
expects you to perform any necessary authentication operations.
If you are using Basic authentication, you need to arrange for
a challenge to be sent to browsers that haven't logged into your
site; this can be done from your AUTHORIZE function or directly
from each procedure, under conditions you determine. Apache
will pass any Basic authentication information back to you in
the HTTP headers. Please review the section on Basic authentication
later in this document.
Characters garbled or incorrectly converted
-------------------------------------------
This is most likely due to a misconfiguration or misunderstanding
of how character sets are handled between the browser and Apache,
and/or between mod_owa and the database. The quick checklist is
to verify proper setting of the NLS_LANG environment variable in
the environment used to launch Apache, verify proper setting of
the OwaCharset (if it's needed to override NLS_LANG), verify
proper setting of the browser character set, and verify the database
character set by querying the value of parameter 'NLS_CHARACTERSET'
from V$NLS_PARAMETERS using sqlplus. If none of this helps, read
the section on character set handling later in this document.
httpd won't load mod_owa on Unix
--------------------------------
Common reasons for this include missing the OCI library on the
LD_LIBRARY_PATH and mixing incompatible versions of the module and
the OCI library. The module needs to match the OCI library major
version number, and you can't use a 64-bit version of mod_owa
against a 32-bit OCI (and vice-versa). Thus, if mod_owa.so depends
on 64-bit libclntsh.so.11, it won't run against any libclntsh.so.12
(unless a symlink is provided), nor against a 32-bit libclntsh.so.11.
Use the "ldd" command, or use "readelf -d -h" on mod_owa.so, httpd,
or libclntsh.so to examine dependencies and the build type. Also,
ensure that httpd was linked against -lpthread, which is now required
by Oracle's OCI. In the past, older versions of Apache did not link
against -lpthread, and so the module would fail to load giving no useful
error message. Please review the release notes above for Unix systems.
Apache unable to load mod_owa on Windows
----------------------------------------
The most common reasons for this are incorrect placement of the
mod_owa.dll file versus the LoadModule directive, or a missing
or incorrect version of the Oracle libraries (e.g. OCI). Make sure
the path in LoadModule corresponds to the path that Apache is displaying
in the error message and that this path is valid. If the error mentions
that "the specified procedure could not be found" this may mean that
the correct OCI library cannot be located. At least one user with
multiple Oracle homes on the same Windows system encountered this
problem because his 8.0.x OCI library was found first, ahead of the
8.1.x version. Note that setting ORACLE_HOME will not change the order
in which Windows will search directories for dynamic libraries. Either
make a global change to the path or use the 8.0.6 backport version of
the mod_owa.dll library. Finally, note that if you attempt to load
a 64-bit version of the module in a 32-bit Apache, or run against a
32-bit OCI library, you'll get similar errors. Check the involved
files with the DUMPBIN utility, specifically check HTTPD.EXE,
OCI.DLL, and MOD_OWA.DLL with DUMPBIN/HEADERS and verify that the
first header value (for machine) is consistent across them.
Apache 1.3 gives warning/errors loading mod_owa
-----------------------------------------------
There are unfortunately two variants of Apache 1.3, the standard
version, and the SSL version. The SSL version has the so-called
EAPI module extensions. To signal modules that were compiled
with the EAPI extensions, there is a new magic number that gets
compiled into the code. The SSL version will load a module that
it thinks is compiled the old way, but will give you a warning
because the end of the module structure may be missing some
necessary pointer structures, and may cause a crash. The standard
version will not load an EAPI module, however; it will give you
an error message claiming the module is not an Apache module.
mod_owa can be compiled in several ways; it can be compiled
without EAPI, it can be compiled with the EAPI extensions but
using the old magic number, and it can be compiled exclusively
for EAPI. By default, on Linux and Windows I build it with the
EAPI extensions (thus avoiding the possibility of a crash) but
using the old magic number (thus allowing it to work fine in
the standard Apache). If you have an Apache with mod_ssl and
you need to get rid of the warning message, rebuild the code
using the EAPI and EAPI_ONLY flags.
Silent failures during RESET_PACKAGE
------------------------------------
After processing an HTTP request by calling your PL/SQL code,
and getting/relaying the results from GET_PAGE, mod_owa must
call the RESET_PACKAGE procedure to clear out the PL/SQL
state for the next request. This is the only SQL operation
performed by mod_owa after it sends the results to the browser;
because of this, any error that occurs at this point can't be
displayed to the browser (even with the SQL diagnostic enabled).
However, with the SQL diagnostic, mod_owa always writes the
final SQL status to the mod_owa log file, so it should be possible
to see if failures have/are occuring by consulting that file.
Failures at this point are very unlikely, but at least one user
has encountered a failure here due to a mis-configuration
(using the OwaReset directive to cause mod_owa to call one of
the newer reset modes against a database that did not support
them).
Oracle environment variables
----------------------------
It's very important that ORACLE_HOME be set to point to the location
where your Oracle software is installed. It's also critical that
the directory containing libclntsh.so be on your LD_LIBRARY_PATH.
Other important environment variables to look at are NLS_LANG
(if you are trying to run a particular locale) and ORACLE_SID (if
you are not using SQL*Net to connect to your database). A good
way to check the Oracle setup is to run sqlplus from the environment
where you launch httpd and connect to the database using the
identical connect string specified in httpd.conf for mod_owa.
Note that on Unix it's vital that areas of the ORACLE_HOME be
readable by the Apache "run-as" user; some users have run into
problems because areas of their ORACLE_HOME were not readable
by the "nobody" user they set Apache up to run as (check the User
and Group directives in httpd.conf). A very common symptom of this
class of configuration problem is to get an -1 error on OCI connect.
Another symptom is an ORA-01804 on connect. One way that users
sometimes accidentally break the environment variables is with
mod_env.
OCI connection issues
---------------------
The OCI connect operation can fail for any number of reasons.
The primary ones are not having the Oracle environment variables
set properly, lack of file-system access to ORACLE_HOME from the
Apache run-time user, incorrect settings for TNSNAMES.ORA, or
lack of access to the Oracle Wallet. To help diagnose some of
these issues, you can set environment variable MODOWA_DUMP_ENV
to a file path/name, and on connect failures, mod_owa will dump
information (mainly the OS environment variables) to the
specified file.
Firewall disconnects
--------------------
Some users have reported that firewall or other networking infrastructure
sometimes kills idle sqlnet connections without notifying the OCI that
the socket connection to the database is no longer valid. This leads to
sqlnet "hangs" when an idle (but now dead) connection is drawn from a
connection pool. I'm not aware of any solution to this type of problem
that I'm able to add to mod_owa. If OCI calls the socket layer to
transmit SQL to the database, and doesn't get an error code from the
operating system, it will then wait for a response from the database.
If the message didn't get through, it will hang waiting for a response
that will never come. I suspect that either the firewall or the OS is
is not configured properly when this type of failure occurs. You might
be able to avoid the problem by setting SQLNET.EXPIRE_TIME on the
database server side to a value that cleanly closes outstanding
connections at an interval that's more frequent than the firewall
kills them. Another work-around is to set OwaPool to 0 to avoid
mod_owa connection reuse entirely.
SELinux issues
--------------
It's important to make sure Oracle's OCI library and associated
files in ORACLE_HOME are readable by the Apache worker processes.
Many users running SELinux run into problems because they forget
to do this. A common symptom is to get file-system privilege
errors when attempting to start Apache, often when loading
libclntsh.so (the very first Oracle-supplied file to be accessed).
Ubuntu issues
-------------
Some users have reported problems getting the module to run properly
with Apache on Ubuntu. The symptom is a failure to start the OCI
systems, sometimes with a report of an error ORA-01804. These errors
are due to the inability of OCI to read flat-files from the ORACLE_HOME
area. Specifically, there are timezone and character set files that
need to be loaded before the OCI can start up. In some cases, users
have reported that they needed to set the ORACLE_HOME (and other)
environment variables in /etc/apache2/envvars.
GLIBC compatibility
-------------------
Some users have reported problems getting the module to run properly
due to GLIBC incompatibilities. Specifically, if their installation has
an older version of GLIBC, mod_owa's dependency on a newer version can
lead to unhelpful error messages such as "GLIBC_2.14 not found". If
you can't upgrade your GLIBC installation, the simplest solution is
to rebuild the module from source code.
OwaLog diagnostic file is not written
-------------------------------------
This problem is almost always due to a file system security problem.
Apache worker processes can be set up to run as a different user/group
than the user who starts Apache, via settings in httpd.conf. If this
"run-as" user doesn't have write permission on the directory where the
log file needs to go, mod_owa can't create it. And if the "run-as"
user doesn't have write permission on the file, mod_owa can't write
messages into it.
Database connection exhaustion with Apache 2
--------------------------------------------
Apache 2.0 and beyond have all but desupported child/worker-process
cleanup. Thus, there is no opportunity for mod_owa to close any
pooled database connections when a worker goes down. Unfortunately,
this doesn't just mean that you might lose connections when a crash
occurs, because it can happen if the Apache manager process decides
to shut down a worker as load subsides on the system. Connection
loss thus appears to be a fact of life now with Apache. Either
disable pooling completely (using a pool size of 0, or via the
THREADS diagnostic flag), or ensure that your database server
checks for and kills inactive database processes (the server side
of the dead connections).
HP/UX issues
------------
The biggest single problem is that Oracle's libclntsh.sl, on which
mod_owa depends, was shipped with some undefined symbols. The HP/UX
loader will fail to start Apache but typically gives misleading
error messages about the cause. Be sure that you've got the patched
version of this library from Oracle before trying to use mod_owa.
Another common source of trouble on HP/UX is to mix an incompatible
combination of 32-bit and 64-bit executables and shared libraries.
Use the HP/UX-specific "chatr" function to check httpd, mod_owa.sl,
libclntsh.sl, and other key libraries. Also, make sure httpd was
linked with libpthread and libcl (chatr will tell you this, too).
You can add -lcl and -lpthread to the Makefile that relinks httpd
and the relinked version will work.
*************
Configuration
*************
Locations
=========
The main task involves setting up "locations" that point into the
OWA module from within httpd.conf. For example:
AllowOverride None
Options None
SetHandler owa_handler
OwaUserid owa/owa
OwaNLS WE8ISO8859P1
OwaAuth OWA_INIT
OwaDiag COMMAND ARGS CGIENV POOL SQL MEMORY
OwaLog "/usr/local/apache/logs/mod_owa.log"
OwaPool 20
OwaStart "doc_pkg.homepage"
OwaDocProc "doc_pkg.readfile"
OwaDocPath docs
OwaUploadMax 10M
OwaCharset "iso-8859-1"
order deny,allow
allow from all
Each location can be thought of as "mounting" a distinct database and
database context. In the above example, URLs will be directed into the
PL/SQL gateway when they have the form:
http://mymachine.../owa/procedurename...
With locations, you can set up separate "mounts" for different databases, or
multiple "mounts" to the same database with different NLS language settings,
etc. Use of patterns (LocationMatch directives, e.g. "/owa/[^/]*") is
supported (see the later section on LocationMatch support). Use of
structured locations e.g. "/owa/subdir", is supported with special-case
code. Use of "/" as a location (directing all web traffic to mod_owa)
is also supported with special-case code. The mod_owa-specific
directives in this Location example are described in next section.
A common practice is to place all the OWA-related Locations and other
directives, including the LoadModule directive, in a separate .conf file
(e.g. modowa.conf) and then include this file into httpd.conf with
something like:
include modowa.conf
Directives
==========
mod_owa has a number of directives that are intended for use within a Location
directive and that establish settings solely for that Location. There are
also two optional directives that appear outside the scope of any location
to specify global (Apache-wide) settings.
When I started working on mod_owa, I was initially just building extensions
to the original version, and I followed the naming convention established by
the original authors. Later, after the rewrite, I found myself adding a
great many new directives, but felt obliged to continue using this naming
convention. In the latest versions of mod_owa, I've switched to a
standards-conforming naming convention where every directive is prefixed
with "Owa". The old names are still supported for backward compatibility,
but will be dropped as of the Apache 2.0 release. Users are encouraged to
adopt the new names in their .conf files.
Summary of directives:
Per-Location Directives
2.0 Directive 1.3 Equivalent Short Description
--------------------------------------------------
OwaUserid oracle_userid database connect string
OwaNLS oracle_nls language/territory (& OCI char set)
OwaAuth oracle_ver choose AUTHORIZE function
OwaDiag oracle_diag diagnostic flags
OwaLog oracle_log diagnostic logging file
OwaDescribe oracle_describe describe mode
OwaPool oracle_pool size of connection pool
OwaWait oracle_wait specify maximum timeout or abort
OwaOptimizer oracle_opt set optimizer mode
OwaStart oracle_start procedure for default start page
OwaBefore oracle_before procedure to run before all procedures
OwaAfter oracle_after procedure to run after all procedures
OwaProc oracle_proc site-wide procedure for all calls
OwaRealm oracle_realm use database login for Basic authentication
OwaAlternate oracle_alt name of alternate OWA implementation
OwaUnicode oracle_uni enable NCHAR/NVARCHAR binding modes
OwaReset oracle_rset change RESET_PACKAGE mode
OwaAdmin oracle_admin restrict admin functions to IP addresses
OwaRound oracle_round round bind variable sizes
OwaSqlError oracle_error URL for showing SQL errors
OwaSession oracle_ses cookie name for session/connection binding
OwaDocProc document_proc document-read procedure
OwaDocPath document_path document-download prefix (LOBs)
OwaDocLong document_long document-download prefix (legacy)
OwaDocFile document_file document storage directory
OwaDocGen document_gen dynamic document prefix
OwaDocLobs document_lobs control document LOB bindings
OwaDocTable document_table enable WebDB document upload/download
OwaUploadMax upload_max maximum size of content upload
OwaCharset dad_charset IANA character set for location
OwaBindset dad_bindset Oracle character set for non-UTF8 requests
OwaDateFmt dad_datefmt format mask for dates and timestamps
OwaTZ dad_tz time zone for location
OwaCache oracle_cache caching directory and location
OwaFlex oracle_flex use flexible arguments for procedure
OwaReject oracle_reject disallow prefixes
OwaEnv oracle_env add name/value pairs to environment
OwaHeader oracle_header add name/value pairs to environment
OwaCharsize oracle_charsize set maximum bytes/character for DB buffer
OwaContentType oracle_ctype set default content type for responses
OwaHttp oracle_http set level of HTTP method support
OwaDav oracle_dav set handler for REST and DAV operations
OwaRefXml refcur_xml set REF cursor XML tags
OwaDadName dad_name set DAD_NAME CGI parameter
OwaLDAP oracle_ldap convert basic auth user/pass to DB user/pass
Global Directives
2.0 Directive 1.3 Equivalent Short Description
------------------------------------------------------
OwaSharedMemory owa_shared_memory size of shared memory segment
OwaSharedThread owa_shared_thread cleanup thread polling interval
OwaFileRoot n/a file system root for static content
(works for owad only)
These directives are described in more detail in the table below. In each
case the new (2.0) directive name is given first, and the old directive
name is given second. Very complex directives have additional information
provided in later sections of this document.
Directive Description
-----------------------------------------------------------------------------
OwaUserid username/password@database for the data server to run
oracle_userid the PL/SQL requests. This parameter is required. Note
that the database name is a SQL*Net V2 locator; if the
database name is omitted, the connection will be through
the TWO_TASK or ORACLE_SID taken from the startup
environment. mod_owa supports OCI's external authentication
mode, which allows you to get the database connect
information from the Oracle Wallet or other external
source - this is triggered by supplying a blank connect
string (simply a "/", or "/@database").
OwaNLS An Oracle NLS_LANG setting of the form
oracle_nls _.
This parameter is not required. If omitted, the
language and territory from the startup environment
are used. Unlike the Oracle environment variable,
mod_owa provides some flexibility to omit portions of
this string. Examples of valid strings:
.
_
_.
You may not specify the territory without language.
You may not specify the language by itself. If language
is specified, then all connections through this Location
will have an ALTER SESSION SET NLS_LANGUAGE='...'
executed prior to commencing operations. A similar
command is run for territory, if present. The character
set is used to deduce an ISO character set return type
for HTTP responses coming from mod_owa. It should match
the character set actually used by the OCI, e.g. the
one taken from the startup environment of Apache.
Otherwise, this character set will be used in preference
to the one from NLS_LANG.
OwaAuth The way in which OWA handles authorization callbacks
oracle_auth has changed in a recent release. Previously, it would
call OWA_INIT.AUTHORIZE and then execute your request on
a TRUE return. Newer versions call OWA_CUSTOM.AUTHORIZE.
This optional parameter lets you control which of these
to call. Make sure it matches your actual deployment
within the database! If you omit this parameter, then
no check is made, all procedure calls are just executed
directly. You can also add the extra keyword PACKAGE, as in
OwaAuth "OWA_CUSTOM PACKAGE"
If present, this signals mod_owa to build
.AUTHORIZE
as the authorization code. PACKAGE may be used with any
of the other modes, as well as by itself. If the called
procedure is not part of a package, the base mode is
used instead. This parameter also supports an empty
string to signal a call to an AUTHORIZE procedure that
is not part of a package, as in
OwaAuth " " PACKAGE
OwaDiag This is a set of flags, described below, that enable
oracle_diag various diagnostic modes built into mod_owa. The
parameter is optional. Note that if you set this you
might create a security issue, since some of the
diagnostics show the actual SQL on error conditions.
OwaLog A pathname to a file where mod_owa diagnostics should
oracle_log be written. It's best to specify a full path. I use
"/usr/local/apache/logs/mod_owa.log" on Linux, and
"D:\apps\apache\logs\mod_owa.log" on Windows. This
parameter is optional; if not specified, the default
is "mod_owa.log" (but note that it's only written to if
you turn on one or more of the diagnostics).
OwaDescribe This optional parameter allow you to specify how
oracle_describe mod_owa handles argument-bind failures. It consists
of a mode parameter and/or a schema name. The allowable
mode values are STRICT, NORMAL, and RELAXED. NORMAL is
the default and is compatible with previous versions of
mod_owa. The operation of this parameter is described
in a later section.
OwaPool This governs the size of the connection pool for the
oracle_pool module. Each Location has its own pool, so all
connections are identical within a pool (e.g. they
all point to the same underlying Oracle schema and
they all have the same NLS settings). On Unix this
parameter has no effect (unless you set it to 0);
because the Unix version of Apache isn't multi-threaded,
the pool will always have at most one connection. The
parameter is not required, and if not specified the default
is either 1 (Unix) or 10 (Windows). The maximum pool size
you can specify is 255. Note that if a request is processed
and the pool is exhausted, a connection will be created
and destroyed to service that request, rather than blocking
on the availability of a connection from the pool. The
pool is initially empty and is filled up to the maximum
size only by user requests that fail to find an available
connection in the pool; thus, to reach 10 connections in
the pool, the server would have to at some point have been
processing 10 simultaneous requests in different threads.
Optionally, you can supply the value THREADS in place of
a numeric value, and the pool will be limited only by
the number of threads in the process (up to 255).
OwaWait This governs the number of milliseconds mod_owa will wait
oracle_wait for a connection from the pool. By default, mod_owa waits
100 milliseconds. After this, mod_owa will normally create
a temporary connection. However, by specifying the optional
ABORT flag, you can cause mod_owa to immediately fail a
request after the timeout, without creating a temporary
connection. Example:
OwaWait 200 ABORT
OwaOptimizer Sets the optimizer mode for OCI connections. mod_owa will
oracle_opt issue this call immediately after connecting:
alter session set OPTIMIZER_MODE=
OwaStart This optional parameter allows you to specify the name of
oracle_start the default PL/SQL procedure to be run when no script
name follows the location of the OWA in a URL. The
specified procedure is called with no arguments. If not
specified, mod_owa calls the OwaDocProc (if available)
with "/" as the requested file. If the procedure name is
preceded with an "!", mod_owa sends a client-side
redirect to the specified procedure.
OwaBefore This optional parameter allows you to specify the name of
oracle_before a PL/SQL procedure that should be run just before the
procedure requested. The procedure is called with no
arguments.
OwaAfter This optional parameter allows you to specify the name of a
oracle_after PL/SQL procedure that should be run just after the procedure
requested. The procedure is called with no arguments.
OwaProc This optional parameter allows you to specify the name of a
oracle_proc PL/SQL procedure to be substituted for all requests to the
Location. The procedure will always be called with the
flexible argument mode. If this parameter is set, it
takes precedence over OwaStart. The document download
parameters, if set, take precedence over this parameter.
OwaRealm This optional parameter signals mod_owa to use a new
oracle_realm database session on every request, and to log the user
in to the database using the username and password
specified in the Basic authentication header. If no
Basic authentication header is present, the user will
be challenged using the realm specified by this directive.
Note that use of this directive partially disables
connection pooling/reuse for the location in question;
connections are reused but a new session must be created
on the connection for every request (adding significant
overhead).
OwaAlternate This optional parameter allows you to specify the name of
oracle_alt a PL/SQL package that provides an alternate implementation
of OWA-like functionality. You may also specify flags
that support enhanced functionality not available in OWA.
The operation of this parameter is described in a later
section.
OwaUnicode This optional parameter allows you to enable special
oracle_uni Unicode binding modes for PL/SQL procedures call by
mod_owa. If set to "USER", all calls to user procedures,
including document action procedures, will be made using
NCHAR/NVARCHAR bindings instead of the normal CHAR/VARCHAR
binding. If set to "RAW", all calls to user procedures
will be made using RAW bindings. If set to "FULL", all
procedure calls including those for the OWA itself will
be bound as Unicode. The operation of this parameter is
more fully described in a later section.
OwaReset This optional parameter allows you to change the way
oracle_rset mod_owa performs the RESET_PACKAGE operation. The
default setting is "NORMAL", which uses the DBMS_SESSION
RESET_PACKAGE interface, compatible with older databases.
Newer installations (8.1.7.2 and higher) are encouraged
to use "FULL", which calls the new MODIFY_PACKAGE_STATE
interface but is functionally equivalent, or "LAZY",
which also uses MODIFY_PACKAGE_STATE but merely marks
packages for as-needed reinitialization. Users who have
the most recent version of the OWA code and whose PL/SQL
code is stateless/re-entrant can consider using "INIT",
which simply calls HTP.INIT to reset the OWA itself,
but leaves user package state undisturbed.
OwaAdmin This optional parameter allows you to restrict the use
oracle_admin of the control/admin functions to a particular range of
IP addresses. It is specified as an IP address prefix
plus a mask. By default, mod_owa uses 255.255.255.255
for the mask, effectively disallowing control operations
from all clients.
OwaRound This optional parameter allows you to control the rounding
oracle_round of bind variables used as arguments to your procedures.
Its use is described in a later section of this document.
OwaSqlError This optional parameter allows you to specify a URL
oracle_error to be called when mod_owa encounters SQL errors executing
the action procedure for a request. mod_owa will issue
a redirect response to the browser to the URL indicated,
adding two or three arguments:
proc=&errcode=&errmsg=
The first argument is the name of the procedure that
mod_owa was trying to execute, the second argument is
the OCI error code, and the third argument is the
OCI message buffer contents (if available). Additional
arguments may be added as necessary in the future, so
it's best if you design the handler to accept that. The
handler need not be another mod_owa page.
OwaSession This optional parameter allows you to specify a cookie
oracle_ses name that mod_owa will take to carry a session identifier.
mod_owa will associate the value of this cookie, if
available, with the database connection that it uses
to service the request. When attempting to find a
connection from the pool, mod_owa will first try to
reuse a connection bearing the same session identifier.
The session identifier must match exactly (including
case), and must be valid as a cookie value per HTTP
(meaning it should be plain ASCII and not use certain
punctuation values). Note that this parameter doesn't
have any beneficial effect if Apache is running in a mode
where worker processes are single-threaded. This parameter
can be of use if your PL/SQL attempts to cache data
for a user session to improve performance. It can also
be used in conjunction with OwaRealm to increase the
chances that a database session-create operation can
be avoided for subsequent requests from the same user;
the Basic authentication is still checked, just in case.
The model is still assumed to be stateless, so session
identifiers need not be unique or secure, and sessions
need not be explicitly ended. (Most applications will,
for other reasons, want to use unique, securely random
session identifiers anyway, and to have some sort of
session cleanup process independent of Apache.) There
is no guarantee that subsequent requests will be serviced
by the same session, even in the case of a multi-threaded
Apache; any connection returned to the pool may still be
used to service any request, as necessary. Use of this
directive will cause mod_owa to prefer a fresh or
unsessioned connection to reuse of a connection already
associated with a different session.
OwaDocProc This is the name of a procedure to be called whenever
document_proc a document request is to be handled. Its operation is
described below.
OwaDocPath This is a prefix that signals that the document procedure
document_path should be invoked on the specified URI to initiate a
mime-typed download operation. Its behavior is also
described below.
OwaDocLong If set, this parameter signals that the document reader
document_long for paths with this prefix should use LONG or LONG RAW
operation, as described below.
OwaDocFile If set, this parameter signals that the document reader
document_file for paths with the OwaDocLong prefix should use the file
system for storage, as described below.
OwaDocGen If set, this parameter signals that the document reader
document_gen for paths with this prefix will generate dynamic content
to be returned by GET_PAGE, as described below.
OwaDocLobs If set, this parameter defines which LOB types to bind
document_lobs for document upload and download operations. The arguments
are built in a pre-determined order, so specifying
LOB types that occur later in the sequence automatically
produces the binding for all earlier types. Its behavior
is described in more detail below. If not set, the default
is "NCHAR", which binds BLOB, CLOB, and NCLOB (compatible
with earlier versions of mod_owa). Note that BFILEs
cannot be bound for write, so specifying BFILE causes
that binding to be used for reads only.
OwaDocTable If set, this parameter causes mod_owa to use the WebDB
document_table interface for document upload and download operations.
This directive should specify a table name for the document
table, and a column name for the BLOB content type. Only
BLOB operations are supported. Example:
OwaDocTable WWV_FLOW_FILE_OBJECTS$ BLOB_CONTENT
OwaUploadMax If set, this parameter limits the total size of a content
upload_max upload to be no greater than the value specified. A
"K" or "M" can be appended to a value to signal kilobytes
or megabytes.
OwaCharset This optional parameter specifies the character set to use
dad_charset for client interactions, because it's not possible to
reliably determine this from browser HTTP requests. Data
to/from the database will be transferred using this
character set, overriding the NLS_LANG character set
for this Apache location. Note that the DAD character
set is specified using IANA's nomenclature, not Oracle's.
OwaBindset This optional parameter specifies the Oracle character set
dad_bindset to use for binding arguments from requests that contain
non-UTF8 byte sequences. Its operation is more fully
described in a later section. Note that this character
set is specified using Oracle's nomenclature, not IANA's.
OwaDateFmt This optional parameter specifies the default date format
dad_datefmt to use for conversion of strings to/from the Oracle
DATE data type. Normally, the environment variable
NLS_DATE_FORMAT should be used in conjunction with
NLS_LANG to control this value, however this provides
a crude means of overriding the territory-specific
value on a per-Location basis. The date format may
contain spaces if enclosed in quotes, however mod_owa
will truncate the format mask at the first such space
found (the assumption is that this is the beginning of
a time field). Against an Oracle 9i server, the string
up to any second space character is used to set the
NLS_TIMESTAMP_FORMAT and the full string is used to set
the NLS_TIMESTAMP_TZ_FORMAT. Again, the preferred means
for setting all these values is via environment variables.
However, since the 9i values will not be understood by
8.x clients, this parameter provides an alternate way
to set them and is the primary expected usage.
OwaTZ This optional parameter specifies a time zone for the
dad_tz Location. It works only with 9i servers; against older
servers, it silently fails (and slightly increases the
time required to create a new connection). The value
would normally be set using the environment variable
ORA_SDTZ, but that variable would then apply to all
Locations, and moreover is only understood by 9i clients.
This parameter allows you to specify different time zones
for different Locations under the same Apache instance,
and also allows you to set the time zone for 8.x clients
(against a 9i server).
OwaCache This is an optional parameter that can appear multiple
oracle_cache times for any Location. It specifies a mapping for
document storage between the logical (URL) names used
by your PL/SQL code and a physical (file system)
directory. The operation of this flag is described
in the section on file system caching.
OwaFlex This is an optional parameter that can appear multiple
oracle_flex times for any Location. It specifies the name of a
procedure (or package.procedure) that should be handled
with the flexible argument convention, even if the leading
"!" is not present in the URI. Optionally, the package/
procedure name can have a prepended "~" to indicate that
the procedure should be called in the 2-argument mode.
Optionally, the package/procedure name can have a prepended
"@" to indicate that the procedure returns data via a REF
cursor. This feature is included primarily for compatibility
with OAS, which apparently is able to do this automatically
by describing procedures after the standard call convention
fails. If you use this directive, put the most
frequently-used procedures first because mod_owa does a
linear search against them.
OwaReject This is an optional parameter that can appear multiple
oracle_reject times for any Location. It specifies a string prefix
for packages/procedures to be disallowed by mod_owa.
For example,
OwaReject DBMS
disallows direct calls to all the DBMS_xxxx packages through
the mod_owa interface. You can specify a full package name
by ending the string with a dot, e.g.
OwaReject "DBMS_SQL."
You can limit rejection to exact matches by specifying a
full package-dot-procedure string, e.g.
OwaReject "MY_PACKAGE.EXACT_PROCEDURE"
The matches are not case-sensitive.
OwaEnv This is an optional parameter that can appear multiple
oracle_env times for any Location. It specifies an environment
variable name and value to be passed to PL/SQL in the
CGI context. For example, "OwaEnv MYVAR myval" adds an
environment variable "MYVAR" with value "myval" to the
CGI environment passed to PL/SQL by mod_owa. Note that
OwaEnv has no effect on the environment used by the
Apache worker processes.
OwaHeader This is an optional parameter that can appear multiple
oracle_header times for any Location. It specifies a request header
name and a CGI environment variable name to map it to.
For example, "OwaHeader X-FOO-BAR FOOBAR" adds an
environment variable "FOOBAR" with the value of the
header X-FOO-BAR, if present on the request.
OwaLDAP This optional directive specifies a procedure for
oracle_ldap converting a basic authentication username/password
and/or session cookie to a database username/password
suitable for connection. It works in conjunction with
OwaRealm and/or OwaSession and is described below. Note
that use of this directive may add significant overhead
to request handling, since a special database connection
must be created to make the call to your procedure.
OwaCharsize This optional directive specifies the maximum number of
oracle_charsize bytes needed to represent a character in the database
character set. The default is 1, valid for single-byte
character sets. For any multi-byte set, the PL/SQL gateway
may overflow unless the correct maximum is set here. For
example, to use Unicode as your database character set,
this parameter should be set to 4.
OwaContentType This optional directive specifies the default content type
oracle_ctype for response pages. The default is "text/html". This
directive can also be used to set a different default
content type for errors. Example setting the default
for all content to "text/plain":
OwaContentType text/plain
Example setting the default for errors to "text/plain"
and the default for normal response to "text/html":
OwaContentType "text/html" "text/plain"
OwaHttp This optional directive specifies the level of HTTP
oracle_http protocol support. The default is "NORMAL", which supports
only GET and POST, and is compatible with previous releases
of mod_owa. "REST" adds support for the PUT, DELETE and
PATCH methods. "DAV" is unsupported at this time. If
"NORMAL" is configured, blank pages from the gateway
are no longer automatically given status 404 (not found),
a behavior of older versions of mod_owa and still the
default if OwaHttp is not set.
OwaDav This optional directive specifies a procedure to handle
oracle_dav REST and DAV requests.
OwaRefXml This directive can be used to override the tags used when
refcur_xml a REF cursor returns XML content. It can be used to override
the root tag, the row tag, and the namespace prefix and URI.
Example:
OwaRefXml employees emp hr "http://mycorp.com/hr"
OwaDadName This directive sets the DAD_NAME CGI variable.
dad_name
Directive Description
-----------------------------------------------------------------------------
OwaSharedMemory Controls the size and usage thresholds for
owa_shared_memory mod_owa's shared memory segment. By default
mod_owa doesn't create one. The exact syntax
and effect of this parameter are described later
in the section on shared memory caching.
OwaSharedThread Controls the connection pool cleanup thread.
owa_shared_thread By default, mod_owa doesn't create one. If
set to a non-zero value, this parameter causes
mod_owa to create a thread with the specified
poll interval.
LocationMatch
=============
Apache's LocationMatch directive allows you to set up a Location where
a variety of incoming URLs will be routed based on regular expression
matching. Unfortunately, after processing the match, Apache will
split the incoming URL at the first "/" found, regardless of the template
used, such that the SCRIPT_NAME and PATH_INFO variables will not be as
expected by the PL/SQL gateway. (In fact, this is true even with regular
Location directives that contain "/"s.) For example, here is how Apache
processes this Location and URL:
Location: /owa/subdir
URI: /owa/subdir/mypackage.myprocedure
Apache returns:
SCRIPT_NAME = /owa
PATH_INFO = /subdir/mypackage.myprocedure
Now consider this LocationMatch example:
LocationMatch: /owa/(cat|dog|horse)
URI: /owa/horse/mypackage.myprocedure
Apache returns:
SCRIPT_NAME = /owa
PATH_INFO = /horse/mypackage.myprocedure
In both cases, Apache has split the URI at the first "/" it encounters,
despite what would actually be matched by the Location or LocationMatch
directive.
mod_owa contains logic to detect and correct this condition before the
CGI environment is passed to PL/SQL. This logic works well for the
Location case. In the LocationMatch case, mod_owa uses Apache's
regular expression engine to reparse the URI after first reassembling
it from SCRIPT_NAME and PATH_INFO.
When matching against the regular expression, mod_owa must still break
the SCRIPT_NAME and PATH_INFO at a "/". Any LocationMatch directive
that does not have a trailing "/" will have one appended by mod_owa for
this purpose. The regular expression processor will attempt to apply
each rule as many times as possible, and this can lead to consumption
of too much of the URI string for poorly-formulated expressions. For
example, consider the effect of a wildcard match on a document-like URI:
LocationMatch: /owa/.*
URI: /owa/horse/docs/images/photo.gif
mod_owa produces:
SCRIPT_NAME = /owa/horse/docs/images
PATH_INFO = /photo.gif
(the implied trailing "/" added by mod_owa forces the wildcard to stop at
"/photo.gif"). This was probably not the desired split, and instead, this
formulation should have been used:
LocationMatch: /owa/[^/]*
URI: /owa/horse/docs/images/photo.gif
mod_owa produces:
SCRIPT_NAME = /owa/horse
PATH_INFO = /docs/images/photo.gif
This code is still very experimental and subject to change. In particular
the assumption of a trailing "/" is still open to question.
External Authentication
=======================
Oracle OCI allows mod_owa to connect using "external authentication".
Examples include using the operating system user and using the Oracle Wallet.
External authentication is triggered by supplying a blank connect string, e.g.
OwaUserid /
or
OwaUserid /@YOURDB
Follow the instructions in your Oracle documentation to set up external
authentication. If set up properly, you should be able to connect to the
database with the sqlplus tool using the same connect string.
This link may be helpful if using the Oracle Wallet as the external store:
http://docs.oracle.com/cd/B19306_01/network.102/b14266/cnctslsh.htm
Please note that on Windows, Apache may run as a service and may therefore
run as a user other than the user you normally log into Windows as. For
instance, as user SYSTEM. This may cause external authentication to fail
unless you configure your windows service to run as the correct operating
system user. For more information refer to the Apache documentation.
***************
Basic Operation
***************
How the SQL works
=================
Request setup
The arguments arrive via either the GET or POST methods in a single
tokenized string. GET passes them in the URL, while POST makes them
available in the content body (and therefore can accomodate much
larger data sets). The arguments are parsed and unescaped by
searching through the argument string, assumed to be in the following
format:
=&=...
The un-escaping is done after each name/value pair is parsed off.
Note that for POST requests, any arguments found in the URL are merged
with the arguments from the content body. URL arguments are processed
first.
Request execution
0. Create database connection
This is generally done only if a previous connection is not
available in the pool for reuse. (More on the connection pool
below.) On creation, the connection is set to the NLS parameters
(if specified) with the following commands:
alter session set NLS_LANGUAGE='AMERICAN'
^^^^^^^^
Specified language goes here
alter session set NLS_TERRITORY='AMERICA'
^^^^^^^
Specified territory goes here
alter session set TIME_ZONE='US/Eastern'
^^^^^^^^^^
Specified time zone goes here
alter session set NLS_DATE_FORMAT='DD/MM/YYYY'
^^^^^^^^^^
Specified format mask goes here
alter session set NLS_TIMESTAMP_FORMAT='YYYY/MM/DD HH24:MI:SS'
^^^^^^^^^^^^^^^^^^^^^
Specified format mask goes here
alter session set NLS_TIMESTAMP_TZ_FORMAT='YYYY/MM/DD HH24:MI TZH:TZM'
^^^^^^^^^^^^^^^^^^^^^^^^^^
Specified format mask goes here
1. Initialize PL/SQL state
This operation is required because the OWA leaves state from previous
request executions lying around after a return. A better design
would make it unnecessary, and probably be much more efficient.
For now, all PL/SQL package state must be zeroed on every request,
and this is done with the following PL/SQL anonymous block:
begin DBMS_SESSION.RESET_PACKAGE; end;
Note that this command is always the same on every request. mod_owa
will keep a statement handle open for this operation and reuse it
whenever the connection is reused. (This is sometimes called a
database "cursor".) It's the first of five statement handles that
are held on every connection in the connection pool.
+----------------------------------------------------------------+
| In the latest version of mod_owa, I've moved this operation |
| to the end, so it now occurs after the final GET_PAGE or |
| other content transfer occurs. In effect, I've gone from |
| a model where the connection is cleaned up prior to every |
| request to a model where the connection is cleaned up after |
| every request. I did this to improve perceived performance, |
| because the cost of the RESET_PACKAGE operation will now |
| be incurred after the user has received his/her content, |
| not while he/she is waiting for it. The difference is |
| largely theoretical: so far I haven't noticed any actual |
| improvement (but then, I don't have a high-volume site where |
| RESET_PACKAGE becomes costly, either). |
+----------------------------------------------------------------+
Via the OwaAlternate directive, you can switch off this operation
via a directive such as:
OwaAlternate MY_OWA KEEPSTATE
This can greatly decrease the load on your database for high-volume
sites. To use this mode, you must ensure that all of your
application's PL/SQL code is re-entrant, meaning that it leaves
no information in PL/SQL package globals that would prevent the
servicing of a subsequent HTTP request for a different client.
As an example, code that performed a one-time initialization of
globals for the particular user involved in a request would not
be re-entrant, whereas similar code that kept such information in
a global cache that was resettable (or that could hold the
information for multiple users) could be re-entrant. Unfortunately
the code of the OWA itself is not yet re-entrant, so this requires
that you supply your own OWA-equivalent implementation.
Short of disabling the reset operation entirely, you can use the
OwaReset flag to alter the nature of the reset call performed, as
follows:
NORMAL The default, calls DBMS_SESSION.RESET_PACKAGE.
FULL Calls DBMS_SESSION.MODIFY_PACKAGE_STATE(1). This mode is
recommended for more recent database versions because
RESET_PACKAGE is being deprecated. It is functionally
equivalent to RESET_PACKAGE.
LAZY Calls DBMS_SESSION.MODIFY_PACKAGE_STATE(2). This avoids
flushing the cursor cache for PL/SQL static SQL statements,
and defers the work of actual package reinitialization until
a package is actually called. This mode is recommended if
you have a newer version of the database.
INIT This mode assumes that your PL/SQL code is re-entrant, and
doesn't need to be reset. The full reset is avoided
completely; instead, HTP.INIT is called to reset the state
of the OWA itself.
2. Set up the CGI environment
mod_owa must pass the name/value pairs from the CGI environment
into the OWA before it can execute your procedure, so that their
values will be available to your code. This is done by executing
the following PL/SQL anonymous block:
begin OWA.INIT_CGI_ENV(:ecount, :namarr, :valarr); end;
The arguments are as follows:
ecount Number of name/value pairs for the CGI environment
namarr PL/SQL table of records with a list of VARCHAR2 names
valarr PL/SQL table of records with a list of VARCHAR2 values
Note that the last two arguments to the procedure are actually
arrays, while the first is a scalar binding.
This statement is always the same on every call, though the
values in the binding buffers are obviously different. mod_owa
keeps this statement handle open so that it can reuse it
whenever the connection is reused. This is the second of five
statement handles held on pooled connections.
3. Pass basic authentication information
mod_owa will pass the username and password from any Basic
authentication header received by Apache using the following
PL/SQL anonymous block:
begin
OWA.USER_ID := :usr;
OWA.PASSWORD := :pwd;
end;
Like the previous statements, this statement is always the same
on every call, but with different values bound for username and
password. mod_owa keeps this statement handle open so that it
can reuse it whenever the connection is reused. This is the
third of five statement handles held on pooled connections.
These assignments could (and should) be executed as part of the
call to INIT_CGI_ENV, but a bug in PL/SQL has forced me to do
this in a separate operation.
4. Execute the requested procedure
This is the only truly dynamic SQL in the entire process. The code
that must be executed is normally of the following form:
begin
if then
( => :B1, => :B2, ...);
commit;
else
:realm := OWA.PROTECTION_REALM;
end if;
end;
The authorization call has three forms. In older versions of
OWA, it was OWA_INIT.AUTHORIZE(), while in newer versions it's
OWA_CUSTOM.AUTHORIZE(). If PACKAGE is specified for the Location,
the check is .AUTHORIZE(). mod_owa will build one
of these based on what you tell it in the httpd.conf Location.
It will also simply skip the check if that's what you specify.
The essential part of the call is of course the procedure and
argument bindings. There are four supported modes of argument
passing:
named arguments The arguments are bound by naming the binding
explicitly, so that the call is position-
independent.
positional arguments Similar to the first form, but the names are
skipped and therefore argument order matters.
flexible arguments The arguments are passed as arrays of name/
value pairs (e.g. argc/argv style).
raw post data The data is not interpreted in any way,
but is passed as a single raw parameter.
The procedure name is taken from the URL, it's the name appearing
right after the "mount point". For example,
http://mymachine/owa/foobar?a=dog&b=cat
is a call to PL/SQL procedure "foobar" (with two arguments). This
call requests the default passing mode, which is named arguments,
and would produce something like the following:
foobar(a=>:B1,b=>:B2);
Bind variables are used for the arguments so that the internal
cursor sharing mechanisms of the database can match this with
other requests for the same function.
Special characters in the procedure name signal which type of argument
passing to use. To get positional argument passing, precede the
procedure name with a circumflex as in:
http://mymachine/owa/^foobar?a=dog&b=cat
which results in:
foobar(:B1,:B2);
To get flexible argument passing, precede it with an exclamation mark:
http://mymachine/owa/!foobar?a=dog&b=cat
which always results in the same call regardless of the number of
arguments:
foobar(:B1,:B2,:B3,:B4);
:B1 is the number of name/value pairs
:B2 is an array of parameter names, in this case the array ['a','b']
:B3 is an array of values, in this case ['dog','cat']
:B4 is an array reserved for future use
Note that OAS supported another flexible argument mode, which
mod_owa will try if the 4-argument mode fails (and if you have
not used the STRICT setting for OwaDescribe):
foobar(:B2,:B3);
:B2 is an array of parameter names, in this case the array ['a','b']
:B3 is an array of values, in this case ['dog','cat']
To get the raw argument data without any escaping or interpretation
by Apache and/or mod_owa, precede the call with a tilde, as in:
http://mymachine/owa/~foobar
This mode is not part of the official OWA specification, and is
described in a later section.
The BLOB allows your routine to return a raw content response; if
you return a non-null result for this, mod_owa will skip the normal
GET_PAGE phase and use the contents of the BLOB as the response.
Binding of arguments is done as a series of string values sized
based on the actual data. In a request it is possible to pass
multiple elements with the same name (typically this happens with
POST requests where form elements are repeated on multiple lines).
For example:
http://mymachine/owa/foobar?n=2&a=dog&a=cat
seems to provide two values for "a". mod_owa treats this as an
array-bound parameter to the PL/SQL routine, and will pass it
in as a table of records of VARCHAR type. Thus, the above URL
results in:
foobar(n=>:B1,a=>:B2);
where :B1 is a scalar string "2" and :B2 is the array ['dog','cat'].
Special notes on bind modes:
* Unless you pass multiple values for an argument name, mod_owa
will assume it should be bound as a scalar, not an array. The
OWA spec doesn't support such singleton bindings because there's
no good way for the gateway to determine what the binding mode
should be. The suggested work-around is to add a hidden field
to your form with a blank value to ensure that the array bound
quantity is seen as such by the gateway. If the normal call fails,
mod_owa will use OCIDescribeAny() and will attempt to figure out
if any scalar bindings should be "promoted" to singleton array
bindings. This adds overhead to the call and I don't recommend
relying on the feature, but it can't hurt since the alternative
is issuing an error message.
* In certain circumstances, browsers may generate form input arguments
that are structured in nature. For example, consider the following
form:
which, when the mouse is clicked inside the image item, results in
this transmission to the web server:
myprocedure?mybutton.x=123&mybutton.y=321
This is not a problem in the flexible argument mode, where mod_owa
will simply bind {"mybutton.x", "mybutton.y"} in the names array
and {123, 321} in the values array. However, in the other
argument modes, particularly the named argument mode, this presents
a problem because "." is not typically valid in the argument name.
In response to this situation, mod_owa will build an array-bound
argument with the name of the structure prefix, e.g.
myprocedure(mybutton=>:B1);
where :B1 is bound to an array containing the structure elements
in the order received. Unfortunately, there is no way to tell
which is "x" and which is "y" except positionally in the array,
but this binding is required for compatibility with Oracle's
implementation of the OWA. I advise users to use the flexible
argument mode instead.
* Please note that the bind variables are only guaranteed to be
suitable for IN parameters, never for IN OUT parameters. If you
try to call a procedure passing an argument to an IN OUT parameter,
the OCI may crash.
The final clause of the statement retrieves a protection realm
which can optionally be used to drive HTTP Basic authentication.
This value is retrieved only if the AUTHORIZE function fails, and
it is not bound if authorization is disabled. You can optionally
set this value directly (or via OWA_SEC.SET_PROTECTION_REALM),
and this will cause mod_owa to skip the GET_PAGE operation
(described below) and send back a Basic authentication challenge
for the realm provided. (This mode of operation is provided for
compatibility with OWS.) If you do not set this value, mod_owa
proceeds with the GET_PAGE operation and assumes that you've
generated an appropriate error or login page during the AUTHORIZE
call.
The procedure call should result in the creation of HTTP content
for transmission back to the client. This call is therefore usually
the most time-consuming part of handling a request. Another
statement handle is needed for this operation, though because the
SQL changes from request to request, mod_owa must reparse the
statement each time (however, by using bind variables, mod_owa
expects to hit the data server's shared cursor cache most of the
time anyway). This is the fourth the five statement handles
held for pooled connections.
5. Get and return data
After the execution, mod_owa is ready to retrieve the HTML output
data, which is held in PL/SQL global memory in a table of records.
This is done by repeated calls to the following PL/SQL anonymous
block:
begin OWA.GET_PAGE(:linearr, :nlines); end;
The first argument is an OUT-only table of records, holding up
to 256 lines of output, while the second argument is an IN/OUT
count of the number of lines available/returned by the call.
This is the fifth statement handle held on every connection in
the pool, and, like the first three, it never changes, so it's
kept parsed and ready for re-execution on subsequent requests.
It's called repeatedly to get the output until the number of
lines returned is less than 256, signalling the end of the HTTP
result.
The output returned is streamed to the Apache request response.
However, it is necessary to parse it for an HTTP header, since
OWA unfortunately doesn't separate the header return from the
content (as demanded by Apache itself). Header elements are
of the form
:
Of course, some lines of content may also have this form. The
header is separated from the content by a blank line, so mod_owa
looks for this blank line when examining the return from GET_PAGE.
Unfortunately the OWA doesn't always return a header, and since
content can also have blank lines, there is no absolutely reliable
way to separate the two. mod_owa uses a heuristic technique that
seems to work well. In cases where OWA does return a header, it
always returns a CONTENT-TYPE: tag. mod_owa looks for this tag
anywhere in the first 256 lines it fetches, and, if found, assumes
there's a header and proceeds to process all lines as header lines
until it sees the blank-line separator. If, however, it doesn't
find this tag, then mod_owa assumes there's no header and it's all
content.
Header elements include the CONTENT-TYPE tag, cookies, etc. They
are all intercepted and merged with the request response header.
After the header is exhausted, the remaining content is simply
streamed to the request output.
Under certain special conditions, the OWA can return a header with
no content. Moreover, these headers don't have a CONTENT-TYPE marker
(because, of course, they have no content!). The three that I am
aware of are the redirection header (signalled by the lone LOCATION:
tag), the refresh header (signalled by the lone REFRESH: tag), and
the authentication challenge (signalled by the WWW-AUTHENTICATE: tag).
mod_owa has special-cased code to handle all these cases.
Binary content types should, in general, not be returned via this
interface. It is possible in most cases to convey such content back
via the file download interface (discussed below). The main problem
with binary content types is simply that the OWA.GET_PAGE interface,
as well as all PL/SQL-side processing done by the OWA, is
character-based, exposing attempts to transmit binary data to the
following problems:
1. Binary values representing invalid character codes can cause
trouble in the PL/SQL code of the OWA, which performs SQL
operations such as LENGTH, SUBSTR, etc., on the data. This
is especially problematic for multi-byte character sets, where
not all binary sequences represent valid character values.
2. Character data passed through the OCI is subject to character-set
conversion from the data server to the client -- such conversions
can change the byte values and corrupt the binary data stream.
3. Binary null byte values can cause code in both PL/SQL and C to
incorrectly believe the end of a character string has been
encountered.
Despite these problems, some users still return binary data through
the OWA interface, using the CHR() function to set specific byte
values. This will generally work under the following conditions:
* The character set of the data server is a single-byte set, thus
avoiding the first problem listed above in all known cases.
* The character set of the Apache client (e.g. the OwaCharset or
NLS_LANG character set for mod_owa) is identical to that of the
data server, thus avoiding the second problem listed above.
* (mod_owa itself avoids the third problem by using a length-returning
OCI bind mode, and using the length instead of searching for a null
terminator.)
File Upload/Download
====================
The mod_owa version of this functionality is still experimental (more so
than the rest of the code). There are still numerous open issues,
particularly relating to uploads and character-set issues for CLOBs.
The HTTP standard for file uploads is particularly toxic, and my
understanding of it is constantly being proven incorrect; as a result,
the quality of mod_owa's upload parsing code is mediocre at best. These
issues are noted in the descriptions that follow.
It was not possible to exactly mirror the specification of this feature
in Oracle's version of the gateway, mainly because that specification
required the installation and use of WebDB-related PL/SQL facilities
that aren't supportable by mod_owa. The mod_owa version of this functionality
attempts to provide a relatively simple, entirely programmatic interface,
by means of which programmers can "wrap" their own tables, access control
functions, etc.
A very limited capability that may be compatible with WebDB's file upload
and download interface is available via the OwaDocTable directive. This
is limited to a single interface table and column, which must be a BLOB
column only. That capability may be sufficient to run Apex, though I am
unable to verify it.
File download
-------------
This is by far the simpler and more robust portion of the implementation,
particularly if you use only the BLOB (binary) files. The functionality
is governed by seven parameters:
OwaDocProc a procedure to be called whenever a document
is being requested
OwaDocPath a path prefix which signals that a request
should be routed to the document procedure
OwaDocLong a path prefix which signals that a request
should be routed to the document procedure
but use LONG or LONG RAW processing logic
OwaDocFile a path on the file system to use for document
storage and retrieval, instead of the LONG or
LONG RAW method
OwaDocGen a path prefix which signals that a request
should be routed to the document procedure
but that the content will be dynamically
generated and returned through GET_PAGE
OwaDocLobs specifies which of the LOB handles to bind
for read and write requests (see below)
OwaDocTable overrides mod_owa's upload/download functionality
and uses the table and column specified.
I will describe the operation of OwaDocLong, OwaDocFile, and OwaDocGen
in the next sections, so what follows here pertains to the LOB-based
interface.
When a URI is received prefixed by the OwaDocPath, the entire URI
(minus any query string) is considered to be a logical file path, and
this file path is passed to the OwaDocProc, along with any other
arguments specified by the query string. For example, consider the
following URI:
http://mymachine/owa/docs/a/b/c/doc.gif?arg1=1&arg2=2
^^^^
OwaDocPath
http://mymachine/owa This prefix routes the request into mod_owa
(/owa is the Location)
/docs/a/b/c/doc.gif This is the logical file path that will be
sent to the OwaDocProc
arg1=1&arg2=2 Any remaining arguments follow the logical
file path in the arrays
To keep the interface standard despite the possibility that the arguments
may not be pre-determined, the call is always made using the flexible
argument mode (described previously). In addition, 2-5 extra arguments
are added, so the call becomes similar to the following:
begin
(:B1,:B2,:B3,:B4,:B5,:B6,:B7,:B8,:B9);
commit;
end;
:B1 is an input with the number of name/value pairs
:B2 is an input array of parameter names
:B3 is an input array of parameter values
:B4 is an input array reserved for future use
:B5 is an output string for the document mime type
:B6 is an output BLOB locator for binary documents
:B7 is an output CLOB locator for textual documents
:B8 is an output NCLOB locator for Unicode textual documents
:B9 is an output BFILE locator for binary file-based content
The first element in the array of values will contain the logical file
path (it's also available in the CGI environment). Any other arguments
found in the query string will follow this one in the arrays.
For example, with the URI shown previously, the following arguments would
be passed as inputs to the OwaDocProc:
B1 => 3
B2 => {"document_name", "arg1", "arg2"}
B3 => {"/docs/a/b/c/doc.gif", "1", "2" }
B4 => {"", "", "" }
The call is made without the usual authorization check being built into
the anonymous block. The presumption is that any necessary access
control is done by this procedure. If authorization fails and/or the
document cannot be found, or if any other failure occurs, return NULLs
for all LOB locators to signal an error to mod_owa. The module will
then assume that the OwaDocProc has written a response page in the
normal manner and will use OWA.GET_PAGE in the normal manner to
retrieve and display it.
If the call succeeds, then return a mime type string (if possible) and
return a LOB locator selected from the underlying storage table. The
mime type string can be up to about 4000 bytes long. The procedure
should return a non-null value in one of the LOB handles, and nulls
in the others (if multiple non-null LOBs are returned, mod_owa will
use the first one it finds in sequence starting from the BLOB, and
ignore the others).
The setting of OwaDocLobs governs which of these arguments mod_owa
will bind. The settings are as follows:
BIN Binds only the BLOB argument (6 arguments total).
CHAR Binds the BLOB and CLOB arguments (7 arguments total).
NCHAR Binds the BLOB, CLOB, and NCLOB arguments (8 arguments total;
this mode is the default and is compatible with APIs coded for
older versions of mod_owa).
FILE Binds all four LOB types, including BFILE.
If you do not return a mime type, mod_owa will assume a type of
"application/octet-stream" for binary LOBs and "text/plain" for
character LOBs.
+----------------------------------------------------------------+
| Also, mod_owa will attempt to determine the mime type based |
| on the file extension, if it can find one. Right now this |
| is a hack because I can't find any native Apache API for this, |
| so there's a hard-coded table in mod_owa for some of the more |
| common extensions. If you want to be sure to avoid this, |
| return a mime type from the OwaDocProc. |
+----------------------------------------------------------------+
BLOBs (and BFILEs) will be streamed down in a straightforward fashion.
For CLOBs and NCLOBs, mod_owa will convert the content to the DAD
character set in a manner similar to the way it handles OWA.GET_PAGE
requests. Because LOB semantics work in characters, it is not always
possible to determine the "Content-Length" for a character-based download.
mod_owa will use the character length of the LOB as the content length
only if you are running a single-byte character set (such as "iso-8859-1").
The semantics of the mime type argument have been extended to allow
your code to return additional header elements for the document, up
to 4000 bytes total. To return such elements, separate them from the
mime type (and each other) with newlines. If there is no mime type
being returned, you must still begin the buffer with a newline. The
last header element need not be terminated with a newline. As an
example:
text/html; charset=iso8859-1\n
Last-Modified: xxx\n
If-Modified-Since: xxx
File download with LONG or LONG RAW
-----------------------------------
This optional mode is signalled by any document path whose prefix matches
the value set for OwaDocLong. The operation is in most respects
similar to that for the LOB read, so in this section I will describe only
the differences. The main difference is that the read procedure call is
built as follows:
begin
(:B1,:B2,:B3,:B4,:B5,:B6,:B7,:B8);
commit;
end;
:B1 is an input with the number of name/value pairs
:B2 is an input array of parameter names
:B3 is an input array of parameter values
:B4 is an input array reserved for future use
:B5 is an output string for the document mime type
:B6 is an output string for returning a SQL select statement
:B7 is an output string for returning a SQL bind variable string value
:B8 is an output string for returning the 1-character value of a RAW flag
Note that the same OwaDocProc name will be used for this mode; the
implementation can either overload the function (recommended) or make
this the only mode of operation by setting OwaDocLong = OwaDocPath
in the configuration.
The function is expected to return a SQL statement suitable for use by
mod_owa to fetch back the LONG or LONG RAW value. The SQL statement
should be a SELECT statement with a single column return value of type
LONG or LONG RAW. The statement may optionally contain a single bind
variable, typically the value for the primary key of the underlying
table, or a ROWID. If the SQL statement requires this bind, return a
value for it in the appropriate argument, otherwise return a NULL value.
The SQL statement can be up to about 32512 bytes in length (the PL/SQL
maximum). The bind value can be up to about 2000 characters long. The
last argument is a flag to indicate whether the returned column is a
character-based LONG or a binary LONG RAW type. Return a NULL for
the character-based type, otherwise return any single-character value
(typically, a 'Y'). The mime type argument is used in a manner similar
to that for the LOB case; as with LOBs, the mime type can be up to
about 4000 bytes long.
As an example, the SQL statement you return should be of the following
form:
select LONG_CONTENT from MYDOCUMENTS where KEY_COLUMN = :B1;
The module will run the statement you specify, with the bind variable
(if any), and perform a piecewise fetch of the result, streaming this
back to the requestor. To signal an access control failure or other
error, return a null for the SQL statement; as with LOB-based downloads,
mod_owa will then assume that your code has written an OWA response page
in the normal fashion, and attempt to display it using OWA.GET_PAGE. The
returned contents can be up to 2 gigabytes in length.
As with LOBs, LONG RAWs work without any character set issues because
they are transferred entirely in binary. (If a return character set is
needed, it can be attached to the mime type value you return, as described
earlier.) For LONGs, the transfer mode causes the character data to be
converted to the character set of the Apache server prior to transfer, so
no special handling is performed.
+------------------------------------------------------------------+
| There may, however, be issues with the way the piecewise |
| functions count lengths; it's not clear whether they always |
| count in bytes, which the current code assumes, or whether |
| some return lengths are in characters, in which case these |
| transfers will have the same issues as CLOBs. Because of the |
| way the code is written, any single-byte character set should |
| work OK, without the data stripping done for CLOBs. |
| |
| It might be more flexible to implement this using the REF CURSOR |
| mechanism. With this mechanism, PL/SQL can return a statement |
| in an already-executed state, such that mod_owa need only run |
| the piecewise fetch to stream the result. Unfortunately, this |
| technique wouldn't work on older versions of the database (the |
| main target of this legacy-mode feature). Also, REF CURSORs |
| can't be used for INSERT/UPDATE operations, so mod_owa's support |
| for uploads to LONG or LONG RAW targets would be forced to have |
| an interface similar to this one anyway. |
+------------------------------------------------------------------+
Unlike LOBs, Oracle's LONG and LONG RAW types don't provide any means
for determining the length of the content, so the above linkage
unfortunately doesn't support the transmission of a Content-Length for
the request. For these legacy types, mod_owa has two special modes
of operation, governed by additional parameters on the OwaDocLobs
directive:
* LONG_RETURN_LENGTH
If set this mode changes the PL/SQL call mod_owa generates to
contain 9 arguments, as follows:
(:B1,:B2,:B3,:B4,:B5,:B6,:B7,:B8,:B9);
The difference is in the last two arguments; the RAW flag is still
the last argument, so it moves from B8 to B9, while argument B8 is
an output binding that allows you to return the content length
(it should be a numeric value suitable for an integer binding).
This interface is useful if as a side-effect of processing the
document lookup you already have the content length readily
available, or if the content length is not stored in the same table
as the content itself.
* LONG_FETCH_LENGTH
If set, this mode causes mod_owa to bind the SQL statement you
return with an additional select-list argument for the content-length.
The PL/SQL portion of the interface is call-compatible with the normal
operating mode (8 arguments). The SQL you return, however, must look
as follows:
select LONG_CONTENT, CONTENT_LENGTH
from MYDOCUMENTS where KEY_COLUMN = :B1;
As before, the exact column names and table names are unimportant,
only the position matters; the long column appears first, followed
by the length, which should be a numeric (integer) value. Also
as before, the bind variable is optional.
This mode is preferable if the length is not computed as a side-effect
of the PL/SQL OwaDocProc call, but is readily available as
a column in the same table as the long content.
You can specify this setting in combination with the LOB-related
settings for OwaDocLobs, as in:
OwaDocLobs CHAR LONG_FETCH_LENGTH
File download using the file system
-----------------------------------
This mode is an override of the legacy LONG and LONG RAW mode. It is
signalled by setting OwaDocFile. The operation is in most respects
similar to that for the LONG reads, so in this section I will describe only
the differences. The main difference is that the read procedure call is
built as follows:
begin
(:B1,:B2,:B3,:B4,:B5,:B6);
commit;
end;
:B1 is an input with the number of name/value pairs
:B2 is an input array of parameter names
:B3 is an input array of parameter values
:B4 is an input array reserved for future use
:B5 is an output string for the document mime type
:B6 is an output string for returning a file system path
Note that the same OwaDocProc name will be used for this mode; the
implementation can either overload the function (recommended) or make
this the only mode of operation by setting OwaDocLong = OwaDocPath
in the configuration. Also note that you must set OwaDocLong, since
that parameter is still needed to distinguish this read mode from the
standard LOB-based mode.
The function is expected to return a file path name that mod_owa can
use to retreive the contents of the document. mod_owa will prepend
the directory from OwaDocFile - mod_owa expects that the path name
returned by the procedure begins with a path separator (e.g. "/")
and that the path in OwaDocFile does not end with such a separator.
Unlike the LONG/LONG RAW mode, there is no distinction between binary
and text data with file system operation.
Dynamically-generated virtual files
-----------------------------------
If the path prefix matches the value of the optional parameter
OwaDocGen, then this signals mod_owa to call your OwaDocProc
with just the normal flexible arguments, and no output arguments,
as shown here:
begin
(:B1,:B2,:B3,:B4);
commit;
end;
You then write the content using the normal OWA facilities, and mod_owa
will retrieve it with GET_PAGE as per any normal request. The main
difference versus a standard request is that the URL transmitted by
the browser appears to be a normal file path (unless you allow
arguments).
It is possible to use temporary LOBs to achieve a similar effect
using the LOB-based transfer mode described earlier. However,
there is no similar concept for LONG or LONG RAW types, so this
method is the only way to do dynamic virtual files on an older
(pre-8i) data server,
Another possible use for this directive is for streaming generation
of the content via an alternate OWA interface. With Oracle's OWA,
you must complete all of the output operations before returning to
mod_owa, because once the GET_PAGE loop begins, you will not have
any further opportunity to generate content. However, with an
alternate OWA, you could simply save the page-generation context
during the call to OwaDocProc, and then generate portions of
content on demand during the GET_PAGE phase.
If you have set the optional parameter OwaDocTable, mod_owa disables the
above functionality as well as the normal functionality of OwaDocPath,
and instead uses an Apex-style interface for virtual files. The OwaDocProc
is called with no arguments. Your procedure is expected to get the URI
path from the OWA's CGI variable interface. Your procedure is expected
to return a standard response either via GET_PAGE or via WPG_DOCLOAD.
File upload
-----------
A file upload operation is triggered whenever the CONTENT_TYPE for a
request is "multipart/form-data". The module assumes that the incoming
stream contains a series of arguments and file streams delimited by the
"boundary" string generated. This is typically accomplished by adding
the enctype attribute to the form tag in the page generating the upload
operation, as shown here:
Note that upload processing takes precedence over any alternative
handling that might be implied by other mod_owa directives. For
example, if the incoming URL matches the structure for a document
download, the document download procedure will not be called.
Unfortunately, HTTP mixes both the form arguments and the file contents
into a single stream. It provides no lengths for individual portions
of the stream, and mixes binary information with textual information. In
short, it's a mess, difficult to parse reliably. An example of an
incoming stream might be as follows:
-------------------------1234567890\r\n
Content-Disposition: form-data; name="field1_name"\r\n
\r\n
\r\n
-------------------------1234567890\r\n
Content-Disposition: form-data; name="field2_name"; filename="foo.doc"\r\n
Content-Type: mime/type\r\n
...other optional header elements such as Content-Encoding...\r\n
\r\n
...file contents...\r\n
-------------------------1234567890--
The incoming stream will be parsed in the order elements are received. The
stream may contain both normal arguments and files. To avoid the need to
buffer the entire stream before calling the action procedure, mod_owa will
call the action procedure whenever a file is encountered in the stream,
passing whatever arguments it has accumulated to that point. Users would
therefore be well-advised to ensure that all non-file arguments in the
form appear before the file input types.
The action procedure will be called once per file found in the stream,
plus one additional time to generate a response page to the request.
The action procedure will be called even if the file input element is
blank. The module builds the call to the action procedure using the
flexible argument model, with additional arguments for LOB handles, as
shown here:
begin
(:B1,:B2,:B3,:B4,:B5,:B6,:B7,:B8);
end;
:B1 is an input with the number of name/value pairs
:B2 is an input array of parameter names
:B3 is an input array of parameter values
:B4 is an input array reserved for future use
:B5 is an input/output string for the document mime type
:B6 is an output BLOB locator for binary documents
:B7 is an output CLOB locator for textual documents
:B8 is an output NCLOB locator for Unicode textual documents
The action procedure should return a valid LOB handle in one of the LOB
output arguments. If nulls are returned for all of them, mod_owa assumes
that an authorization problem or other error has occurred, and will skip
that file in the incoming stream. Otherwise, mod_owa will stream the file
to the first non-null LOB handle in sequence and commit the transaction.
The LOB arguments bound are also subject to the value of OwaDocLobs,
as described in the section on file downloads. The only difference is
that a BFILE locator is never bound, so the FILE setting causes only
8 arguments to be built for writes (this is because Oracle doesn't
support writes to BFILEs).
Note the following differences versus the OwaDocProc described for
reading files, and versus other calling modes:
* No authorization check - like the file read case, the writer is
assumed to do any necessary checking.
* No commit after the call; this is because the LOB must be returned
in a writable state, so mod_owa will do the commit from OCI after
writing data to the LOB.
* An IN OUT argument for the mime type; mod_owa will pass the value of
the "Content-type" (if any); the return value is ignored.
* Output arguments for up to three types of LOB.
The name/value argument arrays will contain every argument found in
the content stream to that point. The first element in the values
array will always be the value of "OwaDocPath" (to support programmers
that want to use a single implementation of the code against multiple
mod_owa locations), and the second element in the array will always
be the name of the file being uploaded (if not null). The name of the
input item for the file is found in the second location in the names
array. Any other arguments available at the time the file was encountered
are passed in the rest of these arrays. Note again that if more non-file
arguments are found later in the content stream, they will be appended
to the array in order and passed to the action procedure on any subsequent
calls. Also note that because the call is in flexible argument mode,
the maximum width of an argument value is limited to 32512 bytes
(4000 or as low as 2000 on older versions of Oracle), and the value
will be truncated if necessary.
The action procedure is responsible for parsing the input file name
(if necessary), creating whatever row is required in the underlying
storage table, and returning a LOB handle suitable for writing. If
an update operation is required, the action procedure should select
the desired row "for update". Typical SQL operations would look
similar to the following:
select CONTENT into plob
from MYDOCUMENTS where DOCNAME = FILEPATH for update;
insert into MYDOCUMENTS (DOCNAME, CONTENT)
values (FILEPATH, empty_blob())
returning CONTENT into plob;
The action procedure should also record whatever information it needs
to generate a response page, but should not generate a response page
until the final call is made. When all incoming files have been
processed, mod_owa will make one final call to the action procedure
with empty strings for both the FILEPATH and the name of the input
item (meaning the second elements of both the names and values arrays
will be empty). This condition, which can't occur normally, signals
the action procedure that the operation is complete, and requests the
generation of a response page. Also, mod_owa will append to the
argument vector a set of "content_length" arguments, one for each file,
that will pass the number of bytes (or characters, for CLOB and NCLOB
types) written for each file, in sequence. For example, in a simple
case where a single file has been uploaded and there are no other
form arguments:
:B1 => 3
:B2 => {"document_path", "", "content_length"}
:B3 => {"docs", "", "123456" }
:B4 => {"", "", "" }
You should typically not need the content lengths for LOB-based transfers
because the LOB length can be obtained by your PL/SQL procedure by using
DBMS_LOB.GETLENGTH(). The main reason for passing the lengths is to
support the older LONG and LONG RAW transfers, where no equivalent SQL
function exists.
Note that because the final-call statement being used does not contain
a commit, any commit operation required as a result of this call is the
responsibility of the action procedure (mod_owa has at this point already
committed each of the LOB write or update operations).
If a value is specified for OwaUploadMax, mod_owa will check to make sure
the Content-Length for the upload request is less than or equal to this
value before proceeding. This is designed to ensure that the module does
not get consumed with large uploads. mod_owa can only check the total
content size prior to reading the stream; it's not possible to limit
the size of individual files without reading them. If the limit is
found to be violated, mod_owa will not attempt to read any content;
instead, it will build the "final" call to the action procedure, passing
the value for "upload_max" in the first array argument (instead of
"document_path", "document_long", or "document_file").
+------------------------------------------------------------------+
| File upload operations for binary content targets (BLOBs) do |
| not present any issues for mod_owa, but character data destined |
| for CLOBs or NCLOBs raises issues with regard to character-set |
| handling that are not completely addressed in mod_owa. The main |
| problem is that Oracle's OCI interfaces for character LOBs work |
| in character units whereas all of the HTTP interfaces work in |
| bytes. This creates a risk when doing upload operations against |
| multi-byte installations because fractional characters may be |
| present at the end of buffer boundaries during large uploads. |
| Apache lacks any interfaces for determining character boundaries |
| against the range of supported character sets. To get around |
| this problem, mod_owa will attempt to determine the character |
| boundaries using a simple built-in routine. I believe this |
| routine to be implemented correctly for the UTF8 Unicode |
| character set. I have also implemented (but not tested) a few |
| other popular Asian character sets including SJIS, JEUC, and |
| BIG5. Finally there is a default implementation that assumes |
| any upper-ASCII value (in the range 128-255) is the leader of a |
| double-byte pair. |
| |
| A further issue with regard to file upload operations is |
| the reliability of parsing the incoming stream (which, |
| unfortunately, mixes textual and binary information) and |
| reliably closing the LOB streams (I've run into some bugs in the |
| OCI interfaces and had to work around them in peculiar ways). |
+------------------------------------------------------------------+
File upload with LONG or LONG RAW
---------------------------------
As with LOBs, a file upload operation is triggered whenever the CONTENT_TYPE
for a request is "multipart/form-data". The optional LONG-based mode is
triggered whenever the action procedure name is preceded with a tilde, as in:
The content stream parsing and input argument handling are identical
to that described for LOB-based uploads, except that the value of
"document_long" is passed in place of "document_path" as the first
array element. The action procedure call differs in that the output
arguments resemble those of the LONG-based read procedure interface:
begin
(:B1,:B2,:B3,:B4,:B5,:B6,:B7,:B8);
end;
:B1 is an input with the number of name/value pairs
:B2 is an input array of parameter names
:B3 is an input array of parameter values
:B4 is an input array reserved for future use
:B5 is an input/output string for the document mime type
:B6 is an output string for returning a SQL update or insert statement
:B7 is an output string for returning a SQL bind variable string value
:B8 is an output string for returning the 1-character value of a RAW flag
The operation is expected to create a SQL statement which, when run by
mod_owa, allows a piecewise LONG or LONG RAW insert to be performed. A
typical statement would appear as follows:
update MYDOCUMENTS set LONG_CONTENT = :B1 where KEY_COLUMN = :B2;
Typically, the action procedure should locate or create a row first,
and return a ROWID or other key column value in the bind buffer string
output argument. However, if desired, the entire INSERT can be passed
back to mod_owa as in:
insert into MYDOCUMENTS (LONG_CONTENT, KEY_COLUMN) values (:B1, :B2);
Note that because mod_owa is expecting only two bind variables, any other
values to be inserted would have to be added to the INSERT statement as
constants (sub-optimal because the SQL cursor thus produced won't be
shared).
You can also optionally omit the second bind value by returning a null
for the bind string output argument (again, this is sub-optimal because
the SQL statement won't be shareable).
The module will commit the entire operation after the piecewise writes
are completed. Up to 2 gigabytes can be uploaded.
As with the read interface, the SQL statement can be up to 32512 bytes,
the bind value can be up to 2000 characters, and the LONG RAW signal flag
argument should be a single byte (or null).
As with the LOB interface, a final call will be made to the action
procedure with nulls for the second location of the name/value arrays,
a condition that can't occur normally. This signals the end of the
upload operation and allows your procedure to generate a response page
that will be processed by OWA.GET_PAGE in the usual fashion.
File upload using the file system
---------------------------------
This mode is an override of the legacy LONG and LONG RAW mode. It is
signalled by setting OwaDocFile. The operation is in most respects
similar to that for the LONG uploads. A file upload operation is
triggered whenever the CONTENT_TYPE for a request is "multipart/form-data".
The file-based mode is triggered whenever the action procedure name is
preceded with a tilde, as in:
Note that this is the same trigger used for the legacy LONG and LONG RAW
modes; if OwaDocFile is set, mod_owa will use the file-based mode and
not the legacy mode, so you can't use both modes in a single Location.
The content stream parsing and input argument handling are identical
to that described for LOB-based uploads, except that the value of
"document_long" is passed in place of "document_path" as the first
array element. The action procedure call differs in that the output
arguments resemble those of the file-based read procedure interface:
begin
(:B1,:B2,:B3,:B4,:B5,:B6);
end;
:B1 is an input with the number of name/value pairs
:B2 is an input array of parameter names
:B3 is an input array of parameter values
:B4 is an input array reserved for future use
:B5 is an input/output string for the document mime type
:B6 is an output string for returning a file system path
The operation is expected to return a file system path where mod_owa can
write the data. mod_owa will prepend the directory from OwaDocFile -
mod_owa expects that the path name returned by the procedure begins with
a path separator (e.g. "/") and that the path in OwaDocFile does
not end with such a separator. mod_owa will create any intermediate
directory levels necessary, though the final portion of the path will
always be a normal file.
File upload/download using a document table and column
------------------------------------------------------
This is enabled via the OwaDocTable directive, as in this example:
OwaDocTable WWV_FLOW_FILE_OBJECTS$ BLOB_CONTENT
mod_owa will upload files to the table (WWV_FLOW_FILE_OBJECTS$
in the above example), generating a name for the file based on
original file name but containing a unique prefix as in:
0123456789ABCDEF/original_file.xml
mod_owa will append this block of code to all procedure calls to
allow any procedure to signal a file download via the WebDB
interface:
if (WPG_DOCLOAD.IS_FILE_DOWNLOAD) then
WPG_DOCLOAD.GET_DOWNLOAD_FILE(:p_doc_info);
WPG_DOCLOAD.GET_DOWNLOAD_BLOB(:p_blob);
end if;
In this mode, mod_owa treats multipart/form-data POST requests as
regular procedure calls, except for any file arguments. The file
arguments are written to the table and the name assigned is passed
in lieu of the file itself.
mod_owa can only support the BLOB column type - any other columns
won't be used. A sample table is:
create table DEMO_DOCLOAD
(
NAME varchar2(256) not null,
MIME_TYPE varchar2(128) ,
DOC_SIZE number ,
DAD_CHARSET varchar2(128) ,
LAST_UPDATED date ,
CONTENT_TYPE varchar2(128) ,
BLOB_CONTENT BLOB ,
constraint DEMO_DOCLOAD_PK primary key (NAME)
);
Note that when returning a file via WPG_DOCLOAD.DOWNLOAD_FILE, you
should use the BLOB method of returning your content, rather than
attempting to pass a file name. The latter overload may not work
if the document table isn't known to the WPG_DOCLOAD package.
If you set OwaDocTable, the behavior of OwaDocProc with respect to
matching OwaDocGen or OwaDocPath changes to the Apex style calling mode.
Basic Authentication
====================
mod_owa has logic to support the Basic authentication mode in a manner
that is roughly compatible with that of OWS. Look again at the call
that mod_owa will build for your procedure:
begin
if then
...
commit;
else
:realm := OWA.PROTECTION_REALM;
end if;
end;
This gives your code the opportunity, in the AUTHORIZE procedure, to
accept or reject the user's credentials by returning TRUE or FALSE. If
you return TRUE, the procedure is called normally, and GET_PAGE is
used to retrieve the result. If you return FALSE, then the procedure
is not called, and mod_owa will still call GET_PAGE to get your
response, which could be a login page if you are doing your own
authentication system, or a Basic authentication challenge if you want
to use that scheme.
An example of issuing a Basic challenge is as follows:
OWA_UTIL.MIME_HEADER('text/html', FALSE);
HTP.PRN('WWW-Authenticate: Basic realm="myrealm"');
OWA_UTIL.HTTP_HEADER_CLOSE;
...
return(FALSE);
This allows you complete control over the header that is returned; mainly,
this would allow you to set cookies if that's useful to your signon
process. The other way would be to let mod_owa do this for you:
OWA_SEC.SET_PROTECTION_REALM('myrealm');
...
return(FALSE);
This call sets the value of OWA.PROTECTION_REALM (actually, you could
just set it yourself if you like), and if you look at the "else"
clause above you will see that mod_owa will read this value back. If
mod_owa sees that this value is non-null, then it will skip the GET_PAGE
operation and will generate the Basic challenge for you, using the realm
that you have specified. (This is for compatibility with OWS.)
Apache returns the Basic authentication header in HTTP_AUTHORIZATION
in the raw base-64-encoded form, and mod_owa will pass this along
with the rest of the CGI environment. To provide minimal compatibility
with OWS, mod_owa will also decode/parse this header and assign the
username and password values to the appropriate global variables in
OWA, so that OWA_SEC.GET_USER_ID() and OWA_SEC.GET_PASSWORD() will
work. No other form of external authentication will set these variables,
however. If you look again at the call mod_owa builds for CGI:
begin
OWA.INIT_CGI_ENV(:ecount, :namarr, :valarr);
OWA.USER_ID := :usr;
OWA.PASSWORD := :pwd;
end;
you can see that this will set the values for OWA.USER_ID and OWA.PASSWORD,
which are simply returned by the OWA_SEC functions. You can also obtain
the raw HTTP header element with:
AUTHSTRING varchar2(2000);
...
AUTHSTRING := OWA_UTIL.GET_CGI_ENV('HTTP_AUTHORIZATION');
Note that the browser will return the Basic header with every request,
and you must check it on every request. Remember that the PL/SQL
environment is stateless and you cannot be sure of keeping state between
calls from a client (unless you set a cookie). Thus, every request must
be treated as if it were the first request. A few tips:
* Check the username/password on every request, and if it's successful,
return TRUE. If you like, you can set a cookie, but remember that now
you must also check this cookie on every return request, too, and you
will need special logic to set the cookie on the first success.
* It's a bad idea to generate error pages in response to invalid
usernames or passwords. Instead, treat such conditions the same as if
it's the first time and there is no username or password; otherwise,
users who make typing errors will not have any way to correct them
during the login process. If you use a cookie to time out the session,
be sure to treat this just the same, and issue a new challenge.
If you use the OwaRealm directive, mod_owa will take over the responsibility
of logging the user in, using database authentication. You should therefore
either disable further authorization checks for that location, or arrange
to return TRUE from any AUTHORIZE procedures you may have; a simple way
to do this is to check that the OWA.USER_ID is identical to the current
database USER, as in:
if (upper(OWA_SEC.GET_USER_ID()) = USER) then
return(TRUE);
end if;
If you end OwaRealm with the special string "{TIME}", mod_owa will replace
the string with a generated timestamp string. This attempts to emulate a
WebDB work-around for the inability to force a browser to clear/expire
the Basic authentication credential. On logout, applications can set
the special cookie WDB_GATEWAY_LOGOUT to "YES" to force mod_owa to
re-authenticate the next request from the client. mod_owa will clear that
cookie (set it to "NO") on the first re-authentication, and the challenge
will seem (to the browser) to be a new realm due to the appended timestamp.
Example:
OwaRealm myrealm+{TIME}
transmits:
WWW-Authenticate: Basic realm="myrealm+2012-12-20T12:34:56.999999"
Use of the OwaSession directive doesn't have any effect on Basic authentication
processing. It causes mod_owa to strongly prefer to reuse a database
connection from the pool that has previously been used for requests from
the same user, based on a cookie value. It is not safe to assume that the
database connection is in fact the same one used previously. It is also
not secure to assume that because the session identifier matched the
connection pool, the user has legitimately been authenticated. OwaSession
should be considered a performance-related feature only.
Use of the OwaLDAP directive alters the behavior of OwaRealm and/or
OwaSession. OwaLDAP specifies a procedure to receive the contents of
the basic authentication header and/or the contents of the session
cookie specified by OwaSession. mod_owa will open a temporary database
connection and build a call to the procedure specified in the OwaLDAP
directive, of the following form:
begin (:username, :password, :session); end;
The username and password arguments are IN/OUT, while the session is IN
only. All three are filled in with values as/if available on the
request. On output, your procedure is expected to fill in the username
and password arguments with a valid database username/password pair,
or with empty strings if some sort of validation failure occurs. mod_owa
will then close the temporary database connection and will use the returned
values instead of any values that came with the request's basic authentication
header. This supports allowing the user to sign in to basic authentication
using credentials from another system such as LDAP, and yet still connect
to a native database account corresponding to the authenticated user.
The temporary database connection uses the OwaUserid connect string,
so it must be valid for this feature to work. To reduce the inefficiency
of constantly opening/closing the temporary connection, mod_owa will
cache the results of calling your routine, matching first on the
session cookie, and next on basic authentication username/password.
I do not recommend using OwaLDAP. A more robust method of logging in
via LDAP would be to build a login page using either another technology
such as Java, or using a mod_owa page designed for that purpose. The
result should be a secure session cookie which can be checked in your
AUTHORIZE procedure. Individual database accounts should not be used.
This feature exists because some users still want to rely on database
native security (e.g. database roles) for their application security.
If OwaLDAP is set to a string containing a "/", the above logic
is not used, and mod_owa instead obtain the username and password from
CGI environment variables, or the supplied string(s). This form of
OwaLDAP is
OwaLDAP /
For example:
OwaLDAP HTTP_SM_USER/HardcodedUniversalPassword
mod_owa will attempt to extract values from the variables with the
specified names, then use those values as if they had come from a
Basic Authentication header. If a variable is not found, the string
is used "as-is". The purpose of this mode is to allow users
to be authenticated by an external trusted authority that supplies
credentials to the HTTP request after validating it.
+----------------------------------------------------------------+
| This has some fairly obvious security flaws. If you cannot |
| absolutely guarantee that the CGI environment variables aren't |
| being forged and sent from the client, an attacker could send |
| requests with forged headers regarding the username/password |
| and launch a password-guessing attack on the database. Worse |
| still, if the password is hard-coded as in the example above, |
| the user might simply supply a valid username in a forged |
| header without ever having to know or guess the password. |
+----------------------------------------------------------------+
If OwaLDAP is set to "*", the above logic is not used, and mod_owa will
process basic authentication normally, but with one exception: it will
not override the value of the CGI REMOTE_USER parameter with the
username taken from the basic authentication header. This is done to
allow for an alternate means of authentication, wherein the request is
first processed by another module, and the basic authentication header
is set artificially. This is a cheesy hack!
Raw Request Processing
======================
In some special situations, programmers have needed access to the
raw bytes of a POST request, without these bytes being parsed and
unescaped by mod_owa as procedure arguments. To support this
type of code, mod_owa has been extended to include a calling mode
not found in the offical gateway specification. This raw request
calling mode is indicated by preceding the procedure with a tilde:
http://mymachine/owa/~foobar
which mod_owa will call as:
foobar(:B1,:B2,:B3);
:B1 is a raw buffer containing the unprocessed POST data or QUERY_STRING
:B2 is an output string for the response mime type
:B3 is an output BLOB locator for a binary response
Up to 32512 bytes of raw data can be passed in this manner. The two output
arguments allow you to return a RAW response directly from your procedure.
If you return a non-null result for the BLOB, mod_owa will skip the normal
GET_PAGE phase and use the contents of the BLOB as the response. In that
event, mod_owa will take the document mime type from the second argument
(additional header values can also be passed in this buffer - see the
previous section on file downloads for information on how to do that).
If you are running the Oracle 7.3 build of mod_owa, the call will have
only the first argument containing the RAW input data, because LOBs
aren't supported. You must therefore return your response via GET_PAGE
or a GET_PAGE alternate (see later section).
Error Status Returns
====================
The PL/SQL gateway allows you to return an HTTP error status code via a
call to OWA_UTIL.STATUS_LINE. Example:
OWA_UTIL.STATUS_LINE(404, Your_Key||' not found', TRUE);
In general, mod_owa allows Apache to handle the generation of an error page
by passing the status code back to Apache. How it gets handled depends on
how you configure Apache. By default, Apache typically generates HTML error
pages with boilerplate text. For example, it commonly generates an HTML page
of the following form in response to a 404:
Not Found
The requested URL /xxxxx was not found on this server.
When you allow Apache to handle the errors in this manner, it effectively
rewrites them; your client will receive a status code of 200 and an HTML
content page, instead of the original error code and message.
mod_owa allows you to avoid this in most cases by supplying your own error
page. The minimum requirement is that you supply a Content-Type header,
for example:
OWA_UTIL.MIME_HEADER('text/plain', FALSE);
OWA_UTIL.STATUS_LINE(404, Your_Key||' not found', TRUE);
You can optionally render your own error page following the headers. mod_owa
will take either the presence of a Content-Type header or the presence of a
content page as an indication that your PL/SQL wishes to control the error
response. REST applications in particular may need to do this.
*******
Options
*******
Diagnostics
===========
The OwaDiag flags allow control over built-in diagnostic features.
Most of these cause logging of information to the mod_owa log file,
though a few cause enhanced (but potentially unsecure) HTML output
to be returned when error conditions occur. Any or all of the flags
can appear together.
COMMAND print the (PL/SQL) command that's going to be run
ARGS print all of the arguments, bind variable names, and values
COOKIES print in-bound cookies as they appear in the request header
CGIENV print the entire CGI environment (shows the cookies, too)
POOL print the connection pool operations as they occur
HEADER print everything being sent to the OWA in the request header
RESPONSE print everything being returned in the response header
CONTENT print everything being returned as content to the request
TIMING print timestamps near pool/SQL operations
SQL display an HTML error page for any Oracle error encountered
(normally, an HTTP error code is returned)
MEMORY display an HTML error page for any failure of memory allocation
(normally, an HTTP error code is returned)
THREADS force the use of a new OCI connection on each request
(in effect, disables connection reuse/pooling)
ERROR log SQL errors (error code and statement)
The diagnostic file was designed to be used for diagnosing problems with
a mod_owa application or deployment. It wasn't designed to be used as a
general access log for production operation. There could be security issues
with leaving diagnostic flags on for a production operation. There could also
be performance issues related to locking of the file if multiple worker
processes attempt to write to it simultaneously.
Control
=======
A few special hooks are built into mod_owa to enable control over
the connection pool. These allow a user, through HTTP, to examine,
close, and reopen the pool for any Location. This is done by
a special convention: in the portion of the URL that is used for
the PL/SQL procedure name, any name ending in an "!" mark is assumed
to be a special control directive. There are currently seven such
directives supported:
SHOWPOOL! Prints statistics about the connections in the pool.
CLOSEPOOL! Closes all connections in the pool and freezes it so
that all subsequent requests must create and destroy
connections each time. Generally useful when you want
to shut down the database without shutting down Apache;
otherwise, the database won't shut down as long as there
are active connections (forcing you to "shutdown abort").
CLEARPOOL! Clears all "old" connections from the pool by closing
them. Useful if you want or need to trigger cleanup
operations externally (for example, if the cleanup
thread feature doesn't work on your platform).
OPENPOOL! Reopens the pool so that subsequent requests will create
and then reuse connections from the pool.
SHOWCACHE! Shows information about all files in all dynamic caches.
CLEARCACHE! Clears all "old" files from any dynamic file-system caches.
Useful if you want or need to trigger cleanup operations
externally (for example, if the cleanup thread feature
doesn't work on your platform).
AUTHENTICATE! Tests the authorization mechanism by forcing the system
to challenge the user with a Basic authentication dialog
if he hasn't already responded to one, then displays the
result in the response page.
Note that access to these functions could represent a security problem.
The current implementation requires that the database password be
supplied with the request, for example:
http://mymachine/owa/showpool!?owa
You can use the OwaAdmin directive to restrict use of control functions
to a range of IP addresses; for example:
OwaAdmin 123.45.67.0 255.255.255.0
restricts access to machines on the subnet 123.45.67.xxx. It is possible
to use this mechanism to restrict access to a single machine if desired
(subject to trusting the relevant portion of your network configuration).
It is also possible to use this flag with invalid values to effectively
disable the control functions, for example by setting 255.255.255.255 as
the subnet value and 0.0.0.0 as the mask (because, of course, any bits
that are 0s in the mask must likewise be 0s in the IP address value, too).
Also, note that the pool functions really only work for NT; on Unix,
there effectively is no pool because each worker process has its own
dedicated connection.
File System Caching
===================
When using the file download capabilities, some users have found that
caching content in the file system where Apache is running helps improve
their performance. mod_owa supports this with the OwaCache directive,
which has the following form:
OwaCache [lifespan]
This specifies a mapping between a logical name as seen in a URL (and as
seen by your PL/SQL code) and a physical file-system name. For example:
OwaCache /docs/images /usr/local/apache/htdocs/image_cache "2 hours"
causes mod_owa to translate a URL such as
http://localhost/owa/docs/images/foobar.gif
into
/usr/local/apache/htdocs/image_cache/foobar.gif
You can specify multiple OwaCache directives in any Location. There
are two main types of file-system caching supported
* static caches
This assumes that you have arranged to push the content from the
database out to the native file system through some external means.
mod_owa will only read from a static cache, never write to it.
The static cache is assumed to have a directory structure that
parallels the structure found in the database.
* dynamic caches
mod_owa will use the specified physical location as a read/write
cache for files matching the pattern.
The setting for lifespan governs which of these caching modes you will
get for any particular OwaCache. If you omit the lifespan, mod_owa
will use static caching mode. If you specify a lifespan, mod_owa will
use the dynamic mode. By default lifespan quantities are in seconds;
mod_owa will recognize other units such as "minutes", "hours", and "days"
(you only need to specify the first letter, thus "10d" equals 10 days).
A lifespan of 0 is treated as infinite.
In dynamic caching, mod_owa will compare the timestamp on the file
with the current system time and will use the file if it's within
the specified lifespan. If a matching file is not found or is
out-of-date, mod_owa will get the file from PL/SQL and (if possible)
generate (or re-generate) the file during the content download
operation. Because mod_owa cannot assume anything about the
directory structure for a dynamic cache, mod_owa will create any
levels of subdirectories needed by the path. For example, with the
preceding OwaCache directive:
http://localhost/owa/docs/images/gifs/foobar.gif
would be translated to
/usr/local/apache/htdocs/image_cache/gifs/foobar.gif
In static caching, the path is built identically but directories are never
created by mod_owa.
In either static or dynamic caching, the physical path will be
built to conform to the underlying operating system (OS). Thus,
on Windows the above translated path might be:
D:\apache\htdocs\image_cache\gifs\foobar.gif
Some additional notes about file system caching:
* You must make sure that the physical path you specify in the
OwaCache directive is compatible with the underlying OS. Also,
do not specify a trailing directory separator on either the logical
or physical names.
* In both static and dynamic caching, if the file being referenced is
a directory, mod_owa ignores it and just passes the request on to
PL/SQL (mod_owa does not have any way to read a directory, and even
if it did it has no way to know how your application would want the
data formatted for display).
* For dynamic caching, mod_owa needs to be able to determine the mime
type of the file from its path name (using the extension). mod_owa
will check this before writing a file to the cache; if the extension
doesn't match the mime type you've returned, the file won't be cached.
* If your file URL contains any arguments, mod_owa will bypass the
cache completely and go directly to PL/SQL.
* mod_owa will match the logical name to the longest path prefix it can
find; thus, if both "/docs/foo/bar" and "/docs/foo" are specified
as logical names, "/docs/foo/bar/file.txt" will be matched to the
longer prefix (but not to, say, "/docs/foo/b", which is treated
as a different directory).
* mod_owa uses a case-sensitive comparison when matching the logical
names (because it can't know if your underlying database file store
is case-sensitive or not, and this is the more conservative assumption).
It also preserves case when assembling the physical name, which
can have implications when the cache is later read (if the underlying
file system is case-sensitive, then it's possible a later read will
miss the cache even if the database file store is not case-sensitive).
* mod_owa won't write files to the cache that exceed 1 megabyte. For
file reads, mod_owa will use shared memory-mapped I/O for files below
this same threshold, and normal file I/O routines above the limit.
* The whole point of the file-system cache is to avoid calling PL/SQL;
this implies that any authorization code you would normally run won't
get executed if the cache is hit. For this reason, you should probably
cache only unsecured documents (or is there a way to use native Apache
authorization to secure such content?).
* The cache will slowly fill up with files, so you should arrange for
a background job to clean them up periodically (or use mod_owa's cleanup
thread if it works on your platform).
To make the file system cache work, mod_owa uses a number of OS-specific
interfaces that may make it somewhat less portable. The main functions
used are file open/close, generation of temporary file names, file stat,
and memory-mapped files.
I must note that Oracle has several technologies in iAS that are vastly
superior to this technique, particularly the WebCache, a memory-based
cache that can cache all forms of static content from the site (not just
PL/SQL-based), as well as many forms of dynamic content.
Cleanup Thread
==============
During periods of peak load, the number of connections in the pool (on NT)
will ramp up to a high-water mark. When the system is not under stress,
it might be a good idea to close unnecessary connections.
If you specify a non-zero value for the global directive owa_shared_thread,
mod_owa will launch a thread immediately after a worker process starts
up, and this thread will remain running for the life of that process. The
value you specify is taken to be the poll interval for the thread, in
seconds. You may optionally include units, provided they appear together
with the lone parameter to this directive, or appear in quotes. mod_owa
will recognize units such as "minutes", "hours", and "days" (you only need
to specify the first letter, thus "10d" equals 10 days). As an example:
OwaSharedThread "1 hour"
The cleanup thread sleeps for the poll interval and then, upon awakening,
checks each of your Locations to see if there are any connections that
have not been used for the specified interval. If it finds such connections,
it closes them. After performing the check, the cleanup thread goes
back to sleep for the poll interval again.
If you are using dynamic file system caching, the cleanup thread will
also attempt to remove any files that are too old. The cleanup thread
will remove any file that is older than the expiration time set in
the OwaCache command that created the dynamic cache, and is older
than the cleanup thread's poll interval.
This feature was designed for Windows. It works on Unix, but since
Apache uses a multiprocess architecture on Unix, it means that you have
one cleanup thread per worker process. On Linux, this is particularly
heavyweight because threads are implemented as cloned processes. Another
problem with this feature on Unix is that it requires the use of system
semaphores, which also have issues. If you intend to use this feature,
especially on Unix, I urge you to review the section on system internals
near the end of this document.
+----------------------------------------------------------------+
| Apache 1.3 wasn't designed to be multithreaded on Unix. Of |
| the Unix variants, I have used this feature successfully on |
| Linux and Solaris. Apache 2.0 and 2.2 support threading with |
| some MPMs. However, the prefork MPM is single-threaded only |
| and the shutdown signal handling code is not thread-safe. I |
| have experienced some crashes during shutdowns because of |
| this. The crashes aren't harmful, but they're annoying, and |
| if you can't live with them, don't use the cleanup thread with |
| prefork. |
+----------------------------------------------------------------+
Shared Memory
=============
If you specify a non-zero value for the global directive OwaSharedMemory,
mod_owa will create a shared memory segment for caching pages and (on
Unix) for inter-process communications. The segment is created at
module load time, which on Unix means after the directives have been
processed but before the children are spawned (mod_owa relies on the
order of operations here so that the children inherit handles to the
shared memory area and the associated semaphore -- otherwise, mod_owa
would have to create a named shared area, or use a memory-mapped file,
both of which would complicate matters greatly). The value you specify
is taken to be the size of the desired shared segment in bytes. You may
optionally include units, provided they appear together with the lone
parameter to this directive, or appear in quotes. mod_owa will recognize
units such as k (kilobytes) and m (megabytes). As an example:
OwaSharedMemory 2M
On Unix, mod_owa takes advantage of the shared memory segment to
track the state of database connections globally across all running
worker processes, so that the SHOWPOOL! pseudo-command will provide
an accurate picture of the resources in use by a site.
On Windows, this feature could have been implemented as a normal
memory region common to all threads (in other words, only the mutex
securing the shared area is needed). I've gone ahead and used an
actual shared segment in anticipation of the Apache 2.0 architecture,
which will combine multi-process and multi-threading on all platforms.
If you intend to use this feature, especially on Unix, I urge you to
review the section on system internals near the end of this document.
+----------------------------------------------------------------+
| I have used this feature successfully on Windows, Linux, and |
| Solaris. In the future, this feature should be used for a |
| memory-based page cache, but I haven't implemented that yet. |
+----------------------------------------------------------------+
Unicode Modes
=============
All strings are bound to PL/SQL procedures using the DAD character set
(or the character set specified in NLS_LANG). The database server must
convert these strings to the server character set before passing them
to the procedure (and from the server character set when returning
them). However, there are two server character sets, the normal
character set (used by CHAR and VARCHAR types) and the national character
set (used by the NCHAR and NVARCHAR types). With Oracle 9i, the national
character set is always Unicode-based. If you need to call a procedure
that takes Unicode arguments, you may not want to take a (potentially
lossy) conversion through the CHAR/VARCHAR character set. Moreover,
you may want to manage the character set conversions yourself, supporting
multiple client character sets from the same Location. mod_owa allows
you to do this with the OwaUnicode directive, which has three settings:
USER This causes mod_owa to generate NCHAR-based bindings for all
calls to your user procedures, including the document action
procedures. Calls to standard OWA functions continue to use
CHAR/VARCHAR bindings.
RAW This causes mod_owa to generate RAW bindings for all calls to
your user procedures, including the document action procedures.
Calls to standard OWA functions continue to use CHAR/VARCHAR
bindings. Also, mod_owa's own output binds continue to use
CHAR/VARCHAR bindings.
FULL This includes user procedures by default, and in addition binds
all calls to OWA procedures (e.g. INIT_CGI_ENV, GET_PAGE, etc.)
using NCHAR-based bindings. Obviously, you must have a modified
or alternate OWA implementation to take advantage of this feature.
Note that in any case the mod_owa side of the binding always uses the
DAD (or NLS_LANG) character set. The main use cases for this feature
are to avoid lossy conversions
1) when running utf-8 on the mod_owa side against a database that for
whatever reason cannot use multibyte UTF8 as the CHAR/VARCHAR set,
2) when running different DAD character sets from multiple Apache servers
and/or locations against a database that for whatever reason cannot
use multibyte UTF8 as the CHAR/VARCHAR set, and
3) when serving clients with multiple character sets from the same
Location (DAD).
Note that for RAW outputs, you will need to set the OwaAlternate directive
appropriately.
Describe Modes
==============
mod_owa has the ability to run a describe operation on your procedure
whenever binding parameters using the standard gateway rules fails. This
describe operation is costly and I recommend avoiding it. It should
never be necessary with correctly-written PL/SQL code. Describe modes
are controlled by the OwaDescribe parameter, which consists of a
mode and an optional schema name, as follows:
OwaDescribe [schema name]
The allowable values for mode are as follows:
STRICT No redescribe is performed; the binding failure is reported
immediately. This mode is recommended for debugging and/or for
highest performance.
NORMAL This is the default (if the OwaDescribe directive is
omitted), and is compatible with previous versions of mod_owa.
If binding fails, a describe is done and an attempt is made
to promote scalar arguments to singleton collections where
necessary to satisfy the procedure signature.
RELAXED This is the most forgiving mode, and is somewhat compatible
with the behavior of OAS. Not only may scalars be promoted
to collections, but collections may be downgraded to scalars
(discarding all but the first value) and unmatched arguments
whose names don't match those of the procedure will simply be
discarded.
The most common situation where a describe may be needed is to bind an
argument as a singleton array, when normally this would be considered
a scalar value. This is the situation that NORMAL is designed for. I
must note that this can always be avoided, either by adding null-valued
hidden form elements with the same name as the argument in question to
ensure that mod_owa treats it as a collection binding, or by overloading
the procedure in question with an additional entry point that has a scalar
argument of the same name in place of the collection (and that typically
would just create a singleton table of strings and pass it to the original
implementation).
In general, use of describe requires that you be using the standard
name-based binding mode. mod_owa will attempt to describe positional
statements but must then rely on positionally matching the arguments
to the order returned by describe. In this situation RELAXED is not
useful, because it's impossible to tell which bindings do not have
valid matches, so for positional bindings RELAXED has no effect beyond
NORMAL. Describes are never done for flexible-argument procedures,
because they should be unnecessary. However, since there are two
variants of the flexible-argument calling convention, the 2-argument
mode will be tried if the 4-argument mode fails.
I caution all users that the describe operation may fail to yield a
valid result. In general, it is very hard to match signatures with a
PL/SQL function because PL/SQL allows arguments to have default values
(which means they need not be supplied) and because it allows overloading
of a function name within a package. In cases of overloading, mod_owa
is biased towards selecting the first function it finds (typically, it
will continue past the first function only if it cannot find any
variable that's a candidate for promotion).
You can also use this directive to specify a "describe schema". This
can be useful in situations where you are separating the mod_owa run-time
schema from the actual schema containing your database tables and
procedures. In this circumstance, describes may fail if you use synonyms
rather than prefixing your package names with the actual schema containing
them (because describe returns a description of the synonym, not of the
actual procedure). mod_owa will attempt to re-describe through the
synonym, so in most circumstances this optional parameter is no longer
useful. It's still supported for backward compatibility, and for
situations where the synonym translation doesn't work correctly due
to the awkwardness of OCI's describe API. If you specify a describe
schema, mod_owa will append this string, with appropriate separator,
to any package or procedure that it needs to describe, unless mod_owa
can determine with certainty that there is already a schema specified. Note
that there is one ambiguous case where a schema-prefixed procedure (with no
package portion) will be mistaken as a package.
In order to make describe operations less costly, mod_owa can cache
the results of doing a describe such that they can be used on
subsequent calls. This saves not only the describe overhead, but
the cost of failing the original statement. The cache is very
primitive. Any described procedure will be put into the cache
just once; therefore, overloading could be defeated by activating
caching (overloading is not officially supported by the gateway
standard, but mod_owa normally does its best). The cache is only
capable of doing scalar promotion to collection binding mode (which
is the main reason applications tend to rely on describe). The
cache uses a simple linear search and probably won't scale well
past a few 100s of procedures. Finally, the cache is never flushed,
so if you are doing development and changing procedure signatures,
you may need to bounce your Apache listener if you change something
that only worked via the describe cache. Describe caching is activated
by passing a "*" for the describe schema, as in:
OwaDescribe NORMAL *
Bind Rounding
=============
In an effort to reduce work on the server, the Oracle database attempts
to reuse SQL statement execution plans by maintaining a pool of
recently-executed statements. This pool is checked whenever a new
statement is parsed. The matching algorithm requires an exact textual
match of the SQL statement itself, as well as a match of the data types
of all arguments to the statement.
PL/SQL statements created by mod_owa are subject to this mechanism. In
particular, it is hoped that frequently-called procedures will have
identical SQL text and bind variable types, and therefore be able to
take advantage of statement sharing. Based on your Location directives,
mod_owa will always build the same SQL text for a given procedural
request. Also, mod_owa will always bind all arguments as strings (or
as collections of strings). Unfortunately, this is not always sufficient
to ensure that statements will be shared, because the size of the
bind parameters (in bytes) and of the collections (in elements) are
also relevant factors in the matching algorithm.
To increase the likelihood of sharing statements, mod_owa has the
ability to round the byte widths of scalar parameters to multiples
of a specified value, and to round the element counts in collections
to multiples of another specified value. Control of these rounding
units is via the OwaRound directive:
OwaRound
For example:
OwaRound 100 20
will cause mod_owa to round all strings up to 100-byte boundaries, e.g.
100, 200, etc., and will cause mod_owa to round collection sizes up to
20-element boundaries, e.g. 20 elements, 40 elements, etc.
Note that while use of this parameter will not change the data that your
procedures see as input, it will increase the amount of memory required
for binding the statement on the server side (in particular, collections
will be rounded to boundaries that are a multiple of both, thus in the
above example the smallest collection would be 20 elements x 100 bytes
or 2000 bytes). Thus, using this parameter trades increased memory
consumption for improved sharing (and therefore speed and scalability).
Besides higher memory consumption, use of this parameter has another
drawback when used with incorrectly-coded PL/SQL procedures. Normally,
procedures called by the PL/SQL gateway should not contain OUT
parameters. If they do, then conceivably the procedure could attempt
to write back information into the bound values (which never makes sense,
since in the best case the gateway will simply discard the values). This
writeback could overflow the buffers allocated; without rounding, this
results in an Oracle error, but if rounding is used, it could result in
a buffer overflow inside the OCI, and therefore crash mod_owa (and
Apache). You should make sure that none of your procedures use OUT
parameters before using this feature.
If not specified, the default behavior is to do no rounding; this is
compatible with older versions of mod_owa.
UTF8 Validation and Failover
============================
HTTP requests from the most commonly-used browsers unfortunately do
not contain a header which enables the server to detect the character
set in which the request is formatted. This forces web site developers
to make assumptions about the browser character set. As described
elsewhere in this document, mod_owa allows you to set up different
Locations with different assumed character sets using the OwaCharset
directive.
In the limited case where the server's character set (set either through
NLS_LANG or via OwaCharset) is a Unicode equivalent (e.g. "AL32UTF8"),
mod_owa can be instructed, via the OwaBindset directive, to check that all
incoming data contains only valid Unicode byte sequences. If OwaBindset
is specified, mod_owa will make this check on every request. If any
string data is found to contain invalid bytes, then mod_owa will use
the Oracle character set specified in OwaBindset as the character set
bound via OCI for marshalling of the procedure arguments.
The check adds overhead to every request, whether or not invalid bytes
are actually found. The check stops as soon as any invalid bytes are
found. The bind override is applied only to inbound bindings for
normal procedure calls. It has no effect on document upload/download
operations. Outputs continue to be produced in UTF8 (so the response
page sent by the Location will still bear the "utf-8" character set).
The intended use of this feature is to enable you to support two
populations of users from a single Location, one population using
UTF8 and another using a single-byte Western European character set.
This check is not foolproof. It will always detect invalid byte
sequences, but it is possible for requests that aren't in UTF8 to
nevertheless have byte sequences that appear to be valid. The
chances that this will occur vary by character set and by the
amount of content (the more content, the less likely it will be
that the byte sequences will appear valid). Specifically with
respect to the commonly-used single-byte Western European character
sets ("WE8MSWIN1252" and "WE8ISO8859P1"), it's very likely that the
check will be able to catch the invalid bytes and rebind them. The
feature is less likely to work reliably with other multi-byte character
sets, or with single-byte character sets for scripts such as Greek,
Cyrillic, or Arabic. The feature will not work correctly if the
Location is not running UTF8 or AL32UTF8; OwaBindset doesn't check
this, though, so you should do so before using it.
Apex (HTML-DB) support
======================
Many users have asked about running Apex using mod_owa instead of
modplsql. It's not clear whether mod_owa can support Apex because
Apex may rely on calling/binding modes that aren't supported by
mod_owa. Also, Apex relies on the WebDB file upload/download
interface to receive and deliver content via a documents table. I've
made some effort to partially resolve these issues:
* Calls via the multipart/form-data content type
Apex relies on these for many of its normal procedure calls, whereas
mod_owa sees that content type as exclusively triggering a file upload
operation.
* Uploads to a document table, and downloads from either an in-line
binding from WPG_DOCLOAD or from the document table.
Specifying the OwaDocTable argument disables mod_owa's file upload
interface in favor of WebDB's. multipart/form-data calls will be brought
into memory and prepared as procedure arguments. Any file arguments are
written to the document table and the system-assigned name of the file is
passed to the procedure in place of the original argument value.
REF cursor returns
==================
This experimental feature is activated by using the OwaFlex directive and
prepending an "@" sign on the package/procedure name. Example:
OwaFlex @YOUR_PACKAGE.YOUR_PROCEDURE
mod_owa will call your procedure with two extra OUT arguments, the first
of type SYS_REFCURSOR and the second of type VARCHAR2:
begin
YOUR_PACKAGE.YOUR_PROCEDURE(:B1,:B2,:B3,:B4,:B5,:B6);
end;
:B1 is an input with the number of name/value pairs
:B2 is an input array of parameter names
:B3 is an input array of parameter values
:B4 is an input array reserved for future use
:B5 is an output of type SYS_REFCURSOR
:B6 is an output of type VARCHAR2
Your procedure should return any HTML headers, including the content type
and request status, using the usual OWA facilities. If you need to return
anything other than a standard success response, render it in the usual
way and do not return a REF cursor. mod_owa will pick up your headers
via the standard GET_PAGE operation, and, if necessary, any error page.
However, if you return the standard success response, and a content type
that is not text/html, mod_owa will expect an open cursor to be returned in
the fifth argument, and mod_owa will iterate over this cursor streaming a
response body to the client.
mod_owa will look at the content type you return to decide how to render
the results. If the content type is text/html, mod_owa assumes you didn't
return a REF cursor and expects the rest of your content to be returned by
GET_PAGE. Otherwise, mod_owa expects a valid REF cursor to be returned.
If no content type is returned, mod_owa defaults to using text/plain as the
content type. If your content type isn't one of the types listed below,
mod_owa will process your data as text/plain (but still leave the content
type you set, e.g. text/richtext). Supported content types:
* text/plain
* text/csv
* text/xml
* application/json
mod_owa accepts two types of cursor return pattern:
* Cursor returns multiple columns with no LOBs
* Cursor returns a single column of type CLOB
For the multi-column return, mod_owa behaves as follows:
text/plain The data rows are rendered by concatenating the columns
into lines, and placing a newline at the end of each
row. No escaping or formatting of any kind is done.
text/csv The column names are rendered on a header line,
followed by the data rows. All values are enclosed
in double-quotes and escaped as necessary. The names
and values are separated by commas, and a newline is
placed at the end of each row.
text/xml A single XML document with as the
root element is returned. Each row is rendered
as a element within the collection. Each
column is rendered as an element, with the column
name transformed into an XML tag name, and the
data value as a text node within the element. Entities
are escaped as necessary.
application/json A JSON array is returned with one array element per
row. Each row is a JSON object, with members matching
the column names. The names and values are escaped
as necessary. Note that this content type works
properly only if the mod_owa character set is Unicode.
Note that mod_owa picks up all column values as string bindings, including
any numbers, dates, timestamps, RAWs, etc. The default formatting implied
by your NLS environment settings will be used. To control the formatting
in your application, perform a TO_CHAR in your SQL statement.
If a CLOB is returned, the CLOB is expected to contain a pre-rendered
"row" in one of the above formats. mod_owa makes no attempt to verify
this, but simply streams the contents to the client. For text/csv,
the rows are presumed to be CSV rows (the cursor may return a header row
or not, as desired, but mod_owa won't know or care that this was done).
For text/xml, mod_owa assumes the content represents a whole element,
and uses it instead of creating a element; effectively, each
LOB is concatenated within the stream underneath the
element. For application/json, mod_owa assumes the content represents a
valid JSON object suitable for use as an array element; the LOB contents
are concatenated with comma delimiters. In all cases, with CLOB returns,
any necessary escaping and formatting is assumed to have been done on
the server.
The optional OwaRefXml directive can be used to override the XML tags
used for the root and row elements, including specification of an XML
namespace prefix and URI. Note that if a namespace is specified for the
multi-column rendering mode, it will be used for all column tags as well.
The last argument of type VARCHAR2 can be used to return content type
specific information to the renderer. This is used by both the JSON renderer
and the XML render, in different ways. (For other content types, it's
reserved for future use.)
The JSON renderer uses the value as the name of a wrapper field for your
collection return. It's useful to protect your site against a JSON scripting
attack against unprotected browers. Return a non-null string to use as the
name of the array that will contain your rows.
The XML renderer uses the value to override OwaRefXml, or in lieu of it.
You can return a whitespace-delimited string containing the root and row
tag information, along with the namespace prefix and URI, in a form similar
to the OwaRefXml directive. Note that since spaces are used as the delimiter
each space is significant (this allows you to return blanks for some values.
This code path is new and relatively untested. In particular error handling
of bad REF cursors is not well understood, nor are the implications of
cursor creation in response to requests. Badly-behaved applications or
error conditions may result in cursor "leaks". The adequacy of the escaping is
also not known. Use of the CLOB method has memory/scalability implications
if the CLOBs are temporary. Once again, error conditions may lead to "leaks"
of temporary LOB objects. Finally, if the server uses the CLOB method,
the entire burden of accurately rendering a safe response rests with
the server-side code.
REST operations
===============
By default, mod_owa only handles GET and POST requests. With the
optional OwaHttp parameter, this can be expanded to include PUT and
DELETE requests, or even the full set of DAV requests.
* NORMAL The mod_owa default, compatible with earlier releases.
* REST Enables mod_owa to handle PUT, DELETE and PATCH
to support REST services.
* DAV Enables all DAV operations (not yet completed).
If the request contains a content body that cannot be interpreted as
HTML form arguments, and the content body is not of zero length, mod_owa
will treat it as a file upload. This is enabled only if REST or DAV is
specified. The body is delivered as an extra parameter, MODOWA$CONTENT_BODY.
For now, this works only with the Apex-style document upload (OwaDocTable).
The HTTP method can be obtained from the CGI variable "REQUEST_METHOD".
A related parameter is OwaDav, which allows you to configure a procedure
to handle REST (or, eventually, DAV) requests. If configured, this
parameter takes precedence over OwaDocProc (or other file-like path
handlers). The configured procedure is called in the 4-parameter
flexible argument mode. The REST path (if any) is delivered as an
extra parameter MODOWA$REST_PATH.
Optionally, if you precede your package/procedure name in OwaDav with
an "@" sign, mod_owa will call it in the 6-argument REF cursor mode
described earlier.
Alternate OWA Implementations
=============================
For various reasons, some mod_owa users have wanted to write their
own version of the PL/SQL side of the gateway interface, in effect
replacing the OWA packages with their own implementations of similar
APIs, possibly with enhanced or specialized functionality. Typical
reasons for using an alternate implementation include:
* Generating and returning RAW content (Oracle's OWA only supports
character data for dynamic pages).
* Keeping PL/SQL state between HTTP requests (Oracle's OWA
is not designed to support this sort of reentrancy).
* Doing on-the-fly content generation during streaming (Oracle's
OWA requires that you generate your page into its internal
buffer before data transfer starts).
mod_owa supports alternate mechanisms via the OwaAlternate directive.
When this directive is set, it specifies the name of an alternate
package to use for the basic functions of INIT_CGI_ENV, AUTHORIZE,
and GET_PAGE. When the package is specified, mod_owa will build
begin .INIT_CGI_ENV(:ecount, :namarr, :valarr); end;
for the environment variable transfer,
begin
.USER_ID := :usr;
.PASSWORD := :pwd;
end;
for the Basic authentication transfer,
begin
if (.AUTHORIZE) then
...
else
:realm := .PROTECTION_REALM;
end if;
end;
for the authorization check (unless you've disabled authorization or
requested the per-PACKAGE check), and
begin .GET_PAGE(:linearr, :nlines); end;
for the final page fetch loop.
In the special case where "OWA" is passed as the package name, no change
to the above calls is made (this is so that you can use some of the
additional parameters below in conjunction with Oracle's official OWA).
You may also specify one or more additional parameters to switch on
customized functionality:
* KEEPSTATE
When this flag is included, it suppresses the generation of the
DBMS_SESSION.RESET_PACKAGE call, thus enabling your code to keep
state between requests. This is useful mainly for caching and other
similar optimizations. Note that because of Apache's pooled process/thread
architecture for workers, there is no guarantee that subsequent requests
from the same client will be routed to the same database session and hence
the same PL/SQL package state. Also, note that your OWA alternative
and all of your other PL/SQL code should be written so as to be re-entrant
(Oracle's OWA is not re-entrant). Use of this flag takes precedence
over the setting of OwaReset.
* WITHRAW
When this flag is included, it causes mod_owa to generate an alternate
call to GET_PAGE where the content is returned in two parts:
1. the header elements, plus the first block of character content
(if any), as a single string of up to 32512 bytes, and
2. the first block of raw content (if any) as a single buffer up to
32512 bytes.
There is also a flag to indicate if more data is available than was
transferred (return null on the last content transfer, any other value
if more content is still waiting for transfer).
The call is generated as follows:
begin .GET_PAGE(:cbuffer, :rbuffer, :moreflag); end;
The header elements should be concatenated with newlines as separators
(mod_owa will pass in a newline with the value of :cbuffer, which is
an IN/OUT parameter). If returning character-typed content, an
additional newline should be appended before starting to append the
content values. For binary content, return the first block in the
RAW buffer. For all subsequent transfers, place the next chunk of
the content into either the cbuffer or rbuffer as appropriate, setting
the unused buffer value to null. Do not return content in both buffers
(except during the first call, when a header may be returned with the
first block of raw data); if you do, mod_owa will use the raw data in
preference to the character data. To signal the end of the stream,
return nulls in both buffers and/or set the flag to null.
* USELOBS
When this flag is included, it causes mod_owa to generate an alternate
call to GET_PAGE where the content is returned in three parts:
1. the header elements, as a single string of up to 32512 bytes,
2. a binary LOB, for binary content return types, and
3. a character LOB, for character content return types.
The call is generated as follows:
begin .GET_PAGE(:header, :pblob, :pclob); end;
The header elements should be contatenated with newlines as separators
(mod_owa will pass in a newline with the value of :header, which is an
IN/OUT parameter). Only one of the two LOBs should be returned with
content, the other should be set to null. Note that this flag
overrides WITHRAW if both are present.
* GETRAW
When this flag is included, it causes mod_owa to generate a call
that supports raw transfers, as follows:
begin .GET_PAGE_RAW(:rawarr, :nraw); end;
This call works just like the GET_PAGE interface, except that the content
is returned in an array of raws rather than an array of strings, each up
to 256 bytes in length. The value of nraw is set on input to be the
maximum number of rows to be returned, and on output should be the actual
number of rows returned. As with the standard GET_PAGE, this is called
until no more rows are available.
Note that even textual content is returned via this interface; this
allows you to return pages in alternate character sets and perform
content-length calculations within the server code.
The OWA requires that the gateway module signal that raw transfers will
be done, so that during the time that your procedure is writting the
page, it can buffer the data in raw form. To signal this, mod_owa
will alter the security-passing statement to appear as follows:
OWA.SET_USER_ID(:usr);
OWA.SET_PASSWORD(:pwd);
OWA.ENABLE_RAW_MODE;
Oracle will include the new GET_PAGE_RAW interface in a future
release of the official OWA, making this the preferred technique for
extending the OWA for raw support. As such, you should consider the
previously-described WITHRAW and USELOBS modes obsolete. This flag
takes precedence over the others if inconsistent settings are present.
* CACHE
When this flag is included, it causes mod_owa to generate a special
check after the call to your routine. The operation of this check
is described in the next section. In conjuction with file-system
caching, this flag allows your alternate implementation to direct
mod_owa to cache files on the file system. In fact, you can do
it with Oracle's OWA as well -- just specify "OWA" as the package
name, as in:
OwaAlternate OWA CACHE
* SETSEC
When this flag is included, it causes mod_owa to pass the Basic
authentication information via function calls, rather than via
direct assignments to package global variables. Thus, where it
would normally use
OWA.USER_ID := :usr;
OWA.PASSWORD := :pwd;
it will instead use
OWA.SET_USER_ID(:usr);
OWA.SET_PASSWORD(:pwd);
This was designed to work around a crash in PL/SQL that relates to
the use of the direct assignment statements after a RESET_PACKAGE.
You can use this with Oracle's OWA, but only if you've modified the
code (in pubowa.sql and privowa.sql) to include the necessary
functions (a future release of the code will include them). Note
that if GETRAW is used, SETSEC is assumed, since the form of the
statement will use the new function calls.
* NOMERGE
When this flag is included, it causes mod_owa to skip merging of
URL parameters with content arguments when processing POST requests.
This mode is compatible with a bug found in older versions of mod_owa.
* CGITIME
When this flag is included, it causes mod_owa to build an extra
call into the CGI environment-passing statement. The complete
call is then:
begin
.INIT_CGI_ENV(:ecount, :namarr, :valarr);
.INIT_CGI_EXTRA(:rtime, :postargs);
end;
The second call is added to pass the elapsed time in milliseconds
since the request was received by Apache. The postargs parameter
is an empty string unless CGIPOST is set.
* CGIPOST
When this flag is included, it causes mod_owa to build an extra
call into the CGI environment-passing statement, similar to the
statement used for CGITIME, but included up to 32k of the post
arguments in the postargs parameter. If the statement is not a
POST, an empty string is passed.
* IPADDR
When this flag is included, it causes mod_owa to build extra
lines into the CGI environment-passing statement that sets the
value of OWA.IP_ADDRESS if a client IP address is available.
* LOGGING
When this flag is included, it causes mod_owa to build an extra
call into the GET_PAGE content retrieval statement. The extra
call is used to return logging information to mod_owa from the
PL/SQL environment. The LOGGING flag is incompatible with the
USELOBS, GETRAW, and WITHRAW return modes. The complete call is:
begin
.GET_PAGE(:linearr, :nlines);
.GET_LOGS(:logs, :ilogs);
end;
Though the statement is run repeatedly and makes both calls each
time, only the logging content returned on the first invocation is
used by mod_owa.
The flags must follow the package name, separated by spaces. For example:
OwaAlternate MY_OWA_PKG KEEPSTATE USELOBS
Caching Dynamic Pages
=====================
Since mod_owa has all the infrastructure necessary to cache content to
the file system, I've added a special mode of operation to allow you
to use mod_owa to cache your dynamically-generated pages as well. This
mode is enabled by setting the value "CACHE" in the OwaAlternate
directive (remember, it must follow a package name, even if the package
is "OWA" itself).
When this mode is enabled, it changes the call that mod_owa generates
to execute your procedure, causing mod_owa to add an additional call
to check for a cache request, as follows:
begin
if then
.( => :B1, ...);
.CHECK_CACHE(:O1,:O2,:O3,:O4);
commit;
else
:realm := OWA.PROTECTION_REALM;
end if;
end;
:O1 is an output string where you can return a logical document URI
:O2 is an output string where you can return a string to be checksummed
:O3 is an output binding for the page mime type
:O4 is an output binding for the cache lifespan (in seconds)
Before the call, mod_owa will set an environment variable, "MODOWA_PAGE_CACHE",
to "enabled" and pass this in the CGI environment, where your procedure can
look for it to decide if it should attempt to hit the cache. If it does
decide to hit the cache, then you should arrange for the routine CHECK_CACHE
to return a non-null value for the logical document URI. You should also
arrange for CHECK_CACHE to return the mime type that your procedure would
have returned had it actually run. You can optionally also arrange for
CHECK_CACHE to return a string to be checksummed and appended to the
document URI (more on that later). You can also optionally return a
lifespan which will take precedence over the setting in OwaCache
(for this document only).
Note that the CHECK_CACHE call is generated in such a way as to be
in the same package as your procedure. In this way, you can easily
arrange for the procedure to return results through CHECK_CACHE by
using PL/SQL package global variables. If the original operation
was not a packaged procedure, then page caching is not supported.
The URI you generate should be a logical document path that matches
the logical path name of one of the caches you specified with OwaCache,
for example "/docs/mycache" (refer to the earlier description of
logical and physical locations and the OwaCache directive). You
cannot use a logical location that's mapped as a static cache,
however, because mod_owa will never attempt to write to a location
designated as static. If a valid logical location is found to match,
mod_owa will translate the logical name to a physical file name, look
for a file with the correct name, and check the lifespan versus the
specified expiration time you return from CHECK_CACHE. If the file
is found, that's it, mod_owa returns it using the file download
mechanism described earlier.
If mod_owa doesn't find the file in the cache, then it will re-execute
the same statement shown above, with the expectation that this time
your procedure will actually produce some content for return via
GET_PAGE. (For this reason, your procedure should set some sort of
package-level flag to indicate to itself that it's already been called
once in the current context, and that it should not make yet another
attempt to hit the cache.) mod_owa will keep the physical name from
the previous call, so on this second call you needn't return anything
(other than nulls) from CHECK_CACHE.
To summarize the process:
if (OWA_UTIL.GET_CGI_ENV('MODOWA_PAGE_CACHE') = 'enabled') then
if (FIRST_CALL) then
...generate values for URI, checksum, mime type, and lifespan...
...set them in package globals so that CHECK_CACHE returns them...
FIRST_CALL := false;
return;
end if;
end if;
...generate your content normally...
Note that you can completely avoid the cache mechanism in any circumstance
desired by arranging for CHECK_CACHE to return a null for the URI argument,
thus:
if (OWA_UTIL.GET_CGI_ENV('MODOWA_PAGE_CACHE') = 'enabled') then
...set globals so CHECK_CACHE returns null URI...
end if;
...generate your content normally...
The keys to success with caching these dynamic pages are:
1. proper selection of the lifespan, and
2. proper uniqueness of the physical file name generated.
mod_owa can't do either of these things for you, since it has no way to
know which aspects of the request (arguments, cookies, etc.) are relevant
to include when generating the file name, and which are not, nor does it
have any way to know when the cached content should expire. In general,
you should strive to put some sort of unique identifier into the base URI,
perhaps combining the procedure name with some key argument value, e.g. if
ORDERS.GET_ORDER is called for order 101, generating:
/docs/mycache/orders$get_order$101$.htm
as the URI is helpful, though perhaps not sufficient. One problem you
may run into is that because file system names are limited, you might
not be able to pack enough information into the URI and still have a
valid file name. To mitigate this problem, you can return up to 32512
bytes of information from any source desired as the second argument of
CHECK_CACHE, and mod_owa will run CRC32 on the data and append an
8-digit hexadecimal string to your URI to help ensure uniqueness. If
a checksum is inserted, mod_owa will place it just ahead of any file
extension that's present, e.g. in the previous example you might get
/docs/mycache/orders$get_order$101$12345678.htm"
As noted earlier, Oracle has many technologies that are far more capable
than this one.
Chunking of large POST arguments
================================
The OCI imposes a width limit of 4000 bytes for collection bindings. In
most cases, you should be able to design your PL/SQL interfaces so that
large values (up to 32512 bytes) are passed as scalars. This is not the
case if you are using the flexible argument passing mode, however.
In situations where you are using the flexible argument calling mode,
either directly by prefixing the call with "!" or indirectly
via the OwaFlex directive, mod_owa will break any argument value
larger than this 4000 byte limit into "chunks". mod_owa will signal
this to your code by appending a "chunk number" to the parameter names.
For example:
given: MYPARAMETER=<6000 byte string>
mod_owa generates: ["MYPARAMETER.1", "MYPARAMETER.2" ]
[<3999 byte chunk>, <2001 byte chunk>]
Note that each chunk is slightly smaller than 4000 to allow space for the
string's trailing null terminator.
Please note that this mode of operation is invoked only when using the
flexible argument calling mode from a normal request; it is not used
for collection bindings that are created for other calling modes, nor is
it used when performing the flexible argument call for document operations.
This mode of operation may not respect multi-byte character boundaries.
Some logic has been added to attempt to break the chunks on even character
boundaries, but hasn't been tested and probably won't work with all
character sets. Oracle's character-handling logic has real trouble
with strings containing odd bytes (fractional characters), so beware!
This feature is here primarily because the alternative is to generate
an error.
On modern versions of the OCI, the array width limit is now 32512, so chunking
may not be relevant anymore. mod_owa will still run the above chunking logic,
but it will do so for values larger than the 32k limit.
**********************
Additional Information
**********************
Connection Pool
===============
Creating a database connection involves a login-like operation that can
cost significant time if the database is located on a machine other than
that of the Apache server (and even if on the same machine). Also,
parsing statement handles is somewhat costly especially when they
involve PL/SQL anonymous blocks. To avoid these costs, mod_owa keeps
the database connections and cursors it uses to process a request, and
reuses them on subsequent requests.
On single-threaded implementations (currently all 1.3 Unix ports including
Linux and Solaris), there is no point in having more than one connection
per Location, because any particular httpd worker process can only
be servicing a single user request at a time. So, in these cases, the
connection "pool" has a single entry, which is filled on the first
successful request and thereafter reused for every other request.
On multi-threaded implementations (currently only Windows for 1.3, but
any platform for 2.0), the process may have to handle multiple requests
at once. During virtually the entire time a request is being processed,
a database connection will be in use (because the database side of the
operations is where all the time goes). Since it would be undesirable
to have user requests single-thread through a solitary database connection,
a pool of reusable connections is needed. Each Location gets such a pool
with up to 255 available "slots". Initially the pool will be empty; the
first sucessful request for a PL/SQL call will fill the first slot with a
reusable connection. Afterwards, this connection will be reused unless
it is in use (locked) by another thread, in which case the next slot
is examined/filled, and so on. The maximum pool size is tunable per
Location in the httpd.conf file. Each pool has a mutex (sometimes
called a "latch") to ensure that the process of obtaining a connection
from the pool is orderly. The code keeps the time window during which
a thread must be latched as small as possible, since this represents a
potential bottleneck. Also, each pool has a semaphore that is decremented
whenever a connection is drawn from the pool, and incremented when
connections are returned to the pool. In situations where the pool is
at maximum capacity and every connection is in use, threads requiring a
connection will be blocked and queued for the first available resource.
The blocking mechanism is timed out after 1/2 second since at that point
it's probably faster to create a new connection for one-time use.
A special value can be supplied for OwaPool to enable operation without
the use of a semaphore, "THREADS". If this keyword is supplied, mod_owa
will use 255 for the maximum number of connections in the pool, and
additionally will skip creation and use of any semaphore for that
location. This mode can be useful in situations where you are certain
that the number of threads cannot exceed the number of connections in
the pool (in other words, that the pool size is thread-count limited
and not actually limited by OwaPool). This mode is particularly useful
when running with Apache 2.0 using the "prefork" mode, since it avoids
the unnecessary creation of a semaphore for a pool of only 1 connection.
It's also useful when running any multithreaded Apache configuration
where the number of threads has been limited to a value less than or
equal to the desired pool size (note: do not use this flag if you
are going to set the number of threads to a value greater than 255!).
The connection pool normally uses a first-available search strategy,
minimizing the time spent in the latched state. If the OwaSession
directive is used, and the specified cookie is seen, mod_owa uses
a more complex three-pass algorithm to find a connection, preferring
first a connection with a session identifier matching the cookie
value, then an unsessioned connection or pool slot, and finally
reuse of any available connection even if marked with a different
session identifier.
Security Considerations
=======================
No particular effort has been made to make mod_owa secure, and
mod_owa cannot ensure that a user-written PL/SQL application is
secure. Developers of any web application are well advised to
become conversant in common web vulnerabilities and secure application
coding and deployment practices. The list below notes a few issues
that have occurred to me.
* Hiding .conf files
Since database passwords are present in the Location directives,
you need to make sure that not only are httpd.conf and its
dependencies secure against writes, they should be secure against
read by unprivileged users. A better practice is to use external
database authentication via the Oracle Wallet or OS.
* Diagnostic file
If you turn on diagnostics, lots of potentially sensitive information
may get written to the log file. The log file must be writable by
the user that Apache is running the worker processes as, too.
* Use of https
mod_owa works with SSL and https on Apache 2.x. I am not able to test
mod_owa with SSL on Apache 1.3, but several users have reported that
it works fine.
* Control functions
The control functions create a potential exposure based on how you
convey the password that enables them. If this is passed in the
clear in either the URL or in POST parameters, you risk exposing
the database password. Probably you should just avoid using the
control functions completely except during development; at a minimum
you should use https when issuing control requests.
* Control displays
The control displays show potentially sensitive information; another
reason to avoid them, use https, or restrict access to machines you
control (and whose IP addresses are local to your network and cannot
be spoofed).
* Control password
Because the control functions use the database password as the key
to the location, they're vulnerable to attack by sending repeated
requests with generated passwords in the URL. You should make sure
that your database password is not something that will be easily
discovered by such a brute-force technique, and/or restrict the
IP addresses that are allowed to perform control functions.
* Protecting database procedures
Through mod_owa, any URL could conceivably attempt to run any
procedure in your database account, or even procedures from the
SYS account (e.g. DBMS_ and UTL_ packages, the packages of the
OWA_ itself, etc.). If this is a concern, you need to protect
these procedures through use of the AUTHORIZE function. The
simplest technique is to specify the PACKAGE authorization mode,
and put this function into only those packages you wish to expose.
(Protect regular procedures by coding the OWA_CUSTOM.AUTHORIZE or
OWA_INIT.AUTHORIZE to disallow all other operations.) Another
technique would be to check the procedure about to be executed
in OWA_CUSTOM.AUTHORIZE or OWA_INIT.AUTHORIZE by looking at the
CGI, and allow only those procedures you desire. The OwaReject
directive can be used to exclude packages/procedures by prefix,
though this requires you to list every class of procedures to be
rejected; it is not as easy to be certain your code is secure using
this technique.
* Protecting database tables
It can be useful to separate the schema containing your application
objects (tables and procedures) from the schema against which
you will run mod_owa. This way, the user that mod_owa connects
as can be prevented from any direct access to your tables and other
database objects, except through APIs that are granted to that
user. Synonyms can be used to make the procedures runnable without
schema prefixes. An even cleaner technique would separate the
user-interface logic from any data access logic, such that the
user-interface logic could be installed in the mod_owa schema
without needing synonyms.
* Trojan horses via US7ASCII
Oracle's character-set conversion logic has a peculiar behavior when
8-bit data is passed from a client that's supposed to be using a 7-bit
character set (such as US7ASCII, the default if you don't set NLS_LANG).
The behavior is that strings are bit-stripped down to 7-bit codes. This
provides an opportunity for a false "procedure" to slip past the normal
checks against spaces, punctuation, etc. that are done by the gateway
when building a statement. The database will see the code as ASCII
characters, folded down from the 8-bit range. It might be possible to
get malicious code fragments through the gateway in this manner. I
strongly recommend always running an 8-bit or higher character set to
prevent this (in addition to checking procedure names in the AUTHORIZE
procedure, of course).
* SQL injection
Although mod_owa uses bind variables for all arguments to your
procedures, the package.procedure name itself is taken from the
input URL. Moreover, because the gateway is case-insensitive,
mod_owa isn't able to surround the name components in double quotes
to protect against malicious code. Instead, mod_owa strictly limits
the characters that can be used in the package and procedure name
portion of the URL, and truncates it at the first disallowed
character. In practice, this means that package and procedure
names are limited to the alpha-numeric characters, the underscore,
pound sign, dollar sign, and period.
* Caches
The file system cache, and, eventually, the shared memory segment,
are potential security holes. These areas are read/written by
the user that Apache runs as, and are no more secure than that
user. The shared memory segment is especially vulnerable. Either
don't use them or use them only for publically-viewable information.
Also, make sure that the file system directories used for caching
can't be written by unprivileged users, otherwise someone could
substitute fake content for real content.
* Basic authentication
This venerable web standard isn't terribly secure, and most sites that
really care about security have their own login page and session-management
logic. mod_owa doesn't particularly make the security of Basic worse,
but it doesn't improve it, either. mod_owa will pass the value of
the basic authentication header to PL/SQL in the CGI environment;
this header is base-64-encoded but is not encrypted in any way.
* OCI traffic
You may want to take steps (via Oracle's technology) to ensure the
security of your OCI network traffic.
* WebDB file upload/download table
The WebDB style of file transfer could have security issues if not handled
carefully. On download, for example, it seems as if a file staged in the
table could be visible to another user/session if that user was able
to guess the name/key to the table and type it into his/her URL. The
document download procedure will have to guard against this by checking
every request, and in addition it would seem wise to generate secure
random keys. Even better would be to avoid the table completely and
always return content via the WPG_DOCLOAD interface. On upload, there
are similar problems; the inbound file is staged to the document table
prior to calling the action procedure. mod_owa will prepend a random
value to the file name as part of the key, but there is no assurance
that this is completely unguessable. The called procedure should ideally
rename the staged file to a secure name, or move it to another table, as
part of the overall transaction associated with the request.
* Cross-site scripting
All web applications that present dynamic content are potentially
vulnerable to users who put scripts or tags into their data with
the intent of having the application play this back to a browser
and reveal sensitive information available on the browser. A simple
example is a short piece of JavaScript that extracts a session cookie
from DOM and then forms and sends an HTTP request to another web site
where the attacker can capture the cookie value. Your PL/SQL application
should guard against this by properly escaping any responses. Note that
the ability to upload files with content types, and the ability to
download them, creates the potential for users to upload HTML files
and JavaScript sources that could be sent back to the browsers of
other users. Consider disabling file upload/download, limiting its
use to administrators, and limiting files to whitelisted content types.
Scalability Considerations
==========================
Little analysis has been done regarding the scalability of Apache with
mod_owa. For what they're worth, here are some observations made by
myself and other users (many thanks to Nate Neely for helping to flush
out these issues):
* DBMS_SESSION.RESET_PACKAGE
Flushing the PL/SQL package state on every request is costly of
database resources, and it also defeats attempts to cache results
in PL/SQL for performance. If you have a recent version of the
database (8.1.7.2 or higher), you can use the OwaReset "LAZY"
mode to defer the reset work until actually needed for a particular
package. This mode also eliminates the needless flushing of cached
cursors for PL/SQL static SQL statements. An even more efficient
mode of operation is to eliminate state flushing entirely, which
you can do via the KEEPSTATE option of the OwaAlternate directive.
This requires that all PL/SQL code (including yours) be re-entrant.
Since Oracle's OWA is not re-entrant, this assumes that you have
your own re-entrant OWA implementation. The latest version of Oracle's
OWA can be reset independently using HTP.INIT, so if your PL/SQL
code is re-entrant, you can set the "INIT" mode of OwaReset to
cause flushing of only the OWA packages.
* Statement sharing
The Oracle database maintains a list of statements and attempts to
reuse them if they are re-executed later. Oracle's matching criteria
require that the statement texts be identical, and also require that
the bind variable data types match. Unfortunately, the type match
for strings is sensitive to the exact byte size specified, and for
collection bindings is also sensitive to the number of elements in the
collection. Since mod_owa builds statements dynamically, the odds of
hitting the shared cache are low, and it may fill up with near-duplicates
and lead to contention for the latch on the shared area. Consider using
the OwaRound directive to reduce the effect of this.
* Connection exhaustion on Unix
Some users of mod_owa on Unix have reported that mod_owa's connection
pooling (really, connection reusing) feature has caused exhaustion of
available connections against their data server. This could happen if,
for example, the number of Apache worker processes exceeds the number
of database connections available, and each worker has serviced an OWA
request. Unfortunately, because Apache is not multithreaded on Unix,
there isn't a good way to handle this problem. As far as I can tell
Apache worker processes are undifferentiated, so any particular request
might get handled by any of the workers; thus, attempting to limit the
number of active connections to a number lower than the number of workers
will eventually result in a set of workers that are holding connections
for reuse while other workers are unable to get a connection with which
to service an OWA request. If you must set the number of workers to a
value larger than the available database connections, you will have to
disable the connection reuse feature, by setting OwaPool to 0 in
httpd.conf. Even this doesn't guarantee safety, since it's still possible
that a number of requests for OWA service exceeding the data server's
connection limit might arrive simultaneously. Forwarding OWA-related
traffic via proxy to a dedicated Apache server with worker count
limited by available connections is another possibile solution.
* Connection thrashing on Unix
As load increases on an Apache site, Apache spins up additional worker
processes to handle it. If this load includes OWA traffic, these workers
will acquire database connections, which mod_owa will attempt to hang
on to for reuse. Once load subsides, Apache is very aggressive at
tearing down the additional workers; this process begins within seconds
and completes within a minute or so. The effect is to tear down all
the database connections for those workers, too. If the load on your
site is bursty in nature, this could be a problem. One possible solution
is to set the worker process floor (MaxSpareServers in httpd.conf) to a
high value, perhaps even to the same level as the worker limit (MaxClients
in httpd.conf).
* Consumption of semaphores on Unix
When a multithreaded Apache becomes available on Unix (Apache 2.0),
mod_owa needs a semaphore to control access to the connection pool
for each Location. Thus, if there are 10 workers each with 2 Locations,
that's a total of 20 semaphores potentially needed. OSes may limit
the total number of semaphores allowed; if you run out, mod_owa falls
back to serializing database operations by creating and destroying a
connection for each OWA request (basically, the behavior coded by the
original authors). The latest release of mod_owa uses process-bounded
pthread condition variables to eliminate the dependency on semaphores,
and to ensure cleanup on process exit; however, the underlying OS
may implement this using semaphores anyway.
* Loss of database connections
If Apache workers are not shut down cleanly, it is possible to "lose"
system resources that were being used by those processes. The chief danger
is severing connections to the database when a hard "kill -9" (or
the NT equivalent) is done on Apache. The database has its own worker
processes (called "shadow" processes) that are created whenever a client
(such as a mod_owa-based Apache) connects. Unless the shadow was doing
a network operation at the time Apache was killed (the normal state is
they're just hanging on a socket waiting for requests), it won't know that
the client has been shut down. Eventually Oracle may reclaim an inactive
shadow, though DBAs typically set this timeout to a large value (or not
at all). Eventually the OS networking code may deliver an error on the
socket, but don't hold your breath for that, either. For this reason, avoid
using kill scripts (such as RedHat's /etc/rc.d/init.d/httpd) to shut down
your server; use "apachectl stop" instead. Note that older versions of
mod_owa did not close database connections cleanly on Apache exit (it was
documented but perhaps overlooked), because I had not found a means by
which to do so reliably. The current 1.3 version will close all database
connections (except perhaps a long-running in-flight transaction) when
a process undergoes an orderly shutdown. However, thanks to changes in
termination processing for Apache 2, this no longer works reliably.
* Loss of system resources
In addition to losing database connections, it's possible to lose system
resources such as shared memory segments and semaphores due to an unclean
shutdown (and, on some platforms, despite a clean shutdown). mod_owa
uses these resources on Unix only if you use the shared memory and/or
cleanup thread options; otherwise, there's no issue. Shared memory is
particularly problematic because Apache 1.3 doesn't deliver an event
whereby mod_owa can close down the shared segment in an orderly fashion.
On Linux and HP/UX, it gets around this problem by marking the segment for
deletion immediately after attaching it, thus ensuring that the segment is
reclaimed when the last attached process exits. However, this is not
possible on Solaris, so the segment is "lost" each time the server is
started and stopped (you can reclaim it manually with the ipcs and ipcrm
utilities, but it's annoying). A semaphore is used for latching the
shared memory area, too, and this semaphore is also lost every time the
server is cycled. Semaphores can also be reclaimed manually with ipcs
and ipcrm.
* Windows
Many of the above issues aren't much of a problem on Windows, because
there Apache uses only one worker and multiple threads. This architecture
ensures that your total number of connections can be limited to the pool
size, despite a larger number of worker threads (under stress, some OWA
requests may be queued waiting for connections from the pool). This
architecture also means you won't be creating and destroying connections
as worker processes are created and destroyed. This architecture also
means that per-process system resources such as mutexes and semaphores
will be limited in number. Thus, the only real issue is to ensure that
resources and database connections are cleaned up, which they will be
if you perform an orderly shut down.
Character Set Handling
======================
A full discussion of character-set handling via HTTP and by Oracle is
beyond the scope of this document. In this section I'll try to explain
the basic theory of operation regarding how character data is handled
by mod_owa and Oracle, so that it may serve as an aid to users trying
to diagnose character-set-related problems.
Oracle character sets
---------------------
Oracle has a long list of supported character sets, each of which
is given a unique name. You'll need to consult your Oracle documentation
for more information on them. Some commonly-used character sets
include US7ASCII (standard 7-bit ASCII), WE8ISO8859P1 (which is
the ISO Latin-1 set commonly used in Western Europe and the Americas),
and AL32UTF8 (which is multi-byte Unicode). Note that these names may
or may not have mappings to the so-called IANA character sets
that HTTP uses. For example, the IANA set "iso-8859-1" is the
same as WE8ISO8859P1.
Database character sets
-----------------------
Every database has two character sets that are determined at the
time the database is created. These character sets are the
database character set and the national character set. The
database character set is the character set used by all CHAR,
VARCHAR, and VARCHAR2 columns and PL/SQL buffers. The national
character set is the character set of the NCHAR, NVARCHAR, etc.
types. For brevity, I will call these the VARCHAR set and the
NCHAR set, respectively. For the purposes of the OWA, only the
VARCHAR set is relevant, because all the PL/SQL interfaces use
VARCHAR2 data types (unless you use the OwaUnicode directive
described previously). The VARCHAR set is the character set that
all string data will be stored on disk in, and it's the character
set that all string data is stored in while in memory (for example,
in a buffer within a PL/SQL function. If you're uncertain what
character sets the database is using, you can find out by selecting
them from V$NLS_PARAMETERS with this command:
select PARAMETER||'='||VALUE from V$NLS_PARAMETERS
where PARAMETER in ('NLS_CHARACTERSET','NLS_NCHAR_CHARACTERSET');
The database character sets are, in a very real sense, "baked in".
Once you create a database with certain sets, you're stuck with them,
because (obviously) you will have row data stored in the database
("bits on disk") that are in that character set, whose meaning would
be jeapordized by after-the-fact changes to the character set used.
Client character set
--------------------
Every OCI program is, from the perspective of the database, a
client. sqlplus is a database client, and so is Apache with
mod_owa installed. While the database sets are fixed after
creation, clients can run any character set they like. The Oracle
OCI code determines the client character set by looking for the
value of NLS_LANG, which it gets either from the environment
or (on Windows) from the registry. The NLS_LANG contains a
specification for your desired language, territory, and
character set. If you do not specify an NLS_LANG, Oracle
assumes you are running US7ASCII. This can be a source of much
trouble, as we'll see shortly.
Network conversions
-------------------
The networking layers below the Oracle OCI allow clients to run
their own character sets by converting client string data to/from
the database character set. These layers perform no conversion,
and hence are most efficient, when client and server are running
the same character set. As an example, suppose that a database
was running the UTF8 character set, while a client was running
WE8ISO8859P1. The OCI will, effectively, convert from the Latin-1
to the Unicode set whenever the client transmits strings, including
SQL statements and procedure arguments, and will convert results
back from Unicode to Latin-1 for any outputs.
It is also possible, when binding an argument with the OCI, to
override the default character set with another one, effectively
marking that argument as being in a character set other than that
specified by the NLS_LANG.
mod_owa DAD character set
-------------------------
All SQL statements built and transmitted by mod_owa will be bound
with the default NLS_LANG character set. However, if you specify
an OwaCharset, and this character set is different from that of the
NLS_LANG, mod_owa will use the per-argument OCI bind mode just
described to ensure that the string in question is interpreted in
a manner consistent with the DAD chararcter set, not the NLS_LANG
set. mod_owa provides you with this capability so that you can
have multiple Locations within your Apache server that each use
and expect a different character set (the reason you might want to
do this will be clear in the next section).
Note that the OwaCharset never affects the SQL, only the input
and output arguments (the OCI does not provide a way to override
the NLS_LANG character set for SQL - in practice this isn't needed
anyway since procedure names should not contain unusal characters).
Also note that the DAD character set is always specified using the
IANA character set name. The reason for this will be clear in the
next section.
Browser character set
---------------------
Just as character set conversions may be necessary between Apache/mod_owa
and the database, it may be necessary to perform conversions between
the working character set of the middle tier and a browser. For
downloaded documents or generated pages, the HTTP protocol allows
the server to attach a character set specifier as part of the content
type, so that the browser will know what character set the document is
in. The character set specifier must be an IANA name. mod_owa will
do this automatically whenever it can, using the OwaCharset you specify,
or an OwaCharset inferred from the NLS_LANG (if you've omitted it). Note
that it's not possible for the OWA to do this in PL/SQL, because at
that stage the database doesn't know what character set the data is
eventually going to be converted to in the client (Apache/mod_owa).
In theory, the browser will make sense of the response even if it's not
in the "default" character set of the browser. For example, if you
send back a document and mark it as utf-8, the browser will commonly
render it properly, even if it contains (for example) Chinese, despite
the fact that your browser default is some other character set (like
iso-8859-1). Thus, there is a relatively reliable way to ensure that
transmitted data is interpreted correctly by the browser.
Unfortunately, there is no reliable way to be sure of the browser's
desired character set for a GET or POST request. The HTTP protocol
provides for a way to signal this to the server, but none of the
browsers has properly implemented this. Instead, they send a list
of character sets that they will accept a response in, which gives
the server no information about what character set the transmitted
data is in. This could be a serious problem for a POST request
containing form data.
To get around this last problem, mod_owa relies on the OwaCharset.
mod_owa simply assumes that all data from your browser sent to the
Location in question will be in the OwaCharset. This is a clumsy
workaround but so far no one's come up with anything better. You
can see how this might be inconvenient for a site that has one set
of users desiring to run Latin-1 while another set wants to run
(say) a Japanese set. The workaround for this is to have different
Locations, one for the Latin-1 users and one for the Japanese users.
However, since by definition the NLS_LANG can only be set to one value,
in this circumstance it is useful that mod_owa allows you, on a per
Location basis, to override the NLS_LANG with a Location-specific
OwaCharset. If not for this feature, you'd be forced to run different
Apache instances on different ports or different machines so that
each could have a different NLS_LANG setting (fortunately, they can
still all use the same database per the earlier discussion).
Non-byte-unique character sets
A character set that is ASCII-compatible is considered a byte-unique
character set. UTF-8 Unicode is byte-unique, as are certain Asian
character sets. Some character sets are non-byte-unique, such as the
Shift-JIS (SJIS) variants. These character sets use byte values from
the ASCII range as follower bytes in double-byte pairs. mod_owa
likely cannot support such character sets due to internal parsers
that scan for byte valued delimiters such as file path separators,
double-quote character delimiters, etc.
There are many ways to get into trouble with character sets, and I'm
only going to mention a few of the more common ones here.
Browser character set doesn't match DAD
---------------------------------------
This one should be obvious -- if your browser is running some
character set that doesn't match the OwaCharset, mod_owa won't
know it and will just bind the raw bytes for transmission to
the database. It might even seem to work, since when you fetch
it back out, assuming there was not a conversion required between
Apache and the database itself, it will likely look just like it
did when it went in. However, a user with a correctly-configured
browser will, on accessing that same data, most likely see the
raw bytes as being entirely different characters. Worse yet,
if Apache and the database are not running the same character set,
the OCI will attempt to convert your inbound data to the database
character set, but will assume (incorrectly) that it's coming from
the DAD character set, not your browser set.
As an example, suppose a user is running an Eastern European
browser character set, and uploads data via a form to an Apache
Location that is set to iso-8859-1. Suppose further that the
actual database is UTF8 Unicode. mod_owa will take the data
from the browser, bind it as Latin-1, and it will get converted
incorrectly to Unicode. The reverse conversion may or may not
result in the same raw binary values; if it does not, then the
browser could see strange values reflected back.
NLS_LANG not set correctly
--------------------------
This one is subtle, because if you don't set NLS_LANG, Oracle
behaves as if you had set it to AMERICAN_AMERICA.US7ASCII by
default. Suppose that you start up Apache without setting
NLS_LANG, and neglect to specify an explicit OwaCharset. Then
you attempt to access the Location from a browser running
iso-8859-1. Because you did not specify an OwaCharset, mod_owa
will use the NLS_LANG character set as the OwaCharset, in this
case US7ASCII. Now you are in the same situation as the previous
example where the browser and DAD sets are mismatched!
Please be aware that despite some similarities, the OwaNLS
directive cannot be used to avoid setting NLS_LANG. Oracle packs
a lot of information into NLS_LANG, including two attributes
(language and territory) that can be set at run-time on a per-session
basis, and one (the character set) that is global to the entire
process and cannot be changed once the program starts. The OwaNLS
directive allows you to tell mod_owa to set the language and territory,
but it does not direct mod_owa to establish the character set,
because this is not possible. The only reason OwaNLS allows
you to specify the character set at all is because in older
versions of the OCI (8.0.x), there is no API by which mod_owa can
determine the NLS_LANG character set, and so you must use the
OwaNLS directive to inform mod_owa of your character set choice
(but not to set it).
Incorrect database character set
--------------------------------
In theory, the OCI will ensure that all data passing to/from
the database is converted to the database character set. However,
what if the database character set isn't the one you think it is?
In the past, many Oracle customers have mistakenly created their
databases and forgotten to specify the character set that they
really want. They then use the database, via sqlplus, as if it
were in the character set that they want, but with an incorrect
NLS_LANG that happens to match the incorrect database character
set. This is a recipe for real trouble.
A classic example of this is to mistakenly create the database
using the US7ASCII character set, when in fact the user intends
to store 8-bit data. This was, unfortunately, the Oracle default
(until recently when it was changed to WE8ISO8859P1) if you failed
to specify a character set at creation time. Users would then
compound this error by neglecting to set NLS_LANG, and would run
sqlplus to create stored procedures with accented characters or
populate table data with accented data. You might think this would
fail, but remember that the OCI will skip any character-set conversion
in the particular case where the client and server character sets
match. Because, in this example, both appear to be running US7ASCII,
the invalid 8-bit data is allowed in. It even seems to work, as
long as you consistently keep forgetting to properly set the
NLS_LANG. The trouble comes later, when a program like mod_owa
is used, and the client character set is properly specified. For
example, again assuming you neglect to set NLS_LANG, but you do
specify the DAD character set to make sure some set of your users
get Latin-1, you will get very strange results as the database
attempts to "convert" the 8-bit data that it thinks should be
US7ASCII into the WE8ISO8859P1 character set.
Very similar problems can arise if you use any character set on both
client and server but enter data as if it were in another character
set. For example, suppose the database were created with Latin-1
and the client's NLS_LANG was likewise set to Latin-1, but the
user was (by whatever means) entering Chinese in the Unicode character
set. This might appear to work for that user, but in reality the
database is storing away the raw bytes of the Unicode strings as if
they were just odd strings of European characters!
Unescaped URLs
--------------
Programmers generating dynamic pages sometimes forget that URLs cannot
contain non-ASCII characters, and even some ASCII characters, unless
they are escaped. For example, the Latin-1 character y-diaeresis,
which is byte value 255, should not appear directly in a URL, but
should be escaped as %FF. Many ASCII characters require similar
escaping, and in fact the exact escaping needed depends on which
portion of the URL is being escaped. For example, spaces are not
valid in the URL, but should be escaped as %20. If a URL is going
to contain arguments, the arguments may need different escaping.
Some browsers will nevertheless attempt to use an invalid URL, and
will simply escape the URL prior to transmitting it. IE will do this,
for example, though it's common for IE to do this using the utf-8
character set (the browser has to make some assumption about what
character set the invalid byte values are in). If the code that
generated the URL used Latin-1, obviously there's going to be trouble
as the byte codes are misinterpreted.
Variants of this problem sometimes arise when JavaScript code attempts
to assign default values to form fields or build URLs based on
strings that were not escaped but that appear in the code, because
again the browser must make an assumption about the character set
of the JavaScript code itself, and this is not necessarily the same
assumption it makes about the page or form field values.
Lossy conversions
-----------------
Even when character sets are properly configured all along the path
from browser to mid-tier to database, it's still the case that when
converting between character sets, some loss of fidelity may result.
For example, no amount of configuration correctness will help the
Russian user that uploads Cyrillic data into a database that's
running WE8ISO8859P1. In some cases, the losses may be subtle and
go unnoticed for a while. This would be most likely to occur
between different Asian character sets, which have many but not all
of the same character code points.
Two modes of lossy conversion are worth special mention because they
very commonly result from misconfigurations where the NLS_LANG is not
specified, and hence US7ASCII is used as the default. If you use
US7ASCII as the client character set, and read out from the database
accented data in a character set such as WE8ISO8859P1, Oracle will
do its best to down-convert the data as it's sent through OCI. The
effect of this is that accents appear to be stripped off as (for example)
a y-diaeresis is converted to a plain y, etc. Going the other direction,
if you bind 8-bit data through to the OCI, then because these byte values
do not represent valid US7ASCII characters, the OCI doesn't know what to
do with them, and bit-strips them down to 7 bits, thus the a-diaeresis
becomes the letter d, the e-diaeresis becomes a k, the o-diaeresis a v,
etc.
Another form of lossy conversion can occur as a result of string
expansion or contraction. For example, an accented German word
takes more bytes to represent in multibyte UTF8 Unicode than it
does in Latin-1. If a word of maximum size (versus a hypothetical
database column) were entered into a form field, it could appear
to be of valid length and yet still overflow the width of either
a PL/SQL buffer or a database column all of which are specified
in bytes, not characters. The symptoms of such an error could
range from database overflow errors to application-level
truncations. The amount of exposure to this problem depends
entirely on the character sets and languages involved.
There are some additional considerations for Oracle 9i and beyond.
NCHAR and character-semantic VARCHARs
-------------------------------------
In Oracle 9i, there will be several significant changes to the way
string data is handled by the database. One is that the NCHAR
set will now always be Unicode. Thus, a programmer writing code
that he or she desires to run with Unicode semantics regardless of
customer deployment can now do so reliably. Another is that
the database will offer the ability to specify column widths
and PL/SQL buffer sizes using character storage units (of course
specification in bytes will always be supported, too, for backward
compatibility).
RAW transfer modes
------------------
Oracle is considering adding support for raw transfers to the PL/SQL
gateway (see the previous descriptions of the OwaUnicode RAW binding
mode and the OwaAlternate GETRAW setting). If this is done, it will
allow you to receive arguments to your procedures in raw form, giving
you control over the character set conversion. It will also provide
you a means to return content in any desired character set, regardless
of the database or mod_owa character set, by allowing results to be
transferred back in raw form. (As a side-effect, you will also
finally be able to generate raw content types, for example content of
mime type image/gif, etc.) If this support is added, it will mean
that in any situation where the target character set doesn't match
the database character set, the HTP functions will convert the content
to the target character set and return it to mod_owa as raw data.
New AL32UTF8 character set
--------------------------
Oracle's UTF8 character set was created prior to the adoption of the
extensions that take Unicode beyond its 16-bit origins. As a consequence,
the Oracle character set known as UTF8 is incapable of handling the
new 4-byte sequences used to encode the extended characters (those
beyond U+FFFF). Compatibility considerations required Oracle to
support the official Unicode standard with a new character set; in
Oracle 9i, this is the character set known as AL32UTF8. If you have
a 9i client environment in which to run Apache, you can use this
character set against 9i databases. It's settable globally via
NLS_LANG, and mod_owa will be able to make use of it if set in this
manner.
One caveat, however, is that it is not possible to set mod_owa to
use this character set at the level of a specific Location (via the
OwaCharset directive). OwaCharset is specified in terms of IANA
names, and, again for compatibility reasons, mod_owa is forced to
map the IANA "utf-8" to UTF8 (otherwise, mod_owa might not work
in 8.x environments). In practice this isn't really a problem;
just set AL32UTF8 as the NLS_LANG character set, and use OwaCharset
for all Locations that do not use AL32UTF8.
Limits
======
Please be aware of the following size limits of mod_owa:
maximum argument size
---------------------
A scalar argument can be up to 32512 bytes in size (an OCI maximum
close too, but not quite as large as, the PL/SQL maximum of 32767).
However, when present in an array, older versions of OCI may limit
the size of an individual argument to 4000 bytes. Also, note that
this can result in a very large memory structure (4000 * array_size)
being required. It's best to limit array arguments to 256 bytes
each if the number of elements in the array is going to be large.
maximum number of arguments
---------------------------
When the flexible argument mode is used for file operations, all
arguments are passed in a single array, and there is a limit of
256 total elements. Since mod_owa adds some arguments of its own,
it's best to limit the number of arguments to 250 or less.
maximum return SQL statement length
-----------------------------------
32512 bytes, the PL/SQL maximum. Character-set issues might cause
expansion of strings when transferred to mod_owa, so either stay
well below this limit or ensure that your SQL is mostly ASCII.
maximum return SQL bind length
------------------------------
The buffer is 4000 bytes in size. However, due to character-set
conversion issues, it's best to limit returned bind values to 2000
characters unless you're sure the characters are ASCII.
maximum return mime type length
-------------------------------
256 characters.
maximum upload/download size
----------------------------
2 gigabytes per file for downloads, 2 gigabytes total (across all
files being transferred) for uploads.
maximum file size for cache
---------------------------
2 gigabytes per file.
maximum connection pool size
----------------------------
255 connections per Location on NT, 1 per Location per worker
process on Unix.
System Internals
================
This section is mainly for programmers and diagnosticians who need to
know about the process architecture of Apache, and how mod_owa uses
mutexes/semaphores to implement the connection pool, the cleanup
thread, and the shared memory segment.
Apache Process Architecture
Windows
-------
On Windows, Apache seems to run two processes, a manager and a worker.
The manager spawns the worker and then seems to monitor its health
thereafter. The worker spawns a number of threads and handles web
requests in parallel up to the number of threads operating.
The directives are therefore processed twice, once by the manager
and then again by the worker. The module initializer is also called
twice. Only the so-called child initializer is called solely in the
worker process; it's run once before anything else happens.
+----------------------------------------------------------------+
| There seems to be one other process involved in the early |
| stages but I've not managed to track that one down yet - it |
| doesn't seem important, though. |
+----------------------------------------------------------------+
Unix
----
On Unix, Apache also seems to have two initial processes, but it
then uses the second process as a "template" to create worker
processes by fork()ing them.
The directives are processed twice, and the module initializer is
also called twice. Their effects are inherited by the workers
that are cloned from the template process. After each worker is
spawned, its child initializer is run before anything else happens.
Apache 2.0 and 2.2
------------------
With Apache 2, on both Unix and NT, Apache will allow you to run in
any of several modes, including prefork (equivalent to the old Unix
architecture of single-threaded worker-processes) and perchild
(a fixed number of worker processes, each of which has a variable
number of threads).
Critical Events
Location Creation
-----------------
mod_owa simply allocates the structure and initializes it. There is
one "blank" location that gets created by Apache when the module is
loaded -- this location seems to be the place where global directives
(outside the scope of any Location) are sent. The effects of this
are inherited by all worker processes, so code here should do
nothing more than allocate memory. On NT, mod_owa creates a mutex
for the Location's connection pool here; if it did not, then at a
later point when attempting to do the first-time initialization,
it might be possible for two threads to attempt the first-time
initialization simultaneously. On Unix, creating the connection-pool
latch is generally not needed, since the workers are normally
single-threaded. However, if a cleanup thread is going to be run,
a connection-pool latch will be needed, and unfortunately this is a
bad place to create that latch, since then all workers would be forced
to share one underlying mutex after being fork()ed, even though
there's no possibility of memory collisions between them because
they each get a separate copy of the memory involved. So on Unix,
any connection-pool latch needed will get created later.
Module Configuration
--------------------
Again, mod_owa simply allocates the structure and initializes it.
There is no guarantee that this event will be delivered prior to
or after the Locations are processed. Also, there is no way to
obtain a handle to any locations, nor (in the Location creation
event) is there any way to obtain this handle. mod_owa needs to
keep a linked list of Locations chained off of this structure to
support cleanup processing; since the timing is uncertain and the
pointers are unavailable, mod_owa does this elsewhere.
Module Initialization
---------------------
This event occurs once for all workers, and occurs after all the
directives have been processed. As a consequence, this is really the
only place mod_owa can create the shared memory segment (if requested)
such that all workers will inherit the handles (you can't do it during
module configuration because the directives may not have been processed
yet, so you may not know that you need to do anything). However,
because this event occurs before the workers are fork()ed, it's the
wrong place (on Unix, that is) to do something like spawn the cleanup
thread.
Child Initialization
--------------------
This event occurs once for each worker (that means on NT it occurs
just once). This is the correct place to spawn a cleanup thread if
one is needed. It's also the place to create the latch that controls
access to the configuration structure's linked list of locations (so
that the cleanup thread can gain control of the list while it runs
through the locations). Since the cleanup thread needs the latch,
the latch must be created first.
First Request for Location
--------------------------
For each worker process, the first request that accesses a location
provides mod_owa with this implied event. mod_owa uses this event to
add the location in question to the linked list. On NT, this operation
is always latched through the connection-pool latch for the location,
ensuring that only one thread will ever run this event, and that
the location is only added to the linked list once. On Unix, this
event is used to create the connection-pool latch, since we couldn't
create it earlier. The Unix version creates it only if there's a
cleanup thread running (otherwise there's no need for the latch). A
deadlock is avoided here because the latch is created before
adding the location to the linked list, ensuring that the latch exists
before the cleanup thread can possibly be aware of the location. On
both Unix and NT, the actual linked-list portion of this operation
is also latched through the configuration structure latch, which always
exists on NT, and exists on Unix if and only if a cleanup thread is
running.
Cleanup Thread Loop
-------------------
When the cleanup thread awakens, it takes the configuration structure
latch so that it can loop through the linked list of locations.
During this short time no new locations can be added to the list.
The code holds the latch only for the time required to get the
pointer to the first location on the list. Since new locations are
always added to the head of the list, the latch can safely be
released and new locations can be added (if necessary) while the
cleanup thread runs. For each location, the cleanup thread takes
control of the connection pool latch for that location, runs through
the connections to close aged ones, then releases the latch and moves
on to the next location. When it's finished, it goes back to sleep.
Child Process Exit
------------------
This happens just once on NT, once per worker on Unix. At this
time, if there is a cleanup thread, the global interval (on the
configuration context) is set to zero so that if it awakens during
this operation, it will immediately exit. An attempt is made to
obtain the configuration structure latch, which if successful then
allows this code to loop through all the locations and close their
connection pools cleanly. After this, the cleanup thread is killed
explicitly to make sure it's gone. Note: this doesn't run
reliably in Apache 2.0.
Server Shutdown
---------------
This event is only available in Apache 2.0. It occurs once during
the shutdown of the Apache server, and provides mod_owa an opportunity
to clean up global resources such as shared memory. Since this event
is not available in Apache 1.3, these resources (if used) are lost when
the server shuts down, and must be reclaimed manually.
Latches
Shared Memory Latch
-------------------
This latch is created once and shared by all worker process that
inherit from the creating process. It is used to secure operations
against shared memory. For connection-pool statistics, each
unique combination of a process and a location will self-assign
a slot within the relevant region of memory. Since this slot
is unique, the latch is used only during the time that the
assignment is done; thereafter, the process is free to write to
its assigned slots any time. Because this latch spans process
boundaries, the implementation uses a semaphore.
Configuration Latch
-------------------
This latch controls access to the module configuration structure.
The primary use is to secure access to the linked list when
adding a location that is being used for the first time. After
that, only the cleanup thread should be hitting this latch
(which means that normally, there will never be contention for it).
This latch is always created on NT, but only created for Unix
processes if a cleanup thread is to be run (on Apache 2.0, this
latch is always created, since threading is possible on any platform).
This latch is implemented using a mutex.
Location Latch
--------------
This is a location-level latch controlling access to the connection
pool for a mod_owa location. It's always created on NT; on Unix,
it's created by each worker process during the first-request processing
event, and then only if a cleanup thread is running (on Apache
2.0 this latch will be created as soon as the Location is used).
This latch is implemented using a mutex.
Connection Pool Semaphore
-------------------------
This is a location-level gate controlling resource usage from the
connection pool. It's currently used only on NT, since on Unix
there can be at most one connection in the pool (it's used on Unix
for Apache 2.0). The Unix implementation uses a pthread mutex and
condition variable, while the NT version uses a semaphore.
Remaining Work
==============
Here are a few notes on outstanding bugs and incompletely-implemented
features:
Intermittent browser hangs against Unix server
----------------------------------------------
I've traced this problem to an odd interaction between the Apache
keepalive processing logic and the handling of Oracle shadow
processes when direct database connections are being used. Details
are at modowabug.htm. For now, the problem is worked around by
marking the Apache socket connection with the close-on-exec flag,
so that it's not carried over into any Oracle shadow process that
might be spawned. This is a good work-around but unfortunately
it won't work under Apache 2.0 (revisit this issue).
Shadow-process crashes related to use of RESET_PACKAGE
------------------------------------------------------
This is a flat-out bug in PL/SQL, but it causes the statement used
by mod_owa to pass the Basic authentication information into OWA
to crash the shadow process with a 3113 error. The only robust
work-around requires modifying the OWA PL/SQL package to include
APIs for setting the username/password; the crash seems related to
the fact that, lacking such APIs, mod_owa must directly assign their
values using an anonymous block. Oracle's official OWA will include
these APIs in a future release.
Worker process grows large over time
------------------------------------
Hard to pin down because, except for the OCI calls, mod_owa uses the
request pool for all allocations and assumes Apache will clean things
up. I do not believe this is an actual memory leak, though it's
possible OCI-related memory is not being cleaned up. It's also
possible that, because mod_owa needs some relatively large blocks,
poor implementations of the storage allocator could result in
fragmentation-related heap growth. On Unix, where Apache is
single-threaded, mod_owa attempts to reuse a static buffer to
reduce/eliminate this.
Mime types for file uploads/downloads
-------------------------------------
Right now, mod_owa relys on the upload action procedure to do whatever is
necessary to compute a mime type for files being stored. On download, it
assumes that the read procedure will supply the mime type, and if it
doesn't get one has a hard-coded set of file extensions that it checks
to infer the mime type. It would be much better if Apache provided an
interface to derive mime types from file extensions, but I can't find one.
Character set issues for LOBs/LONGs
-----------------------------------
As described above, there are some problems with character versus byte
semantics that aren't fully sorted out. Mainly this is because I lack
any support for determining character boundaries in byte-streams when
using multi-byte character sets, except for the UTF8 Unicode set for
which there is a simple algorithm (included in mod_owa but untested).
No way to know browser character set
------------------------------------
The whole DAD character set system is a hack; it would be much much
better if there was some way to know what character set a browser was
transmitting content in, and conversely what character set it would like
the response in.
Support for non-byte-unique character sets
------------------------------------------
mod_owa's support for multi-byte character sets is limited in a few
areas. One of these is in the procedure name portion of the URL,
where the string is assumed to be in a byte-unique ASCII-based
character set. A byte-unique set is one where multi-byte characters
are always represented by sequences of bytes that have the 8th bit
set. Non-byte-unique sets may set the 8th bit only for the leading
byte, while the following byte may resemble an ASCII character. When
mod_owa scans the string to parse it for a ".", or to eliminate
invalid characters such as spaces, it does so using simple string
functions that are not aware of multi-byte character boundaries.
Many multi-byte sets are byte-unique, including UTF8. A major
non-byte-unique set is Shift-JIS, a Japanese set common on windows.
Note that you can still use your non-byte-unique set for arguments
via the OwaCharset, the only restriction is on procedure names,
which are interpreted using the NLS_LANG character set.
Bugs in OCI8
------------
A few minor bugs in OCI8 required rather awkward work-arounds. One is
that for array-bound variables, the string buffers must be at least two
bytes wide, even if the array consists entirely of one-byte "empty"
strings (e.g. the unused "reserved" argument to the flexible parameter
mode). Another is that LOB data transfers must be closed with an
OCI_LAST_PIECE flag, yet during a streaming operation mod_owa doesn't
necessarily know that a block being transferred is the last block it
will see. Unfortunately the OCI won't let you write a 0-length block,
so when this situation occurs (only for large files, and then only about
1 in a 1000 times) mod_owa has to write an extra byte to the LOB and then
truncate the LOB back to the correct size.
End of content upload
---------------------
mod_owa uses ap_bread() to read content from the incoming stream.
Unfortunately ap_bread() will return with content whenever it's
available, rather than returning only when it's satisfied the requested
number of bytes or seen the end of the incoming stream. This means that
mod_owa is forced to assume that the CONTENT_LENGTH is correct and continue
calling ap_bread() until either that number of bytes has been received,
or until an error (-1) is returned. Another related problem is
interpretation of 0-length return values. In theory this could occur
if mod_owa is reading faster than data is available on the pipe (likely),
and is therefore not an error. However, some users have reported that
they get 0-length returns when a user aborts an in-progress POST
operation from his/her browser. mod_owa is therefore forced to treat
the 0-length return as if it were an error, to avoid an infinite loop.
Initializing OCI might fail after startup
-----------------------------------------
The only way for a module to stop the Apache startup is to exit from the
module startup procedure. Unfortunately this procedure is only called
once, whereas the OCI initialization must be done for each child process
(on Unix). mod_owa must therefore perform the OCI initialization from the
child startup procedure, where, if a failure occurs, there is no
opportunity to abort startup (instead, an error message is silently
written to the error log). Also, because the child startup is called
multiple times on NT from within the same process, a kludgey flag is used
to ensure that the OCI is initialized exactly once per worker process.
Open-source version of PL/SQL side
----------------------------------
The PL/SQL side of the gateway is not open-source. It would be best
if an equivalent were built and bundled with mod_owa. The code in
the sample file modowa.sql has a bare-minimum implementation, but
a tremendous amount of functionality is still missing.
IIS and Netscape versions
-------------------------
The code is architected so that all the Apache-specific linkage is
encapsulated (in modowa.c). It should be possible to implement a
replacement linkage for the Microsoft or Netscape web servers.
Morten Braten has independently implemented a PL/SQL gateway for
IIS in C#. It's called the Thoth Gateway:
https://github.com/mortenbra/thoth-gateway
Shared memory cache
-------------------
The shared memory segment was added to support a memory-based cache to
augment the file-system cache. The feature isn't finished yet, in fact
I don't even really have a design.
Cleanup shared resources
------------------------
As noted, Apache doesn't give me an event where the shared memory segment
and shared semaphore can be reclaimed. This is a problem only on Apache
1.3 on Unix. Another problem is that a superfluous shared resource is
created by the Apache starter process, because the module initialization
event doesn't allow me to distinguish the start of the real worker
template process from the start of the manager.
OWS-compatible authentication
-----------------------------
There are some nuances of OWS that aren't mimiced well by mod_owa with
respect to basic authentication. There is a question about how to convey
the values for username and password in HTTP_AUTHORIZATION. Should they
be base-64 decoded? Should they be assigned to OWA.USER_ID and OWA.PASSWORD
as part of the anonymous block? Should the Apache check_auth and/or
check_user_id callback functions be added? Should the setting of the
OWA_SEC mode be sensed in some way? Should the false return of an
authorization check be examined by mod_owa? Currently, handling any/all
of these issues are left to the client PL/SQL code for maximum
flexibility. The interfaces in OWA_SEC won't work without some code
that you'd have to have in your PL/SQL.
URL unescaping issues
---------------------
The character set used to escape arguments in URLs is difficult to
determine accurately. The standard calls for iso-8859-1, though browsers
will often use utf-8 instead. Under some circumstances, including
situations where a GET method is used for a form submission, the browser
may use the page character set, or the user-selected default character
set. URLs constructed by JavaScript code may use utf-8. Finally,
Microsoft IE encodes some values as UCS2 with %uXXXX, which it's not
clear Apache's native unescape functions will properly handle. It's a
mess. For now, mod_owa assumes that the URL parameters are in the same
character set as post arguments, which will generally be true for
submission operations. For other links on a page, the application
should probably pre-escape them on the server when generating them.
The JavaScript issue will remain a problem, though - applications that
need to do that should probably just run exclusively in utf-8.
Administration interfaces
-------------------------
There should be some built-in code to generate pages that support
administration of the httpd.conf files remotely via a web browser.
DAV services
------------
It would be very useful if the code were extended to provide hooks for
supporting some level of the Web-DAV standard. This would require
picking up a dependency on an XML parser, though I believe I could
use Oracle's which is public and should be available in the same
library that supplies OCI.
Also, the OPTIONS verb should probably be supported as a REST method.
Stateful sessions
-----------------
Some users have requested that mod_owa support stateful applications,
wherein database connections would be bound to particular user
sessions rather than pools. One reason I didn't implement this
originally was that Apache's architecture on Unix is multi-process.
In Apache 2.0, the architecture flexibly supports multi-process/
multi-threaded deployments, even on Windows. Database sessions
cannot be shared across process boundaries, and Apache provides
no mechanism for ensuring that any particular request from a client
will get routed to the same worker process. Still, in the limited
case where a deployment was single-process, multi-threaded (all 1.3
Windows deployments, and potentially any 2.0 deployment that was
configured that way), it could probably be supported, likely by
adopting a model similar to that of Java servlet sessioning
(cookie-based).
OAS-compliant document operations
---------------------------------
Partial support for WebDB-style document uploads and downloads
has been added. Only the BLOB mode is supported, which may be
sufficient to run Apex (HTML-DB). These operations run to and
from a documents table that can be specified with OwaDocTable.
Alias Procedure
---------------
mod_owa doesn't implement this, though the combination of either OwaDocGen or
OwaDocPath along with OwaDocProc or OwaDocGen provides similar functionality.