JavaScript User Defined Functions (UDFs)

Introduction

One way to extend the database with user-provided functionality is through user defined functions (UDF for short). These functions can occur at different places in the statement execution. Let us consider a SELECT query as an example. Assume the query has the form SELECT <selector> FROM <table> WHERE <filter>. A UDF can occur either in the selector or in the filter expressions. If a UDF occurs in a selector, it will be called for every row in the result of the query. A UDF in a filter, however, might also be called on rows that will not be in the result. In the remainder of this document we show how to write UDFs in JavaScript that can be used in either part.

A Simple Example

Our running example will be the use of a validation and email normalization system. Luckily we only have to write minimal code for this since there is already an npm package called validator that can perform these tasks for us.

Dependency Setup

We first install the dependency validator from npm in the current folder. To not destroy any existing JavaScript projects we recommend using an empty folder to start from. Then use

$ npm install validator

/home/oracle/udf
└── validator@8.1.0 

npm WARN enoent ENOENT: no such file or directory, open
'/home/oracle/udf/package.json'
npm WARN udf No description
npm WARN udf No repository field.
npm WARN udf No README data
npm WARN udf No license field.

to download and install the validator module (this requires network access). Note that we are not building a full-fledged npm module here, which would require us to run npm init before and then save the dependency in package.json. Our deployment tool does not require this; the presence of the module at the time of deployment is sufficient. The warnings about missing package.json file can therefore safely be ignored.

JavaScript Code

Now we write the code for our UDF. We start with an empty file called simple.js. This file will be the basis for the JavaScript module we will later deploy in the database.

We start with requiring the validator module from npm. To do this, we add the line

var validator = require('validator');

to simple.js. This line already gives us almost all of our data processing logic we need.

We only have to slightly adapt the validator API to return consistent types. This is not enforced by JavaScript but required by the statically typed Oracle Database SQL. The normalizeEmail function, for example, returns either a normalized email address or false:

> require('validator').normalizeEmail('me@oracle.com');
me@oracle.com
> require('validator').normalizeEmail('Not an email address');
false

There are some automatic conversions, but those would transform the boolean value false into the VARCHAR2 FALSE which is not a valid email address. So we add a wrapper function that implements our desired semantics: if validator.normalizeEmail returns false, the wrapper function returns the JavaScript null value. In the database, null maps to the SQL NULL indicating the absence of a value. To do this, we append to simple.js the function:

function normalizeEmail(input) {
  var normalized = validator.normalizeEmail(input);
  return normalized ? normalized : null;
}

Note how we reuse the previously imported validator npm module. All work is done by the normalizeEmail function from the validator package. Our wrapper only sanitizes the output.

Now that we have a wrapper function for normalizeEmail implemented in simple.js, we need to export this function and the validator function isEmail such that Oracle Database MLE can pick them up. We don't need a wrapper for the isEmail function since it already has the correct semantics. Also, for all but the first argument, it has sensible defaults that we can reuse. In general, Oracle Database MLE accepts valid JavaScript files that may include an arbitrary number of function definitions and statements. However, only functions that are explicitly exported will be picked up by Oracle Database MLE and can be used in SQL statements.

To export the functions, we simply add the following assignments to the end of simple.js:

module.exports.normalizeEmail = normalizeEmail;
module.exports.isEmail = validator.isEmail;

Note that both, module and exports are not defined in the source file. This is not a problem since the loader used to load the JavaScript code into Oracle Database MLE will define them. Also note that this is just one way (aka the CommonJS way) to export functions in Oracle Database MLE. Oracle Database MLE supports all forms of JavaScript module specifications that can be loaded by a UMD-compatible module loader can be used. Due to its simplicity we recommend to use the CommonJS way of exporting modules.

With these modifications we conclude our work on simple.js. Now we have coded the functionality, but we still need to tell the deployment tool which functions to export and what call specifications to create.

Call Specifications using TypeScript

To declare the call specifications, we create the TypeScript declaration file simple.d.ts. In this file, we add TypeScript signatures for all exported functions:

export function normalizeEmail(input : string) : string;
export function isEmail(input : string) : number;

The first line exports the normalizeEmail function. This function takes a character string (TypeScript type string) as input and produces a character string. Note that the name of the input parameter (input in our example) does not matter. We could as well have chosen any other parameter name (e.g. foo) as long as the file is a valid TypeScript declaration file.

The second line exports the isEmail function. This function takes a character string as input and produces a number. We choose the type number as result since Oracle Database MLE does not support the boolean type of JavaScript and TypeScript natively and would apply some implicit conversions if we specified it. So to make things clearer, we specified number and can look at the implicit conversions applied by Oracle Database MLE. Thus, if an email is valid, we get the value 1, and 0 otherwise.

Deployment

With the JavaScript source and the call specifications in the same folder we installed the validator package in, we have everything we need to deploy our JavaScript extension into the database. To do this, we need to know the database connection string used to connect to the database and the schema that should own the extension. Assuming we have a database with global name DBML running on the current machine and listening on port 1521, we use

$ dbjs deploy simple.js -u scott -p tiger -c localhost:1521/DBML

to deploy our script into the scott schema.

Creating Test Data

To call the UDFs we first need some data. We will use SQL*Plus to connect to the database and create this test data. To connect with SQLPlus, start a command line and type

$ sqlplus scott/tiger

to connect to the database with the scott schema.

You should see a command prompt starting with SQL>. Here, we can issue commands to interact with the database. We now create a table, mails, with a single column (email) that contains text data. Some of the column values should be valid email addresses, others not.

SQL> CREATE TABLE mails (email VARCHAR2(100));
SQL> INSERT INTO mails VALUES ('some.body@googlemail.com');
SQL> INSERT INTO mails VALUES ('not really an email');
SQL> INSERT INTO mails VALUES ('somebody@gmail.com');
SQL> INSERT INTO mails VALUES ('.@googlemail.com');

Executing a Filter

Now that we have our test data we can execute our first JavaScript UDF. The goal is to filter the table to remove every row that does not represent an email address. To do this, connect again to the database via SQL*Plus. Then issue

SQL> SELECT email FROM mails WHERE simple.isEmail(email) = 1;

In this query, we use the isEmail function we deployed with dbjs previously as a predicate in the WHERE clause such that the query returns only rows that contain valid email addresses. Recall that we cannot directly use this function as a filter, but still have to compare the result against a value. In our case, we rely on the implicit conversion from the boolean value from isEmail to a number (see conversion documentation for details of type conversion). This will give us the value 1 for every input that returns true.

In this example, the isEmail function is called for every row of the table. Note, however, that the values might appear in no particular order and the UDF should not rely on any specific order. Furthermore, the UDF should be purely functional and not keep any state between consecutive calls.

Executing a Selector

Now that we have seen how to execute a JavaScript UDF in a filter, let us employ a JavaScript UDF in the SELECT clause of a query. We again connect to the database using SQL*Plus and execute the following query:

SQL> SELECT simple.normalizeEmail(email) FROM mails;

This query calls the normalizeEmail function we implemented before. Since we don't specify a filter, the function will be called for every row in the table. Note that we will not receive a normalized email address for every row in table mails. Function normalizeEmail will return NULL for all entries that are not valid email addresses to indicate a missing value. Alternatively, we could again use the isEmail function in the WHERE clause to restrict the result to valid email addresses that can be normalized.

With this function we could generate a view containing only valid normalized email addresses. This task is left to the reader as an exercise.

NULL-awareness

What happens if we pass a SQL NULL to a UDF defined in JavaScript? In this case, the input value that a JavaScript UDF receives as an argument will be the JavaScript null value. In a JavaScript UDF, this gives us the ability to deal with values that are missing in the database.

The functions of the validator package that we use are not prepared to deal with null input values. So we will have to update our code. In JavaScript, we can take advantage of the fact that null evaluates to false. So if we want to give a default value, we can simply use a disjunction with a default value. Now, for our example, we could choose as default value any string that does not represent a valid email address. We choose the empty string and change our code to:

var validator = require('validator');

function normalizeEmail(input) {
  var normalized = validator.normalizeEmail(input || '');
  return normalized ? normalized : null;
}

function isEmail(input) {
  return validator.isEmail(input || '');
}

module.exports.normalizeEmail = normalizeEmail;
module.exports.isEmail = isEmail;

Note that we introduced a wrapper for the isEmail function since it now does not only pass the input value to validator.isEmail.

We can now deploy the modified script to the database as before:

$ dbjs deploy simple.js -u scott -p tiger -c localhost:1521/DBML

Note that we don't have to change the TypeScript declaration file since the call specifications stay the same. After deploying the modified script, we can connect to the database with SQL*Plus. To test our modifications, we should add a NULL value to our table:

SQL> INSERT INTO mails VALUES (NULL);

Now, the queries mentioned above will work just fine, even with the NULL value.

Dealing with NULL values is of course application dependent and whether to use a default value like done above, or branching on the input value is up to the developer. NULL values are a valid part of many relational database schemas. You should write your UDFs such that they can deal with NULL values without failure.

Simplification: TypeScript

As with stored procedures, we can simplify development and include both our code and the declarations of the call specifications in one file using TypeScript. We can even reuse existing typescript code as long as the types used in the declarations are usable by Oracle Database MLE. Next, we repeat our email validation example that we have used so far - this time, however, using TypeScript instead of JavaScript.

We start by installing the TypeScript version of the validator package:

$ npm install @types/validator

Note that the declaration for the validator.normalizeEmail function in the @types/validator TypeScript package specifies the return type as string |false. We have already discussed this in our pure JavaScript example: validator.normalizeEmail returns false for strings that cannot be normalized. Again, we add a simple wrapper function that returns either a string containing a normalized email address or null:

import * as validator from "validator";

namespace myExports {
  export function isEmail(input : string) : number {
    return validator.isEmail(input || '') ? 1 : 0;
  }
  export function normalizeEmail(input : string) : string {
    var normalized = validator.normalizeEmail(input || '');
    return normalized ? normalized : null;
  }
}
export = myExports;

This script should be saved as tssimple.ts. It first imports the validator TypeScript package. Next, it creates a namespace to bundle our exported functions. These functions are then implemented like before. The difference here is that we already provide the types along with the implementation. Finally we export the namespace.

To deploy it into the scott schema using the same connection specification used in the JavaScript example above, we use

$ dbjs deploy tssimple.ts -u scott -p tiger -c localhost:1521/DBML

Note that we do not need an additional TypeScript declaration file since the call specifications are derived from the types that are already declared in the sources.

A Note on Callouts in UDFs

Since a UDF is executed in the context of a transaction, callouts that use DML statements cannot be used directly. Instead, autonomous transactions have to be used. See the callout documentation for details.