(Copyright Philip Quaife 1999)
DRAFT Version 1.2
Lack of a common access method to disparate SQL databases is a stumbling block to the availability of generic
database applications in the net community. As each application developer has coded specifically for a target database,
they have almost uniformly created a set of procedures to manage the underlying database structure to free the coding
task of repetitive low level database calls. This sort of interface library is the basis of an abstraction layer. This draft
looks at defining a standard set of calls to SQL DBMS's to allow the application to function on a larger base of
databases than was considered at the time of its development. This standard is hereafter called the Database Abstraction
Standard (DAS).
While most TCL applications are designed for one DBMS, the code is often made available to others that may have a
different DBMS. This standard will enhance the portability of such code as well as reducing the time taken to port the
application if the backend DBMS is changed..
The DAS requires that for each distinct DBMS an interface layer will be required to convert native TCL DBMS
function calls into those specified in the standard.
The abstraction standard has to fulfill two distinct roles. One is to allow the application to interface with the underlying DBMS in a consistent manner without limiting the facilities provided. Its job is to translate TCL procedure calls into native database library calls and format the returned data into the structure as per the standard. The second is to hide the differences that each DBMS supplier has put into its interpretation or extension to the SQL standard. To this end this requires a standard way of accessing SQL return information as well as some rewriting of the SQL statements passes to the DAS processor.
Both the DDL as well as DML constructs will have to be handled by the DAS. Dependant on how much variation there
is , the DAS may require separate functions to handle the DDL as parsing the SQL statements may not provide enough
information to translate to all common DBMS syntaxes.
The DAS for any given DBMS will be split into three parts, these are :
1. Interface Functions.
2. DML functions.
3. SQL rewriting
Only items 1 and 2 are considered part of the core standard. Item 3 is to allow developers to use the full power of the
DBMS that they are developing the application for, but allow the easier migration to other DBMS systems by
dynamically converting non ANSI standard parts of queries into ones usable in the new DBMS system. By
incorporating rewriting into the standard it is hoped that other less portable methods of working with cross platform
development are not used. Every driver must support the rewriting specifications but is allowed to pass the request to
the dbms loader for execution. A Full driver will leverage off the dbms loader but will implement it's own rewriting
routes for emulation of functionality not native to itself.
Under the standard all DDL functions are to be accessed from DML statements. Any driver than has seperate DBMS
api functions for DDL functions will have to implement query parsing and translation to work correctly.
The implementation of the native database abstraction standard will be through an intermediatory package This will be
called the DBMS loader. This package will provide a catalog of available native database interface packages that meet
this standard. A TCL Call to connect to a specific database will be routed via this package to the appropriate database handler.
The standard will allow multiple native DBMS to be accessed concurrently by the one TCL application.
The DAS will be implemented as a widget command type interface to TCL, to allow object technologies to be used for
the underlying DBMS interface. This also allows called widget procedures to be made directly to the underlying
interface without having to go through a level of indirection via the DBMS loader.
The following describes the API for the dbms loader which is used for creating and maintaining data sources and driver
information.
Package dbms
The dbms loader package will support the following commands.
design | application driver | ||||
Register driver as the type used for design of the application. This is used to control the rewrite engine where rewriting is dependant on the source and destination drivers. Application names are unique long names assigned by the developer. This is used by the loader to set write rules in effect when an application uses a data source that uses a different driver than the one it was designed for. | |||||
add | [driver / server ] name | ||||
Registers a new server or driver, or changes the parameters for an existing server. Any options specified for the driver will be used as defaults for connections created with the driver. Order of prefference is connection, driver, server. | |||||
-Driver | Set the driver type | ||||
-Package | tcl script file or package name | ||||
-Command | Driver command name for connections | ||||
-Init | Initialisation command to call after loading driver | ||||
-host | set the default host | ||||
-user | set the default user | ||||
-pass | set the default password | ||||
-db | sets the default database if possible. | ||||
drivers | Returns a list of known DBMS drivers. Well known names are: | ||||
Msql | Informix | Empress | Ingres | ||
MySql | Postgres | XBase | RDB | ||
Sybase | Solid | ODBC | Progress | ||
SQLServer | DB2 | Oracle | AdabasD | ||
servers | Returns a list of DBMS servers. This uses information registered in an file and provides a link to the underlying database driver. This is similar to the ODBC ini file. | ||||
option | [server | driver | connection] name [option [value]] | ||||
Queries or sets a server,driver or conenction option. If option is specified then value for that option is returned. If value is specified the option is set to the new value. Options can be created by the application for holding connection required state or static data. A request to return the value of an option that does not exists will return an empty list with no error. The options are specific for a driver, some of the common options may be: | |||||
-user | Set default user name | ||||
-pass | Set default password | ||||
-host | Sets default host for connections | ||||
-db | Default database | ||||
-autocommit | Sets state of autocommit for new connections | ||||
-cursor | Set default cursor type | ||||
-cursortype | Set default cursor type | ||||
-rowsetsize | Sets cursor rowset size | ||||
-nullvalue | Sets null value return string | ||||
-forcecount | Sets forcecount for driver. The driver will execute a count(*) on the current query to ascertain the number of rows returned. | ||||
connect | server connection [option ...] | ||||
Makes a connection to a server and creates a widget connection to interact with the DBMS. Options specified override defaults set in the driver or server. Default options are copied from the server and driver default options before command line options are applied. | |||||
drop | connection | ||||
Drops a connection to a driver. Calls the connection drop procedure to close the driver connection then frees DAS resources. If connection is all then all connections will be dropped. | |||||
rewrite | options | ||||
See rewrite section below. |
Notes
Options that begin with lowercase 'a' to 'z' are considered global and are copied from Server and Driver settings to a
new connection.
The loader will load the appropriate database driver and call the initialisation fuction the first time a connection is made
using that database driver. After a connection is made, the connection object will call the driver directly without
intervention of the loader. Options available to the driver and connections should be set in the global array DBMS.
These options should be in lowercase and begin with a minus sign as per the command line equivalent. The options
should be configured prefixed with the text 'def,drv,{driver},'. The options can then be picked up from the array by the
driver configurator and applied/removed from each server as desired. Options that do not begin with a lower case a - z
are not replicated for each connection and can be equated to global or driver specific options. These can be set/unset on
a per server basis.
The dbms driver package will support the following utility functions:
quote | type data | |
Performs ANSI quoting of ANSI data types. Unknown field types are returned as an empy set | ||
parse | array {arglist} | |
Parses parameters and stores them as associative entries in the array array. Arguments are stored as array(arg0) to array(argn) based on the order they are found in the list. Options are stored as array(option). | ||
regexpfunc | funcname args# | |
generates the regexp for matching a function in the form of func(arg1,arg2 ...,argn). The number of args in the expression must be specified. The result can be used in a regex replace using \n to match the arguments in the original function. | ||
rewrite | options | |
The dbms loader supports the full rewrite specification. A driver may call the dbms loader to perform rewriting but is responsible for handling stored procedure emulation. See rewrite section below for details. | ||
sqlparse | query array | |
Parses a sql statement and fill in array with result. Only parses DML querys.
Returns: | ||
array(cmd) | DML command | |
array(fields) | Field list | |
aray(tables) | Table list | |
array(where) | where clause if used | |
array(having) | having clause if used | |
array(order) | order by if used | |
array(group) | group by if used | |
array(columns) | columns if specified | |
array(values) | values if specified | |
array(sub) | sub query of insert command | |
array(union) | rest of query after union |
Package dbms_[Driver]
Each driver package will create a tcl command in response to the connect command from the dbms loader. This
command will store the state information for one connection to the DBMS. The underlying driver may use seperate
statements on one physical connection for each of the tcl connection commands but each must be independant of each other.
The driver tcl command must support all of the following commands:
Command | Description | |
use | Changes default database. This command is ignored if driver only supports one database for a connection. Returns the currently selected database if no database specified. | |
Database | New database. | |
transact | Maintains transaction information on connection. | |
begin | Starts transaction | |
commit | Commits pending transactions on this connection | |
rollback | Rolls back all uncommitted transactions on this connection | |
transact commands are ignored and return true if transactions are not supported on the driver. | ||
info | Return information on current result set. | |
-columns | attributes | |
Column info in result set returned as a list of attributes. Common attributes are : name, table, user , type , len, precision, maxlen, nullable, ispkey, isfkey. Attributes not known are returned as empty values. | ||
eg:,{ {colname coltable colfield coltype collen colmaxlen ...} {...} } | ||
-cursor | Name of open cursor. | |
-colcount | Number of columns in result set. | |
-rowcount | Numer of rows available in current result set. Returns -1 if rowcount is unknown. | |
open | Execute SQL statement. Result set is left pending and can be fetched by the fetch command. | |
-sql | qry | |
Assigns new sql query to this connection. If not specified the last sql statement passed to the prepare command used for the open. | ||
-params | params | |
Executes sql statement with parameter substitution. Parameters are replaced by the driver if the DBMS does not support parameter substitution. | ||
-cursor | name | |
Use a cursor name for this statement. The driver will use curosr of next suitable type if requested type is not available. See cursor section below. | ||
-cursortype | {type modifier} | |
type one of : forward, scroll, static | ||
modifier is typically one of : update , insensitive, keyset | ||
-prepare | {arg type ...} | |
prepares sql statement for execution but does not execute it. Records argument types for following arguments. This is needed so that quoting can be done in the driver for drivers that do not support parameter substitution. | ||
-proc | name | |
Calls stored procedure rather than sql statement. This option must be supported by all drivers. Procedure calls must be handled by call rewriting by driver if not implemented natively. | ||
exec | Execute sql statement. Result set is closed at the end of the command. The result set can be saved with one of the options -in*. See fetch for details. | |
-invar | var | |
Store result as a list in var. | ||
-inarray | Array | |
Store result as Associative entries in array. Only the first row is saved. | ||
-sql | qry | |
Assigns new sql query to this connection. If not specified the last sql statement is used for the open. | ||
-params | params | |
Parameters to substitute into query. | ||
-proc | name | |
Calls stored procedure rather than sql statement. | ||
-loop | See fetch command for details. Used as a shortcut for open, fetch -loop, close. | |
-nonselect | Query is not a select statement. This option exists for drivers that use a seperate api function for non select queries when no result set can be returned and using a select api call would result in an error. | |
fetch | Fetches next row from the result set. By default row is returned as a list | |
-row n | Fetch absolute row number n. | |
-byrows | Fetches entire result set as a list of rows. | |
-bycols | Fetches entire remaining result set as a list of columns. | |
-invar | var | |
Store result as a list in var. | ||
-inarray | Array | |
Store result as Associative entries in array. Only the first row is saved. | ||
-status | var | |
Fetch status is saved in the variable var. Value should be 1 for success, 0 for no more rows, negative for error. | ||
-loop | command arguments | |
Execute command over each row in the result set. Command is parsed two additional arguments arguments. The first is the connection handle, the second a flag indicating one of Begin, Data, End. This allows the routine to perform initialising and cleanup actions before and after the data records. Use -invar or -inarray to save row into a variable for use in the routine. Data returned from the command is saved and returned to the caller when the dataset is consumed. A typical use for this command would be converting a result set to an html table. | ||
capabilities | Returns info on underlying DMBS system as named value pairs.. | |
transact | 1 If transactions are supported | |
procs | 1 if stored procedures are native | |
subselect | 1 if sub selects are supported | |
*selectinto | 1 if select into table is supported | |
ansijoin | 1 if ansi sql2 join syntax is supported | |
fieldtypes | returns list of field types DMBS supports | |
usertables | Database supports tables with the same name for different users. This requires specifying Database.User.Table for table references. | |
dbms | Multiple or single. Set to Multiple if selects can query data in Tables accross databases. | |
option | option value ... option value | |
Retrieves or sets connection options. Upcalls the dbms driver to do option processing. May also rewrite options and set driver specific settings in response to input options. Option tracing should be set in the driver initialisation for options that must change the state of the connection. | ||
-ctype | type | |
Set default cursor type for open result sets. | ||
-autocommit | set to 1 to enable autocommit for this connection. Defaults to 1 | |
-maxrows | n | |
Limit result set to n rows at most. | ||
-countrows | If set then the rowcount command always returns the number of rows in a resultset even for select queries. | |
-forcecount | Forces the driver to perform a count(*) on the last query in response to a rowcount command if it does not support row counts automatically. | |
Note: Options specific to a DBMS may be added to this list of options. Other DBMS are to ignore, emulate or rewrite options that are not used by itself. | ||
rewrite | Controls SQL rewriting. See Rewrite section in this spec. | |
close | Explicitly closes any pending result set. | |
lastinsertid | Returns last insert id on this connection. | |
seek | Repositions data pointer in result set. Returns 0 if seek cannot be made.All drivers will support the rewind function andreopen the result set or reissue the query if necessary. | |
-rewind | moves to the begining of the result set. | |
-forward | n | |
move forward n rows | ||
-back | n | |
move backward n rows. | ||
-row | n | |
Reposition to row n | ||
quote | data | |
Return the argument quoted for the field type indicated. If type is not specified varchar is used. All special characters in the data are escaped as appropriate and quote characters are applied if necessary. | ||
-type | field | |
Quote as for a field type of field | ||
-escape | Returns the escape for enclosing Quote characters in strings. Ie \' or ''' | |
-character | Returns the quote character used for this field. | |
-notnull | treat the text NULL as ascii text not the NULL operator. Normally the text NULL wil not be quoted in varchar and date datatypes.. | |
dict | Return information on DBMS dictonary objects in system. All drivers are expected to return information for databases, tables, columns. | |
options | Returns list of options that the DBMS has dictionary items for, | |
A list of one or more of the following: users, databases, tables, columns, keys, indexes, views, triggers. | ||
users | Returns a list of DMBS users. | |
databases | Returns list of databases in system. | |
tables | db | |
Returns list of tables in database db. | ||
columns | db table attributes | |
Returns list of colum names and column attributes. Attributes not recognised by the driver are returned as empty values. | ||
primary | db table | |
Returns list of primary key fields in table. | ||
foreign | db table | |
Returns list of foreign keys in table | ||
indexes | db table | |
Returns list of indexes on table | ||
views | database | |
Returns list of views for database | ||
triggers | db table | |
List of trigger names for table. | ||
blob | reads or writes a blob entry for a record. | |
-infile | filename | |
File to use for blob data | ||
-invar | var | |
Uses tcl variable var for storing and retrieving blob data. | ||
write | Read database object into file or variable. Default option if not specified | |
read | Reads database object from file or variable into row. | |
-id
-column -table -rowid |
rowid | |
Specifies the Identifier for blob. If not specified current row is used for the blob id. | ||
-size | returns or sets the size of the blob object. |
All valid commands and options that are unimplemented by the driver must return no data and generate no error.
When a query is executed with a cursor, the driver will wrap the sql command in what ever syntax is used to create a
cursor. This will usually be "declare name as cursor for ..." syntax. The cursor will be opened so that the columns
names etc are available prior to a fetch being issued. A database that does not support cursors must emulate the action
by creating another database connection handle and using that for subsequent calls to fetch. Multiple open cursors are
available on the one connection and calls to fetch must pass the cursor name to specify which cursor to use. There will
always be a maximum of one result set available to fetch at any one time. Databases that emulate cursors must allow
fetching from multiple cursors. Close for the cursor must be called prior to reissuing a a open for the same cursor name.
The driver will delete the cursor reference when a close request is made to the cursor.
When using cursors the application should not put any "cursor" specific syntax into the sql statements. It is up to the
driver to manage cursors as appriopriate for the underlying DBMS.
All drivers will support rule rewriting. A Basic rewriting engine will be part of the DBMS Loader module. The load
functions can be called from the driver to perform rewriting if not implemented in the driver proper. To help out drivers
the dbms loader has support for basic sql rewriting.. The driver is responsible for checking that the connection has the
rewrite option set and if not supporting rewriting directly must call the dbms loader to perform the sql rewriting.
There are three parts of the rewriting engine that must be supported. They are:
1. SQL query rewriting. Including stored procedures.
2. Field type replacement.
3. Option replacement.
The following ReWrite Syntax is to be supported by both the driver and dbms loader.:
sql | Parse sql statements | |
-proc | name | |
Call tcl procedure and use returned string as replacement. String may contain \n sub strings from the matched string. The procedure is passed three parameters. The first is the conection handle, the second is the regexp to match, and the third is any replacement regexp specified. | ||
-inclass | class | |
Restricts rule to matching sql queries of one of the following classes: Select Create Insert Update DDL | ||
-inscope | {(first)(scope)(rest)} | |
Searches for scope in regexp and restricts replacement to the match found. The matches for first and rest are left unchanged. | ||
-reqmatch | regexp | |
The regular expression to match in the sql statement | ||
-reqsub | regexpsub | |
The replacement string including regexp sub parts with \n notation. | ||
-fordrivers | {fromdriver todriver} | |
Restrict rule to servers that have a designed driver of fromdriver and a executing on todriver. | ||
field | Replace field types. Used for quote matching rules. | |
-field | fieldtype | |
Specifies field type to override | ||
-newfield | fieldtype | |
Specifies new field type to use. | ||
-proc | name | |
Calls proc and uses returned result as new fields type. Proc is passed the connection name and the fieldtype as parameters. | ||
option | Replace option values for the connection option command. | |
-proc | name | |
Calls proc and uses returned result as the procedure name and parameters.. Proc is passed the connection name , the name of the procedure called and all the parameters. | ||
-callproc | name | |
Calls proc which returns a driver handle to a result set to use as the results from the procedure. | ||
-name | procname | |
A regexp that matches the name of a procedure. | ||
-newname | procname | |
The replacement name for the procedure to be called. | ||
-args | {{arg# regexp regexpsub} {arg# regexp regexpsub} {arg# regexp regexpsub}} | |
A List containing the argument order to use in the new query and a list of substitution regexp to apply on each argument. |