TCL Database Abstraction Standard

(Copyright Philip Quaife 1999)

DRAFT Version 1.2



Abstract

  
Lack of a common access method to disparate SQL databases is a stumbling block to the availability of generic database applications in the net community. As each application developer has coded specifically for a target database, they have almost uniformly created a set of procedures to manage the underlying database structure to free the coding task of repetitive low level database calls. This sort of interface library is the basis of an abstraction layer. This draft looks at defining a standard set of calls to SQL DBMS's to allow the application to function on a larger base of databases than was considered at the time of its development. This standard is hereafter called the Database Abstraction Standard (DAS).

While most TCL applications are designed for one DBMS, the code is often made available to others that may have a different DBMS. This standard will enhance the portability of such code as well as reducing the time taken to port the application if the backend DBMS is changed..
 

Design Rational

  
The DAS requires that for each distinct DBMS an interface layer will be required to convert native TCL DBMS function calls into those specified in the standard.

The abstraction standard has to fulfill two distinct roles. One is to allow the application to interface with the underlying DBMS in a consistent manner without limiting the facilities provided. Its job is to translate TCL procedure calls into native database library calls and format the returned data into the structure as per the standard. The second is to hide the differences that each DBMS supplier has put into its interpretation or extension to the SQL standard. To this end this requires a standard way of accessing SQL return information as well as some rewriting of the SQL statements passes to the DAS processor.

Both the DDL as well as DML constructs will have to be handled by the DAS. Dependant on how much variation there is , the DAS may require separate functions to handle the DDL as parsing the SQL statements may not provide enough information to translate to all common DBMS syntaxes.
 

Specification

  
 
 The DAS for any given DBMS will be split into three parts, these are :
 

1. Interface Functions.

2. DML functions.

3. SQL rewriting
 

Only items 1 and 2 are considered part of the core standard. Item 3 is to allow developers to use the full power of the DBMS that they are developing the application for, but allow the easier migration to other DBMS systems by dynamically converting non ANSI standard parts of queries into ones usable in the new DBMS system. By incorporating rewriting into the standard it is hoped that other less portable methods of working with cross platform development are not used. Every driver must support the rewriting specifications but is allowed to pass the request to the dbms loader for execution. A Full driver will leverage off the dbms loader but will implement it's own rewriting routes for emulation of functionality not native to itself.
 

DDL Functions

  
Under the standard all DDL functions are to be accessed from DML statements. Any driver than has seperate DBMS api functions for DDL functions will have to implement query parsing and translation to work correctly.
 

Implementation

  
The implementation of the native database abstraction standard will be through an intermediatory package This will be called the DBMS loader. This package will provide a catalog of available native database interface packages that meet this standard. A TCL Call to connect to a specific database will be routed via this package to the appropriate database handler.
 

The standard will allow multiple native DBMS to be accessed concurrently by the one TCL application.
 

The DAS will be implemented as a widget command type interface to TCL, to allow object technologies to be used for the underlying DBMS interface. This also allows called widget procedures to be made directly to the underlying interface without having to go through a level of indirection via the DBMS loader.
 

API

The following describes the API for the dbms loader which is used for creating and maintaining data sources and driver information.
 

DBMS Loader

Package dbms
 

The dbms loader package will support the following commands.
 
design application driver
Register driver as the type used for design of the application. This is used to control the rewrite engine where rewriting is dependant on the source and destination drivers. Application names are unique long names assigned by the developer. This is used by the loader to set write rules in effect when an application uses a data source that uses a different driver than the one it was designed for.
add [driver / server ] name
Registers a new server or driver, or changes the parameters for an existing server. Any options specified for the driver will be used as defaults for connections created with the driver. Order of prefference is connection, driver, server.
-Driver Set the driver type 
-Package tcl script file or package name
-Command Driver command name for connections
-Init Initialisation command to call after loading driver
-host set the default host
-user set the default user
-pass set the default password
-db sets the default database if possible.
drivers Returns a list of known DBMS drivers. Well known names are:
Msql Informix Empress Ingres
MySql Postgres XBase RDB
Sybase Solid ODBC Progress
SQLServer DB2 Oracle AdabasD
servers Returns a list of DBMS servers. This uses information registered in an file and provides a link to the underlying database driver. This is similar to the ODBC ini file.
option [server | driver | connection] name [option [value]]
Queries or sets a server,driver or conenction option. If option is specified then value for that option is returned. If value is specified the option is set to the new value. Options can be created by the application for holding connection required state or static data. A request to return the value of an option that does not exists will return an empty list with no error. The options are specific for a driver, some of the common options may be:
-user Set default user name
-pass Set default password
-host Sets default host for connections
-db Default database
-autocommit Sets state of autocommit for new connections
-cursor Set default cursor type
-cursortype Set default cursor type
-rowsetsize Sets cursor rowset size
-nullvalue Sets null value return string
-forcecount Sets forcecount for driver. The driver will execute a count(*) on the current query to ascertain the number of rows returned.
connect server connection [option ...]
Makes a connection to a server and creates a widget connection to interact with the DBMS. Options specified override defaults set in the driver or server. Default options are copied from the server and driver default options before command line options are applied.
drop connection
Drops a connection to a driver. Calls the connection drop procedure to close the driver connection then frees DAS resources. If connection is all then all connections will be dropped.
rewrite options
See rewrite section below.

Notes
Options that begin with lowercase 'a' to 'z' are considered global and are copied from Server and Driver settings to a new connection.


 

Driver Interface



The loader will load the appropriate database driver and call the initialisation fuction the first time a connection is made using that database driver. After a connection is made, the connection object will call the driver directly without intervention of the loader. Options available to the driver and connections should be set in the global array DBMS. These options should be in lowercase and begin with a minus sign as per the command line equivalent. The options should be configured prefixed with the text 'def,drv,{driver},'. The options can then be picked up from the array by the driver configurator and applied/removed from each server as desired. Options that do not begin with a lower case a - z are not replicated for each connection and can be equated to global or driver specific options. These can be set/unset on a per server basis.
 

Utility Functions

  
 
 The dbms driver package will support the following utility functions:
  
 
quote type data
Performs ANSI quoting of ANSI data types. Unknown field types are returned as an empy set
parse array {arglist}
Parses parameters and stores them as associative entries in the array array. Arguments are stored as array(arg0) to array(argn) based on the order they are found in the list. Options are stored as array(option).
regexpfunc funcname args#
generates the regexp for matching a function in the form of func(arg1,arg2 ...,argn). The number of args in the expression must be specified. The result can be used in a regex replace using \n to match the arguments in the original function.
rewrite options
The dbms loader supports the full rewrite specification. A driver may call the dbms loader to perform rewriting but is responsible for handling stored procedure emulation. See rewrite section below for details.
sqlparse query array
Parses a sql statement and fill in array with result. Only parses DML querys. 

Returns:

array(cmd) DML command
array(fields) Field list
aray(tables) Table list
array(where) where clause if used
array(having) having clause if used
array(order) order by if used
array(group) group by if used
array(columns) columns if specified
array(values) values if specified
array(sub) sub query of insert command
array(union) rest of query after union

 
 

DBMS Syntax

Package dbms_[Driver]
 

Each driver package will create a tcl command in response to the connect command from the dbms loader. This command will store the state information for one connection to the DBMS. The underlying driver may use seperate statements on one physical connection for each of the tcl connection commands but each must be independant of each other.
 

The driver tcl command must support all of the following commands:
 
Command Description
use Changes default database. This command is ignored if driver only supports one database for a connection. Returns the currently selected database if no database specified.
Database New database.
transact Maintains transaction information on connection.
begin Starts transaction
commit Commits pending transactions on this connection
rollback Rolls back all uncommitted transactions on this connection
transact commands are ignored and return true if transactions are not supported on the driver.
info Return information on current result set.
-columns attributes
Column info in result set returned as a list of attributes. Common attributes are : name, table, user , type , len, precision, maxlen, nullable, ispkey, isfkey. Attributes not known are returned as empty values.
eg:,{ {colname coltable colfield coltype collen colmaxlen ...} {...} }
-cursor Name of open cursor.
-colcount Number of columns in result set.
-rowcount Numer of rows available in current result set. Returns -1 if rowcount is unknown.
open Execute SQL statement. Result set is left pending and can be fetched by the fetch command.
-sql qry
Assigns new sql query to this connection. If not specified the last sql statement passed to the prepare command used for the open.
-params params
Executes sql statement with parameter substitution. Parameters are replaced by the driver if the DBMS does not support parameter substitution.
-cursor name 
Use a cursor name for this statement. The driver will use curosr of next suitable type if requested type is not available. See cursor section below.
-cursortype {type modifier}
type one of : forward, scroll, static
modifier is typically one of : update , insensitive, keyset
-prepare 

{arg type ...}
prepares sql statement for execution but does not execute it. Records argument types for following arguments. This is needed so that quoting can be done in the driver for drivers that do not support parameter substitution.
-proc name
Calls stored procedure rather than sql statement. This option must be supported by all drivers. Procedure calls must be handled by call rewriting by driver if not implemented natively.
exec Execute sql statement. Result set is closed at the end of the command. The result set can be saved with one of the options -in*. See fetch for details.
-invar  var
Store result as a list in var.
-inarray  Array
Store result as Associative entries in array. Only the first row is saved.
-sql qry
Assigns new sql query to this connection. If not specified the last sql statement is used for the open.
-params params
Parameters to substitute into query.
-proc name
Calls stored procedure rather than sql statement.
-loop See fetch command for details. Used as a shortcut for open, fetch -loop, close.
-nonselect Query is not a select statement. This option exists for drivers that use a seperate api function for non select queries when no result set can be returned and using a select api call would result in an error.
fetch Fetches next row from the result set. By default row is returned as a list
-row n Fetch absolute row number n.
-byrows Fetches entire result set as a list of rows.
-bycols Fetches entire remaining result set as a list of columns.
-invar  var
Store result as a list in var.
-inarray  Array
Store result as Associative entries in array. Only the first row is saved.
-status  var
Fetch status is saved in the variable var. Value should be 1 for success, 0 for no more rows, negative for error.
-loop  command arguments
Execute command over each row in the result set. Command is parsed two additional arguments arguments. The first is the connection handle, the second a flag indicating one of Begin, Data, End. This allows the routine to perform initialising and cleanup actions before and after the data records. Use -invar or -inarray to save row into a variable for use in the routine. Data returned from the command is saved and returned to the caller when the dataset is consumed. A typical use for this command would be converting a result set to an html table.
capabilities Returns info on underlying DMBS system as named value pairs..
transact 1 If transactions are supported
procs 1 if stored procedures are native
subselect 1 if sub selects are supported
*selectinto 1 if select into table is supported
ansijoin 1 if ansi sql2 join syntax is supported
fieldtypes returns list of field types DMBS supports
usertables Database supports tables with the same name for different users. This requires specifying Database.User.Table for table references.
dbms Multiple or single. Set to Multiple if selects can query data in Tables accross databases.
option option value ... option value
Retrieves or sets connection options. Upcalls the dbms driver to do option processing. May also rewrite options and set driver specific settings in response to input options. Option tracing should be set in the driver initialisation for options that must change the state of the connection.
-ctype  type
Set default cursor type for open result sets.
-autocommit set to 1 to enable autocommit for this connection. Defaults to 1
-maxrows n
Limit result set to n rows at most.
-countrows If set then the rowcount command always returns the number of rows in a resultset even for select queries.
-forcecount Forces the driver to perform a count(*) on the last query in response to a rowcount command if it does not support row counts automatically.
Note: Options specific to a DBMS may be added to this list of options. Other DBMS are to ignore, emulate or rewrite options that are not used by itself.
rewrite Controls SQL rewriting. See Rewrite section in this spec.
close Explicitly closes any pending result set.
lastinsertid Returns last insert id on this connection.
seek Repositions data pointer in result set. Returns 0 if seek cannot be made.All drivers will support the rewind function andreopen the result set or reissue the query if necessary.
-rewind moves to the begining of the result set.
-forward n
move forward n rows
-back n
move backward n rows.
-row n
Reposition to row n
quote data
Return the argument quoted for the field type indicated. If type is not specified varchar is used. All special characters in the data are escaped as appropriate and quote characters are applied if necessary. 
-type field
Quote as for a field type of field
-escape Returns the escape for enclosing Quote characters in strings. Ie \' or '''
-character Returns the quote character used for this field.
-notnull treat the text NULL as ascii text not the NULL operator. Normally the text NULL wil not be quoted in varchar and date datatypes..
dict Return information on DBMS dictonary objects in system. All drivers are expected to return information for databases, tables, columns.
options Returns list of options that the DBMS has dictionary items for,
A list of one or more of the following: users, databases, tables, columns, keys, indexes, views, triggers.
users Returns a list of DMBS users.
databases Returns list of databases in system.
tables db
Returns list of tables in database db.
columns  db table attributes
Returns list of colum names and column attributes. Attributes not recognised by the driver are returned as empty values.
primary  db table
Returns list of primary key fields in table.
foreign db table
Returns list of foreign keys in table
indexes db table
Returns list of indexes on table
views database
Returns list of views for database
triggers db table
List of trigger names for table.
blob reads or writes a blob entry for a record.
-infile filename
File to use for blob data
-invar var
Uses tcl variable var for storing and retrieving blob data.
write Read database object into file or variable. Default option if not specified
read Reads database object from file or variable into row. 
-id

-column

-table

-rowid

rowid
Specifies the Identifier for blob. If not specified current row is used for the blob id.
-size returns or sets the size of the blob object.

All valid commands and options that are unimplemented by the driver must return no data and generate no error.



Cursors

  
 
When a query is executed with a cursor, the driver will wrap the sql command in what ever syntax is used to create a cursor. This will usually be "declare name as cursor for ..." syntax. The cursor will be opened so that the columns names etc are available prior to a fetch being issued. A database that does not support cursors must emulate the action by creating another database connection handle and using that for subsequent calls to fetch. Multiple open cursors are available on the one connection and calls to fetch must pass the cursor name to specify which cursor to use. There will always be a maximum of one result set available to fetch at any one time. Databases that emulate cursors must allow fetching from multiple cursors. Close for the cursor must be called prior to reissuing a a open for the same cursor name. The driver will delete the cursor reference when a close request is made to the cursor.

When using cursors the application should not put any "cursor" specific syntax into the sql statements. It is up to the driver to manage cursors as appriopriate for the underlying DBMS.
 
  

ReWriting Rules.

All drivers will support rule rewriting. A Basic rewriting engine will be part of the DBMS Loader module. The load functions can be called from the driver to perform rewriting if not implemented in the driver proper. To help out drivers the dbms loader has support for basic sql rewriting.. The driver is responsible for checking that the connection has the rewrite option set and if not supporting rewriting directly must call the dbms loader to perform the sql rewriting.
 

There are three parts of the rewriting engine that must be supported. They are:

1. SQL query rewriting. Including stored procedures.

2. Field type replacement.

3. Option replacement.
 

The following ReWrite Syntax is to be supported by both the driver and dbms loader.:
 
sql Parse sql statements
-proc name
Call tcl procedure and use returned string as replacement. String may contain \n sub strings from the matched string. The procedure is passed three parameters. The first is the conection handle, the second is the regexp to match, and the third is any replacement regexp specified.
-inclass class
Restricts rule to matching sql queries of one of the following classes: Select Create Insert Update DDL
-inscope {(first)(scope)(rest)}
Searches for scope in regexp and restricts replacement to the match found. The matches for first and rest are left unchanged.
-reqmatch regexp
The regular expression to match in the sql statement
-reqsub regexpsub
The replacement string including regexp sub parts with \n notation.
-fordrivers {fromdriver todriver}
Restrict rule to servers that have a designed driver of fromdriver and a executing on todriver.
field Replace field types. Used for quote matching rules.
-field fieldtype
Specifies field type to override
-newfield fieldtype
Specifies new field type to use.
-proc name
Calls proc and uses returned result as new fields type. Proc is passed the connection name and the fieldtype as parameters.
option Replace option values for the connection option command.
-proc name
Calls proc and uses returned result as the procedure name and parameters.. Proc is passed the connection name , the name of the procedure called and all the parameters.
-callproc name
Calls proc which returns a driver handle to a result set to use as the results from the procedure.
-name procname
A regexp that matches the name of a procedure.
-newname procname
The replacement name for the procedure to be called.
-args {{arg# regexp regexpsub} {arg# regexp regexpsub} {arg# regexp regexpsub}}
A List containing the argument order to use in the new query and a list of substitution regexp to apply on each argument.


Back To TCL index