/sys/doc/ Documentation archive

Database Support in Inferno


D. Knudsen
DMTS Inferno OS Development Team
(dknudsen@lucent.com)
Lucent Technologies
Bell Labs Innovations
600 Mountain Ave.
Murray Hill, NJ 07974

Summary

Inferno is a network operating system that provides a software infrastructure for distributed, network applications. Inferno allows any application, written in the Limbo programming language, to run across multiple platforms and networks under the Dis virtual machine. Inferno provides an elegant file-like interface to resources and services that allows the dynamic construction of a user name space. An Inferno application can access the resources and services in its name space even though they may be distributed throughout the network. The Styx component of Inferno provides transparent comunications over a variety of networks with strong security capabilities built in.

The world of database is diverse, even that subset of databases which support access via Structured Query Language. Inferno can provide a standard generic interface to this subset of databases by leveraging its strengths in the networking, dynamic name space, and security areas.

The R1.0 release of Inferno includes a database Application Programming Interface (API) and a remote database service program that allow developers to design and implement database applications that will be upward compatible with later releases of Inferno.

The API is embodied in a Dis loadable module that talks to the remote server running on the database platform. The server in turn talks to the database.



Introduction

Application developers will benefit from Inferno's unique capabilities:

With Inferno, database applications have new and extended uses under a broader set of networked devices. Such uses include personalized services, electronic commerce, data narrowcasting, etc.

The DB module

What the database application developer needs is a simple, generic application programming interface (API) that permits standardized access to (possibly remote) databases on non-Inferno platforms. Inferno provides this support, in the form of two distinct pieces. The first piece is a Dis loadable module, DB, written in Limbo, that provides an API that can be used to access any database supporting SQL. The second piece of software provided for Inferno support of databases is the server.

The developer accesses the DB module in the standard way in Limbo (see the Chapter 6, Limbo Language Definition, in the Inferno User's Guide for details on Limbo syntax and semantics):

     include "db.m";
     :
     db := load DB DB->PATH;

The DB module consists of one function, open, and one abstract data type (adt), DB_Handle. The function is called therefore:

    dh : ref DB->DB_Handle;
    (dh, errs) = open(addr, user, password, dbname);

where the first line declares a variable holding a reference to an instance of the adt, and the second calls the function to get an actual instance. (If the functions fails, errs contains the details of the failure.) The open function illustrates the Limbo capability of having a function return multiple values in the form of a tuple. This is a typical use, where the first element of the tuple is non-nil and the second is nil if the function succeeds, and the reverse is true if the function fails. addr is a string of the form netaddr!service, where netaddr is the network address of the database server, for instance db.server.lucent.com, and service is conventionally infdb, the Inferno database service name. The user and password strings are demanded by many DBMSs in order to control access to the data in the database named dbname (also a string). The returned value dh refers to an instance of the abstract data type, DB_Handle, which includes the functions needed to manipulate the opened database:

     (status, errmsg) = dh.SQL(sql_command);
     numcols = dh.columns();
     row = dh.nextRow();
     (status, field) = dh.read(col_number);
     count = dh.write(param_position, field);
     col_name = dh.columnTitle(col_number);

The first of these is used to send a request, in Structured Query Language (SQL), to the server. If the request produces a result set (typically the outcome of a select request), then columns returns the number of columns in the result set, nextRow steps through the rows of the result set one at a time, and columnTitle and read return the column names and the values in the current row, respectively. The column names are returned as strings, while the field values are returned as array of byte, thus allowing the retrieval of binary data (all other data types, including numeric and date types, are returned as strings).

The write function is used only if the application must insert or update binary data fields; it is needed because the corresponding SQL request cannot contain binary data, only placeholders representing the actual data. The param_position argument is an integer (1, 2, etc.) referring to the first, second, etc., placeholder in a subsequent SQL request. The field parameter is a byte array, to permit any kind of binary data to be passed.

Database application developers can build more complex and specialized functions out of these generic and lower-level functions.

Module details

The DB module makes direct use of Inferno's name space concepts to implement its simple design. In particular, the open function opens a network connection to the specified service at the specified network address. By mounting the connection on a directory in the local name space, the remote end is set to "serve" files in that directory. Following the open, the served files support the functions of the API, or if the programmer prefers, can be read from of written to directly via file descriptors in the DB_Handle adt.

Closely paralleling the API, the served files are: ctl, SQL, cols, row, entry, and title. The ctl file is used only to send the user name, password, and database name to the remote end. The SQL file is used to write SQL requests to the far side, and the remaining files are used to implement the remaining APIs. For instance, to implement the read function, the specified column number is written to the cols file, and then the entry file is read, up to an end-of-file, to get the value of the field in that column of the current row.

The advantage of this implementation is that the data being sent back and forth from client to server is encapsulated in Styx packets, so the security features of the Styx protocol may be used.

The Server Module

The second piece of software provided for Inferno support of databases is the server. The application programmer does not interface directly with the server, which hides the complexity of retrieving data from real databases. This complexity is because:

The second bullet arises from the historic interplay between the desire of vendors for product differentiation and the desire of users for standardization. Structured Query Language (SQL) is an early, and continuing, effort to standardize the syntax and semantics of messages between database applications and DBMSs. The majority of commercial DBMSs now support SQL.

Unfortunately, SQL is an evolving standard. The original standard was adopted in 1986, there is an SQL 89, the current SQL 92, and another in the works. While the changes are upward-compatible, different releases of the various vendor's DBMSs support different standards.

In addition, there are several programmatic interfaces to SQL, including static and dynamic embedded SQL and call level interface (CLI). The former are techniques for embedding SQL statements within other languages, calling for the use of a preprocessor. The difference between the static and dynamic versions is that in the former, the SQL statements are fixed at compile time, while in the latter they can be constructed at run time. A CLI is a set of APIs supported by a function library.

Finally, vendors have not hesitated to provide proprietary extensions to SQL and to their CLIs.

A common instance of a CLI is the Open Database Connectivity (ODBC) interface, running on various Microsoft Windows platforms. ODBC is not tied to any particular DBMS, but depends on DBMS vendors to provide a driver, in the form of a dynamically loadable library that supports a prescribed set of APIs. Since there are now such drivers for most major DBMSs, ODBC provides an "almost universal" database access, as long as one sticks to the lowest conformance levels for both ODBC and SQL.

Vendors have also lately begun to supply servers to allow databases supported by their DBMS to be accessed by client applications over a network. Again unfortunately, the protocols used by these servers are proprietary.

The net result of all this is that database applications are connected to the database via some opaque glue which is supplied by the DBMS vendor, and are typically written in C language, perhaps requiring a preprocessing step before compilation.

For Inferno support of a database, the generic architecture is:

There is a hosted Inferno on the database platform to serve the special files representing the database to the DB module. There are quotes around "standard" in the Database App. box because even though this application is the same for all databases under this DBMS on this platform, it must be different for different DBMSs, and must be ported to each platform where support for a particular DBMS/database is needed. (The DBMS/database could actually be on some other platformthe details of the connection from the database application to it are hidden by vendor code bound into the application.)

This standard method for accessing database from Inferno involves several different software processes and interprocess communications methods. There are ways to improve the efficiency of database access where large amounts of data are transferred for each transaction, and the Inferno team is actively pursuing this. But, since all the changes will be to the remote access piece and will not affect the APIs, applications written today against the DB API will not even need to be recompiled to run when these improvements are put in place.

Inferno 1.0

Release 1.0 of Inferno includes a DB module with the complete API described above, except that it doesn't have the ability to update binary data fields.

The "standard" database application included in Inferno Release 1.0 is an ODBC application that runs on Windows 95 and Windows NT and talks to the DB module (no intermediate hosted Inferno is needed). This program, called infdb, will support simultaneous connections from multiple Inferno processes to the same, or different, databases.

This combination should be sufficient to allow a developer to write trial applications. Future releases of Inferno may include new versions of the DB module and additional database server programs (to improve efficiency and security, for instance), but applications written to the current API should not need to be changed.