Database Language SQL

SQL is a popular relational database language first standardized in 1986 by the American National Standards Institute (ANSI). Since then, it has been formally adopted as an International Standard by the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). It has also been adopted as a Federal Information Processing Standard (FIPS) for the U.S. federal government.

Database Language SQL is under continual development by the above mentioned standardization bodies. The most recent published version was in 1992, a 580 page specification published by ANSI as American National Standard X3.135-1992 and by ISO/IEC as International Standard 9075:1992. The two specifications are word-for-word identical. Both versions are available in hard copy only from ANSI (Sales telephone: +1-212-642-4900). Further extensions and enhancements are under development (see SQL ENHANCEMENTS below).

The FIPS SQL specification is published as FIPS PUB 127-2; it points to the ANSI specification by reference, presents four FIPS SQL conformance levels, and identifies additional FIPS SQL requirements for flagging extensions, for documentation of supported features, and for character set support.

The SQL standard is very popular with a large and increasing number of conforming implementations. It is, or soon will be, the basis of definition for a majority of Federal databases and database applications involving structured data.

SCOPE of FIPS SQL

A database language standard specifies the semantics of various components of a database management system (DBMS). In particular, it defines the structures and operations of a data model implemented by the DBMS, as well as other components that support data definition, data access, security, programming language interface, and data administration. The SQL standard specifies data definition, data manipulation, and other associated facilities of a DBMS that supports the relational data model.

A database language standard is appropriate for all database applications where data will be shared with other applications, where the life of the application is longer than the life of current equipment, or where the application is to be understood and maintained by programmers other than the original ones.

DESCRIPTION of SQL

The basic structure of the relational model is a table, consisting of rows and columns. Data definition includes declaring the name of each table to be included in a database, the names and data types of all columns of each table, constraints on the values in and among columns, and the granting of table manipulation privileges to prospective users. Tables can be accessed by inserting new rows, deleting or updating existing rows, or selecting rows that satisfy a given search condition for output. Tables can be manipulated to produce new tables by Cartesian products, unions, intersections, joins on matching columns, or projections on given columns.

SQL data manipulation operations may be invoked through a cursor or through a general query specification. The language includes all arithmetic operations, predicates for comparison and string matching, universal and existential quantifiers, summary operations for max/min or count/sum, and GROUP BY and HAVING clause to partition tables by groups. Transaction management is achieved through COMMIT and ROLLBACK statements.

The standard provides language facilities for defining application specific views of the data. Each view is the specification of database operations that would produce a desired table. The viewed table is then materialized at application execution time.

The SQL standard provides a Module Language for interface to other languages. Each SQL statement may be packaged as a procedure that can be called and have parameters passed to it from an external language. A cursor mechanism provides row-at-a-time access from languages that can only handle one row of a table at one time.

Access control is provided by GRANT and REVOKE statements. Each prospective user must be explicitly granted the privilege to access a specific table or view using a specific statement.

The SQL Integrity Enhancement facility offers additional tools for referential integrity, CHECK constraint clauses, and DEFAULT clauses. Referential integrity allows specification of primary and foreign keys with the requirement that no foreign key row may be inserted or updated unless a matching primary key row exists. Check clauses allow specification of inter-column constraints to be maintained by the database system. Default clauses provide optional default values for missing data.

The Embedded SQL specification provides SQL interface to programming languages, specifically Ada, C, COBOL, FORTRAN, MUMPS, Pascal, and PL/I. Applications may thereby integrate program control structures with SQL data manipulation capabilities. The Embedded SQL syntax is just a shorthand for an explicit SQL Module accessed from a standard conforming programming language.

SQL-92 significantly increases the size of the original 1986 standard to include a schema manipulation language for modifying or altering schemas, schema information tables to make schema definitions accessible to users, new facilities for dynamic creation of SQL statements, and new data types and domains. Other new SQL-92 features include outer join, cascade update and delete referential actions, set algebra on tables, transaction consistency levels, scrolled cursors ,deferred constraint checking, and greatly expanded exception reporting. SQL-92 also removes a number of restrictions in order to make the language more flexible and orthogonal.

The following textbooks identify and explain all of the features in the SQL-92 specification:

   Jim Melton and Alan R. Simon
   Understanding the New SQL: A Complete Guide
   Morgan-Kauffman Publishers, San Mateo, CA 94403 U.S.A.
   October 1992
 
   C.J. Date with Hugh Darwen
   A Guide to the SQL Standard, 3rd Edition
   Addison-Wesley Publishing Company, Reading, MA 01867 U.S. A.
   October 1992
 
   Stephen Cannan and Gerard Otten
   SQL - The Standard Handbook
   McGraw-Hill Book Company Europe, Berkshire, SL6 2QL, England
   November 1992

GENERAL USE of FIPS SQL

The purpose of a database language standard is to provide portability of database definitions and database application programs among conforming implementations. Use of a database language standard is appropriate in all cases where there is to be some interchange of database information between systems. The schema definition language may be used to interchange database definitions and application specific views. A data manipulation language provides the data operations that make it possible to interchange complete application programs.

The relational data model, and thereby the SQL standard, is appropriate for database applications requiring flexibility in the data structures and access paths of the database. It is desirable both for applications under production control and when there is a substantial need for ad hoc data manipulation by end users who are not computer professionals.

The SQL Schema Definition Language is particularly appropriate for describing tables of information that may be transferred among various data management applications. Used with the Remote Database Access (RDA) standard, it is possible to exchange data occurrences in a standard manner and for conforming systems to interoperate.

SQL ENHANCEMENTS

Development is currently underway to enhance SQL into a computationally complete language for the definition and management of persistent, complex objects. This includes: generalization and specialization hierarchies, multiple inheritance, user defined data types, triggers and assertions, support for knowledge based systems, recursive query expressions, and additional data administration tools. It also includes the specification of abstract data types (ADTs), object identifiers, methods, inheritance, polymorphism, encapsulation, and all of the other facilities normally associated with object data management. An SQL3 FTP directory contains papers from individual authors giving descriptions and status of various proposed SQL enhancements.

In 1993, the ANSI and ISO development committees decided to split future SQL development into a multi-part standard. The Parts, as of December 1995, are:

Part 1: Framework. A non-technical description of how the document is structured.

Part 2: Foundation. The core specification, including all of the new ADT and Object SQL features; currently over 800 pages.

Part 3: SQL/CLI. The call level interface. A version dependent only on SQL-92 was published in 1995 as ISO/IEC 9075-3:1995. A follow-on, providing support for new features in other Parts of SQL is under development.

Part 4: SQL/PSM. The stored procedures specification, including computational completeness. Currently being processed for DIS Ballot.

Part 5: SQL/Bindings. The Dynamic SQL and Embedded SQL bindings taken from SQL-92. No active new work at this time, although C++ and Java interfaces are under discussion.

Part 6: SQL/XA. An SQL specialization of the popular XA Interface developed by X/Open (see below).

Part 7: SQL/Temporal. A newly approved SQL subproject to develop enhanced facilities for temporal data management using SQL.

SQL Part 2: SQL Foundation

The SQL Foundation includes all of the new SQL Abstract Data Type (ADT) facilities. Some features in the proposed new Part 2 are very stable, and ready for immediate standardization, whereas other features are undergoing active evolution and so are somewhat unstable. This Part will soon be split into a piece that can be progressed "early" versus a piece that can wait until "later".

The early piece could be progressed to CD in 1996 with final adoption in 1998. All of the user-defined VALUE ADTs, subtype families, Assertions, Triggers, Savepoints, and Cursor extensions easily made it into the "early" piece as everyone agrees they are stable specifications. The OBJECT ADT facility is on the edge between "early" and "later" and the WITH OID VISIBLE alternative is currently in the "later" piece. All of the collection type stuff (i.e. lists, sets, arrays, multisets) are also on the edge because their specification is still incomplete.

SQL Part 3: Call Level Interface (SQL/CLI)

The SQL call level interface is a requirement for third-party software developers who produce "shrink-wrapped" software for use on personal computers and workstations. They do not wish to use a Module processor or an Embedded SQL preprocessor binding style because they do not wish to distribute any source code with the products they sell to individual users. Instead they desire a services call interface to SQL data repositories that can be invoked from the calling environment provided by the host operating system. The calls to the SQL data repository can then be embedded in the object code just like calls to any other system service.

The Call Level Interface is an alternative mechanism for executing SQL statements. The SQL/CLI consists of a number of routines that:

-- Allocate and deallocate resources.

-- Control connections to SQL-servers.

-- Execute SQL statements using mechanisms similar to Dynamic SQL.

-- Obtain diagnostic information.

-- Control transaction termination.

The CLI ExecDirect routine and the CLI Prepare routine each support an input character string parameter identified as StatementText. If P is the value of StatementText, then P shall satisfy the following restrictions:

1. P shall conform to the Format, Syntax Rules, and Access Rules for a "preparable statement" as specified in Subclause 17.6, "prepare statement", of the SQL'92 standard.

2. P shall not be a "commit statement" or a "rollback statement".

3. P shall abide by the Leveling Rules of the level of SQL support claimed by the SQL/ERI Server.

The SQL/CLI specification is intended to support CLI routines embedded into both pointer-based programming languages and non-pointer-based programming languages.

In particular, CLI routines may be embedded into any one of the following standard programming languages: Ada, C, COBOL, Fortran, MUMPS, Pascal, and PL/I.

An SQL/ERI Server will indicate which of those languages it supports.

SQL/CLI contains a number of features to make it fully compatible with SQL-92 (e.g. handling of international character sets, scrolled cursor support, and support for dynamic SQL), and "concise" routines have been added to make it backward compatible with MicroSoft's ODBC and the early SQL Access and X/Open Snapshot specification.

The Editor will produce DIS text by mid-October, 1994, with the expectation that the DIS Ballot can be completed by May 1995. If all goes well, there should be a published ISO/IEC SQL/CLI standard in late 1995.

SQL3 Part 4: Persistent Stored Modules (SQL/PSM)

This part includes all of the new "programming language flow of control" features in SQL, e.g. assignment, variables, looping, branching, procedure block structure, procedure and function calls, exception handling, etc. It provides a very powerful facility for "stored procedures", that is, SQL procedures stored and optimized at the server that can be invoked from a client with a single call.

This is a very important requirement in client/server environments.

An advantage of the PSM specification is that non-SQL data repositories will now be able to present themselves to an SQL Environment as conforming to a Minimal level of SQL data manipulation, while at the same time offering their "value- added" facilities as SQL-callable functions and procedures. This should be an important stimulant to get legacy systems and other non-SQL data repositories, such as Document Management Systems and Geographic Information Systems, to claim conformance to one of the SQL/ERI profiles defined in the FIPS PUB for SQL Environments (see SQL ENVIRONMENTS below).

The SQL/PSM specification will undergo its 1st CD Ballot this Fall, with likely progression to DIS no later than July 1995. It should be formally adopted as an ISO/IEC standard in calendar year 1996.

SQL3 Part 6: SQL XA Interface Specialization (SQL/XA)

This specification would standardize an application program interface (API) between a global Transaction Manager and an SQL Resource Manager. It would standardize the function calls, based upon the semantics of ISO/IEC 10026, "Distributed Transaction Processing", that an SQL Resource Manager would have to support for two-phase commit. The base document is derived from an X/Open publication, with X/Open permission, that specifies explicit input and output parameters and semantics, in terms of SQL data types, for the following functions: xa_close, xa_commit, xa_complete, xa_end, xa_forget, xa_open, xa_prepare, xa_recover, xa_rollback, and xa_start.

The current schedule for SQL/XA calls for CD registration in 1996, with formal adoption as an ISO/IEC standard in 1998.

SQL MULTIMEDIA (SQL/MM)

A new ISO/IEC international standardization project for development of an SQL class library for multimedia applications was approved in early 1993. This new standardization activity, named SQL Multimedia (SQL/MM), will specify packages of SQL abstract data type (ADT) definitions using the facilities for ADT specification and invocation provided in the emerging SQL3 specification. SQL/MM intends to standardize class libraries for science and engineering, full-text and document processing, and methods for the management of multimedia objects such as image, sound, animation, music, and video. It will likely provide an SQL language binding for multimedia objects defined by other JTC1 standardization bodies (e.g. SC18 for documents, SC24 for images, and SC29 for photographs and motion pictures).

The Project Plan for SQL/MM indicates that it will be a multi-part standard consisting of an evolving number of parts. Part 1 will be a Framework that specifies how the other parts are to be constructed. Each of the other parts will be devoted to a specific SQL application package. The following SQL/MM Part structure exists as of December 1995:

Part 1: Framework. A non-technical description of how the document is structured.

Part 2: FullText. Methods and ADTs for text data processing. Only minimal content at present.

Part 3: Spatial. Methods and ADTs for spatial data management. About 125+ pages with active contributions from Spatial Data experts from 3 national bodies.

Part 4: Gen Purpose Facilities. Methods and ADTs for complex numbers, trig and exponential functions, vectors, sets, etc. Currently about 85 pages.

SQL ENVIRONMENTS

An SQL environment is an integrated data processing environment in which heterogeneous products, all supporting some aspect of the FIPS SQL standard, FIPS PUB 127-2, are able to communicate with one another and provide shared access to data and data operations and methods under appropriate security, integrity, and access control mechanisms. Some components in an SQL environment will be full-function SQL implementations that conform to an entire level of FIPS SQL and support all of its required clauses for schema definition, data manipulation, transaction management, integrity constraints, access control, and schema information. Other components in an SQL environment may be specialized data repositories, or graphical user interfaces and report writers, that support selected portions of the SQL standard and thereby provide a degree of integration between themselves and other products in the same SQL environment. The intent is to provide a high level of control over a diverse collection of legacy or specialized data resources. An SQL environment allows an organization to obtain many of the advantages of SQL without requiring a large, complex, and error-prone conversion effort; instead, the organization can evolve, in a controlled manner, to a new environment.

FIPS PUB 193, the FIPS PUB for SQL Environments, is the beginning of a continuing effort to define appropriate conformance profiles that can be used by both vendors and users to specify exact requirements for how various products fit into an SQL environment. The emphasis is to specify general purpose, SQL external repository interface (SQL/ERI) profiles for non-SQL data repositories. These profiles specify how a subset of the SQL standard can be used to provide limited SQL access to legacy databases, or to support SQL gateways to specialized data managers such as Geographic Information Systems (GIS), full-text document management systems, or object database management systems. All of the profiles specified therein are for server-side products, that is, products that control persistent data and provide an interface for user access to that data. Subsequent versions of the FIPS PUB for SQL Environments may specify SQL profiles for client-side products, that is, products that access data and then present that data in graphical or report-writer style to an end user, or process the data in some other way on behalf of the end user.

Many applications require a logically integrated database of diverse data (e.g. documents, graphics, spatial data, alphanumeric records, complex objects, images, voice, video) stored in geographically separated data banks under the management and control of heterogeneous data management systems. An over-riding requirement is that these various data managers be able to communicate with one another and provide shared access to data and data operations and methods under appropriate security, integrity, and access control mechanisms. Much of this source data may be stored in simple file systems, legacy data management systems, or very specialized data repositories that satisfy only a small percentage of these data management requirements. The objective of an SQL environment is to logically integrate these diverse data repositories "as if" they were under the control of a single SQL data manager. User presentation tools, such as graphical user interfaces or report writers, can then use this SQL interface to collect data from various sources, merge it together under ad hoc join conditions, and present it to the user in a pleasing graphical format.

A properly functioning SQL environment will use the SQL language to describe this data using standardized facilities, integrate it into a single federated collection, enforce any integrity or access control constraints, and make it available as a logical whole to sophisticated user productivity or presentation tools. These client-side tools can then use the full power and flexibility of SQL for data retrieval and manipulation. The underlying data managers may implement non-relational data models and thus may have difficulty supporting SQL requirements for nested subqueries, multi-table joins, derived columns in a Select list, referential integrity, or other relational model features. On the other hand, they may offer advanced features of other data models that are rarely supported by relational implementations. With emerging features in the SQL language for user-defined abstract data types (ADTs), stored procedures, encapsulation, polymorphism, and other object management facilities, these diverse data repositories can be described as specialized SQL repositories and accessed using already standardized SQL binding alternatives.

With this approach, SQL may prove to be as successful as an integrator of heterogeneous data repositories as it has been as a language interface to the relational data model. The SQL language can meet these integration objectives if non-SQL implementations provide a "simple" SQL interface to their data and services, and if full-function SQL implementations use that simple interface to provide full-function services to end user tools. The FIPS PUB 193 for SQL Environments specifies standard profiles for such "simple" SQL interfaces, thereby making it easier to specify and support the desired integration.

SQL and RDA provide the basis for standardized communication. An SQL external repository interface (SQL/ERI) makes it possible for non-SQL data repositories to share their data with user applications. With emerging SQL3 enhancements for object-oriented and knowledge-based data management and emerging RDA extensions for distributed database, the ERI can evolve to support "seamless" data integration.
 

return to SQL Table of Contents


For questions on SQL and SQL related standards, contact Len Gallagher (lgallagher@nist.gov).