USENET/GOOGLE groups: comp.lang.cobol
skip to the main content area of this page
Structured Query Language - Can it RELATE to COBOL?
In 1982, two Fellows at IBM were investigating possible ways to access data better than was currently the case. Codd and Date identified a RELATIONAL model which would eliminate redundant data and the huge overheads involved in maintaining and synchronizing multiple copies of it. Despite strong resistance at first, and some very poor and inefficient early attempts to implement the model in software, within 15 years the generally accepted repository for storing important information was the Relational Database.



COBOL people are familiar with the use of tables, defined through the OCCURS clause. The "tables" that comprise a Relational database represent "relations" between a key and its data; each row is comprised of columns which relate to a particular primary key. Although COBOL people will see a "row" as a "record", and the columns which comprise it as "fields", this is not strictly accurate (although it is certainly a good enough starting point to help you get the idea of relational tables.)

In Relational terminology, elements that "occur" are called "Repeating Groups". The way that COBOL deals with them is to assign a fixed area of memory (even if you use "OCCURS...DEPENDING the maximum amount of memory is still allocated), but Relational Database Management Systems (Oracle, SQLServer, Access, DB2 etc.) will not tolerate fixed defined limits on data. Instead, a COBOL "table" within a record defintion must be separated out to a new DB table where it can grow to whatever size it needs to. (see diagram, above)

This process of removing repeating goups out to separate attached tables is part of the "normalization" of the Relational Database (RDB). "Normalization" is the process of optimizing your data so that it conforms with the Relational Model. A properly normalized RDB has no redundant data, just as the the mathematical purity of the Relational Model promised.

Here's a quick summary of some RDB jargon you are likely to encounter:

TERM Simple explanation
CONSTRAINT Defining an explicit relationship between tables. See CASCADE
REFERENTIAL INTEGRITY Guarantees that constrained tables will be treated as a single unit.
FOREIGN KEY The key in an attached table which links it back to the base table, or another table it is constrained with.
PRIME KEY The unique identifier for a given table row. It may become a FOREIGN KEY in linked tables
CASCADE The base and attached tables are constrained on one or more keys and if you delete or update the base relation, any attached relations are also updated or deleted.


Apart from providing random and sequential access to discrete or linked data (which you could do with your indexed or relative files anyway), the Relational Database Management Systems carry out a number of other very useful functions that help to ease the load for the programmer.

Transactional isolation means that transactions which process against a number of tables can have their updates deferred until the successful end of the transaction, when all pending updates are applied together. If the transaction aborts or is cancelled, the database will remain in the state it was before the transaction started. COMMITS and ROLLBACKS can be applied manually under program control, or automatically when transactions are run.

By using a separate subsystem to manipulate data it is possible to do performance tuning in the subsystem, rather than in the application.

PRIMA ESQL for COBOL tutorial document
There is a document that covers the use of embedded SQL (ESQL) with COBOL and shows how to include SQL into your programs so you can manipulate RDBs. A separate Appendix describes and explains Normalization in simple language. DOWNLOAD the package by clicking the icon; you can READ the normalization document by clicking here...



RDBs and SQL
  • Where can I get more details?
    Download the tutorial pack from the icon at the bottom of this page as a start point. Once you have the basic ideas you will know what to search the Web for to get information for your specific compiler.
  • What if I'm really stuck?
    Post a description of your problem, along with the compiler and version you are using, plus the database RDBMS name, and samples of the code you've tried, to comp.lang.cobol. (If you don't have a newsreader for Usenet, click the link at the top of this page and you can access through GOOGLE groups.)

    Remember that the people in the newgroup do not HAVE to respond. Be courteous and acknowledge any help you receive. There are a number of highly skilled and experienced people who frequent this group, and, for the most part, they are very happy to help as long as you can show you had a shot at it yourself.
  • So what do I need to know?
    You will need to get familiar with at least the basic SQL commands. The command set is subdivided into 3 categories:

    1. DDL - Data Definition Language (sometimes referred to as "schema")
    2. DML - Data Manipulation Language (used to access and manipulate data)
    3. Admin - commands used by Administrators to control the RDBMS.

    The following are important:
    • SELECT (DML)
    • INSERT (DML)
    • UPDATE (DML)
    • DELETE (DML)
    • CREATE (DDL)
    • ALTER (DDL)
    • DROP (DDL)
    • DECLARE CURSOR (DML)
    • OPEN (DML)
    • FETCH (DML)
    • CLOSE (DML)
    • GRANT (Admin)
    • REVOKE (Admin)


    Some implementations of ESQL do not allow Schema (DDL) or Admin commands.