Development

The current implementation of the idl tool employs the stated prerequisites with the software written in the C programming language. A Flex-based lexical analyser produces tokens that are consumed by a Bison-based parser generator whose rules call program functions to build structures and generate output.

  1. Main Program
  2. Configuration
  3. Lexical Analysis
    1. Operating Modes
  4. Parsing
    1. Token and Rule Result Values
    2. Value Copying
    3. Rules and Structures
  5. Interface Structure
  6. Code Generation
    1. Compound Interfaces
    2. Individual Interfaces
    3. Includes and Headers
    4. Interface Definitions
      1. Client Definitions
      2. Server Definitions
    5. Templates and Output
    6. Servers
    7. Dispatchers and Handlers
    8. Parameters and Members
    9. Message Structures and Access
    10. Summaries

Main Program

The main.c file contains the main function whose responsibilities include the following:

Various helper functions are also defined.

Configuration

The config.c and config.h files define configuration state for the program as it processes input files, with the latter defining the nature of the state and the former recording the actual state set by the main program when interpreting program arguments and options. Thus, this configuration can be considered to be a concise, validated form of those arguments and options.

Lexical Analysis

The idl.lex file defines a scanner or lexical analyser for the language used to describe interfaces. For the most part, it merely matches sequences of characters and produces token identifiers, with no additional information being provided for tokens having a fixed form (typically keywords and symbols).

Tokens with varying input forms (identifiers, numbers, and so on) also have a value associated with them indicating such things as the precise sequence of matching characters or the numeric equivalent of those characters. Such details are stored using members of the value type, referenced via the yylval variable. The token value type is itself defined in the parser definition.

Operating Modes

There are two distinct operating modes of the scanner:

When a comment start indicator (/*) is encountered in the default mode, the mode is switched to the comment mode and all input is consumed until the comment end indicator (*/) is encountered, at which point the default mode is selected again.

A special indicator is used to declare the comment mode in the scanner file:

%x comment

All rules applicable in this mode are prefixed with <comment> in the file to distinguish them from rules applicable in the default mode.

Parsing

The idl.y file defines a parser for the interface description language. It starts by attempting to satisfy the file rule, matching statements until the end of input, invoking the write_files function to generate the configured output from the tool.

As each rule is evaluated, tokens are consumed from the scanner and operations are performed to build up a structure describing the input. Where a rule cannot be evaluated successfully, the included yyerror function emits a message and the parsing will halt.

(Error handling and reporting could do to be improved.)

Token and Rule Result Values

For interoperability with the scanner or lexical analyser, the parser defines the nature of the values associated with tokens using the %union declaration, these including a str (string) interpretation and a num (numeric) interpretation of a value.

Since token values are propagated between parser rules, the %union declaration is augmented with other interpretations that are employed by the rules. Consequently, rules can obtain values for tokens that were produced by the scanner (such as numbers and strings) but then incorporate them into other kinds of values or structures, passing them on to other rules. These other rules can treat such propagated values in the same way as those produced directly by the scanner.

For example, the include rule obtains a value associated with a header filename. This filename is associated with the HEADER token and its value is interpreted as a string. However, the rule needs to prepare a structure that incorporates the filename and that can be referenced by other structures. To achieve this, a %union member is defined for the structure type concerned:

%union {
  long num;
  char *str;
  struct include inc;
  ...
}

The member name, inc, is then associated with the include rule:

%type <inc> include

With this, the rule can be considered to be working to populate a value of the indicated type (struct include), and where other rules reference the result of the rule, they will be able to recognise the type of this result.

Value Copying

When rules obtain values to store them, it is necessary to copy the obtained values because these values may not be allocated in any permanent sense: they may only be available at a particular point during the scanning of input, and any attempt to reference them later may yield an invalid value. Consequently, a copy function and a suite of convenience macros (defined in parser.h) allocate memory for new values.

(Currently, the management of allocated memory is deficient in that such memory is not deallocated. However, since the program is intended to have a limited running time and handle limited numbers of input files, no effort has been directed towards tidying up this allocated memory.)

Rules and Structures

Generally, the structures built using the result values reflect the structure of the rules describing the interface description language. However, the form of rules, necessary as it is for parsing, is not entirely optimal for a generated structure. Consider the following rule:

attributes : attribute SEP attributes
           | attribute
           ;

Consider a pair of attributes:

first,second

With the attribute rule matching each identifier, and with the attributes rule incorporating a single attribute value within a structure referencing other attribute, the following structure would emerge:

rule_structureattributesattributeattributesattribute"first"...attributesattributeattributesattribute"second"...

A more natural structure would instead employ a linked list of attributes:

natural_structureattribute"first"...tailattribute"second"...tail

To achieve this, a tail member is defined in structures, and instead of wrapping results in new structures at each level of the rule hierarchy, results are effectively combined by having one result reference another via the tail member, thus linking together collections of results.

Interface Structure

The types.h file defines the structural elements of interfaces prepared during the processing of the input files. A hierarchy of structure types is defined as follows:

typesinterfacenamesignaturesattributesincludestailsignaturequalifieroperationparametersattributestailattributeattributeidentifierstailincludefilenametailparameterspecifierclassidentifierstailidentifieridentifiertail

The nature of the hierarchy should reflect the conceptual form of the input. It should be noted that header file information (represented by include structures) is associated with interface information (represented by interface structures). This arrangement merely attempts to indicate the header file declarations that preceded specific interface declarations, but the two different types of information should arguably be grouped within a file-oriented structure.

Code Generation

The program.c and program.h files define the functions that coordinate the generation of program code. It is in program.c that files are opened for writing (using the get_output_file function provided by common.c), and two principal functions are involved in initiating the population of these files:

These functions are described in more detail with regard to the topics of compound and individual interfaces.

Meanwhile, a selection of writer functions are employed to generate code for the different structures employed within interface descriptions. Since output may be generated for different kinds of files, the following concepts are employed to parameterise the output appropriately:

Through the use of roles, the same fundamental information can be expressed in different ways by the same routine.

Compound Interfaces

The begin_compound_output function is called by the main program when compound interface generation has been requested. It produces extra output that references and augments output produced for individual interfaces.

Various details of individual interfaces are incorporated into the compound interface output. To achieve this, once the begin_compound_output function has been called, individual interface output is generated. During this activity, the write_compound_output function is called for each individual interface to insert details of that interface into the appropriate place within the compound interface output.

The end_compound_output function ultimately closes the files involved, either through being invoked by the main program or upon a failure condition.

The following diagram summarises the general function organisation involved.

compoundparserwrite_files..._server.{c,cc,h}..._interface.h..._interfaces.h..._interface_type.hmainyyparsebegin_compound_outputend_compound_outputwrite_interfaceswrite_compound_dispatch_includewrite_compound_outputwrite_handler_signaturewrite_dispatcher_signaturewrite_dispatcher_caseswrite_compound_interfacewrite_include

Individual Interfaces

The write_files function coordinates the generation of individual interface output.

individualparserwrite_files..._client.{c,cc,h}..._server.{c,cc,h}..._interface.hmainyyparsewrite_interfaceswrite_client_interfacewrite_dispatcherwrite_dispatcher_signaturewrite_functionswrite_handler_signaturewrite_includewrite_interface_definitionwrite_signatures

Includes and Headers

The includes.c and includes.h files provide support for writing #include statements in C and C++ programs, with the write_includes function traversing an include structure list to emit a list of statements.

Interface Definitions

The interface.c and interface.h files provide the write_interface_definition function which is concerned with generating a description of an interface in two different contexts: client and server. Some generated files are employed in both contexts such as the file of the form <interface>_interface.h.

Client Definitions

For client use, some types are defined for client programs to use in a header file of the form <interface>_client.h.

C programs employ a collection of function signatures for the different operations provided by the interface plus a declaration of an interface object retaining references to those functions as members.

C++ programs employ a class declaration containing the operations as methods and with internal state referencing the component providing the operations.

Server Definitions

For server use, some types are defined for server programs to use in a header file of the form <interface>_server.h.

C programs employ declarations of the interface object type whose members provide a table of references to operation functions. A reference type is defined as a way of referencing the component providing the operations, and an object type is defined that bundles this opaque reference with a reference to the interface details. (Such an object is populated with a pointer to a component and a pointer to a table of concrete functions providing the appropriate operations.)

C++ programs provide the rather more straightforward class declaration containing the operations to be implemented by a component. (The implementation is provided by deriving from the generated class and defining concrete methods.)

Both C and C++ output contains additional details describing operation codes and message structures (incoming and outgoing) for the different operations.

Templates and Output

Servers

Dispatchers and Handlers

Parameters and Members

Message Structures and Access

Summaries