Development

The current implementation of the idl tool employs the stated prerequisites with the software written in the C programming language. A Flex-based lexical analyser produces tokens that are consumed by a Bison-based parser generator whose rules call program functions to build structures and generate output.

Main Program
Configuration
Lexical Analysis
1. Operating Modes
Parsing
Interface Structure
Code Generation

Main Program

The main.c file contains the main function whose responsibilities include the following:

Processing program arguments and options using C library getopt functionality, configuring the program behaviour and output.
Input file access and output coordination.

Configuration

The config.c and config.h files define configuration state for the program as it processes input files, with the latter defining the nature of the state and the former recording the actual state set by the main program when interpreting program arguments and options. Thus, this configuration can be considered to be a concise, validated form of those arguments and options.

Lexical Analysis

The idl.lex file defines a scanner or lexical analyser for the language used to describe interfaces. For the most part, it merely matches sequences of characters and produces token identifiers, with no additional information being provided for tokens having a fixed form (typically keywords and symbols).

Tokens with varying input forms (identifiers, numbers, and so on) also have a value associated with them indicating such things as the precise sequence of matching characters or the numeric equivalent of those characters. Such details are stored using members of the value type, referenced via the yylval variable. The token value type is itself defined in the parser definition.

Operating Modes

There are two distinct operating modes of the scanner:

The default mode recognising most elements of the language.
A comment mode consuming the text found within comment delimiters.

When a comment start indicator (/*) is encountered in the default mode, the mode is switched to the comment mode and all input is consumed until the comment end indicator (*/) is encountered, at which point the default mode is selected again.

A special indicator is used to declare the comment mode in the scanner file:

%x comment

All rules applicable in this mode are prefixed with <comment> in the file to distinguish them from rules applicable in the default mode.

Parsing

The idl.y file defines a parser for the interface description language. It starts by attempting to satisfy the file rule, matching statements until the end of input, invoking the write_files function to generate the configured output from the tool.

As each rule is evaluated, tokens are consumed from the scanner and operations are performed to build up a structure describing the input. Where a rule cannot be evaluated successfully, the included yyerror function emits a message and the parsing will halt.

(Error handling and reporting could do to be improved.)

Token and Rule Result Values

For interoperability with the scanner or lexical analyser, the parser defines the nature of the values associated with tokens using the %union declaration, these including a str (string) interpretation and a num (numeric) interpretation of a value.

Since token values are propagated between parser rules, the %union declaration is augmented with other interpretations that are employed by the rules. Consequently, rules can obtain values for tokens that were produced by the scanner (such as numbers and strings) but then incorporate them into other kinds of values or structures, passing them on to other rules. These other rules can treat such propagated values in the same way as those produced directly by the scanner.

For example, the include rule obtains a value associated with a header filename. This filename is associated with the PHEADER and QHEADER tokens and its value is interpreted as a string. However, the rule needs to prepare a structure that incorporates the filename and that can be referenced by other structures. To achieve this, a %union member is defined for the structure type concerned:

%union {
  long num;
  char *str;
  struct include inc;
  ...
}

The member name, inc, is then associated with the include rule:

%type <inc> include

With this, the rule can be considered to be working to populate a value of the indicated type (struct include), and where other rules reference the result of the rule, they will be able to recognise the type of this result.

Value Copying

When rules obtain values to store them, it is necessary to copy the obtained values because these values may not be allocated in any permanent sense: they may only be available at a particular point during the scanning of input, and any attempt to reference them later may yield an invalid value. Consequently, a copy function and a suite of convenience macros (defined in parser.h) allocate memory for new values.

(Currently, the management of allocated memory is deficient in that such memory is not deallocated. However, since the program is intended to have a limited running time and handle limited numbers of input files, no effort has been directed towards tidying up this allocated memory.)

Rules and Structures

Generally, the structures built using the result values reflect the structure of the rules describing the interface description language. However, the form of rules, necessary as it is for parsing, is not entirely optimal for a generated structure. Consider the following rule:

attributes : attribute SEP attributes
           | attribute
           ;

Consider a pair of attributes:

first,second

With the attribute rule matching each identifier, and with the attributes rule incorporating a single attribute value within a structure referencing other attribute, the following structure would emerge:

A more natural structure would instead employ a linked list of attributes:

To achieve this, a tail member is defined in structures, and instead of wrapping results in new structures at each level of the rule hierarchy, results are effectively combined by having one result reference another via the tail member, thus linking together collections of results.

Interface Structure

The types.h file defines the structural elements of interfaces prepared during the processing of the input files. A hierarchy of structure types is defined as follows:

The nature of the hierarchy should reflect the conceptual form of the input. It should be noted that imported file and header file information (represented by import and include structures respectively) is associated with interface information (represented by interface structures). This arrangement merely attempts to indicate the import and header file declarations that preceded specific interface declarations, but the two different types of information should arguably be grouped within a file-oriented structure.

Code Generation

The program.c and program.h files define the functions that coordinate the generation of program code. It is in program.c that files are opened for writing (using the get_output_file function provided by common.c), and the principal function involved in initiating the population of these files is the write_files function. This function then invokes write_interfaces to coordinate the code generation for each of the interfaces described by the input.

Meanwhile, a selection of writer functions are employed to generate code for the different structures employed within interface descriptions. Since output may be generated for different kinds of files, the following concepts are employed to parameterise the output appropriately:

Component roles indicate the nature of the component employing the generated code: client or server
Function roles indicate the characteristics of the function being written: general or completion
Parameter roles indicate the context of an identifier at a location in the generated code: signature (formal function parameter), structure (member), invocation (actual function parameter)

Through the use of roles, the same fundamental information can be expressed in different ways by the same routine.

Interfaces

The write_files function coordinates output generation for each interface.

Imports and Parsing

The imports.c file coordinates the parsing and code generation activities. Upon running the tool, the import_file function is invoked by the main function (in main.c) for each input file.

The import_file function invokes the parser and with a successfully parsed result, it then seeks to resolve any import statements mentioned in the input. The resolve_imports function iterates over any such imports, these represented by import structures, calling the import_file function to obtain another parsing result structure for each import, with a reference to this structure being stored in the import structure concerned, thus establishing inter-file relationships. Such importing is done recursively.

Where compound interfaces are defined in the input, these composing or combining other interfaces, the identity of such base interfaces needs to be determined. This is done by the populate_bases function which searches the parsing results from the location of each compound interface whose base interfaces are to be resolved, traversing interface definitions and imports. When determined, an interface definition will then be directly referenced by the base interface reference associated with the compound interface.

Includes and Headers

The includes.c file provides support for writing #include statements in C and C++ programs, with the write_includes function traversing an include structure list to emit a list of statements.

Interface Definitions

The interface.c file provides the write_interface_definition function which is concerned with generating a description of an interface in two different contexts: client and server. Some generated files are employed in both contexts such as the file of the form <interface>_interface.h.

Client Definitions

For client use, some types are defined for client programs to use in a header file of the form <interface>_client.h.

C programs employ a collection of function signatures for the different operations provided by the interface plus a declaration of an interface object retaining references to those functions as members.

C++ programs employ a class declaration containing the operations as methods and with internal state referencing the component providing the operations.

Server Definitions

For server use, some types are defined for server programs to use in a header file of the form <interface>_server.h.

C programs employ declarations of the interface object type whose members provide a table of references to operation functions. A reference type is defined as a way of referencing the component providing the operations, and an object type is defined that bundles this opaque reference with a reference to the interface details. (Such an object is populated with a pointer to a component and a pointer to a table of concrete functions providing the appropriate operations.)

C++ programs provide the rather more straightforward class declaration containing the operations to be implemented by a component. (The implementation is provided by deriving from the generated class and defining concrete methods.)

Both C and C++ output contains additional details describing operation codes and message structures (incoming and outgoing) for the different operations.

Templates and Output

The strings used to generate output are provided in templates.h. Many of these strings contain placeholders in the form of output specifiers such as %s and %d used by the fprintf function. A more sophisticated approach would involve the use of a template language, which would potentially simplify this tool's code and make the output generation mechanisms somewhat clearer.

Clients

The client.c file is concerned with generating wrapper functions that present the operations exposed by an interface to a client program. These wrapper functions take any parameters supplied to them and populate a message to be sent to the server providing the implementation of the interface, sending the message to the server, interpreting the reply, and setting output parameters appropriately.

The form of client wrapper functions is similar to that of server wrapper functions, and various common functions are used by both client and server code generation activities.

Servers

The server.c file is concerned with the generation of code for wrapper functions for each of the operations associated with an interface, this code obtaining values from messages and invoking the actual operation implementations.

Most of the functions provided are concerned with the generation of variable declarations and the initialisation of such variables from messages, the formulation of invocation statements, and the population of messages from operation results. Various common functions are used by client and server code generation activities.

Dispatchers and Handlers

The generation of code that interprets message details and dispatches to the wrapper functions written by server.c can be found in the dispatch.c file.

Parameters and Members

The declaration.c file provides general support for generating declarations for client and server code, handling artefacts such as interfaces, function signatures and parameters.

Message Structures and Access

The message.c file provides support for the generation of statements related to the access of message contents. The functions provided are used by the client.c and server.c files in the generation of wrapper functions.

The structure.c file concerns itself with the generation of operation code (opcode) enumerations, these giving each operation a distinct identifier, along with the generation of structures used to interpret and populate messages for each operation.

Summaries

The summary.c file provides functions to display the structure of parsed interface files. Each function corresponds to a data structure defined in types.h, with references traversed to other structures. The show_interface function is the entry point from which other structures created during parsing are reached.