The Grunt Preprocessor

Grunt is a preprocessor designed for generating repetitive blocks of code with subtle variations which can not be captured by more traditional means (functions, C preprocessor macros, C++ templates). Grunt is named for the fact that it takes the "gruntwork" out of coding.

Grunt reads from the files specified on its command line and generates output to STDOUT by default. If no files are specified on the command line, grunt reads from STDIN.

Input files can consist of raw text and preprocessor directives which are specially interpreted by grunt.

Statements and Expressions

Grunt interprets two kinds of directives: statements and expressions. Expressions are initiated with the expression character ('$' by default) and statements are initiated with the statement character ('@' by default).

Expressions are interpreted and replaced with their value in the output stream. The simplest form of expression is a variable expansion. The file:

   hello $user

will generate "hello Mike" in the case where user = Mike. Variable names in grunt are case sensitive and are of the same format as C variable names:

   VariableName := LetterOrUnderscore VariableCharacter*
   VariableCharacter := '0'-'9' | LetterOrUnderscore
   LetterOrUnderscore := 'a'-'z' | 'A'-'Z' | '_'
In the case where a variable name is followed a non-variablename character, no delimiters are required. In cases where a variable name is followed by a variable name character (and in the case of expressions in general), it is neccessary to enclose the expression in parenthesis:
   hello $(firstName)_$(lastName)_$middleInitial

The example above will output: "Mike_Muller_A" where firstName = Mike, lastName = Muller, and middleInitial = A. Note that it was not neccessary to enclose middleInitial in parenthesis, as the character following it is not a character which can be used in a variable name.

In addition to variable, expressions can also contain binary operators. At present, grunt supports the following binary operators:

a = b
returns the string "true" if the value of a is identical to the value of b and the empty string if not.
a / b
returns the string "true" if the value of b is a substring of the value of a and the empty string if not.
a . b
concatenates the value of b to the value of a.
a & b
returns the string "true" if both a and b are not empty strings. Otherwise returns false.
a | b
returns the string "true" if either a or b are not empty strings. Otherwise returns false.

The '=' and '/' operators are not particularly valuable within stand-alone expressions. Their primary use is within statements.

Binary operators are always evaluated from left to right, regardless of what the operator is. Parenthesis may be used to group sub-expressions. String constants may be used as atoms, enclosed in single or double quotes.

In addition to binary operators, the logoical "not" operator (the exclamation point ['!']) is used to negate the expression which it immediately preceeds.

Statements are used to define variables, to define tables, and to implement control structures. The grunt preprocessor begins interpreting a statement when it encounters the statement character ('@' by default) and stops when it encounters one of the characters used to indicate the end of a statement (';', '}', and '{'). Between these characters, C and C++ style comments may be used and will not be output.

Grunt recognizes the following kinds of statements:

def
Defines a variable.
table
Defines a table.
all
Iterates through the rows of a table
if
conditionally generates a block of text.
include
"includes" a file: reads in a file as though it were in this file's input stream.
out
Sets the current output stream. Closes the previous output stream.

Grunt also defines several "special" statements. The closing curly bracket ('}') is used to indicate the end of a block. The pound sign ('#') is used to indicate a grunt comment which is terminated at the end of the line. A semicolon (';') is used to indicate the empty statement which may be used as a placeholder for C or C++ style comments. Examples:

   @# this is a comment

   @if (x = 'test') {
   some text
   
   @# the following ends the block initiated by the "if" statement
   @  }
   
   @
   // here we take advantage of the fact that grunt parses everything
   // up till the end of the statement to insert a C++ style comment
   // which will not be visible in the output stream.  The following
   // semicolon is the empty statement
   ;

The def statement

def is used to define variables. Its format is:

   @def variable = expression;

For example:

   @# simple definition
   @def x = "this is a test string";
   
   @# use a more complicated expression
   @def insult = "You know, " . name . ", you smell really bad.";

The table statement

Application programmers are constantly faced with the problem of having to implement code structures which vary only in a limited number of parameters. The table statement attempts to solve this problem by allowing a tabular structure to be defined inline. The table can then be enumerated using the all statement.

table statements are of the form:

   @table variable {
      headers;
      data
   @  }

Headers are column names: within the body of an all statement, these will be defined as variables whose values are those of the corresponding column of the current row of the table. The headers line defines the number of columns in the table as well as their names. The remainder of the table is the actual table data. The number of elements of the data area should be evenly divisible by the number of columns. Within the data area, identifiers are interpreted as strings. Examples:

   @# Table of days of the week.  Note the semicolon at the end of the
   @# header definitions.  The cute little dashes under the column names
   @# are unneccessary, they're just part of a comment included for clarity.
   
   @table Weekdays {
      ordinal  name        abbrev;
   // -------  ----        ------
      '0'      Sunday      Sun
      '1'      Monday      Mon
      '2'      Tuesday     Tue
      '3'      Wednesday   Wed
      '4'      Thursday    Thu
      '5'      Friday      Fri
      '6'      Saturday    Sat
   @  }

The all statement

The all statement is used to iterate through the elements of a table. In the body of the all statement, each of the column names of the table is defined as the value of that column for the current row.

all statements are of the form:

   @all tablename {
      text
   @  }

The text portion of the all statement is run through the preprocessor for each element of the table. Example:

   @# here's our weekday table again...
   @table Weekdays {
      ordinal  name        abbrev;
   // -------  ----        ------
      '0'      Sunday      Sun
      '1'      Monday      Mon
      '2'      Tuesday     Tue
      '3'      Wednesday   Wed
      '4'      Thursday    Thu
      '5'      Friday      Fri
      '6'      Saturday    Sat
   @  }

   @# now we're going to use it to generate a C style array definition
   @all Weekdays {
      day[$ordinal].name = "$name";
      day[$ordinal].abbrev = "$abbrev";
   @  }

The above example whould result in the following output:


     day[0].name = "Sunday";
     day[0].abbrev = "Sun";

     day[1].name = "Monday";
     day[1].abbrev = "Mon";

     day[2].name = "Tuesday";
     day[2].abbrev = "Tue";

     day[3].name = "Wednesday";
     day[3].abbrev = "Wed";

     day[4].name = "Thursday";
     day[4].abbrev = "Thu";

     day[5].name = "Friday";
     day[5].abbrev = "Fri";

     day[6].name = "Saturday";
     day[6].abbrev = "Sat";

Note that the content of the body is not "part" of the statement: it is really a nested component of the overall input stream. It can include nested statements and expressions as in the example.

The if statement

As its name implies, the if statement is used to conditionally process a block of text.

if statements are of the form:

   @if (expression) {
      text
   @  }

As might be expected, the text block is only evaluated if the expression evaluates to anything other than the empty string.

Examples:

   @# an "if" used the way it would be by a traditional preprocessor
   @def realOS = 'true'
   @if (OS = 'Windows95') {
   @  def realOS = '';
   @  }
   
   @#--------------------------------------------------------------------
   @# this demonstrates how an if statement can be used to conditionally
   @# generate code from a table
   @#--------------------------------------------------------------------
   
   @table Fields {
      fieldName   isSettable;
   // ---------   ----------
      name        true
      rank        true
      serialNum   ''
   @  }
   
   class Person {
   
   @all Fields {
   
      private:
         String $fieldName;
         
      public:
         String get$fieldName() { return $fieldName; }
         
   @  if (isSettable) {
         void set$fieldName(String val) { $fieldName = val; }
   @     }
         
   @  }

The last example demonstrates how the kind of code generated can be varied according to the needs of a particular case.

The include statement

The include statement is used to include another source file in the grunt input stream.

The format of the include statement is as follows:

   @include file-name-expression;

The file-name-expression can be any expression which evaluates to a file name.

By default, grunt searches for the file only in the current directory. It is possible to override this behavior by setting the "INCLUDE" variable, which defines an include path. This variable can also be set on the grunt command line using the "-I" option. If you override the INCLUDE path through either method, it is necessary to explicitly identify the current directory (either with a "." or an empty string) if you want to be able to continue to include from there.

Example:

   $ grunt -I. -I /usr/local/include
   @include "LocalFile.pre";
   @include "globalheader.h";
   
   @# reset the include path (to the same value that we have used on the
   @# command line)
   @def INCLUDE = ".:/usr/local/include";
Note that the path separator (":" in the above example) varies depending on the platform. In platforms that use ";" as a path separator, it will also be used as the grunt path separator.

The out statement

The out statement is used to define a new output stream for the preprocessor. It is used to allow a single preprocessor source to generate multiple output files.

The format of the out statement is as follows:

   @out file-name-expression;

As with the include statement, the fact that the file name is an expression permits all expression operations to be performed on it.

The out statement can also be used as a block command, to temporarily send output to another source:

	@out file-name-expression {
	
	this text is sent to the file
	
	@	}
	
	this text is sent to the output stream that was in use before the
	out statement

Functions

Grunt does not yet allow user defined, parameterizeable macros like the C preprocessor or m4. This feature will be implemented in future releases. It does, however, provide two primitive functions which can be invoked in expressions. These are upper() and lower(), and as their names suggest, they are used to convert text strings to upper and lowercase, respectively. Example:

   $(upper('test string'))

generates the following output:

   TEST STRING

The grunt preprocessor is Copyright (C) 1996, 1997, 1998, 1999 Michael Alan Muller. It may be used and redistributed under the terms of the GNU General Public License.