Classes for template generation of XML, HTML, etc.

These classes are in the main jar file. Suppose that you want to generate an HTML page containing data produced by your program, or write out an object in serialised form using XML. One way to do this is with successive print statements, but this tends to mix up the dynamic information you are really interested in with a fairly static skeleton. This hurts you both when you are concentrating on the code to write out the dynamic information, and when you find out that you need to change the static skeleton, to make the HTML page look nicer, or when you want to move to a different XML DTD or Schema.

Template generation puts the static framework into a template. This consists of chunks of static text interspersed with escape sequences which cause the substitution of dynamic information from outside the template, or some form of control flow. Heavy-duty versions of this include Java Server Pages, Active Server Pages, and PHP. These are fully-fledged programming languages, which become the main entry point of the program. What is presented here is different: I describe classes that can be used from within a program to generate chunks of text, without demanding that they become the main framework of the program.

Using the template classes

Typically, you do the following:

  1. Create a Template object from a java.io.Reader, typically produced by calling class.getResourceAsStream() to retrieve text stored with your program's class files.
  2. Build a ConsumableMap from a java.util.Map of java.util.ListIterators that return StringChunk objects. This associates a number of java.lang.String keys with a sequence of StringChunk objects. Attribute-value pairs are ubiquitous in computing. Templates can easily handle a slightly more general object: maps from attributes to lists of values, and that is what these are. One way to produce StringChunk objects is to construct them from a String.
  3. Call Template.tryGenerate(), passing it the ConsumableMap. This returns a StringChunk object.
  4. Write out the generated text by calling StringChunk.printAll(), or simply turn it into a string by calling StringChunk.toString().

A number of examples of this are in the test code for these classes, TemplateTest.java - which is again with the main jar file. Since it uses templates loaded in via Class.getResourceAsStream, example templates are also here, as TemplateTest.in1.. TemplateTest.in4.

StringChunk

You could do template generation without introducing a StringChunk object, using java.lang.String or java.lang.StringBuffer instead. The problem with this is that you end up doing a lot of string copying, especially if you end up using strings generated from one Template as input to another, which would otherwise be a very good way to generated highly variable nested XML or HTML. Worst case cost, even from StringBuffer, can be quadratic in the number of characters finally generated. Consider the following:

  for (int i = 0; i < n; i++)
  {
    sb = new StringBuffer("x");
    sb.append(old.toString());
    old = sb;
  }
Obviously, you would never write this, but something like this could happen in the worst case during template generation.

StringChunk objects can be created from Strings, e.g. as StringChunk sc = new StringChunk.StringCarrier("String text"). They can also be appended as e.g. appendResult = stringChunk.append(stringChunk2). Appending StringChunks takes constant time, no matter the length of the StringChunks involved. It does not modify the StringChunks involved, but creates a new StringChunk to form the result, so there should be no problems with concurrency, or with appending one StringChunk to many others.

Once you (or the template generation code) have finished your sequence of appends and are happy with the result, you can call StringChunk.toString() or StringChunk.printAll(java.io.Writer) to produce a String, or write out a result. This takes time proportional to the length of the string produced or result written plus the number of chunks appended together.

ConsumableMap

The limited template generation provided here is based on a very simple data structure, which maps from a key to a sequence of values. As templates are expanded, they use the values in each sequence in order, until the sequence is exhausted. This makes it possible to write templates that produce lists of the values associated with a particular key. A ConsumableMap uses its ListIterators to move along the sequence of values associated with a particular key. To support more complex control structures in the templates, it is also possible to undo the consumption of values in a sequence. You can call mark() before attempting to extract the values associated with a number of keys. After extracting these values you can either call accept() to accept the consumption of these values, or reject() to restore the state of the map at the time mark() was called. Sequences of mark() and accept() or reject() nest. mark() and accept() take constant time. reject() takes time proportional to the number of values extracted since its nested mark (it is implemented by calling previous() to restore the state of each ListIterator from which a value was extracted using next()).

Template Escape Sequences

The escape character for these templates is '%'. It can be used as follows:

%white space%
% immediately followed by white space causes the % escape and all succeeding white space to be ignored. This allows you to lay out templates in a reasonably readable fashion without placing arbitrary chunks of white space in the generated template output. For XML or HTML extra white space doesn't matter much, but it is a bit of a waste, and you might not always be generating XML or HTML.
%<...%>
%< and %> brackets mark the start and end of template comments. These brackets, and all the material within them, are ignored. They do not nest.
%xdigits%
%x should be followed by a sequence of hex digits terminated by a % character. The hex number represented by these digits is cast to a character and becomes part of the text. So %x66% is another way of expressing the single character 'f'.
%%
%% expands to a single %
%$name%
Here name should be a sequence of characters valid as the end part of a java identifier (so letters, digits, and _ are all OK, as well as others). This is a request to substitute in the value associated with the key <name> in the ConsumableMap when the template is expanded (or fail, if no values for that key remain).
%:control(...%)
here control is one of a small number of words that produce some template control behaviour, such as repeating the bracketed text until a template expansion fails. The text within the brackets can contain all of the defined escape sequences, including other control sequences, and these brackets do nest. Each such expansion either succeeds or fails. This failure may or may not be propagated beyond the brackets, depending on the control word, since controlling such failure is one of the main functions of control word. In any case, a section of bracketed text that fails will never return any generated text, and never consume any values from the map, even if the failure occurrs at the very end of its section of text.

Control Sequences

The following control sequences are defined:

fail
A fail section never generates any text or consumes any values. It succeeds if, and only if, the expansion of the text within its brackets fails. One use of this is to detect keys whose values have not all been used up.
ignore
The text within the brackets is expanded as usual but thrown away, so the expansion is not visible. The effect on the ConsumableMap is as if the brackets were not there: values are consumed, and the expansion succeeds or fails, as usual.
release
After the section of text within release brackets has been expanded, whether or not the expansion is successful, the state of the ConsumableMap is restored to its state at the start of the expansion. So expanding a release section never permanently consumes any values (though within that section, values may be consumed as usual). If the bracketed section fails, the release section fails too.
repeat
A section within repeat brackets is repeatedly expanded until it fails. The states of both the ConsumableMap and the generated text are as they were at the end of the last successful generation: the final failing generation consumes no values and generates no text. A repeat section does not propagate its final failure, and so always succeeds.
try
A section within try brackets is generated once. If the generation is successful, the text is kept, and the values from the ConsumableMap used to produce it are consumed. If the generation is not successful, no text at all is generated, and the ConsumableMap is restored as it was at the start of the try section. The failure is not propagated, so a try section always succeeds.

Defining New control sequences

You can define your own control sequences by inserting them as values in a Map from Strings to objects implementing the ControlTemplate interface and passing that Map as the second parameter to the two-parameter Template Constructor. When your ControlTemplate object is referenced in the process of template generation, the single method defined in the interface, tryGenerate() is called, passing your object the ConsumableMap, a TemplateRecord object constructed from the template text inside your object's control brackets, and a StringChunk object which is the entire text generated so far. If the keys you insert in the map collide with the existing control template keys described above (such as "fail" and "ignore") an IllegalArgumentException will be thrown during the construction of the Template.

Example Template

Here is an example template used to create a fragment of html. It starts with a comment, and a demonstration of the escape sequence used to cause a chunk of whitespace to be ignored (which makes producing readable templates a lot easier). The main body of the template does little more than substitute in a few values. %:try(%... %) is used in places where those values are optional and so may not be present in the map. %:release(..%) is used for a value that is substituted in twice, so that the first substitution doesn't permanently remove the value we need to use again.
%< Produce input element with default value where possible
inside a table line: note loss of white space after final 
% in this line>%%


<tr><td><label for="%:release(%$name%%)">%$label%</label></td>
<td>%:try(%$mandatoryMarker%%)</td>
<td><input type="%$type%" name="%$name%" %:try(value="%$value%"%)
%:try(checked="%$checked%"
%)/></td></tr>%