[Text Generation]

Text Generation

Monday, October 10, 2005

Backgrounders

See here and here.

Ingredients

The text generation mechanism is provided by the CTextGenerator class in Gregor.Core. Going from an arbitrary root object, an object graph is created by CObjectGraph.CreateWithPathInfo. The object graph is then further processed, creating text files from literals, code expressions, and simple textual templates (which again support code expressions). All the latter are defined within the path information XML document.

Templates support replacing expressions as well. Graph creation, code expression evaluation, as well as template processing require an object implementing ICodeInterpreter2 (.NET Console, typically).

A Most Basic Example

Here's the XML document with (not so much) path information (PathInfo.xml):

<Root outfilename="Result.txt"
      literal="Hello, Literal!"
      expression="System.DateTime.Now"
      template="Template.htm" />

Here's the template file (Template.htm):

<p>
Hello, Template!
</p>

Let's run it in WebEdit.NET's Console window, using the Code Interpreter:

code:
info = new CTextGenerationInfo(new object())
info.Interpreter = new Gregor.NetConsole.Engine.CCurlyInterpreter()
info.PathInfoDocument = Dev.ReadXmlFile(Vars.DocPath) // assuming PathInfo.xml is active
info.InputDirectory = Vars.DocFolderPath              // assuming any doc is active
info.OutputDirectory = Vars.DocFolderPath             // assuming any doc is active
generator = new CTextGenerator()
generator.Generate(info)

A few notes:

Using Conditions

In the path info document, which is used by the graphing API, conditions can be expressed with filter, condition, and match attributes. See here.

Templates, as used by the base CTextGenerator class, support conditions only indirectly by calling appropriate routines. These routines may be quite specific to a special purpose; a more general case is passing in boolean flags, like calling CGraphContext.EchoIf(bool, object). The graph context is accessible via the Context variable. For example:

$Context.EchoIf(Context.CurrentNode.IsFirst, "<table>")%
<tr><td>Customer.Name</td><td>Customer.Phone</td></tr>
$Context.EchoIf(Context.CurrentNode.IsLast, "</table>")%

Grouping Collections

I can think of four possibly ways of grouping:

The first two are condition-based: it's simply conditional output that doesn't change the data structure, which remains a flat list. The other two are hierarchical: groups are based on the tree structure.

Managing Output Files

Let's change the path info file. The idea is to generate a couple of output files, relating to four path nodes each.

<Root>

    <Node select="new object()" outfilename="Result.txt">
        <ChildNode select="new object()" template="Template.txt" />
        <ChildNode select="new object()" template="Template.txt" />
    </Node>

    <Node select="new object()" outfilename="Result.txt">
        <ChildNode select="new object()" template="Template.txt" />
        <ChildNode select="new object()" template="Template.txt" />
    </Node>

    <Node select="new object()" outfilenameexpression="&#x22;DynResult.txt&#x22;">
        <ChildNode select="new object()" template="Template.txt" />
        <ChildNode select="new object()" template="Template.txt" />
    </Node>

    <Node select="new object()" outfilenameexpression="&#x22;DynResult.txt&#x22;">
        <ChildNode select="new object()" template="Template.txt" />
        <ChildNode select="new object()" template="Template.txt" />
    </Node>

</Root>

You can set the output file name with any combination of the outfilenameprefix, outfilename, outfilenameexpression, outfilenamesuffix attributes. Things are simply concatenated in the order listed here.

The CTextGenerator class has smart file stream management built-in, as can be studied in this example. If a file name is dynamically build, that is, if an outfilenameexpression attribute is present, the stream is closed when that node (not it children!) is done.

TextGen Command-Line Tool

Part of the FileTools collection (available off the Downloads page) is the TextGen utility. Supplying an XML path info file, you can use it for generating files based on any input.

Tip: within select expressions, you can load additional libraries that provide the object creation or data retrieval services you need:

<Root>
  <Library select="Gregor.Core.Reflect.LoadAssembly(&#x22;System.Data.dll&#x22;)" />
  <Table select="new System.Data.DataTable()" >
    <!-- ... -->
  </Table>
</Root>

See below for notes on the order of evaluation.

Another tip: for reusing code, you can also define user functions in select expressions:

<Root>
  <Foo select="foo(s){Gregor.Core.Parse.Tighten(s, ' ');}" />
  <Bar select="&#x22;   abc   xyz   &#x22;"
    <Bazz select="foo(Bar)"
       <!-- ... -->
    </Bazz>
  </Bar>
</Root>

WebEdit.NET

In WebEdit.NET, choose File/Template/Open Complex Template ..., choosing a path information document (currently preset to a template stored on my web site, which creates a new class along with a type-safe collection), and an output directory.

Have a look at the Resources page for interesting path information documents.

The most simple path info doc looks like this:

<Root literal="Hello, Complex Template!" outfilename="Hello.txt" />

Final Notes

While CTextGenerator processes the path information file in depth-first order, with pre-order traversal, the order in which the underlying object graph is built is undefined (currently: breadth-first, see CGraphNode). Right now, parent and preceding sibling nodes' select attributes will have been evaluated at a given point, but not the select attributes of child nodes of sibling nodes, even preceding siblings.