Structured Program Generation Techniques – Yannis Smaragdakis

Yannis takes a quite broad definition of program generators, including program transformation like refactorings and program synthesis, apart of course from more traditional interpretation like MDE and DSLs.

This is a fascinating topic, says Yannis, what can be more interesting then computing about computation?! 🙂

Program generation is structured when the generator guarantees well-formedness of the generated programs, like the program will parse, or, a bit harder, that it will type-check. Why would we want such a thing? We are going to compile the generated programs anyway, so we’ll figure out if it doesn’t type check soon enough. Well, this is not soon enough for the writer of the generator. So the well-typedness of the generator programs are like test cases for the generator, they help debug the generator.

Program generation basics

There are many different ways to generate a program:

Capture

 

These already have different guarantees: the text-based approach basically promises nothing, while the AST strategy at least guarantees syntactic correctness.

Capture

 

A very popular way is quoting, where the parse tree of an expression is embedded into a statement. Other ways are macros, templates, or special language constructs (like aspects)

As explained above, different strategies have different issues in terms of how structured they are, for exmapl C++ templates can have errors in them that will only be detection at instantiation time, and aspects can result in errors on applications.

Meta-type safe program generation

An example: it is possible to generate a statement with undeclared variables, as such:

Capture

So we need to have a lot of information about the control flow of the generator in order to be able to say anything about the generator programs. Other example of issues that could occur are non-unique or illegal variable names.

Multi-stage programming

A simple way to ensure ensure meta type-safety is to map fragments of the generator one-to-one to the generated program. This sounds very complicated, so let’s have a look at an example, a simple exponential function:

exp(n,a) =
if (a == 0) 1
else n * exp(n, a-1)

We can turn this into a program generator!

exp(n,a) =
if (a == 0) `[1] 
//if a (known at compile time) = 0, then generate the program "1"
else `[#[n]* #[exp(n, a-1)]] 
//else, generate the program "n" times whatever we get for n-1

So:

expr (3,4) generates: 3*3*3*1.

This program will be much faster at runtime than the exponent program. And, it can also be used with a variable:

expr(‘[i]*4) = i*i*i*i*1

This is called ‘staging’: you take an existing program and annotate it to turn it into a program generator and is used in for example MetaML, you decide/guess what you will know at generation time and for those parts, you can generate a statement immediately.

Staging cannot express all generators, for example, it cannot do class declarations or complex control flow, and it is most useful for specializing programs in such a way that as much as possible is pre-evaluated. LMS is a modern incarnation of staging in Scala, that allows for very concise and powerful program generation.

By just changing a type into rep[type] we have made a generator with LMS, example:

def powerS(n: Rep[Int], x: Int): Rep[Int] = {
   if (n == 0) 1
   else if (n % 2 == 0) {
      val result = powerS(n/2, x)
      result * result
   }
   else x * powerS(n-1, x)
}

Now we can call PowerS and it will generate a new function for us:

def pow5(n: Rep[Int]): Rep[Int] = powerS(5,x)

This has what Yannis calls the ‘erasure’ property, if we delete the “Rep”s it will be a regular function.

Morphing

Another way to constrain what a generator can generate is to use ‘morphing’: iterate over known code to produce new code. This is more powerful than staging as it can produce arbitrary types of code. Example:

class MethodLogger<class X> extends X {
<Y*>[meth] for(public int meth (Y) : X.methods)
   int meth (Y a) {
      int i = super.meth(a);
      System.out.println("Returned: " + i);
      return i;
   }
}

The idea of morphing is to write code once and apply everywhere. It is a bit like aspects but more generic. Isn’t this very abstract? Well, Yannis says the type system can do it, which is a lot dumber than a programmer, so we should be able to do it too 🙂

Within MorphJ, the framework Yannis made for morphing Java, the definition of meta type-safety is:

The generic class is verified on its own (not when type-instantiated) and the generator is defined to have a type error if any input can cause an error.

A cliffhanger for the next lesson:

Capture