Genetic programming on a DSL [Desiderius #16]

Hi desi lovers!

Service notice: Sorry I missed a post last weekend, I was prepping my F# Exchange talk. If you have not read any of my Desi posts, you can save some time and watch my talk that more or less covers posts 1/5, 8, 15 and 16 (yes, that is this one) If you want to just see this post in video form, watch the final 10 minutes of the video.

A second benefit of a DSL

Writing a DSL is nice as my code is now a lot more concise, instead if writing something like this:







I can now write this:





But another benefit is that it is a lot easier to use genetic programming. Last week two weeks ago, I wrote about mutating a program until it returned the right string, but I cheated slightly there by only generating compiling programs, like this:

return "\nfunction helloFSharpLovers() { \n \t return \"" + Outputstring + randChar + "\";\n}";

Obviously generating totally random strings is as effective as letting your cat run over your keyboard until all your unit tests pass. It might happen, but… So more commonly we mutate the AST directly instead of a textual representation. And this is why a DSL is handy, because the syntax is smaller, we can generate programs that make sense more likely.


As an example, let’s consider the Peano axioms. This is the idea of representing of natural numbers by starting with Zero and an increment, typically called Suc. in F#:

type Term = 
       | Suc of Term 
       | Zero

We can calculate with Peano terms too, we can for example add and multiply them:

       | Plus of Term * Term
       | Times of Term * Term

In this world, there are some equalities, like 0 + something = 0. These are useful if you want to simplify terms, for example, before you are evaluating them.

We could write them, but that sounds like a lot of work. Why not generate lots of random rules and then just retain the ones that are correct?? A rule is just a tuple of Terms:

We define an eval function, like this:

   let rec Eval t = 
       match t with 
       //eval to int
       | Zero -> 0
       | Suc x -> 1 + Eval x
       | Plus (a,b) -> Eval a + Eval b
       | Times (a,b) -> Eval a * Eval b

Now, we know what rules are correct, because we can eval both sides of the tuple, and if they eval to the same, this is a good rewriting “tuple”, i.e. it would be fair to replace the left side with the right.

So, we can just generate lots and lots of totally random terms, like this:

   let rec randomTerm() = 
      let randomNumber = rnd.Next(0, 8)  
      match randomNumber with
      | 0 -> Zero
      | 1 -> Suc (randomTerm())
      | 2 -> Plus (randomTerm(), randomTerm())
      | 3 -> Times (randomTerm(), randomTerm())
      | 4 -> VarA
      | 5 -> VarB
      | _ -> Zero

This randomized between 0 and 8 by the way instead of 0 and 6 because otherwise the terms get waaaay too long. Okay, so this makes a random term, we can just generate lots of pairs of them and then filter out the ones that are wrong:

      let mutateN x: RulesList = 
         List.ofSeq (seq {for i in 1 .. x -> randomRule()}) |>
         List.filter RuleUseful |> 
         List.filter RuleCorrect

Just add a nice pretty printer, stir a little, and we get this!!


We can even use this rewriting system on terms:

   let rec applyoneRule (lhs : Term, rhs : Term)(input:Term) = 
      match input with 
      | a when a = lhs -> rhs
      | Plus (a,b) -> Plus (applyoneRule (lhs,rhs) a , applyoneRule (lhs,rhs) b)
      | Times (a,b) -> Times (applyoneRule (lhs,rhs) a , applyoneRule (lhs,rhs) b)
      | Suc a -> Suc (applyoneRule (lhs,rhs) a)
      | Zero -> Zero
      | VarA -> VarA
      | VarB -> VarB 

   let rec rewrite (rules : RulesList) (t:Term) = 
      match rules with
      | [] -> t
      | h :: tail -> rewrite tail (applyoneRule h t)

So, whatever we synthesize, can be applied. No thinking needed!! See you next week (I promise I will get my act together)

PS. The code is on GitHub.