Evolutionary Algorithms – Directing the undirected

This is a followup to my previous post on the same topic : http://rogeralsing.com/2010/07/29/evolutionary-algorithms-problems-with-directed-evolution/

I started thinking about possible solutions after I published my last post, and I think I might have something that could work.

In order to harness the full power of evolution, we need to be able to simulate undirected evolution.
Undirected evolution requires more than one environment, so that organisms can evolve and niche into specific environments instead of just competing for the same environment.

In our case, the “environment” is the “problem” that we want to solve.
The more fit an organism is in our environment, the better it is to solve our given problem.

So far, I think everyone have been applying evolutionary/genetic algorithms on individual problems, evolving algorithms/solutions for a single purpose..
And thus, experiencing the problems of irreducible complexity.

But what if we were to introduce an entire “world” of problems?

If we have a shared “world” where we can introduce our new problems, the problem would be like an island in this world, and this island would be a new environment where existing organisms can niche into.
This way, we could see organisms re-use solutions from other problems, and with crossover we could see combinations of multiple solutions for other problems.

The solutions would of course have to be generic enough to handle pretty much every kind of algorithm, so I guess the DNA of the organisms needs to be very close to a real GPL language.
Possibly something like serialized Lisp/Clojure, running in a sandboxed environment…

By adding more and more problems to this “world”, the better it would become at solving harder problems since it can reuse existing solutions.

The structure of it all would be something like:

The “World” is the container for “Problems”.
“Problems” contains input/output sampling and “populations of organisms”, thus, each problem have its own eco system.
“organisms” evolve by mutations and genetic crossover, they can also migrate to other “problems” from time to time.

This way, an organism from the “SSH Encryption island” may migrate over to the island of “Bank authentication login code generator island” and possibly be used as a module in one of the branches of one of the organisms in there, and thus removing “irreducible complexity” from the equation here..
Evolution would be locally directed and globally undirected…

I think this could work at least to some extent, or?


Evolutionary Algorithms – Problems with Directed Evolution

Creationists often use “irreducible complexity” as an argument against evolution.
e.g. you need all parts in place and functioning before the “whole” can do its work and thus reproduce and spread its features.

The bacteria flangellum is one such feature that have been attributed with “irreducible complexity”.
You need the “tail” and some “engine” parts in place before it can be used as a propeller and drive the bacteria forwards.

Evolutionists have however proven this wrong and shown that each of these parts have had other purposes before they were re-used/combined for propulsion, so each part was already present.

The key here is that evolution in reality is not directed to a “final goal” it simply makes organisms adapt to their current environment.
e.g. an organism might evolve a single pair of legs and later generations might get more of those legs if that is beneficial.
The front pair of legs might even later evolve into a pair of arms that allows the organisms to grab food while they eat and so on.

In short, existing features can be reused and refined in order to reach a higher fitness level in the current environment.

As far as I know, we still have not managed to accomplish this sort of “undirected evolution” in computer programs in the same sense.
If we make a program that are supposed to come up with a solution for a given problem, I would use “directed evolution” and try to breed solutions that are better and better at solving the given problem.
So if our program was supposed to come up with a propulsion system for a body, it would fail at evolving the bacteria flangellum since we experience the effects of irreducible complexity, our program is unable to evolve all the parts for other reasons than moving the body forwards.

In order to harness the full power of evolution in computer programs, we need to be able to simulate “undirected evolution” so that we can evolve all these parts that later can be re-used for other purposes.

Are there any research going on in this topic at all?

I know that the old “Tierra” simulation was sort of undirected, the only goal was to consume as much CPU as possible, but it sure could use undirected evolution to get to that goal.

But other than that, anything?

Genetic Programming: Evolving Domain Logic a’la CQRS

This was supposed to be my contribution to first of april (Sorry guys)
But apparently people are already doing things similair to this, for real, which makes this post quite a bit less fun.

So maybe aprils fool is on me if someone implements this :-)

Most samples of genetic/evolutionary programming evolve formulas, e.g. a single mathematical formula that evolves to come up with a formula a given set of in and out parameters.
(see http://rogeralsing.com/2010/02/14/genetic-programming-code-smarter-than-you/)
While that is interesting, it would be much more interesting and profitable if one could evolve an entire application, or at least some parts of it.

If you are into domain driven design, you have probably seen that the latest buzz is CQRS, Command Query responsibility segregation.
In short, you let your domain model raise events that can be consumed by a query/reporting store.
Thus, allowing the query and command sides to be separated.

So what does this have to evolving code?
Well, if you want to evolve code, you need some way to verify the fitness of it, how good the generated code solves your problem.
This is hard if not impossible to do for a normal application, there is no natural integration point where you can see if the application does what you expect.

However, with domain logic a’la CQRS, you have domain events, and you can easily set up a set of event sequences for a given use case.


Given that:



  CustomerId = 123
  NewName = "Foo"

Given that:



  CustomerId = 345
  NewName = "Bar"

Or preferably a more complex scenario

Given that:



  OrderId = 1
  CustomerId = 2
         ProductId = 50050
         Quantity = 10
         ProductId = 1024
         Quantity = 5
         ProductId = 50040
         Quantity = 2

This way, you can compare the expected output with the output of the generated code.
You simply have to score the events based on if they match the expected types and if they contain the correct data.

You will still have to create skeleton code for your entities and event types and describe the expected output, but the actual implementation can be generated by evolution.
That is, it will evolve the command methods that performs domain logic and changes state in your model.

In my experiment it takes on average 7-9 minutes for my POC generator to come up with C# code that solves the given problems, per command method that is.
(About 43 k generations)

IMO this is groundbreaking since the code don’t have to be tested since you already know it fills your expectations and even in complex cases the generator is usually faster than most junior developers.

A very nice benefit of evolution is that if you later get some changed requirements, some special cases, the evolver can simply continue from the current code and incrementally add the code for the new requirements.

Since the fitness factor currently doesn’t include how complex the generated code is, the output can be somewhat verbose.
Actual sample:

public void Rename(string newName)
     this.Name = (newName + "a¤").Substring(0, newName.Length - 9 + 5 + 4);

This can be solved by giving short code higher fitness than equal but more verbose code.

Thats all for now..

Genetic Programming: Code smarter than you!

Here in sweden there are currently circulating some email with a challange about solving a math puzzle.
It’s nothing special really but the mail goes something like this (translated):
(Don’t blame me for the claims in the quote, it’s not my words)

It’s said that only people with an IQ over 120 can solve the following problem:

If we assume that:
2 + 3 = 10
7 + 2 = 63
6 + 5 = 66
8 + 4 = 96

How much is?
9 + 7 = ????

The mail contains an excel file with a password and can only be opened if you know the answer to the above.

Just for the hell of it I entered the problem into my old genetic expression evolver:

The application is based on genetic programming and does use genetic crossover and a real population (unlike my EvoLisa sample).

The problem was described like this:
problem.Cases.Add(new Case(2, 3, 10));
problem.Cases.Add(new Case(7, 2, 63));
problem.Cases.Add(new Case(6, 5, 66));
problem.Cases.Add(new Case(8, 4, 96));

The first and second arguments are variable values and the last argument is the expected output.

And here is the output of the application:

As you can see on the screenshot, the application have solved the equation in 250 generations (a few milliseconds).

That’s probably faster than you solved it ;-)

If it makes you feel better, I didn’t solve it at all, I go into fetus position on the floor when I see math problems ;-)


Evolutionary Compression

When I first posted the “Evolution of Mona Lisa” post, there were a few questions raised regarding how well an evolutionary approach to image compression would work out.
A few days later, some guys at the NVidia forms started to try out the concept using hardware accellerated graphics.
That attempt was fairly successful, it was better than normal JPG, but worse than JPG2000.

Now there is another attempy made by Neil Graham, you can read his article about it here: http://www.screamingduck.com/Article.php?ArticleID=46&Show=ABCE 
This attempt seems to deal much better with fine details than the CUDA version did.

I think that the final results shown in that article are awesome, although I would like to have seen some reference images using other compression algorithms with the same data size.