## Jupyter at Bryn Mawr College

Public notebooks: /services/public/dblank / CS371 Cognitive Science / 2016-Fall

# Lab 06: Evolving Brains

## CS371: Cognitive Science Bryn Mawr College, Fall 2016 Prof. Blank

This is a team-based project.

Please put your "# title" and "## by-line" in the next two cells (respectively), and then delete this cell

This lab explores using evolution to find a good Ladybug DNA. First, we have all of our imports and setup that we will need:

In [1]:
from calysto.simulation import DiscreteView, get_robot, DNARobot
from calysto.display import display, clear_output
import random
import string
import copy

%matplotlib inline


For these experiments, we will be using the DNA representation that is:

• exactly 93 characters long
• made up of the characters "0123456789"

Remember that exactly what these 93 bits of DNA are is unknown to us, as is the meaning of each character. We don't know how these DNA encode a Ladybug brain, but we know that somehow they do.

Our mission is to search through the space of possible Ladybug brains and find a good one, if not the best.

# Genetic Algorithm¶

Evolutionary algorithms are techniques for searching through a solution landscape in a generally effective manner.

All evolutionary strategies have a core set of properties:

• A "population" of solutions
• Limited resources (not all solutions will survive)
• A measure of performance, called ''fitness''
• Preferential treatment toward the higher measured solutions (the most fit ones)
• Reliance on randomness

A typical evolutionary system follows this basic algorithm:

1. Create a random population of solutions
2. Compute a fitness measure for each
3. Create new members by mutating and/or combining old ones (often selecting the better ones based on fitness)
4. Select the most fit for the next generation
5. Go to Step #3

## Randomized DNA Function¶

First, we make a simple function to make a random DNA string:

In [2]:
alphabet = "0123456789"

def make_dna(length):
return "".join([random.choice(alphabet) for i in range(length)])

In [3]:
dna = make_dna(93)
dna

Out[3]:
'667725087231604835114936588685531632573346010361182621852439651692351185085263856981249486522'

## 1.2 Fitness Function¶

Let's let the ladybug move about for a few simulated seconds, or until it dies (whichever happens first). We'll take the fitness to be the average amount of energy that it has over that time.

To make this a bit easier on our ladybug, we'll add an extra cluster of food around it when we test the ladybug.

First, we make a simulation object and get the simulated ladybug:

In [4]:
sim = DiscreteView("Ladybug1")
robot = get_robot()


Nothing is visible, but it exists. This is to make the simulation run fast. Whenever we want, we can see where the ladybug is:

In [5]:
sim.render()

Out[5]:

Now, we can define a fitness function. This is a little complicated, because we are using some of the internals of the simulation.

compute_fitness() does the following:

• takes a DNA string
• creates a DNARobot (ie, the ladybug) which needs the original robot, and 93 character DNA
• resets the simulation
• let's it run for some seconds, or until it dies
• computes the fitness avg energy over the run
• returns fitness
In [6]:
def compute_fitness(dna, seconds, show=False):
new_robot = DNARobot(robot, dna)
sim.setRobot(0, new_robot)
sim.reset()
while new_robot.energy > 0 and sim.clock < seconds:
sim.step()
if show:
clear_output(wait=True)
display(sim.render())
fitness = sum(new_robot.history)/len(new_robot.history)
return fitness


Now we try it:

In [7]:
%%time
dna = make_dna(93)
print(dna)
print(compute_fitness(dna, 3, True))

92.5
CPU times: user 2.85 s, sys: 14 ms, total: 2.87 s
Wall time: 3.11 s

In [8]:
%%time
print(dna)
print(compute_fitness(dna, 3, False))

198996236972211213555226682775354052437519977701384932237059263031245677270529281530796359230
92.5
CPU times: user 552 ms, sys: 5 ms, total: 557 ms
Wall time: 799 ms


With the display on, it will take about the number of seconds run. Without the display, the actual time taken is less that seconds given.

## 1.3 A Mutate Function¶

For all of the search methods (besides Greedy search) we need a method of slightly changing the DNA.

For some search algorithms, we will need a variable amount of mutation. So, we will make our mutate function take two arguments: the DNA to mutate, and an amount. The amount will be the number of mutations to make.

In [9]:
def mutate(dna, amount):
dna_list = list(dna)
### BEGIN SOLUTION
for i in range(amount):
point = int(random.random() * len(dna))
dna_list[point] = random.choice(alphabet)
### END SOLUTION
return "".join(dna_list)

In [10]:
mutate("XXX", 1)

Out[10]:
'1XX'
In [11]:
mutate(dna, 10)

Out[11]:
'198206236932211213555226682775354052437519977771384932237052263031245677270529885530686359230'

## 1.4 Putting it altogether¶

We define a make_pop() function that will generate a number of DNA. We return them as a list of pairs:

[[0, DNA1], [0, DNA2], ...]


The zero is a place keeper where we will put the fitness in a moment.

In [12]:
def make_pop(size, dna=None):
pop = []
for i in range(size):
if dna is None:
pop.append([0, make_dna(93)])
else:
pop.append([0, mutate(dna, 10)])
return pop

In [13]:
pop = make_pop(3)
pop

Out[13]:
[[0,
'288011181125947895543144113760071612630986198389816490561083032835531858849961664286602626144'],
[0,
'535509792262951458600411106626311989962369722112135552266827753540524375199777013849322370592'],
[0,
'630312456772705292815307963592305019810338376558525912677696262144070735157945904944963440121']]

## How diverse is the population per position?¶

We define a simple function to show the diversity of the population. Along the top will be each position of the DNA. Along the left side is the possible values for a location (a gene). The matrix is filled with the counts of those genes in that location.

In [14]:
def show_diversity(pop):
counts = [{} for i in range(len(pop[0][1]))]
for i in range(len(pop)):
for j in range(len(pop[i][1])):
c = pop[i][1][j]
counts[j][c] = counts[j].get(c, 0) + 1
print("   012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012")
print("   ---------------------------------------------------------------------------------------------")
for number in range(9, -1, -1):
print("%s: " % number, end="")
for position in range(len(counts)):
count = int(counts[position].get(str(number), 0)/len(pop) * 10)
if count == 10:
count = "X"
elif count == 0:
count = "."
print(count, end="")
print()

In [15]:
show_diversity(pop)

   012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012
---------------------------------------------------------------------------------------------
9: .....3.3....6...6.......3...3....3.63..3.3.3...3...33.....3..............366..3..3.33......3.
8: .33....3.......3.33...............3.3...33..3.333........33....3.....3.33........33..........
7: ......3..33.3.3........3...3...3..........33...........33..33......3.33...3333..........3....
6: 3.......3.3.......3......33333...3..33..33..3.....3...3633.3.3..............3.33...333.3.3...
5: 3.33...3...3.33.333.3......3....3............33.3.633.3......3.3.36...36.3...3............3..
4: ......3......3.3...3.333...........................3............63..3....3..3...6.63...33..33
3: .6.3................33....6...6......3.63.3..3...3.........3.33.3..3.33.........3...3.33.....
2: 3....3..3336...3.3..........33.....3..3....33..3.3...63...3.3.6....3.............3...36.3..33
1: ....633.33....3....3.33363.....36.6..3....3..33.33..3...3......3....3...6....3.3..........6.3
0: ..333........3.....33.3..3...333.3....6..............3...3..3....33.3.........33.....3...6...

In [16]:
for i in range(len(pop)):
pop[i][0] = compute_fitness(pop[i][1], 3)

In [17]:
pop.sort(reverse=True)

In [18]:
pop

Out[18]:
[[92.5,
'630312456772705292815307963592305019810338376558525912677696262144070735157945904944963440121'],
[92.5,
'535509792262951458600411106626311989962369722112135552266827753540524375199777013849322370592'],
[92.5,
'288011181125947895543144113760071612630986198389816490561083032835531858849961664286602626144']]

## Crossover Function¶

Crossover exchanges pieces of two genomes. Crossover can be a perfect shuffle, uniform, or point-based. Shuffle simply alternates between each parent. Uniform tosses an even coin to see if the gene comes from the mother or the father. Point-based mutation will only crossover at particular points. Shown here is a single point crossover:

To use crossover, we need a method of selecting good parents:

In [19]:
def select(pop):
"""
Given a population of the form [[FITNESS, DNA], ...]
pick one (with better fitness, more likily to be picked),
and return a copy of the DNA.
"""
partsum = 0.0
sumFitness = sum([item[0] for item in pop])
if sumFitness == 0:
raise Exception("Population has a total of zero fitness")
spin = random.random() * sumFitness
index = 0
while index < len(pop) - 1:
fitness = pop[index][0]
if fitness < 0:
raise Exception("Negative fitness in select: " + str(fitness))
partsum += fitness
if partsum >= spin:
break
index += 1
#print("index:", index)
return copy.copy(pop[index][1])

In [20]:
select(pop)

Out[20]:
'630312456772705292815307963592305019810338376558525912677696262144070735157945904944963440121'

Our crossover function will replace a segment of the population, given by percentage:

In [21]:
def crossover(pop, percentage):
"""
Given a population and a percentage, replace
that percentage of the population with new
individuals created from higher-fitness individual.
New individuals replace the worst-performing individuals.
"""
old_pop = copy.copy(pop)
for j in range(int(len(pop) * percentage)):
p1 = select(old_pop)
p2 = select(old_pop)
# Pick a crossover point:
point = random.randint(0, len(pop[j][1]) - 1)
child = []
for i in range(len(pop[j][1])):
if i < point:
child.append(p1[i])
else:
child.append(p2[i])
child = "".join(child)
#print("p1:", p1)
#print("p2:", p2)
#print("c :", child)
#print("replacing", len(pop) - j - 1)
pop[len(pop) - j - 1][1] = child


In the following cell, provide settings for all of the parameters, such as:

generations = 2
simulated_seconds = 3
pop_size = 10
mutations_per_individual = 10
crossover_percentage = .33

In [22]:
### BEGIN SOLUTION
generations = 10
simulated_seconds = 5
pop_size = 10
mutations_per_individual = 10
crossover_percentage = .33
### END SOLUTION


And then run the experiment!

In [23]:
best_fitness = []
pop = make_pop(pop_size)
for generation in range(generations):
print("Computing fitness...")
for i in range(len(pop)):
pop[i][0] = compute_fitness(pop[i][1], simulated_seconds)
print("Sorting...")
pop.sort(reverse=True)
print("Generation:", generation + 1, "Best:", pop[0][0])
print("  Best DNA:", pop[0][1])
print("Mutating...")
best_fitness.append(pop[0][0])
for i in range(2, len(pop)):
pop[i][1] = mutate(pop[i][1], mutations_per_individual)
# Crossover:
crossover(pop, crossover_percentage)

Computing fitness...
Sorting...
Generation: 1 Best: 115.75
Best DNA: 199777013849322370592630312456772705292815307963592305019810338376558525912677696262144070735
Mutating...
Computing fitness...
Sorting...
Generation: 2 Best: 115.75
Best DNA: 199777013849322370592630312456772705292815307963592305019810338376558525912677696262144070735
Mutating...
Computing fitness...
Sorting...
Generation: 3 Best: 115.75
Best DNA: 199777013849322370592630312456772705292815307963592305019810338376558525912677696262144070735
Mutating...
Computing fitness...
Sorting...
Generation: 4 Best: 115.75
Best DNA: 199777013849322370592630312456772705292815307963592305019810338376558525912677696262144070735
Mutating...
Computing fitness...
Sorting...
Generation: 5 Best: 115.75
Best DNA: 199777013849322370592630312456772705292815307963592305019810338376558525912677696262144070735
Mutating...
Computing fitness...
Sorting...
Generation: 6 Best: 115.75
Best DNA: 199777013849322370592630312456772705292815307963592305019810338376558525912677696262144070735
Mutating...
Computing fitness...
Sorting...
Generation: 7 Best: 115.75
Best DNA: 199777013849322370592630312456772705292815307963592305019810338376558525912677696262144070735
Mutating...
Computing fitness...
Sorting...
Generation: 8 Best: 115.75
Best DNA: 199777013849322370592630312456772705292815307963592305019810338376558525912677696262144070735
Mutating...
Computing fitness...
Sorting...
Generation: 9 Best: 115.75
Best DNA: 199777013849322370592630312456772705292815307963592305019810338376558525912677696262144070735
Mutating...
Computing fitness...
Sorting...
Generation: 10 Best: 115.75
Best DNA: 199777013849322370592630312456772705292815307963592305019810338376558525912677696262144070735
Mutating...


## 1.5 Analysis¶

Now, you can examine different aspects of this evolutionary run.

In [24]:
pop

Out[24]:
[[115.75,
'199777013849322370592630312456772705292815307963592305019810338376558525912677696262144070735'],
[106.44444444444444,
'879961668136608656828582931173558626726394621678393989229862522413204340489912115921688388838'],
[106.44444444444444,
'179201668136608640828501600281522048866996064979382667781462401430204807460557865983683896836'],
[106.44444444444444,
'149201668136608640828512600281522048866996064979382667781462401414204427560857815270686702838'],
[87.5,
'879961668636608640828501600281522048866996064968382667781464401430204800460507869983183896836'],
[87.5,
'879961668636608640828501600281522048866996064968382667781464401430204800460507869983183896836'],
[87.5,
'879961668636608640828501600281522048866996064968382669781464401414204420560857815270196702836'],
[87.5,
'879961668636608640828501600281522048866996064968382669781464401414204420560857815270196702836'],
[87.5,
'879961668636608640828501600281522048866996064968382667781464401414204420560857815270196702836'],
[87.5,
'149201668136608640828501600281522048866996064968382669781464401414204420560857815270196702836']]

What does the diversity of the population's DNA look like?

In [25]:
show_diversity(pop)

   012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012
---------------------------------------------------------------------------------------------
9: .1X6.......1.......1....1............1.89....9.2.2.1.4..2...............1.11...124...4..3....
8: 6.......91....9...9.9.1.....8...1..88..1.......7.8..1..8.2....1.....13...1.5..8...3..513119.2
7: .7.111..........1...........1.11.1..1.......1.3......58.........1......2....19....5....51.1..
6: ....6.99.5.99..9.1...1..8....1...1.1.89..818.17....88.....9......1.......8.1..131.1.3.5..3..7
5: ................1.1..9......1.91...1.....1......1....1......1.....11.1.15..36...7...........1
4: .2........1.....8..........1......8......1..8............8.68..9.5..951.4............11......
3: ........1.9.1..1......1.11...1.........1..1....19.11........11.131...1.............3..31...X.
2: ...3.........11....91..2..18...89.1.111....1......9...11...3.11...9...6...1..1...611.....5...
1: 4....9.1.4............17.111.8..........1...1..........18.1...8.6........1..1.16...17........
0: ....3.1......9...9....71.88......81.......81........1.1....1.8...3.9..37..8.2......5...151...


In the following cell, discuss the diversity of the population. Is this what you expected? Did it "converge"? Or is the population fairly diverse?

In the following cell, create a matplotlib graph of the best fitnesses over time.

In [26]:
best_fitness

Out[26]:
[115.75,
115.75,
115.75,
115.75,
115.75,
115.75,
115.75,
115.75,
115.75,
115.75]
In [40]:
import matplotlib.pyplot as plt
### BEGIN SOLUTION
plt.plot(best_fitness)
plt.xlabel("Generation")
plt.ylabel("Fitness")
plt.ylim(50,200)
plt.show()
### END SOLUTION


As part of any experiment, you'll want to examine different solutions. This section explores how you can watch a DNA control the ladybug, and make a movie for showing.

First, we define two functions:

In [31]:
def watch_one(dna, seconds=10):
new_robot = DNARobot(robot, dna)
sim.setRobot(0, new_robot)
sim.reset()
while new_robot.energy > 0 and sim.clock < seconds:
sim.step()
clear_output(wait=True)
display(sim.render())
print("fitness:", sum(new_robot.history)/len(new_robot.history))

def make_movie(dna, name, seconds=10):
new_robot = DNARobot(robot, dna)
sim.setRobot(0, new_robot)
sim.reset()
frame = 0
while new_robot.energy > 0 and sim.clock < seconds:
sim.step()
canvas = sim.render()
canvas.save("%s-%04d.gif" % (name, frame))
frame += 1


watch_one() takes a DNA string, and simply renders it on each step:

In [32]:
dna = '900499520468472606914876192567498206996983559871961745555167134603951243523914525534679075336'

In [33]:
watch_one(dna)

fitness: 74.5

In [34]:
watch_one(pop[0][1])

fitness: 108.88888888888889


make_movie() takes a DNA string and a filename, and makes a bunch of gif files:

In [35]:
make_movie(pop[0][1], "ladybug-0")


You can turn the ladybug-0-*.gif files into a movie named moview-0.gif with this shell command:

In [36]:
! convert -delay 20 -loop 0 ladybug-0-*.gif movie-0.gif


This cell is a markdown cell that shows how to display an animated gif file in your notebook using HTML:

Finally, you may want to delete the extra gif files with this shell command:

In [37]:
! rm ladybug-*.gif