![]() |
Jupyter at Bryn Mawr College |
|
|
Public notebooks: /services/public/dblank / CS371 Cognitive Science / 2016-Fall |
This is a team-based project.
In the cell below, add your by-line, and delete this cell:
In this lab we will attempt to replicate the results from "Finding Structure in Time":
https://crl.ucsd.edu/~elman/Papers/fsit.pdf
First, we need some text. For this demo, I'll make up a short text. For your assignment, you should generate sentences like Elman did in his paper. You should write a program that will generate random sentences of the appropriate grammar.
text = ("me like you you like me me like apples me like bananas "
"you like bananas you like apples you hate berries me "
"like berries me need berries you need apples you need me").strip()
Next, we write some encoding and decoding functions:
text_words = text.split(" ")
words = list(set(text_words))
def encode(word):
index = words.index(word)
binary = [0] * len(words)
binary[index] = 1
return binary
def decode(pattern):
winner = max(pattern)
index = list(pattern).index(winner)
return label(index)
def label(index):
for word, pattern in patterns.items():
if pattern[index] == 1:
return word
return None
# Reset to max length:
pattern_size = len(encode(words[0]))
patterns = {word: encode(word) for word in words}
text_words is the text corpus, as a list of words. Your's will be too long to display here.
text_words
words
patterns.keys()
Testing our encoding and decoding functions:
encode("need")
decode(encode("need"))
decode([0, 0.6, 0.5, 0, 0.1, 0, 0, 0])
label(1)
And now, we explore reading through a text, predicting what word comes next.
from conx import SRN
class Predict(SRN):
def initialize_inputs(self):
pass
def inputs_size(self):
# Return the number of inputs:
return len(text_words)
def get_inputs(self, i):
current_word = text_words[i]
next_word = text_words[(i + 1) % len(text_words)]
return (patterns[current_word], patterns[next_word])
net = Predict(len(encode("need")), 5, len(encode("need")))
net.train(max_training_epochs=2000,
report_rate=100,
tolerance=0.3,
epsilon=0.1)
And testing the trained network. You may have to train an amount comparable to what Elman did.
net.test()
That is hard to read. conx comes with a way to override the display of the test input:
net.display_test_input = lambda inputs: print("Input: " + decode(inputs))
net.test()
That is better. But we can also do the same for displaying the outputs:
def display_outputs(outputs, result="Outputs", label=None):
print(result + ": " + decode(outputs))
net.display_test_output = display_outputs
net.test()
Better! Why is it that sometimes the "output" may be the same as "correct" but still marked as "Incorrect"?
Elman produced "dendograms" (tree plots) to show similarity of the hidden activations associated with each word. We can do the same.
%matplotlib inline
import io
import numpy as np
import matplotlib.pyplot as plt
from scipy.cluster import hierarchy
from scipy.spatial import distance
We will plot the hidden unit activations of how close each hidden pattern is to each other. This is a way of seeing the clustering among numeric representations of many dimensions.
To do this, we need to get the "hidden layer activations" for each word, in a proper order.
net.layer[0].propagate(encode("need"))
Next, we will go through the words in the text, and get the hidden layer activations.
Note that each time, we are overwriting the previous activations. A better method would be to somehow average each set of hidden layer activations.
hiddens_dict = {}
for word in text_words:
hiddens_dict[word] = net.layer[0].propagate(patterns[word])
Next, we get those hidden layer activations in the order that matches the "words" list:
hiddens = []
for word in words:
hiddens.append(hiddens_dict[word])
Now, we are ready to process the hidden layer activations to display as a dendrogram.
http://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.dendrogram.html
linkage = hierarchy.linkage(hiddens)
Let's make the output big enough to easily see (units are based on DPI):
plt.rcParams["figure.figsize"] = (13, 5)
threshold = 0.3
clusters = hierarchy.fcluster(linkage, threshold, criterion="distance")
hierarchy.dendrogram(linkage, color_threshold=0.3, leaf_label_func=label, leaf_rotation=90)
plt.xlabel("Words")
plt.ylabel("Distance")
You may want to explore the options for the dendrogram here:
http://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.dendrogram.html
Write about what you have found. You can write much here. You can explain the experiment, results found, where they agree with Elman, or disagree with his results.
As per usual, please reflect deeply on this week's lab. What was challenging, easy, or surprising? Connect the topics onto what you already know.