Today I was issued a challenge, of sorts. Do you know those word games
where you find a chain of words, for example from
each word in the chain differs from the previous word by just one
letter. For example the chain from cat to dog would look like this:
cat, cot, cog, dog.
The original problem was to find if we can get from
Thinking about this program, we can represent the words in a dictionary as a graph. Each word is a node in the graph and if there is a one-letter difference between two words, we create an edge between those two words. Our problem now reduces to finding the shortest path through the graph.
The first thing I did was to find a graph library in Python. This was not difficult, and I chose NetworkX from http://networkx.lanl.gov/. There is a lovely little example of how it is used on the front page:
>>> import networkx as nx >>> G=nx.Graph() >>> G.add_node("spam") >>> G.add_edge(1,2) >>> print(G.nodes()) [1, 2, 'spam'] >>> print(G.edges()) [(1, 2)]
and it turns out that the example contains about 90% of the functionality that we need for our solution.
Here is the solution that I came up with:
import sys import networkx as nx start_word = sys.argv end_word = sys.argv if len(start_word) != len(end_word): print 'words are not the same length.' sys.exit(1) l = len(start_word) d = set() # Dictionary of words we've added to our graph g = nx.Graph() def distance(w1, w2): d = 0 for i in range(len(w1)): if w1[i] != w2[i]: d += 1 return d for word in open('/usr/share/dict/words'): word = word.strip() if len(word) == l: d.add(word) g.add_node(word) for w in d: if distance(word, w) == 1: g.add_edge(word, w) try: print nx.shortest_path(g, start_word, end_word) except: print 'Could not find a chain from ', start_word, 'to', end_word
Here is a short description of how the program works:
- We create an empty graph g.
- We go through each word in the dictionary and add it to our graph.
- For convenience, we also add it to a set of words that we’ve seen.
- We compare each new word with all the words that we’ve seen up to this point. We use a function called distance that returns the number of characters that must be changed to transform the first word to the second. If the distance is 1, we add an edge between the two words.
- Once we’ve built the graph, we use the shortest path algorithm to find the shortest path in the graph between the start word and the end word.
It turns out that we can’t get from weather to climate (there are no words in the dictionary that can be made by changing a single letter of climate, so it was a non-starter (or I suppose a non-ender).
However, we can get from
white. I will leave that as an
exercise to the reader.