The Longest Word Ladder Puzzle Ever—Wolfram Blog (original) (raw)
UPDATE: The solution to the puzzle and more comments from Jon have been added at the bottom of the post.
On the long flight to the recent Wolfram Technology Conference, I ended up on the puzzle page of a newspaper. My attention was drawn to a word ladder puzzle, where you must fill in a sequence of words from clues, but each word differs from the previous by only a single letter. Here, for example, is a simple puzzle already solved:
| best | from a position of superiority or authority |
|---|---|
| bast | strong woody fibers obtained especially from the phloem of from various plants |
| bash | a vigorous blow |
| bath | a vessel containing liquid in which something is immersed (as to process it or to maintain it at a constant temperature or to lubricate it) |
| math | a science (or group of related sciences) dealing with the logic of quantity and shape and arrangement |
I wasn’t going to do a blog entry on this, as it is a very similar task to my “Exploring Synonym Chains” post that I wrote some time ago, but that changed with a chance conversation at the (excellent) Technology Conference. Proving that one never stops learning, Charles Pooh, one of our graph theory developers, pointed out to me that my synonyms item could have been done much better. I had broken one of the very rules that I wrote about in my “10 Tips for Fast Mathematica Code” entry—”Use built-in functions.” I had effectively re-implemented the built-in Mathematica commands GraphPeriphery and GraphDiameter.
So, armed with these two new functions, let’s find the longest word ladder puzzle that can be made using _Mathematica_‘s English dictionary.
Some word ladder puzzles allow you to add or remove letters, but I am going to look only at the version where all words are the same length. Knowing that there can be no connections between different lengths means that we can be more efficient by considering each word length separately. So I start by generating dictionaries of words of specific lengths. Because I intend to use the words’ definitions as my clues, I will only look at words where I know at least one definition for the word, and I will also exclude words that contain non-letters or capitals (e.g. hyphenated words and proper names). The function caches its result, as we only want to do this once:
![nWords[n_] := nWords[n] = DeleteCases[Cases[WordData[], word_String /; (StringLength[word] === n && StringMatchQ[word, RegularExpression["[a-z]+"]] && Length[WordData[word, "Definitions"]] > 0)], "-Redacted word-"];](https://content.wolfram.com/sites/39/2012/01/WordLadders-In1.png "nWords[n_] := nWords[n] = DeleteCases[Cases[WordData[], word_String /; (StringLength[word] === n && StringMatchQ[word, RegularExpression["[a-z]+"]] && Length[WordData[word, "Definitions"]] > 0)], "-Redacted word-"];")
Next we construct the graph of words that have an edit distance of one, that is, only a single letter difference. The conceptually simple way is…
![wordLadderGraph[n_] := wordLadderGraph[n] = AdjacencyGraph[nWords[n], Outer[EditDistance, nWords[n], nWords[n]] /. p_Integer /; p =!= 1 → 0]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In2.png "wordLadderGraph[n_] := wordLadderGraph[n] = AdjacencyGraph[nWords[n], Outer[EditDistance, nWords[n], nWords[n]] /. p_Integer /; p =!= 1 → 0]")
…but we are unnecessarily checking the edit distance in both directions, when we know it is symmetric. And we are also including “aloof” words (completely unconnected—so named by Donald Knuth because “aloof” is one of them). So this version, while more complicated, is faster and yields simpler graphs:
![wordLadderGraph[n_] := wordLadderGraph[n] = Graph@Flatten@ Last@Reap@ Do[If[EditDistance[nWords[n][[a]], nWords[n][[b]]] == 1, Sow[nWords[n][[a]] ↔ nWords[n][[b]]]], {a, 2, Length[nWords[n]]}, {b, 1, a - 1}];](https://content.wolfram.com/sites/39/2012/01/WordLadders-In3.png "wordLadderGraph[n_] := wordLadderGraph[n] = Graph@Flatten@ Last@Reap@ Do[If[EditDistance[nWords[n][[a]], nWords[n][[b]]] == 1, Sow[nWords[n][[a]] ↔ nWords[n][[b]]]], {a, 2, Length[nWords[n]]}, {b, 1, a - 1}];")
Knuth studied word ladders with five letters and observed that most five-letter words can be connected. We can easily replicate this observation:
![wordLadderGraph[5]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In4.png)

We can count the five-letter aloof words:
![Length[Complement[nWords[5], VertexList[wordLadderGraph[5]]]]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In5.png "Length[Complement[nWords[5], VertexList[wordLadderGraph[5]]]]")

Apparently, Knuth couldn’t tackle the six-letter words at the time, as it was too difficult. But we will analyze all the way up to 23-letter words. It turns out that the disconnectedness increases with word length. Here, for example, is the graph for six letters:
![wordLadderGraph[6]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In6.png)

The number of aloof words also goes up to 2,756 in the graph for six letters.
You can make incredibly long ladders, but ones that behave like bat → cat → fat → hat are rather annoying, because you can get directly from “bat” to “hat” without going through the other steps. So I am going to assert that the only good word ladders follow the shortest path of valid words. This view is supported, I think, by Charles Dodgson (Lewis Carroll), who claimed to have invented word ladders. He did work on the shortest paths in word ladders, including the result that the shortest evolution from “ape” to “man” was six steps. Either because I have a more modern set of words, or because Dodgson needed Mathematica, I make it five steps:
![FindShortestPath[wordLadderGraph[3], ape, man]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In7.png "FindShortestPath[wordLadderGraph[3], ape, man]")

Using the looser definition of “no two consecutive steps may change the same letter,” Games Magazine held a competition in 1993 to find the longest word ladders, and its readers managed 26 steps. But we can do better than that.
We need to find which connected subgraph of which word length has the longest shortest distance between any two of its words. That’s a tough sentence to parse, but the concept is summed up in one of the two commands that Charles Pooh told me about: GraphDiameter.
![diameters = ParallelTable[GraphDiameter[Subgraph[wordLadderGraph[i], #]] & /@ ConnectedComponents[wordLadderGraph[i]], {i, 23}];](https://content.wolfram.com/sites/39/2012/01/WordLadders-In8.png "diameters = ParallelTable[GraphDiameter[Subgraph[wordLadderGraph[i], #]] & /@ ConnectedComponents[wordLadderGraph[i]], {i, 23}];")
There is quite a lot of searching to do, but parallelizing it makes it take under 15 minutes on my laptop. We discover that the longest shortest word ladder is 49 words long:
![Max[diameters]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In9.png)

Because I kept the structure of the data as I generated it, the position of that value will tell me the word length and subgraph number that contains our prize.
![Position[diameters, 49]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In10.png)

This turns out to be the main cluster of connected six-letter words.
![bestGraph = Subgraph[wordLadderGraph[6], ConnectedComponents[wordLadderGraph[6]][[23]]]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In11.png "bestGraph = Subgraph[wordLadderGraph[6], ConnectedComponents[wordLadderGraph[6]][[23]]]")

Now we use the other command that Charles introduced me to. GraphPeriphery finds those vertices that are at the maximal distance from each other.
![GraphPeriphery[bestGraph]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In12.png)

Luckily, we get two results. This means that they must be at either end of the word ladder. If we had more, we would have to figure out which element paired with which to produce the maximal path lengths. The next step is to generate the path:
![puzzle = FindShortestPath[bestGraph, "charge", "comedo"]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In13.png "puzzle = FindShortestPath[bestGraph, "charge", "comedo"]")
I am not going to show you the result as that would ruin the puzzle, but here it is highlighted on the graph:
![HighlightGraph[bestGraph, PathGraph[puzzle], GraphHighlightStyle → "Thick"]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In14.png "HighlightGraph[bestGraph, PathGraph[puzzle], GraphHighlightStyle → "Thick"]")

We now have to make the clues. I will use the built-in WordData definitions, picking one of them at random. Since lots of words have multiple meanings, you get a slightly different puzzle each time.
Before the result, an admission of cheating and a warning: unfortunately, the solution that this code finds turns out to contain a rather adult-themed word and clue. In an effort to keep the Wolfram Blog family-friendly, I manually removed the word from the dictionary (the redacted word in the first line of code). Run the code for yourself if you want the adult version. Of course, the whole result is dependent on the dictionary you use anyway, though larger dictionaries do not necessarily lead to longer word ladders, since they also increase connectedness.
Here is the family-friendly version; have fun:
![Style[Grid[Transpose[{Join[{"charge"}, Table["", {48}], {"comedo"}], Map[RandomChoice[Last /@ WordData[#, "Definitions"]] &, puzzle]}], Frame → All, Alignment → Left, ItemSize → {{6, 35}, Automatic}, FrameStyle → Gray], FontFamily → "Georgia"]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In15.png "Style[Grid[Transpose[{Join[{"charge"}, Table["", {48}], {"comedo"}], Map[RandomChoice[Last /@ WordData[#, "Definitions"]] &, puzzle]}], Frame → All, Alignment → Left, ItemSize → {{6, 35}, Automatic}, FrameStyle → Gray], FontFamily → "Georgia"]")
| charge | a quantity of explosive to be set off at one time |
|---|---|
| exchange or replace with another, usually of the same kind or category | |
| occurring or appearing or singled out by chance | |
| of uncertain outcome; especially fraught with risk | |
| a rhythmical work song originally sung by sailors | |
| small crude shelter used as a dwelling | |
| European scaleless blenny | |
| a simple version of hockey played by children on the streets (or on ice or on a field) using a ball or can as the puck | |
| the characteristic sounds made by a horse | |
| habitually complaining | |
| a person given to excessive complaints and crying and whining | |
| any of numerous small silvery North American cyprinid fishes especially of the genus Notropis | |
| shake, as from cold | |
| a razor powered by an electric motor | |
| someone who has or gives or receives a part or a share | |
| an effigy in the shape of a man to frighten birds away from seeds | |
| an electronic pulse counter used to count pulses that occur too rapidly to be recorded individually | |
| an official who affixes a seal to a document | |
| a person skilled in a particular type of therapy | |
| a machine that cuts the heads off grain and moves them into a wagon | |
| someone who reads proof in order to find errors and mark corrections | |
| give an interpretation or rendition of | |
| an owner of property who receives payment for its use by another person | |
| someone who rants and raves; speaks in a violent or loud manner | |
| an enlisted soldier who serves in the ranks of the armed forces | |
| desire strongly or persistently | |
| a programmer for whom computing is its own reward; may enjoy the challenge of breaking into other computers but does no harm | |
| small striped semiterrestrial eastern American squirrel with cheek pouches | |
| long slender feather on the necks of e.g. turkeys and pheasants | |
| challenge aggressively | |
| (paper making) a frame used to form paper pulp into sheets | |
| (statistics) any of nine points that divided a distribution of ranked scores into equal intervals where each interval contains one-tenth of the scores | |
| make dirty or spotty, as by exposure to air; also used metaphorically | |
| give a definition for the meaning of a word | |
| make more complex, intricate, or richer | |
| express discontent | |
| the act of despoiling a country in warfare | |
| a deep narrow steep-sided valley (especially one formed by running water) | |
| in a raving manner | |
| migratory | |
| capturing cattle or horses with a lasso | |
| brick that is laid sideways at the top of a wall | |
| of the relatively near future | |
| orienting or directing homeward or to a destination | |
| hulled corn with the bran and germ removed | |
| a sermon on a moral or religious topic | |
| having a feeling of home; cozy and comfortable | |
| according with custom or propriety | |
| light and humorous drama with a happy ending | |
| comedo | a black-tipped plug clogging a pore of the skin |
If you want printable versions, here are CDF and PDF versions of the table.
I will post the solution to the puzzle in the comments section of the Wolfram Blog in a couple of weeks.
Download this post as a Computable Document Format (CDF) file.
Update:
The solution to the generated puzzle is:
{“charge”, “change”, “chance”, “chancy”, “chanty”, “shanty”, “shanny”, “shinny”, “whinny”, “whiney”, “whiner”, “shiner”, “shiver”, “shaver”, “sharer”, “scarer”, “scaler”, “sealer”, “healer”, “header”, “reader”, “render”, “renter”, “ranter”, “ranker”, “hanker”, “hacker”, “hackee”, “hackle”, “heckle”, “deckle”, “decile”, “defile”, “define”, “refine”, “repine”, “rapine”, “ravine”, “raving”, “roving”, “roping”, “coping”, “coming”, “homing”, “hominy”, “homily”, “homely”, “comely”, “comedy”, “comedo”}
However, as some of the comments pointed out, the dictionary used for this lacked some obvious modified words such as plurals and verb conjugations. Dropping the requirement that we know definitions for the generated puzzle, and using a much larger dictionary, I have a revised result.
![nWords[n_] := nWords[n] = Cases[Union[WordData[], DictionaryLookup[]], word_String /; (StringLength[word] === n && StringMatchQ[word, RegularExpression["[a-z]+"]])]](https://content.wolfram.com/sites/39/2012/01/WordLadder-Edit-In1.png "nWords[n_] := nWords[n] = Cases[Union[WordData[], DictionaryLookup[]], word_String /; (StringLength[word] === n && StringMatchQ[word, RegularExpression["[a-z]+"]])]")
The larger dictionary provides greater connectivity, so the largest minimal word ladder is a little shorter at 46 words, and interestingly occurs in the seven-letter words.
{“gimlets”, “giblets”, “gibbets”, “gibbers”, “libbers”, “limbers”, “lumbers”, “cumbers”, “cambers”, “campers”, “carpers”, “carters”, “barters”, “batters”, “butters”, “putters”, “puttees”, “putties”, “patties”, “parties”, “parries”, “carries”, “carrier”, “currier”, “curlier”, “burlier”, “bullier”, “bullies”, “bellies”, “jellies”, “jollies”, “collies”, “collins”, “colling”, “coaling”, “coaming”, “foaming”, “flaming”, “flaking”, “fluking”, “fluxing”, “flexing”, “fleeing”, “freeing”, “treeing”, “theeing”}