Word Solver

Status: Implemented | Updated | January 2025

Overview

The word solver uses the bettersolve algorithm designed by Sébastien Migniot. It’s a depth-first search through the grid, designed to fail fast and quickly determine if a given branch either is valid or unsolvable. To do that, the algorithm will choose the best available slot to fill next of available slots. It then gets a list of the best words sorted to fill that slot with, and iterates through them trying.

NOTE: for the purposes of this document we use the term slot to refer to a given clue in either the across or down direction. This is to align with the upstream bettersolve documentation. In all other docs, the term clue is used instead.

The algorithm works as follows:

  1. First, pick the best slot to fill in the grid. “Best” is defined by the heuristic defined below, but effectively is the slot that is the most constrained. This will either find a word that works or fail.

  2. Next, generate a list of words that fits the slot, and sort it by a score. This score is also defined below, but is meant to optimize the chances that the word can be continued against.

  3. Go through each word from the list serially placing it in in the slot and iterating, picking the next best slot to recurse to until all words are exhausted or a solution is found.

One side effect of this algorithm is that the act of placing a word can and does affect the next best slot to choose when iterating, so the algorithm may jump around to different places of the grid when solving.

We don’t cycle through the slots at a given level. If an empty slot can’t be solved than the grid is unsolvable at that point. There’s no point trying the same grid from a different angle.

The solver makes heavy use of the Word List to fill slots. That object is highly-optimized to give a list of possible words for a given grid space.

NOTE: Solving an arbitrary grid is NP-hard. There are grids that this algorithm can’t fill. However

Best Slot Heuristic

We use the following scoring technique to pick the best slot to choose for a given iteration.

  1. The more known letters the better, i.e. EX.UIS.TE seems easier than T....D

  2. The fewer unknown letters the better, i.e. EX.T seems easier than EXT.....

  3. The fewer candidates the better, i.e. ...ZZ s easier than E...T

  4. The more crossings the better, i.e. choose to fail fast

We compare these heuristics serially, so a slot with more known letters than any other will always be picked first no matter what other characteristics it has.

Note that heuristic three requires using the word-list to calculate the number of candidates. That can be a (relatively) expensive operation, so we’re careful to check the first two heuristics first and only calculate that on the slots that have identical results.

Word Fit Heuristics

Words are sorted by a score based on how much they expand the grid. We are calling this word fit. The word-score is a little complex: For each crossing slot, we look at the letter in the word and see what percentage of words have that latter at the crossing index. A word’s score is the sum of all these percentages.

This is confusing so consider this grid:

##?#
CA??
##??
##??
##??

In the example above, the the main slot being solved would be the word CA??, and the crossing slots would have the filters F???K and ???S. The index of the first crossing slot would be 2, and the crossing index would be 1. The index of the second crossing slot would be 3, and the crossing index would be 0.

To generate the list of candidates, first we get the list of all words with the filter CA??. If we wanted to score the word CARS from this list, we would see what the frequency of the letter ‘R’ is the second letter of 5 letter words, and add it to the frequency of ‘S’ in the first letter of 4 letter words.

Expansion

The Slot Heuristics seem to be working well, but the Word Heuristics needs tweaking. We are able to fill grids fast — maybe too fast. We are currently generating cryptic grids where every crossing letter is either a vowel or extremely common letters (all “S” and “T”s). The next steps are to try and generate a good grid, not just find the first possible match. To do that, we intend to compose the word ‘fit’ score with a score on how interesting it is.

See Word Scores for more information.