the sudoku affair
In 2006, Ron Jeffries wrote a series of posts describing his attempts to build a Sudoku solver. He began by wrapping a class around a simple datatype for the board — essentially a List[Option[Int]]
— and after that, there isn't much to tell. As Peter Seibel puts it:
[H]e basically wandered around for the rest of his five blog postings fiddling with the representation, making it more “object oriented” and then fixing up the tests to work with the new representation and so on until eventually, it seems, he just got bored and gave up, having made only one minor stab at the problem of actually solving puzzles.1
This story has, in some circles, become notorious. There are two reasons for this. The first is that Ron Jeffries is a leading proponent of a post-design approach to software development. Good design, he asserts, is simply the result of keeping your code "properly-factored:"
[Kent] Beck has those rules for properly-factored code: 1) runs all the tests, 2) contains no duplication, 3) expresses every idea you want to express, 4) minimal number of classes and methods. When you work with these rules, you pay attention only to micro-design matters.
When I used to watch Beck do this, I was sure he was really doing macro design "in his head" and just not talking about it, because you can see the design taking shape, but he never seems to be doing anything directed to the design. So I started trying it. What I experience is that I am never doing anything directed to macro design or architecture: just making small changes, removing duplication, improving the expressiveness of little patches of code. Yet the overall design of the system improves. I swear I'm not doing it.2
The second reason is that, in the same week that Jeffries was wandering around, Peter Norvig released a complete Sudoku solver. And in his first fifteen lines of code, he got further than Jeffries:
def cross(A, B):
"Cross product of elements in A and elements in B."
return [a+b for a in A for b in B]
digits = '123456789'
rows = 'ABCDEFGHI'
cols = digits
squares = cross(rows, cols)
unitlist = ([cross(rows, c) for c in cols] +
[cross(r, cols) for r in rows] +
[cross(rs, cs) for rs in ('ABC','DEF','GHI') for cs in ('123','456','789')])
units = dict((s, [u for u in unitlist if s in u])
for s in squares)
peers = dict((s, set(sum(units[s],[]))-set([s]))
for s in squares)
If you compare both implementations, Norvig's is notable for its clarity and directness. Even more notable is his chosen datatype. When Jeffries chose List[Option[Int]]
, he was clearly mimicking how a Sudoku board was rendered on a page: there are a 81 cells, and some of them have numbers. In Norvig's code, however, the Sudoku board is represented as a collection of possible moves: Map[Coord, Set[Int]]
.
An empty board is a collection of full sets; each cell can be assigned any integer in {1, ..., 9}
. When a cell is assigned a number, its set is reduced to a singleton, and that number is removed from the sets of all its peers
. If, as a result, a peer set is reduced to a singleton, this process repeats.
After that, the only thing left was a simple recursive search. At each step, a move was randomly selected from the available options. If that move emptied one of the sets, that line of search was abandoned. And if all the sets were reduced to singletons, then our search was over.
While Norvig didn't employ any heuristics in his search, they'd be easy to add. A common strategy in Sudoku is looking for naked pairs: if two peers have the same two candidate values, then simply assign one to each. To employ this, all you'd need to do is replace the random selection with a function that looks for matching 2-sets.
And so Norvig's implementation was more than a toy solution. It was a minimal, but extensible, example of constraint propagation. It provided a foundation for exploring both the problem and the solution.
what it all means
Norvig, for his part, didn't assign any of this much importance. When interviewed by Seibel, he said the key difference was that Jeffries "didn't know how to solve the problem."
I actually knew — from AI — that, well, there's this field of constraint propagation — I know how that works. There's this field of recursive search — I know how that works. And I could see, right from the start, you put these two together, and you could solve this Sudoku thing. He didn't know that so he was sort of blundering in the dark even though all his code "worked" because he had all these test cases.3
I agree, but would take it a little further. Both Norvig and Jeffries are genre programmers; they have spent most of their career solving a specific kind of problem. And that problem, inevitably, lends itself to a particular kind of solution.
Peter Norvig's genre is search. He literally wrote the book on good old-fashioned AI, where every problem is reduced — for better or worse — to a search problem.
Ron Jeffries' genre is, as best I can tell, the database application. Like the rest of the Agile Manifesto co-authors, he came up in an era where every business was seeking to "computerize" its processes. This led to a decade's worth of applications consisting of a database, a thin layer of business logic, and an even thinner frontend.
There is, in these applications, a close relationship between the database schema and user interface. Consider the scaffolding provided by the Rails framework: you describe an entity, and it generates the code necessary to view and change those entities. This is why Jeffries chose the List[Option[Int]]
representation; it mimicked how a Sudoku board is presented to its user.
This choice is not remarked upon. It is, to Jeffries, simply the obvious place to start. And I'd imagine that in his professional career, this intuition served him well. But his intuition was developed, and applied, within his chosen genre. Here, he was doing something new; a mystery novelist trying his hand at fantasy. But in the end, it was just a bunch of elves and dwarves in a stately manor, waiting for the wizard to tell them whodunnit.
Jeffries, it should be noted, also assigns little importance to this episode. After weathering fifteen years of online discourse, he wrote this:
Did incremental design and development fail? I don’t think so. Certainly I was using an incremental approach, and certainly no product came out. Did the approach fail?
I don’t think so. I think I wasn’t having fun and just stopped working on the project.4
This is belied, however, by the fact that he returned to Sudoku two years later, writing forty-five new posts on the topic.
By the end of the fifth post, Jeffries had a working Sudoku solver. But then he kept going. He continued to tinker with the solver for another two months and forty posts. And in this prolonged epilogue, we can see the limits of his incremental approach to software design.
back in the saddle point
Jeffries' second attempt at a Sudoku solver begins, predictably, with a List[Option[Int]]
representation. From there, he writes a function that calculates the possible values for an empty cell. He writes a simple recursive search function, which always selects the first possible value for the first possible cell. And with that, his solver was complete.
His solution is, from a design perspective, serviceable. It's fewer than a hundred lines, but because of his chosen representation, much of that is spent on integer arithmetic:
def used_numbers_in_sub_grid(self, position):
first_index_in_row = position // self.line_size * self.line_size
offset_in_row = position % 9 // 3 * 3
first_index_in_sub_grid = first_index_in_row // 27 * 27 + offset_in_row
...
for row in range(first_index_in_sub_grid, first_index_in_sub_grid+27, 9):
for col in range(0, 3):
...
We can contrast this with the (roughly) equivalent code in Norvig's implementation:
[cross(rs, cs) for rs in ('ABC','DEF','GHI') for cs in ('123','456','789')]
The difference lies in how each cell is represented. In Jeffries' code, each cell is an integer, representing its index in the List
. Norvig's code, on the other hand, represents each cell as a string describing its row and column. As a result, he doesn't need to fuss with modulo operators; he can simply use string concatenation.
This, by itself, is not an indictment of Jeffries' approach. His software design is incremental; what matters is the destination. As he explains in his twenty-third post:
Naturally, I try to make good design decisions, not bad ones, although often enough the decisions I make do not pan out. (Yes, I hear you saying that makes your case for coming up with a solid design early, but no, it doesn’t. It makes mine: I don’t know enough to make a better decision.) So I try to make small decisions, simple decisions, decisions that will be easy to change when, not if, a better idea comes along.5
There is, unfortunately, little evidence of this in the preceding posts. After demonstrating a working solver, Jeffries spends the next six posts trying to "simplify" it. He tinkers a bit with the integer arithmetic, and then hides it away in a separate class.
Then, in the twelfth post, things get interesting. While debugging his implementation, he generates an analogue of Norvig's representation: rather than an Option[Int]
, each cell is a Set[Int]
of possible values. And in the thirteenth post, he arrives at the notion of constraint propagation:
Suppose the puzzle contains, not just the current solution state, but also, for each of the 81 cells, the available values for that cell. So when we do our
new_puzzle_trying
method, we’dCreate a new puzzle that copied all those values … then
Assign our guess … and then
Recompute the available values for the position’s components, which
Can be done either by removing the guess or recalculating, whichever is better.6
This is a pivotal point in Jeffries' design process. He has solved the problem, understood the limitations of that solution, and imagined something entirely different. It is, in other words, a chance to start over.
If Jeffries started with a different core representation, then it's likely his subsequent design decisions would also change. The bookkeeping for constraint propagation might push him towards Norvig's relational approach to the rules of Sudoku; rather than continually recomputing the rows, columns, and boxes, he could simply have a map of each cell onto its peers
. He could distill every lesson of the previous posts, creating something simpler and faster.
But Jeffries isn't in the business of starting over. He not only believes in incremental design, but in using the smallest possible increments. In his posts, he regularly returns to GeePaw Hill's maxim of "many more much smaller steps." He is only interested in designs that are reachable through a series of small, discrete steps:
In most every article of the 5,000 articles here, at least the programming ones, I have intentionally done minimal design up front. My intention was to demonstrate what happens when I do that, expecting that small steps, tests, and refactoring would enable me to improve the design as needed.... In almost every situation, that has turned out to be the case. Incremental design, at least at the level I do it here, works well. And the techniques thereof allow us to improve any code that needs it.7
This, again, is contradicted by the preceding posts. His attempts to implement constraint propagation had unambiguously failed.
He began in his seventeenth post, adding a List[Set[Int]]
as a sidecar to his core representation. But since the incremental updates were, as yet, unimplemented, it had to be recomputed after each move. As a result, his solver became two orders of magnitude slower; his test puzzle, which once took 200ms to solve, now took a full 20 seconds.
Jeffries seems largely unbothered by this. He disables the slow test, and begins to look at search heuristics, under the theory that "moving towards more human approaches"8 might offset the performance loss. He only returns to it ten days and eleven posts later, at which point he's clearly lost the thread. Rather than try to implement the incremental updates, he simply makes it so that the List[Set[Int]]
is lazily generated. And since nothing is actually using this new data structure, this "fixes" the issue.
In the twenty-odd posts that follow, Jeffries never returns to constraint propagation. Instead, he putters around with search heuristics like naked pairs; something that, again, Norvig's approach makes fairly trivial. And after the forty-fifth post, Jeffries seems to lose interest. He has, apparently, reached his destination.
The resulting code is, again, serviceable. It solves Sudoku puzzles, and has support for various "human approaches" to solving puzzles. But the implementation9 has a diffuse, muddled quality. In his Puzzle
class, some methods refer to the Set[Int]
associated with a cell as possible_answers
, and others as candidates
. Likewise, most method names distinguish between position
(index) and cell
(value), but Puzzle.unknown_cells
returns a list of indices. And while Puzzle
began as an immutable representation, somewhere along the way it grew a mutable remove_naked_pair
method.
In a larger codebase, these sorts of inconsistencies are inevitable. Despite our best efforts, entropy creeps in. But Jeffries' solver is only a few hundred lines of code, and was refined for months on end. We must treat every line as intentional.
When we say software is simple, we mean it's easy to explain. Well-designed software often has a narrative structure; there is a natural order to its components, and each helps to explain the next. We can see this in Norvig's implementation: it codifies the rules as a set of relationships, and then uses those relationships to solve the problem.
This doesn't happen by accident. The developer needs to hold the entire structure in their head, and find a simple path that connects its constituent parts. And if they can’t find that path, they need to find a better structure.
Jeffries, however, does not believe in bigger pictures; his approach to software design is proudly myopic. He prevents himself from seeing the forest by pressing his face against the trees. And sometimes, as he moves from one tree to the next, he takes a moment to celebrate:
As I refine and refine and refine, the design moves toward smaller objects, with single responsibilities and simple code. The design improves, bit by bit, almost by itself.10
But it doesn't. Software design is a deliberate process, and requires deliberate effort. Anything less is just a shrug in the face of entropy.
Marick 2008, p. 185 ↩
Seibel 2009b, p. 200 ↩
Unfortunately, Jeffries doesn't provide a complete listing of the final version of his solver. To piece it together, you'll need to read all four of his final review posts. ↩
This post is an excerpt from my (incomplete) book on software design. For more about the book, see the overview.
One thing I’m trying to reconcile is that I know others games like chess and connect four often use bitboards to model game state, rather than sets of legal moves. So I wonder if Sudoku is different in some subtle but essential ways. E.g., no loops in Sudoku game states (vs chess) or legal move set sufficiently representing the current game states (vs connect four). Perhaps different networks are better solved by different algorithms, and constraint propagation wouldn’t be as effective in connect four.
Great article, reminds me of https://registerspill.thorstenball.com/p/skin-shedding-code. Looking forward to the book!