TL;DR Show me the code : Gist
Tell me how fast it runs : Follows the same trend reported in the original post. For instance, the difficult problem that was solved in 72.27 seconds by the Python code took 45.34 seconds.
EDIT :
After being chastised by many insightful comments on Reddit, I changed the code to use {cpp}auto const&{/cpp} in places that matter, compiled with -O3, and re-ran both Python and C++ versions, and changed the times above to reflect those experiments. I had reported the time from Norvig’s webpage for the Python version which is not at all fair as that was probably on years old machine.
This post chronicles my effort to translate this Sudoku solver written in Python to C++11. (Yes, I know, I should call it C++, since the standard has been ratified, but C++11 feels like a completely new language, so I mentally separate C++03 and C++11). The first part sets up some lists to hold a grid configuration. Please refer to the original post to find out what each data structure means. I will show the Python code followed by the C++ with some explanation.
[python]
def cross(A, B):
“Cross product of elements in A and elements in B.”
return [a+b for a in A for b in B]
[/python]
The above code is doing a cartesian product of two lists. I am going to use {cpp}std::vector<>{/cpp} for representing lists. So here is the corresponding C++11 code for doing the cartesian product.
[cpp]
template
auto cross(const vector
vector
vector
for (auto _a : a) {
for (auto _b : b) {
output.push_back(_a+_b);
}
}
return output;
}
[/cpp]
Note that the Python code is generic with respect to the type of the output list. The type of the output list is determined by the type of the expression {python}a+b{/python}. We can emulate the same in C++ using {cpp}decltype{/cpp} and late-specified return type as shown.
[python]
digits = ’123456789′
rows = ‘ABCDEFGHI’
cols = digits
squares = cross(rows, cols)
[/python]
These are pretty straightforward :
[cpp]
vector
vector
vector
vector
[/cpp]
[python]
unitlist = ([cross(rows, c) for c in cols] +
[cross(r, cols) for r in rows] +
[cross(rs, cs) for rs in ('ABC','DEF','GHI') for cs in ('123','456','789')])
[/python]
{python}unitlist{/python} is a list of lists of strings in Python. I chose {cpp}vector
[cpp]
vector
[/cpp]
The code populates {python}unitlist{/python} with 3 lists. Let’s look at the last list first.
[python]
[cross(rs, cs) for rs in ('ABC','DEF','GHI') for cs in ('123','456','789')]
[/python]
It’s hard to get this concise in C++, but we can try. Python uniformly treats a string as a list of characters. However C++ doesn’t. So the first order of business is to write something that can convert a string literal like {cpp}”123″{/cpp} to a {cpp}vector
[cpp]
vector
vector
for_each(str, str+N, [=, &output] (char c) {
output.push_back(string(1, c));
});
return output;
}
[/cpp]
Now, here is the translation for the last of the three lists that make up {python}unitlist{/python} :
[cpp]
for (vector
for (vector
unitlist.push_back(cross(rs, cs));
}
}
[/cpp]
Almost as concise, eh? The translations for the other two lists are easier.
[cpp]
for (auto _c : cols) {
vector
unitlist.push_back(cross(rows, c));
}
for (auto _r : rows) {
vector
unitlist.push_back(cross(r, cols));
}
[/cpp]
Let’s look at the next two data structures :
[python]
units = dict((s, [u for u in unitlist if s in u])
for s in squares)
peers = dict((s, set(sum(units[s],[]))-set([s]))
for s in squares)
[/python]
A {cpp}map{/cpp} is the straightforward translation for a Python dictionary.
[cpp]
map
map
[/cpp]
{python}units{/python} is easy to translate:
[cpp]
for (string s : squares) {
for (vector
if (find(u.begin(), u.end(), s) != u.end())
units[s].push_back(u);
}
}
[/cpp]
However, the way {python}peers{/python} is populated is a bit tricky. Python {python}sum{/python} function converts a list of lists into a list by simply appending the lists together. The equivalent can be achieved with {cpp}std::accumulate{/cpp}. That is not the most efficient way to achieve the flattening of the nested list in C++, but I am just going to use this as it is not in any inner loop. Once I get the flattened vector, I am going to conditionally copy it to the output set, instead of doing a set difference.
[cpp]
for (auto s : squares) {
vector
accumulate(units[s].begin(), units[s].end(), vector
[=](const vector
vector
for (auto e : a1) ret.push_back(e);
for (auto e : a2) ret.push_back(e);
return ret;
});
set
copy_if(temp.begin(), temp.end(), inserter(output, output.end()),
[=] (string t) { return !(t == s); });
}
[/cpp]
Next, let’s tackle the {python}assign, eliminate, parse_grid, and grid_values{/python} functions. All of either return the grid representation, which is a dictionary mapping strings to strings ({cpp}map
[cpp]
template
struct maybe {
bool valid;
static unsigned created, destroyed;
union {
shared_ptr
};
maybe() : valid(false) {}
maybe(T *_v) : valid(true), value(_v) {}
maybe(const shared_ptr
++created;
}
maybe(const maybe& other) : valid(other.valid),
value(other.value) { ++created; }
operator T& () {
if (valid) return *value;
assert(false);
}
~maybe() {
if (valid) {
value.reset();
++destroyed;
}
}
bool is_valid() {
return valid;
}
};
typedef map
[/cpp]
The {cpp}maybe
assign :
[python]
def assign(values, s, d):
“”"Eliminate all the other values (except d) from values[s] and propagate.
Return values, except return False if a contradiction is detected.”"”
other_values = values[s].replace(d, ”)
if all(eliminate(values, s, d2) for d2 in other_values):
return values
else:
return False
[/python]
[cpp]
maybe
value_t& values = _values;
size_t pos = values[s].find(d);
assert(pos != string::npos);
string temp = values[s];
temp.replace(pos, 1, “”);
string other_values = temp;
for (auto d2 : other_values) {
if (eliminate(_values, s, string(1, d2)).is_valid() == false)
return maybe
}
return _values;
}
[/cpp]
eliminate :
[python]
def eliminate(values, s, d):
“”"Eliminate d from values[s]; propagate when values or places <= 2.
Return values, except return False if a contradiction is detected."""
if d not in values[s]:
return values ## Already eliminated
values[s] = values[s].replace(d,'')
## (1) If a square s is reduced to one value d2, then eliminate d2 from the peers.
if len(values[s]) == 0:
return False ## Contradiction: removed last value
elif len(values[s]) == 1:
d2 = values[s]
if not all(eliminate(values, s2, d2) for s2 in peers[s]):
return False
## (2) If a unit u is reduced to only one place for a value d, then put it there.
for u in units[s]:
dplaces = [s for s in u if d in values[s]]
if len(dplaces) == 0:
return False ## Contradiction: no place for this value
elif len(dplaces) == 1:
# d can only be in one place in unit; assign it there
if not assign(values, dplaces[0], d):
return False
return values
[/python]
[cpp]
maybe
value_t& values = _values;
size_t pos = values[s].find(d);
if (pos == string::npos)
return _values;
values[s].replace(pos, 1, “”);
if (values[s].length() == 0)
return maybe
else if (values[s].length() == 1) {
string d2 = values[s];
for (auto s2 : peers[s]) {
if (eliminate(_values, s2, d2).is_valid() == false)
return maybe
}
}
for (auto u : units[s]) {
vector
for (auto _s : u) {
if (values[_s].find(d) != string::npos)
dplaces.push_back(_s);
}
if (dplaces.size() == 0)
return maybe
if (dplaces.size() == 1)
if (assign(_values, dplaces[0], d).is_valid() == false)
return maybe
}
return _values;
}
[/cpp]
The only notable thing is returning {cpp}maybe
[python]
def search(values):
“Using depth-first search and propagation, try all possible values.”
if values is False:
return False ## Failed earlier
if all(len(values[s]) == 1 for s in squares):
return values ## Solved!
## Chose the unfilled square s with the fewest possibilities
n,s = min((len(values[s]), s) for s in squares if len(values[s]) > 1)
return some(search(assign(values.copy(), s, d))
for d in values[s])
[/python]
The C++ code is here :
[cpp]
maybe
if (_values.is_valid() == false)
return maybe
value_t& values = _values;
value_t temp;
copy_if(values.begin(), values.end(),
inserter(temp, temp.begin()),
[=, &values] (value_t::value_type v) -> bool {
return v.second.length() > 1;
});
if (temp.empty())
return _values;
auto min_s = min_element(temp.begin(), temp.end(),
[=, &values] (value_t::value_type v1,
value_t::value_type v2) -> bool {
return v1.second.length() < v2.second.length();
});
string s = min_s->first;
for (auto d : values[s]) {
maybe
maybe
if (retval.is_valid()) return retval;
}
return maybe
}
[/cpp]
The code slightly deviates from the Python implementation, but remains similar in spirit. First, I copy the elements of the grid that have more than one valid entry in them to a temporary vector. If the temporary vector is empty, then we are done. If not, I pick the entry with smallest number of valid elements. That is what the {cpp}min_element{/cpp} is doing. Then I recurse by setting that grid cell to each possible valid entry. The gist has a complete runnable program that takes the input grid as a command line argument. An example run :
[cpp]
./a.out 003020600900305001001806400008102900700000008006708200002609500800203009005010300
13666 ms
4 8 3 | 9 2 1 | 6 5 7
9 6 7 | 3 4 5 | 8 2 1
2 5 1 | 8 7 6 | 4 9 3
———+———+———
5 4 8 | 1 3 2 | 9 7 6
7 2 9 | 5 6 4 | 1 3 8
1 3 6 | 7 9 8 | 2 4 5
———+———+———
3 7 2 | 6 8 9 | 5 1 4
8 1 4 | 2 5 3 | 7 6 9
6 9 5 | 4 1 7 | 3 8 2
[/cpp]
Time wise, I found that the C++ implementation followed the same trend as the Python implementation. That is, most problems were solved in less than few milliseconds. The hard one that took 72.27 seconds by Python was solved in 45.34 seconds by the C++ code.