Main Page - Back
 
From SudokuWiki.org, the puzzle solver's site
breakline

Sudoku String Definitions 2.0

I'm excited to introduce a new scheme to capture and transport the board and progress in minimal fashion for Sudoku and many other puzzles including Killers, KenKen and Str8ts. I've been collaborating with Quintin May a.k.a. Sasha behind the scenes to come up with a better approach. This page and the follow-up page on Killer/KenKen definitions is a very compressed overview of the approach - but enough to explain how. Quintin has created a superb and comprehensive documentation library - and if you intend to code - I recommend you browse the following: There are also some Test Strings built into links to each solver.

The problems with the old approach

The chief deficits of version 1 were:
  • A single candidate note was being interpreted as a solved cell
  • Unable to expand the scheme for puzzles with jigsaw boxes
  • Unable to expand the scheme for operators in clues like KenKen and KenDoku
  • No version number in case of changes to the definition
The solvers will continue to support the old packed definitions for the time being since there will be many hard coded links to the solvers out in the wild or stored in databases.

The unpacked 81 character sudoku definition remains the same as before in all places and uses. You can continue to import a Sudoku with 81 numbers (and other symbols like '.' or '_' for blanks). I will be building a more comprehensive importer in the near future to cope with line breaks and other more 'noisy' puzzle definitions.

Version 'B'

So lets dive into the new version for Sudoku including Jigsaw Sudoku, Sudoku X, Windoku and Coloured Sudoku. The next page will cater for Killer, KenKen and KenDoku. Lets consider the sorts of information we want to transport:
  • All puzzles need the ability to store progress - which consists of solved cells and candidates.
  • We should also retain which large numbers are clues.
  • It will be very helpful to include a version header to give vital context for interpreting the data
That is all for Sudoku and variants with extra constraints like Sudoku X. However other puzzles contains more information we will need to encode:
  • Jigsaw Sudoku has irregular boxes. Our new approach moves away from a shape number as this is peculiar to SudokuWiki.org and not universal.
  • For Killers and variants we have the concept of cages and clue operands (+ for Killer, -, x and / for others)
  • For Str8ts we need to encode the black cells
The following table lays it all out. We can also predict the minimum number of bytes per cell the new packed strings will achieve. (Str8ts is still under development).

Leave Out
Encode

Prefix

Clues

Solved

Notes

Boxes

Cages
Clue
Operands
Black
Cells
Bytes
/ Cell
Sudoku S9B Yes Yes Yes Regular - - - 2
Sudoku X X9B Yes Yes Yes Regular - - - 2
Windoku W9B Yes Yes Yes Regular - - - 2
Colour Sudoku C9B Yes Yes Yes Regular - - - 2
Jigsaw Sudoku J9B Yes Yes Yes Irregular - - - 3
Killer Sudoku L9B No Yes Yes Regular Yes - - 4
Killer Jigsaw M9B No Yes Yes Irregular Yes - - 5
KenKen K6C No Yes Yes - Yes 4 (+-x/) - 4
KenDoku D6C No Yes Yes Regular Yes 4 (+-x/) - 4
Str8ts T9B Yes Yes Yes - - - Yes ?
Str8ts X U9B Yes Yes Yes - - - Yes ?
Str8ts B B9B Yes Yes Yes Regular - - Yes ?
Str8ts BX V9X Yes Yes Yes Regular - - Yes ?
OFF SET SHIFTED SHIFTED SHIFTED SHIFTED

The Header

Each packed string definition will be prefixed by three characters.
  • A puzzle type letter
  • + board size (exclusively 6 or 9 at the moment)
  • + version letter. Since this is version 2 we're going with 'B'.
Sudoku and any variants with a fixed pattern or extra constraint will have the same packed definition. We can tell what variant by the first letter of the packed string (yellow column).

The Body

As in the previous packed version we store candidates as bits. (1=1, 2=2, 3=4, 4=8, 5=16, 6=32, 7=64, 8=128 and 9=256). The key difference between this packing method and the previous is that we're going to create smaller numbers for the clues/solved/candidates. In fact they are going to be no larger than 511+18. Previously we shifted the bits to make room for the clue flag meaning the largest number was 1022. However, the new method recognises that a clue and a solved cell and candidates are mutually exclusive.
  • We can store a clue number (1 to 9) as 1 to 9
  • We can store a solved call (1 to 9) as 10 to 18 - offset.
  • And candidates can be any number between 1 and 511 + 18
It so happens that the largest number 529 can be stored in 10 bits. All this fits in two bytes. For Jigsaws we need three bits to store the box number. So we shift the box number 10 bits and add it to the clues/candidates number. Finally we convert to base 36 and pad with zeros. Jigsaws fit produce cell strings 3 bytes long.

All the data for each cell will fit into a single integer. But we're not using all 64 bits. Only the last 16 (two bytes)
unused
|  |box |cell no |
000 010 1010010001
|byte 1 | byte 2 |

A bit of javascript code shows how this is done

for (y = 0; y < 9; y++)
for (x = 0; x < 9; x++) {
n = get the cell value or set of candidates as bits
if( candidates )
n += 18; // offset by 18
else if ( !clue )
n += 9; // offset by 9
if( jigsaw )
n += (box_num << 10); // shift 10 bits and add
h = n.toString(36); // convert to base 36
if (h.length < 2) h = '0' + h; // pad the number if not 2 digits
if (jigsaw && h.length < 3) h = '0' + h; // pad if not 3 digits
s += h; // append to string being made
}

To unpack the string (after checking the version is S9B, X9B, W9B, C9B or J9B) we do the following
// This splits a string into an array of elements each 2 characters long 
var narr = theString.match(/.{1,2}/g);   // should be {1,3} for Jigsaws
for (y = 0; y < 9; y++)
	for (x = 0; x < 9; x++) {
		p = parseInt(narr[y * 9 + x], 36); // convert base 36 to decimal
		n = p & 1023; // first 10 bits
		clue(y, x) = (n < 10) ? 1 : 0; // extract the clue flag
		if (n <= 18) {
			if (n > 9 ) n -= 9; { // removed 'solved cell' offset
			// save the number as a clue or solution
		} else {
			// save the number as a set of candidates
			n -= 18;
		}
		p >>= 10; // remove first 10 bits
		 // p is the box number for this cell
		 // Will be 0 for vanilla Sudoku
	}

Click on the link to read about how the scheme extends to Killer, KenKen and KenDoku



Article created on 18-January-2023. Views: 11647
This page was last modified on 20-November-2024.
All text is copyright and for personal use only but may be reproduced with the permission of the author.
Copyright Andrew Stuart @ Syndicated Puzzles, Privacy, 2007-2025