\datethis
\ifpdftex \input supp-pdf \else \input epsf \fi
\def\caret/{\smash{\lower4pt\hbox to0pt{\hss$\scriptscriptstyle\land$\hss}}}
\def\qcaret/{`\thinspace\caret/\thinspace'} % quoted caret
@s delete unknown
@*Introduction. The purpose of this program is to enumerate fixed polyominoes
of up to 55 cells (although I won't really be able to get that far until
Moore's law carries on for a few more years). The method is essentially
that of Iwan Jensen [{\tt arXiv:cond-mat/0007239}, to appear in {\sl Journal of
@^Jensen, Iwan@>
Statistical Physics}, spring 2001], who discovered that the important
techniques of Andrew Conway [{\sl Journal of Physics\/ \bf A28} (1995),
@^Conway, Andrew Richard@>
335--349] can be vastly improved in the special case of polyomino
enumeration.
The basic idea is quite simple: We will count the number of fixed polyominoes
that span a rectangle that is $h$ cells high and $w$ cells wide, where
$h$ and $w$ are as small as possible; then we will add the totals for
all relevant $h$ and $w$. We can assume that $h\ge w$. For each $h$ and $w$
that we need, we enumerate the spanning polyominoes by considering
one cell at a time of an $h\times w$ array, working from left to right
and top to bottom, deciding whether or not that cell is occupied, and
combining results for all boundary configurations that are equivalent
as far as future decisions are concerned. For example, we might have a
polyomino that starts out like this:
$$\ifpdftex\convertMPtoPDF{polyomino.2}{1}{1}\else\epsfbox{polyomino.2}\fi$$
(This partial polyomino obviously has more than 54 cells already, but large
examples will help clarify the concepts that are needed in the program below.)
Most of the details of the upper part of this pattern have no effect on
whether the yet-undetermined cells will form a suitable polyomino or not.
All we really need to know in order to answer that question is
whether the bottom cells of each column-so-far are occupied or
unoccupied, and which of the occupied ones are already connected to
each other. We also need to know whether the left column and right
column are still blank.
In this case the 26 columns have occupancy pattern
\.{01001001001010110011010110} at the bottom,
and the occupied cells belong to six connected components, namely
$$\vbox{\halign{\.{#}\cr
01000000000000000000010000\cr
00001000001000110000000000\cr
00000001000000000000000000\cr
00000000000010000000000000\cr
00000000000000000011000000\cr
00000000000000000000000110\cr}}$$
Fortunately the fact that polyominoes lie in a plane forces the components
to be nested within each other; we can't have ``crossings'' like
\.{1000100} with \.{0010001}. Therefore we can encode the component and
occupancy information conveniently as
$$\.{0(00(00100-010-)00()0)0()0}$$
using a five-character alphabet:
$$\vbox{\halign{\.#\enspace&#\hfil\cr
0&means the cell is unoccupied;\cr
1&means the cell is a single-cell component;\cr
(&means the cell is the leftmost of a multi-cell component;\cr
)&means the cell is the rightmost of a multi-cell component;\cr
-&means the cell is in the midst of a multi-cell component.\cr}}$$
Furthermore we can treat the cases where the entire leftmost column is
nonblank by considering that the left edge of the array belongs to the leftmost
cell component; and the rightmost column can be treated similarly.
With these conventions, we encode the boundary condition at the
lower fringe of the partially filled array above by the 26-character string
$$\.{0(00(00100-010-\caret/)00()0)0(-0}\,,\eqno(*)$$
using \qcaret/ to show where the last partial row ends.
\vskip1pt
A string like $(*)$ represents a so-called {\it configuration}. If no
\qcaret/ appears, the partial row is assumed to be a
complete row. The number of rows above a given occupancy pattern is implicitly
part of the configuration but not explicitly shown in the notation,
because this program never has to deal simultaneously with configurations
that have different numbers of rows above the current partial row.
@ A bit of theory may help the reader internalize these rules: It turns out
that the set of all connectivity/occupancy configuration codes at the end of a
row has the interesting unambiguous context-free grammar
$$\eqalign{
S&\to L\,\.0J_0R \mid Z\.-I\.-Z\mid Z\.-Z\cr
Z&\to\epsilon\mid\.0Z\cr
L&\to\epsilon\mid Z\,\.)\mid Z\.-I\.)\cr
R&\to\epsilon\mid \.(Z\mid \.(I\.-Z\cr
I_0&\to\epsilon\mid\.0J_0\cr
J_0&\to I_0\mid A\,\.0J_0\cr
I&\to\epsilon\mid\.-I\mid\.0J\cr
J&\to I\mid A\,\.0J\cr
A&\to 1\mid \.(I\.)\cr
}$$
[Translation: $I$ is any sequence of \.0's and \.-'s and $A$'s with each
$A$ preceded and followed by~\.0; $I_0$ is similar, but with no \.- at
top level.] The number $s_n$ of strings of length $n$
in $S$ has the generating function
$$\sum s_nz^n={1-4z^2-4z^3+z^4-(1+z)(1-z^2)\sqrt{\mkern1mu1+z}
\sqrt{\mkern1mu1-3z}\over2z^3(1-z)}=2z+6z^2+16z^3+\cdots{};$$
hence $s_n$ is asymptotically proportional to $3^n\!/n^{3/2}$.
@ Any polyomino with the configuration $(*)$ in the midst of row 10 must have
occupied at least $28+18+1+1+2+4=54$ cells so far, and 16 more will be
needed to connect it up and make it touch the left edge of the array.
Moreover, the array has only 10 rows so far, but it has 26 columns, and we
will only be interested in arrays for which $h\ge w$. Therefore at least 15
additional cells must be occupied if we are going to complete a polyomino in
an $h\times26$ array.
In other words, we would need to consider the boundary configuration
$(*)$ only if we were enumerating polyominoes of
at least $54+16+15=85$ cells. Such considerations greatly limit the number of
configurations that can arise; Jensen's experiments show that the total number
of cases we must deal with grows approximately as $c^n$ where $c$ is
slightly larger than~1.4. This is exponentially large, but it is considerably
smaller than the $3^{n/2}$ configurations that arise in Conway's original
method; and it's vastly smaller than the total number of polyominoes,
which turns out to be approximately $4.06^n$ times $0.3/n$.
@ This program doesn't actually do the whole job of enumeration;
it only outputs a set of instructions that tell how to do it.
Another program reads those instructions and completes the computation.
@c
#include
@@;
@@;
@@;
main(int argc, char *argv[])
{
@;
@;
@;
@