\datethis @*Intro. This program is the fourteenth in a series of exploratory studies by which I'm attempting to gain first-hand experience with OBDD structures, as I prepare Section 7.1.4 of {\sl The Art of Computer Programming}. It's basically the same as {\mc BDD11}, but extended to include some rudimentary methods for changing and sifting'' the order of variables. In this program I try to implement simplified versions of the basic routines that are needed in a large'' BDD package. \def\<#1>{\hbox{$\langle\,$#1$\,\rangle$}} \chardef\ttv='174 % vertical line \chardef\tta='046 % ampersand \chardef\tth='136 % hat \chardef\ttt='176 % tilde The computation is governed by primitive commands in a language called BDDL; these commands can either be read from a file or typed online (or both). @^BDDL, a primitive language for BDD calculations@> BDDL commands have the following simple syntax, where \ denotes a nonnegative decimal integer: \eqalign{ &\\gets\.{c0}\mid\.{c1}\cr &\\gets\.x\\cr &\\gets\.f\\cr &\\gets\\mid\\mid\\cr &\\gets\\\mid \\\\mid \\.{[y]}\mid\cr &\hskip15em \\\\\\cr &\\gets\\mid\\.=\\mid\\.{=.}\mid \.y\\.=\\mid\.y\\.{=.}\cr} The special commands \, @^Special commands@> @^Commands@> the unary operators \, the binary operators \, and the ternary operators \ are explained below. One short example will give the general flavor: After the commands \halign{\qquad\tt#\hfil\cr f1=x1{\tth}x2\cr f2=x3{\ttv}x4\cr f1=f1{\tta}f2\cr f2={\ttt}f1\cr} the function $f_1$ will be $(x_1\oplus x_2)\land(x_3\lor x_4)$, and $f_2$ will be $\lnot f_1$. Then \.{f1=.}' will undefine~$f_1$. If the command line specifies an input file, all commands are taken from that file and standard input is ignored. Otherwise the user is prompted for commands. For simplicity, I do my own memory allocation in a big array called |mem|. The bottom part of that array is devoted to BDD nodes, which each occupy two octabytes. The upper part is divided into dynamically allocated pages of a fixed size (usually 4096 bytes). The cache of computed results, and the hash tables for each variable, are kept in arrays whose elements appear in the upper pages. These elements need not be consecutive, because the $k$th byte of each dynamic array is kept in location |mem[b[k>>12]+(k&0xfff)]|, for some array~|b| of base addresses. Each node of the BDD base is responsible for roughly 28 bytes in |mem|, assuming 16 bytes for the node itself, plus about 8 for its entry in a hash table, plus about 4 for its entry in a cache. (I could reduce the storage cost from 28 to 21 by choosing algorithms that run slower; but I decided to give up some space in the interests of time. For example, I'm devoting four bytes to each reference count, so that there's no need to consider saturation. And this program uses linear probing for its hash tables, at the expense of about 3 bytes per node, because I like the sequential memory accesses of linear probing.) Many compile-time parameters affect the sizes of various tables and the heuristic strategies of various methods adopted here. To browse through them all, see the entry Tweakable parameters'' in the index at the end. @^Tweakable parameters@> @ Here's the overall program structure: @c #include #include #include #include "gb_flip.h" /* random number generator */ #define verbose Verbose /* because |verbose|' is |long| in libgb */ @@; @@; @@; @@; @# main (int argc, char *argv[]) { @;@# @; @; while (1) @; alldone: @; exit(0); /* normal termination */ } @ @d file_given (argc==2) @= if (argc>2 || (file_given && !(infile=fopen(argv[1],"r")))) { fprintf(stderr,"Usage: %s [commandfile]\n",argv[0]); exit(-1); } @ @= FILE *infile; /* input file containing commands */ int verbose=-1; /* master control for debugging output; $-1$ gives all */ @ @= gb_init_rand(0); /* initialize the random number generator */ @ One of the main things I hope to learn with this program is the total number of |mems| that the computation needs, namely the total number of memory references to octabytes. I'm not sure how many mems to charge for recursion overhead. A machine like \.{MMIX} needs to use memory only when the depth gets sufficiently deep that 256 registers aren't enough; then it needs two mems for each saved item (one to push it and another to pop it). Most of \.{MMIX}'s recursive activity takes place in the deepest levels, whose parameters never need to descend to memory. So I'm making a separate count of |rmems|, the number of entries to recursive subroutines. Some of the mems are classified as |zmems|, because they arise only when zeroing out pages of memory during initializations. @d o mems++ /* a convenient macro for instrumenting a memory access */ @d oo mems+=2 @d ooo mems+=3 @d oooo mems+=4 @= printf("Job stats:\n"); printf(" %llu mems plus %llu rmems plus %llu zmems\n",mems,rmems,zmems); @; @ @= void show_stats(void) { printf("stats: %d/%d nodes, %d dead, %d pages,", totalnodes,nodeptr-botsink,deadnodes,topofmem-pageptr); printf(" %llu mems, %llu rmems, %llu zmems\n",mems,rmems,zmems); } @ This program uses |long long|' to refer to 64-bit integers, because a single |long|' isn't treated consistently by the \CEE/~compilers available to me. (When I first learned~\CEE/, |int|' was traditionally |short|', so I was obliged to say |long|' when I wanted 32-bit integers. Consequently the programs of the Stanford GraphBase, written in the 90s, now get 64-bit integers---contrary to my original intent. C'est tragique; c'est la vie.) @= unsigned long long mems, rmems, zmems; /* mem counters */ @ @+= if (sizeof(long long)!=8) { fprintf(stderr,"Sorry, I assume that sizeof(long long) is 8!\n"); exit(-2); } @ Speaking of compilers, the one I use at present insists that pointers occupy 64 bits. As a result, I need to pack and unpack pointer data, in all the key data structures of this program; otherwise I would basically be giving up half of my memory and half of the hardware cache. I could solve this problem by using arrays with integer subscripts. Indeed, that approach would be simple and clean. But I anticipate doing some fairly long calculations, and speed is also important to me. So I've chosen a slightly more complex (and slightly dirtier) approach, equivalent to using short pointers; I wrap such pointers up with syntax that doesn't offend my compiler. The use of this scheme allows me to use the convenient syntax of~\CEE/ for fields within structures. Namely, data is stored here with a type called |addr|, which is simply an unsigned 32-bit integer. An |addr| contains all the information of a pointer, since I'm not planning to use this program with more than $2^{32}$ bytes of memory. It has a special name only to indicate its pointerly nature. With this approach the program goes fast, as with usual pointers, because it doesn't have to shift left by 4~bits and add the base address of~|mem| whenever addressing the memory. But I do limit myself to BDD bases of at most about 30 million nodes. (At the cost of shift-left-four each time, I could extend this scheme to handling a 35-bit address space, if I ever get a computer with 32 gigabytes of RAM. I~still would want to keep 32-bit pointers in memory, in order to double the effective cache size.) The |addr_| macro converts an arbitrary pointer to an |addr|. @d addr_(p) ((addr)(size_t)(p)) @= typedef unsigned int addr; @*Dynamic arrays. Before I get into the BDD stuff, I might as well give myself some infrastructure to work with. The giant |mem| array mentioned earlier has nodes at the bottom, in locations |mem| through |nodeptr-1|. It has pages at the top, in locations |pageptr| through |mem+memsize-1|. We must therefore keep |nodeptr<=pageptr|. A node has four fields, called |lo|, |hi|, |xref|, and |index|. I shall explain their significance eventually, when I {\it do\/} get into the BDD stuff.'' A page is basically unstructured, although we will eventually fill it either with hash-table data or cache memos. The |node_| and |page_| macros are provided to make pointers from stored items of type |addr|. @s node int @s page int @^Tweakable parameters@> @d logpagesize 12 /* must be at least 4 */ @d memsize (1<<29) /* bytes in |mem|, must be a multiple of |pagesize| */ @# @d pagesize (1<= typedef struct node_struct { addr lo,hi; int xref; /* reference count minus one */ unsigned int index; /* variable ID followed by random bits */ } node; typedef struct page_struct { addr dat[pageints]; } page; @ Here's how we launch the dynamic memory setup. Incidentally, I tried to initialize |mem| by declaring it to be a variable of type |void*|, then saying |mem=malloc(memsize)|'. But that failed spectacularly, because the geniuses who developed the standard library for my 64-bit version of Linux decided in their great wisdom to make |malloc| return a huge pointer like |0x2adaf3739010|, even when the program could fit comfortably in a 30-bit address space. D'oh. @d topofmem ((page*)&mem[memsize]) @+= botsink=(node*)mem; /* this is the sink node for the all-zero function */ topsink=botsink+1; /* this is the sink node for the all-one function */ o,botsink->lo=botsink->hi=addr_(botsink); o,topsink->lo=topsink->hi=addr_(topsink); oo,botsink->xref=topsink->xref=0; oooo,botsink->index=gb_next_rand(); oooo,topsink->index=gb_next_rand(); totalnodes=2; nodeptr=topsink+1; pageptr=topofmem; @ @= char mem[memsize]; /* where we store most of the stuff */ node *nodeptr; /* the smallest unused node in |mem| */ page *pageptr; /* the smallest used page in |mem| */ node *nodeavail; /* stack of nodes available for reuse */ page *pageavail; /* stack of pages available for reuse */ node *botsink, *topsink; /* the sink nodes, which never go away */ int totalnodes; /* this many nodes are currently in use */ int deadnodes; /* and this many of them currently have |xref<0| */ int leasesonlife=10; @ Here's how we get a fresh (but uninitialized) node. The |nodeavail| stack is linked by its |xref| fields. If memory is completely full, |NULL| is returned. In such cases we need not abandon all hope; a garbage collection may be able to reclaim enough memory to continue. (I've tried to write this entire program in such a way that such temporary failures are harmless.) @= node* reserve_node(void) { register node *r=nodeavail; if (r) o,nodeavail=node_(nodeavail->xref); else { r=nodeptr; if (r<(node*)pageptr) nodeptr++; else { leasesonlife--; fprintf(stderr,"NULL node forced (%d pages, %d nodes, %d dead)\n", topofmem-pageptr,nodeptr-botsink,deadnodes); fprintf(stderr,"(I will try %d more times)\n",leasesonlife); if (leasesonlife==0) { show_stats();@+exit(-98); /* sigh */ } return NULL; } } totalnodes++; return r; } @ Conversely, nodes can always be recycled. In such cases, there had better not be any other nodes pointing to them. @= void free_node(register node *p) { o,p->xref=addr_(nodeavail); nodeavail=p; totalnodes--; } @ Occupation and liberation of pages is similar, but it takes place at the top of |mem|. @= page* reserve_page(void) { register page *r=pageavail; if (r) o,pageavail=page_(pageavail->dat[0]); else { r=pageptr-1; if ((node*)r>=nodeptr) pageptr=r; else { leasesonlife--; fprintf(stderr,"NULL page forced (%d pages, %d nodes, %d dead)\n", topofmem-pageptr,nodeptr-botsink,deadnodes); fprintf(stderr,"(I will try %d more times)\n",leasesonlife); if (leasesonlife==0) { show_stats();@+exit(-97); /* sigh */ } return NULL; } } return r; } @# void free_page(register page *p) { o,p->dat[0]=addr_(pageavail); pageavail=p; } @ @= j=(node*)(pageptr-3)-nodeptr; if (j>=0) { for (p=nodeavail;p&&j<3;o,p=(node_(p->xref))) j++; if (j>=3) break; } @ @= j=nodeptr-(node*)mem; k=topofmem-pageptr; printf(" %llu bytes of memory (%d nodes, %d pages)\n", ((long long)j)*sizeof(node)+((long long)k)*sizeof(page),j,k); @ @= register int j,k; @*Variables and hash tables. Our BDD base represents functions on the variables $x_v$ for $0\le v<|varsize|$, where |varsize| is a power of~2. When $x_v$ is first mentioned, we create a |var| record for it, from which it is possible to find all the nodes that branch on this variable. The list of all such nodes is implicitly present in a hash table, which contains a pointer to node $(v,l,h)$ near the hash address of the pair $(l,h)$. This hash table is called the {\it unique table\/} for~$v$, because of the BDD property that no two nodes have the same triple of values $(v,l,h)$. When there are $n$ nodes that branch on $x_v$, the unique table for~$v$ has size $m$, where $m$ is a power of~2 such that $n$ lies between $m/8$ and $3m/4$, inclusive. Thus at least one of every eight table slots is occupied, and at least one of every four is unoccupied, on the average. If $n=25$, for example, we might have $m=64$ or $m=128$; but $m=256$ would make the table too sparse. Each unique table has a maximum size, which must be small enough that we don't need too many base addresses for its pages, yet large enough that we can accommodate big BDDs. If, for example, |logmaxhashsize=19| and |logpagesize=12|, a unique table might contain as many as $2^{19}$ |addr|s, filling $2^9$ pages. Then we must make room for 512 base addresses in each |var| record, and we can handle up to $2^{19}-2^{17}=393216$ nodes that branch on any particular variable. @^Tweakable parameters@> @d logmaxhashsize 21 @d slotsperpage (pagesize/sizeof(addr)) @d maxhashpages (((1<= typedef struct var_struct { addr proj; /* address of the projection function $x_v$ */ addr repl; /* address of the replacement function $y_v$ */ int free; /* the number of unused slots in the unique table for $v$ */ int mask; /* the number of slots in that unique table, times 4, minus 1 */ addr base[maxhashpages]; /* base addresses for its pages */ int name; /* the user's name (subscript) for this variable */ unsigned int timestamp; /* time stamp for composition */ int aux; /* flag used by |math_print| or the sifting algorithm */ struct var_struct *up,*down; /* the neighboring active variables */ } var; @ Every node |p| that branches on $x_v$ in the BDD has a field |p->index|, whose leftmost |logvarsize| bits contain the index~$v$. The rightmost |32-logvarsize| bits of |p->index| are chosen randomly, in order to provide convenient hash coding. The SGB random-number generator used here makes four memory references per number generated. N.B.: The hashing scheme will fail dramatically unless |logvarsize+logmaxhashsize<=32|. @^Tweakable parameters@> @d logvarsize 10 @d varsize (1<>(32-logvarsize)) @d initnewnode(p,v,l,h) oo,p->lo=addr_(l),p->hi=addr_(h),p->xref=0,@| oooo,p->index=((v)<<(32-logvarsize))+(gb_next_rand()>>(logvarsize-1)) @ Variable $x_v$ in this documentation means the variable whose information record is |varhead[v]|. But the user's variable \.{x5}' might not be represented by |varhead[5]|, because the ordering of variables can change as a program runs. If \.{x5} is really the variable in |varhead[13]|, say, we will have |varmap[5]=13| and |varhead[13].name=5|. @d topofvars &varhead[varsize] @= var varhead[varsize]; /* basic info about each variable */ var *tvar=topofvars; /* threshold for verbose printouts */ int varmap[varsize]; /* the variable that has a given name */ @ @= for (k=0;k= node* projection(int v) { register node *p; register var *hv=&varhead[v]; o,p=node_(hv->proj); if (p) return p; /* the projection function has already been created */ o,hv->base[0]=addr_(reserve_page()); /* it won't be |NULL| */ @; p=unique_find(&varhead[v],botsink,topsink); /* see below */ oooo,botsink->xref++,topsink->xref++; o,hv->proj=addr_(p); /* |p| won't be |NULL| either */ if (verbose&2) printf(" %x=x%d\n",id(p),v); o,hv->name=v; return p; } @ I sometimes like to use a subroutine before I'm in the mood to write its innards. In such cases, a pre-specification like the one given here allows me to procrastinate. @= node* unique_find(var *v, node *l, node *h); @ Now, however, I'm ready to tackle the subroutine just named, |unique_find|, which is one of the most crucial in the entire program. Given a variable~|v|, together with node pointers |l| and~|h|, we often want to see if the BDD base contains a node $(v,l,h)$---namely, a branch on~$x_v$ with {\mc LO} pointer~|l| and {\mc HI} pointer~|h|. If no such node exists, we want to create it. The subroutine should return a pointer to that (unique) node. Furthermore, the reference counts of |l| and |h| should be decreased afterwards. To do this task, we look for $(l,h)$ in the unique table for $v$, using the hash code $$\hbox{|(l->index<<3)^(h->index<<2)|}.$$ (This hash code is a multiple of~4, the size of each entry in the unique table.) Several technicalities should be noted. First, no branch is needed when $l=h$. Second, we consider that a new reference is being made to the node returned, as well as to nodes |l| and~|h| if a new node is created; the |xref| fields (reference counts) must be adjusted accordingly. Third, we might discover that the node exists, but it is dead; in other words, all prior links to it might have gone away, but we haven't discarded it yet. In such a case we should bring it back to life. Fourth, |l| and |h| will not become dead when their reference counts decrease, because the calling routine knows them. And finally, in the worst case we won't have room for a new node, so we'll have to return |NULL|. The calling routine must be prepared to cope with such failures (which we hope are only temporary). The following inscrutable macros try to make my homegrown dynamic array addressing palatable. I have to admit that I didn't get them right the first time. Or even the second time. Or even \dots~. @d hashcode(l,h) ((addr*)(size_t)(oo,((l)->index<<3)^((h)->index<<2))) @d hashedcode(p) hashcode(node_(p->lo),node_(p->hi)) @d addr__(x) (*((addr*)(size_t)(x))) @d fetchnode(v,k) node_(addr__(v->base[(k)>>logpagesize]+((k)&pagemask))) @d storenode(v,k,p) o,addr__(v->base[(k)>>logpagesize]+((k)&pagemask))=addr_(p) @= node* unique_find(var *v, node *l, node *h) { register int j,k,mask,free; register addr *hash; register node *p,*r; if (l==h) { /* easy case */ return oo,l->xref--,l; /* |l->xref| will still be $\ge0$ */ } restart: o,mask=v->mask,free=v->free; for (hash=hashcode(l,h);;hash++) { /* ye olde linear probing */ k=addr_(hash)&mask; oo,p=fetchnode(v,k); if (!p) goto newnode; if (node_(p->lo)==l && node_(p->hi)==h) break; } if (o,p->xref<0) { deadnodes--,o,p->xref=0; /* a lucky hit; its children are alive */ return p; } oooo,l->xref--,h->xref--; return o,p->xref++,p; newnode: @; @; } @ @= void recursively_revive(node *p); /* recursive resuscitation */ void recursively_kill(node *p); /* recursive euthanization */ void collect_garbage(int level); /* invocation of the recycler */ @ Before we can call |unique_find|, we need a hash table to work with. We start small. @d storenulls(k) *(long long*)(size_t)(k)=0LL; @= o,hv->free=2,hv->mask=7; storenulls(hv->base[0]); /* both slots start out |NULL| */ zmems++; @ A little timer starts ticking at the beginning of this program, and it advances whenever we reach the present point. Whenever the timer reaches a multiple of |timerinterval|, we pause to examine the memory situation, in an attempt to keep node growth under control. Memory can be conserved in two ways. First, we can recycle all the dead nodes. That's a somewhat expensive proposition; but it's worthwhile if the number of such nodes is more than, say, 1/8 of the total number of nodes allocated. Second, we can try to change the ordering of the variables. The present program includes Rudell's @^Rudell, Richard Lyle@> sifting algorithm'' for dynamically improving the variable order; but it invokes that algorithm only under user control. Perhaps I will have time someday to make reordering more automatic. @^Tweakable parameters@> @d timerinterval 1024 @d deadfraction 8 @= if ((++timer%timerinterval)==0) { if (deadnodes>totalnodes/deadfraction) { collect_garbage(0); goto restart; /* the hash table might now be different */ } } @ @= unsigned long long timer; @ Brand-new nodes enter the fray here. @= p=reserve_node(); if (!p) goto cramped; /* sorry, there ain't no more room */ if (--free<=mask>>4) { free_node(p); @; } storenode(v,k,p);@+o,v->free=free; initnewnode(p,v-varhead,l,h); return p; cramped: /* after failure, we need to keep the xrefs tidy */ deref(l); /* decrease |l->xref|, and recurse if it becomes dead */ deref(h); /* ditto for |h| */ return NULL; @ We get to this part of the code when the table has become too dense. The density will now decrease from 3/4 to 3/8. @= { register int newmask=mask+mask+1,kk=newmask>>logpagesize; if (verbose&256) printf("doubling the hash table for level %d(x%d) (%d slots)\n", v-varhead,v->name,(newmask+1)/sizeof(addr)); if (kk) @@; else { for (k=v->base[0]+mask+1;kbase[0]+newmask;k+=sizeof(long long)) storenulls(k); zmems+=(newmask-mask)/sizeof(long long); } @; v->mask=newmask; /* mems are counted after restarting */ v->free=free+1+(newmask-mask)/sizeof(addr); goto restart; } @ @d maxmask ((1<= { if (newmask>maxmask) { /* too big: can't go there */ if (verbose&(2+256+512)) printf("profile limit reached for level %d(x%d)\n",v-varhead,v->name); goto cramped; } for (k=(mask>>logpagesize)+1;k<=kk;k++) { o,v->base[k]=addr_(reserve_page()); if (!v->base[k]) { /* oops, we're out of space */ for (k--;k>mask>>logpagesize;k--) { o,free_page(page_(v->base[k])); } goto cramped; } for (j=v->base[k];jbase[k]+pagesize;j+=sizeof(long long)) storenulls(j); zmems+=pagesize/sizeof(long long); } } @ Some subtle cases can arise at this point. For example, consider the hash table {\let\\=\Lambda $(a,\\,\\,b)$, with hash$(a)=3$ and hash$(b)=7$; when doubling the size, we need to rehash $a$ twice, going from the doubled-up table $(a,\\,\\,b,\\,\\,\\,\\)$ to $(\\,\\,\\,b,a,\\,\\,\\)$ to $(\\,\\,\\,\\,a,\\,\\,b)$ to $(\\,\\,\\,a,\\,\\,\\,b)$.} I learned this interesting algorithm from Rick Rudell. @^Rudell, Richard Lyle@> @= for (k=0;kmask) break; /* see the example above */ } @ While I've got linear probing firmly in mind, I might as well write a subroutine that will be needed later for garbage collection. The |table_purge| routine deletes all dead nodes that branch on a given variable~$x_v$. @= void table_purge(var *v) { register int free,i,j,jj,k,kk,mask,newmask,oldtotal; register node *p, *r; register addr *hash; o,mask=v->mask,free=v->free; if (o,v->proj) { /* |v->proj!=0| if and only if $x_v$ exists */ oldtotal=totalnodes; for (k=0;kxref<0) { free_node(p); @; } } deadnodes-=oldtotal-totalnodes, free+=oldtotal-totalnodes; @; o,v->free=free; } } @ Deletion from a linearly probed hash table is tricky, as noted in Algorithm 6.4R of TAOCP. Here I can speed that algorithm up slightly, because there's no need to move dead entries that will be deleted later. Furthermore, if I do meet a dead entry, I can take a slightly tricky shortcut and continue the removals. @= do { for (kk=k,j=k+sizeof(addr),k=0;;j+=sizeof(addr)) { jj=j&mask; oo,p=fetchnode(v,jj); if (!p) break; if (p->xref>=0) { o,i=addr_(hashedcode(p))&mask; if ((i<=kk)+(jj1) storenode(v,kk,p),kk=jj; }@+else if (!k) k=j,free_node(p); /* shortcut */ } storenode(v,kk,NULL); } while (k); k=j; /* the last run through that loop saw no dead nodes */ @ At least one node, |v->proj|, branches on $x_v$ at this point. @= k=(mask>>2)+1-free; /* this many nodes still branch on $x_v$ */ for (newmask=mask;(newmask>>5)>=k;newmask>>=1); if (newmask!=mask) { if (verbose&256) printf("downsizing the hash table for level %d(x%d) (%d slots)\n", v-varhead,v->name,(newmask+1)/sizeof(addr)); free-=(mask-newmask)>>2; @; for (k=mask>>logpagesize;k>newmask>>logpagesize;k--) o,free_page(page_(v->base[k])); v->mask=newmask; } @ Finally, another algorithm learned from Rudell. To prove its correctness, one can verify the following fact: Any entries that wrapped around from the upper half to the bottom in the original table will still wrap around in the smaller table. @^Rudell, Richard Lyle@> @= for (k=newmask+1;k \smallskip\textindent{$\bullet$} Otherwise $(f,g,h)$ encodes a ternary operation on the three BDD nodes $f$, $g$, |h&-16|. The four least-significant bits of~$h$ are used to identify the ternary operation involved: if-then-else~(0), median~(1), and-and~(2), and-exist~(3), not-yet-implemented~(4--15). @^Ternary operations@> @s memo int @d memo_(a) ((memo*)(size_t)(a)) @= typedef struct memo_struct { addr f; /* first operand */ addr g; /* second operand, or time stamp */ addr h; /* third operand and/or operation code */ addr r; /* result */ } memo; @ The cache always occupies $2^e$ pages of the dynamic memory, for some integer $e\ge0$. If we have leisure to choose this size, we pick the smallest $e\ge0$ such that the cache has at least $\max(4m,n/4)$ slots, where $m$ is the number of nonempty items in the cache and $n$ is the number of live nodes in the BDD. Furthermore, the cache size will double whenever the number of cache insertions reaches a given threshold. @^Tweakable parameters@> @d logmaxcachepages 15 /* shouldn't be large if |logvarsize| is large */ @d maxcachepages (1<= addr cachepage[maxcachepages]; /* base addresses for the cache */ int cachepages; /* the current number of pages in the cache */ int cacheinserts; /* the number of times we've inserted a memo */ int threshold; /* the number of inserts that trigger cache doubling */ int cachemask; /* index of the first slot following the cache, minus 1 */ @ The following subroutines, useful for debugging, print out the cache contents in symbolic form. If |p| points to a node, |id(p)| is |p-botsink|. @d id(a) (((size_t)(a)-(size_t)mem)/sizeof(node)) /* node number in |mem| */ @= void print_memo(memo *m) { printf("%x",id(m->f)); if (m->h==0) printf("[%d]",m->g); /* time stamp */ else if (m->h<=maxbinop) printf("%s%x",binopname[m->h],id(m->g)); else printf("%s%x%s%x", ternopname1[m->h&0xf],id(m->g),ternopname2[m->h&0xf],id(m->h)); printf("=%x\n",id(m->r)); } @# void print_cache(void) { register int k; register memo* m; for (k=0;kr) print_memo(m); } @ Many of the symbolic names here are presently unused. I've filled them in just to facilitate extensions to this program. @= char *binopname[]= {"","&",">","!","<","@@","^","|","_","A","N","#","Y","$","D","E"}; char *ternopname1[]= {"?",".","&","&","@@","#","$","%","*","<","-","+","|","/","\\","~"}; char *ternopname2[]= {":",".","&","E","@@","#","$","%","*","<","-","+","|","/","\\","~"}; @ The threshold is set to half the total number of cache slots, because this many random insertions will keep about$e^{-1/2}\approx 61$\% of the cache slots unclobbered. (If$p$denotes this probability, a random large binary tree will need about$E$steps to recalculate a lost result, where$E=p\cdot1+(1-p)\cdot(1+2E)$; hence we want$p>1/2$to avoid blowup, and$E=1/(2p-1)$.) @= int choose_cache_size(int items) { register int k,slots; k=1,slots=cacheslotsperpage; while (4*slotsr=0; zmems+=cacheslotsperpage; } cachemask=(cachepages<= cache_init(); @ Here's how we look for a memo in the cache. Memos might point to dead nodes, as long as those nodes still exist. A simple hash function is adequate for caching, because no clustering can occur. No mems are charged for computing |cachehash|, because we assume that the calling routine has taken responsibility for accessing |f->index| and |g->index|. @d cachehash(f,g,h) ((f)->index<<4)^(((h)?(g)->index:addr_(g))<<5)^(addr_(h)<<6) @d thememo(s) memo_(cachepage[((s)&cachemask)>>logpagesize]+((s)&pagemask)) @= node* cache_lookup(node *f,node *g,node *h) { register node *r; register memo *m; register addr slot=cachehash(f,g,h); o,m=thememo(slot); o,r=node_(m->r); if (!r) return NULL; if (o,node_(m->f)==f && node_(m->g)==g && node_(m->h)==h) { if (verbose&8) { printf("hit %x: ",(slot&cachemask)/sizeof(memo)); print_memo(m); } if (o,r->xref<0) { recursively_revive(r); return r; } return o,r->xref++,r; } return NULL; } @ Insertion into the cache is even easier, except that we might want to double the cache size while we're at it. @= void cache_insert(node *f,node *g,node *h,node *r) { register memo *m,*mm; register int k; register int slot=cachehash(f,g,h); if (h) oo;@+else o; /* mems for computing |cachehash| */ if (++cacheinserts>=threshold) @; o,m=thememo(slot); if ((verbose&16) && m->r) { printf("lose %x: ",(slot&cachemask)/sizeof(memo)); print_memo(m); } oo,m->f=addr_(f),m->g=addr_(g),m->h=addr_(h),m->r=addr_(r); if (verbose&32) { printf("set %x: ",(slot&cachemask)/sizeof(memo)); print_memo(m); } } @ @= if (cachepages=cachepages;k--) o,free_page(page_(cachepage[k])); goto done; } for (m=memo_(cachepage[k]);mr=0; zmems+=cacheslotsperpage; } cachepages<<=1; cachemask+=cachemask+1; threshold=1+(cachepages*cacheslotsperpage)/2; @; } done:@; @ @= for (k=cachepages>>1;kr) { if (m->h) oo;@+else o; /* mems for computing |cachehash| */ oo,mm=thememo(cachehash(node_(m->f),node_(m->g),node_(m->h))); if (m!=mm) { oo,*mm=*m; o,m->r=0; } } } @ Before we purge elements from the unique tables, we need to purge all references to dead nodes from the cache. At the same time, we might as well purge items whose time stamp has expired. @= void cache_purge(void) { register int k,items,newcachepages; register memo *m,*mm; for (k=items=0;kr) { if ((o,node_(m->r)->xref<0) || (oo,node_(m->f)->xref<0)) goto purge; if (m->h==0) { if (m->g!=thevar(node_(m->f))->timestamp) goto purge; }@+else { if (o,node_(m->g)->xref<0) goto purge; if (m->h>maxbinop && (o,node_(m->h&-0x10)->xref<0)) goto purge; } items++;@+continue; purge: o,m->r=0; } } if (verbose&(8+16+32+512)) printf("purging the cache (%d items left)\n",items); @; cacheinserts=items; } @ @= newcachepages=choose_cache_size(items); if (newcachepagesr) { if (m->h) oo;@+else o; /* mems for computing |cachehash| */ oo,mm=thememo(cachehash(node_(m->f),node_(m->g),node_(m->h))); if (m!=mm) { oo,*mm=*m; } } free_page(page_(cachepage[k])); } cachepages=newcachepages; threshold=1+(cachepages*cacheslotsperpage)/2; } @*BDD structure. The reader of this program ought to be familiar with the basics of BDDs, namely the facts that a BDD base consists of two sink nodes together with an unlimited number of branch nodes, where each branch node$(v,l,h)$names a variable$x_v$and points to other nodes$l\ne h$that correspond to the cases where$x_v=0$and$x_v=1$. The variables on every path have increasing rank~$v$, and no two nodes have the same$(v,l,h)$. Besides the nodes of the BDD, this program deals with external pointers$f_j$for$0\le j<|extsize|$. Each$f_j$is either |NULL| or points to a BDD node. @^Tweakable parameters@> @d extsize 1000 @= node *f[extsize]; /* external pointers to functions in the BDD base */ @ Sometimes we want to mark the nodes of a subfunction temporarily. The following routine sets the leading bit of the |xref| field in all nodes reachable from~|p|. @= void mark(node *p) { rmems++; /* track recursion overhead */ restart:@+if (o,p->xref>=0) { o,p->xref^=0x80000000; ooo,mark(node_(p->lo)); /* two extra mems to save and restore |p| */ o,p=node_(p->hi); goto restart; /* tail recursion */ } } @ We need to remove those marks soon after |mark| has been called, because the |xref| field is really supposed to count references. @= void unmark(node *p) { rmems++; /* track recursion overhead */ restart:@+if (o,p->xref<0) { o,p->xref^=0x80000000; ooo,unmark(node_(p->lo)); /* two extra mems to save and restore |p| */ o,p=node_(p->hi); goto restart; /* tail recursion */ } } @ Here's a simple routine that prints out the current BDDs, in order of the variables in branch nodes. If the |marked| parameter is nonzero, the output is restricted to branch nodes whose |xref| field is marked. Otherwise all nodes are shown, with nonzero |xref|s in parentheses. @d thevar(p) (&varhead[varpart((p)->index)]) @d print_node(p) printf("%x: (~%d?%x:%x)",id(p),thevar(p)->name,id((p)->lo),id((p)->hi)) @d print_node_unmapped(p) printf("%x: (~%d?%x:%x)",id(p),thevar(p)-varhead,id((p)->lo),id((p)->hi)) @= void print_base(int marked,int unmapped) { register int j,k; register node *p; register var *v; for (v=varhead;vproj) { for (k=0;kmask;k+=sizeof(addr)) { p=fetchnode(v,k); if (p && (!marked || (p->xref+1)<0)) { if (unmapped) print_node_unmapped(p);@+else print_node(p); if (marked || p->xref==0) printf("\n"); else printf(" (%d)\n",p->xref); } } if (!marked && v->repl) printf("y%d=%x\n",v->name,id(v->repl)); } if (!marked) { /* we also print the external functions */ for (j=0;j= void print_function(node *p,int unmapped) { unsigned long long savemems=mems,savermems=rmems; /* mems aren't counted while printing */ if (p==botsink || p==topsink) printf("%d\n",p-botsink); else if (p) { mark(p); print_base(1,unmapped); unmark(p); } mems=savemems, rmems=savermems; } @ @= void print_profile(node *p) { unsigned long long savemems=mems,savermems=rmems; register int j,k,tot; register var *v; if (!p) printf(" 0\n"); /* vacuous */ else if (p<=topsink) printf(" 1\n"); /* constant */ else { tot=0; mark(p); for (v=varhead;vproj) { @; } unmark(p); printf(" 2 (total %d)\n",tot+2); /* the sinks */ } mems=savemems, rmems=savermems; } @ @= for (j=k=0;kmask;k+=sizeof(addr)) { register node *q=fetchnode(v,k); if (q && (q->xref+1)<0) j++; } printf(" %d",j); tot+=j; @ In order to deal efficiently with large BDDs, we've introduced highly redundant data structures, including things like hash tables and the cache. Furthermore, we assume that every BDD node~|p| has a redundant field |p->xref|, which counts the total number of branch nodes, external nodes, replacement functions, and projection functions that point to~|p|, minus~1. Bugs in this program might easily corrupt the data structure by putting it into an inconsistent state. Yet the inconsistency might not show up at the time of the error; the computer might go on to execute millions of instructions before the anomalies lead to disaster. Therefore I've written a |sanity_check| routine, which laboriously checks the integrity of all the data structures. This routine should help me to pinpoint problems readily whenever I make mistakes. And besides, the |sanity_check| calculations document the structures in a way that should be especially helpful when I reread this program a year from now. Even today, I think that the very experience of writing |sanity_check| has made me much more familiar with the structures themselves. This reinforced knowledge will surely be valuable as I write the rest of the code. @d includesanity 1 @= #if includesanity unsigned int sanitycount; /* how many sanity checks have been started? */ void sanity_check(void) { register node *p,*q; register int j,k,count,extra; register var *v; unsigned long long savemems=mems; sanitycount++; @; @; @; @; @; mems=savemems; } #endif @ Sanity checking is done with a shadow memory'' |smem|, which is just as big as~|mem|. If |p| points to a node in |mem|, there's a corresponding ghost'' in |smem|, pointed to by |q=ghost(p)|. The ghost nodes have four fields |lo|, |hi|, |xref|, and |index|, just like ordinary nodes do; but the meanings of those fields are quite different: |q->xref| is$-1$if node~|p| is in the free list, otherwise |q->xref| is a backpointer to a field that points to~|p|. If |p->lo| points to~|r|, then |q->lo| will be a backpointer that continues the list of pointers to~|r| that began with the |xref| field in |r|'s ghost; and there's a similar relationship between |p->hi| and |q->hi|. (Thus we can find all nodes that point to~|p|.) Finally, |q->index| counts the number of references to~|p| from external pointers, projection functions, and replacement functions. @d ghost(p) node_((size_t)(p)-(size_t)mem+(size_t)smem) @= #if includesanity char smem[memsize]; /* the shadow memory */ #endif @ @d complain(complaint) {@+printf("! %s in node ",complaint); @+print_node(p);@+printf("\n");@+} @d legit(p) (((size_t)(p)&(sizeof(node)-1))==0 && (p)=botsink && ghost(p)->xref!=-1) @d superlegit(p) (((size_t)(p)&(sizeof(node)-1))==0 && (p)topsink && ghost(p)->xref!=-1) @= for (p=botsink;pxref=0,ghost(p)->index=-1; @; @; for (count=2,p=topsink+1;pxref!=-1) { count++; if (!legit(node_(p->lo)) || !legit(node_(p->hi))) complain("bad pointer")@; else if (node_(thevar(p)->proj)==NULL) complain("bad var")@; else if (p->lo==p->hi) complain("lo=hi")@; else { @; if (node_(p->lo)>topsink && thevar(p)>=thevar(node_(p->lo))) complain("bad lo rank"); if (node_(p->hi)>topsink && thevar(p)>=thevar(node_(p->hi))) complain("bad hi rank"); if (p->xref>=0) { /* dead nodes don't point */ q=ghost(p); q->lo=ghost(p->lo)->xref, ghost(p->lo)->xref=addr_(&(p->lo)); q->hi=ghost(p->hi)->xref, ghost(p->hi)->xref=addr_(&(p->hi)); } } } if (count!=totalnodes) printf("! totalnodes should be %d, not %d\n",count,totalnodes); if (extra!=totalnodes) printf("! %d nodes have leaked\n",extra-totalnodes); @ The macros above and the |who_points_to| routine below rely on the fact that |sizeof(node)=16|. @ @+= if (sizeof(node)!=16) { fprintf(stderr,"Sorry, I assume that sizeof(node) is 16!\n"); exit(-3); } @ @= { register addr *hash; register var *v=thevar(p); j=v->mask; for (hash=hashcode(node_(p->lo),node_(p->hi));;hash++) { k=addr_(hash)&j; q=fetchnode(v,k); if (!q) break; if (q->lo==p->lo && q->hi==p->hi) break; } if (q!=p) complain("unfindable (lo,hi)"); addr__((size_t)(v->base[k>>logpagesize]+(k&pagemask)) -(size_t)mem+(size_t)smem)=sanitycount; } @ @= extra=nodeptr-botsink; for (p=nodeavail;p;p=node_(p->xref)) { if (!superlegit(p)) printf("! illegal node %x in the list of free nodes\n",id(p)); else extra--,ghost(p)->xref=-1; } @ @= ghost(botsink)->index=ghost(topsink)->index=0; for (v=varhead;vproj) { if (!superlegit(node_(v->proj))) printf("! illegal projection function for level %d\n",v-varhead); else ghost(v->proj)->index++; if (v->repl) { if (!legit(node_(v->repl))) printf("! illegal replacement function for level %d\n",v-varhead); else ghost(v->repl)->index++; } } for (j=0;jtopsink && !superlegit(f[j])) printf("! illegal external pointer f%d\n",j); else ghost(f[j])->index++; } @ @= for (p=botsink,count=0;pxref==-1) continue; /* |p| is free */ for (k=q->index,q=node_(q->xref);q;q=node_(addr__(ghost(q)))) k++; if (p->xref!=k) printf("! %x->xref should be %d, not %d\n",id(p),k,p->xref); if (k<0) count++; /* |p| is dead */ } if (count!=deadnodes) printf("! deadnodes should be %d, not %d\n",count,deadnodes); @ If a reference count turns out to be wrong, I'll probably want to know why. The following subroutine provides additional clues. @= #if includesanity void who_points_to(node *p) { register addr q; /* the address of a |lo| or |hi| field in a node */ for (q=addr_(ghost(p)->xref);q;q=addr__(ghost(q))) { print_node(node_(q&-sizeof(node))); printf("\n"); } } #endif @ We've seen that every superlegimate node is findable in the proper unique table. Conversely, we want to check that everything is those tables is superlegitimate, and found. @d badpage(p) ((p)=topofmem) @= extra=topofmem-pageptr; /* this many pages allocated */ for (v=varhead;vproj) { for (k=0;k<=v->mask>>logpagesize;k++) if (badpage(page_(v->base[k]))) printf("! bad page base %x in unique table for level %d\n", id(v->base[k]),v-varhead); extra-=1+(v->mask>>logpagesize); for (k=count=0;kmask;k+=sizeof(addr)) { p=fetchnode(v,k); if (!p) count++; else { if (addr__((size_t)(v->base[k>>logpagesize]+(k&pagemask)) -(size_t)mem+(size_t)smem)!=sanitycount) printf("! extra node %x in unique table for level %d\n",id(p),v-varhead); if (!superlegit(p)) printf("! illegal node %x in unique table for level %d\n",id(p),v-varhead); else if (varpart(p->index)!=v-varhead) complain("wrong var"); } } if (count!=v->free) printf("! unique table %d has %d free slots, not %d\n", v-varhead,count,v->free); } @ The fields in cache memos that refer to nodes should refer to legitimate nodes. @= { register memo *m; extra-=1+(cachemask>>logpagesize); for (k=0;kr) { if (!legit(node_(m->r))) goto nogood; if (!legit(node_(m->f))) goto nogood; if (m->h>0) { if (!legit(node_(m->g))) goto nogood; if (m->h>maxbinop && !legit(node_(m->h&-0x10))) goto nogood; } } continue; nogood: printf("! bad node in cache entry ");@+print_memo(m); } } @ Finally, |sanity_check| ensures that we haven't forgotten to free unused pages, nor have we freed a page that was already free. @= { register page *p=pageavail; while (p && extra>0) { if (badpage(p)) printf("! bad free page %x\n",id(p)); p=page_(p->dat[0]),extra--; } if (extra>0) printf("! %d pages have leaked\n",extra); else if (p) printf("! the free pages form a loop\n"); } @ The following routine brings a dead node back to life. It also increases the reference counts of the node's children, and resuscitates them if they were dead. @= void recursively_revive(node *p) { register node *q; rmems++; /* track recursion overhead */ restart:@+if (verbose&4) printf("reviving %x\n",id(p)); o,p->xref=0; deadnodes--; q=node_(p->lo); if (o,q->xref<0) oooo,recursively_revive(q); else o,q->xref++; p=node_(p->hi); if (o,p->xref<0) goto restart; /* tail recursion */ else o,p->xref++; } @ Conversely, we sometimes must go the other way, with as much dignity as we can muster. @d deref(p) if (o,(p)->xref==0) recursively_kill(p);@+else o,(p)->xref-- @= void recursively_kill(node *p) { register node *q; rmems++; /* track recursion overhead */ restart:@+if (verbose&4) printf("burying %x\n",id(p)); o,p->xref=-1; deadnodes++; q=node_(p->lo); if (o,q->xref==0) oooo,recursively_kill(q); else o,q->xref--; p=node_(p->hi); if (o,p->xref==0) goto restart; /* tail recursion */ else o,p->xref--; } @*Binary operations. OK, now we've got a bunch of powerful routines for making and maintaining BDDs, and it's time to have fun. Let's start with a typical synthesis routine, which constructs the BDD for$f\land g$from the BDDs for$f$and~$g$. The general pattern is to have a top-level subroutine and a recursive subroutine. The top-level one updates overall status variables and invokes the recursive one; and it keeps trying, if temporary setbacks arise. The recursive routine exits quickly if given a simple case. Otherwise it checks the cache, and calls itself if necessary. I write the recursive routine first, since it embodies the guts of the computation. % and since C wants me to The top-level routines are rather boring, so I'll defer them till later. Incidentally, I learned the \CEE/ language long ago, and didn't know until recently that it's now legal to modify the formal parameters to a function. (Wow!) @= node* and_rec(node*f, node*g) { var *v,*vf,*vg; node *r,*r0,*r1; if (f==g) return oo,f->xref++,f; /*$f\land f=f$*/ if (f>g) r=f, f=g, g=r; /* wow */ if (f<=topsink) { if (f==topsink) return oo,g->xref++,g; /*$1\land g=g$*/ return oo,botsink->xref++,botsink; /*$0\land g=0$*/ } oo,r=cache_lookup(f,g,node_(1)); /* two mems for |f->index|, |g->index| */ if (r) return r; @; } @ I assume that |f->lo| and |f->hi| belong to the same octabyte. The |rmems| counter is incremented only after we've checked for special terminal cases. When none of the simplifications apply, we must prepare to plunge in to deeper waters. @= rmems++; /* track recursion overhead */ vf=thevar(f); /* |f->index| has already been fetched */ vg=thevar(g); /* |g->index| has already been fetched */ if (vflo): f),(vg==v? o,node_(g->lo): g)); if (!r0) return NULL; /* oops, trouble */ r1=and_rec((vf==v? node_(f->hi): f),(vg==v? node_(g->hi): g)); if (!r1) { deref(r0); /* too bad, but we have to abort in midstream */ return NULL; } r=unique_find(v,r0,r1); if (r) { if ((verbose&128)&&(v= node* or_rec(node*f, node*g) { var *v,*vf,*vg; node *r,*r0,*r1; if (f==g) return oo,f->xref++,f; /*$f\lor f=f$*/ if (f>g) r=f, f=g, g=r; /* wow */ if (f<=topsink) { if (f==topsink) return oo,topsink->xref++,topsink; /*$1\lor g=1$*/ return oo,g->xref++,g; /*$0\lor g=g$*/ } oo,r=cache_lookup(f,g,node_(7)); if (r) return r; @; } @ @= rmems++; /* track recursion overhead */ vf=thevar(f); vg=thevar(g); if (vflo): f),(vg==v? o,node_(g->lo): g)); if (!r0) return NULL; /* oops, trouble */ r1=or_rec((vf==v? node_(f->hi): f),(vg==v? node_(g->hi): g)); if (!r1) { deref(r0); /* too bad, but we have to abort in midstream */ return NULL; } r=unique_find(v,r0,r1); if (r) { if ((verbose&128)&&(v= node* xor_rec(node*f, node*g) { var *v,*vf,*vg; node *r,*r0,*r1; if (f==g) return oo,botsink->xref++,botsink; /*$f\oplus f=0$*/ if (f>g) r=f, f=g, g=r; /* wow */ if (f==botsink) return oo,g->xref++,g; /*$0\oplus g=g$*/ oo,r=cache_lookup(f,g,node_(6)); if (r) return r; @; } @ After discovering that$f\oplus g=r$, we could also put$f\oplus r=g$and$g\oplus r=f$into the cache. I tried that, in the first draft of this code. Unfortunately it cost more than it saved. @= rmems++; /* track recursion overhead */ if (f==topsink) vf=0,v=vg=thevar(g); else { vf=thevar(f),vg=thevar(g); if (vflo): f),(vg==v? o,node_(g->lo): g)); if (!r0) return NULL; /* oops, trouble */ r1=xor_rec((vf==v? node_(f->hi): f),(vg==v? node_(g->hi): g)); if (!r1) { deref(r0); /* too bad, but we have to abort in midstream */ return NULL; } r=unique_find(v,r0,r1); if (r) { if ((verbose&128)&&(v @^Madre, Jean Christophe@> The function$f\downarrow g$returned by this routine depends on the current variable ordering. In brief,$f\downarrow g=0$if$g$is identically zero; otherwise$(f\downarrow g)(x)=f(y)$when$y$is the first element of the sequence$x$,$x\oplus1$,$x\oplus2$, \dots\ such that$g(y)=1$. (This definition treats$x=(x_0\ldots x_n)_2$and$y=(y_0\ldots y_n)_2$as binary numbers.) @= node* constrain_rec(node*f, node*g) { var *v,*vf,*vg; node *r,*r0,*r1; if (g==botsink) return oo,botsink->xref++,botsink; /*$f\downarrow 0=0$*/ if (g==topsink || f<=topsink) return oo,f->xref++,f; /*$f\downarrow1=f$,$0\downarrow g=0$; also$1\downarrow g=1$if$g\ne0$*/ if (f==g) return oo,topsink->xref++,topsink; /*$f\downarrow f=1$*/ oo,r=cache_lookup(f,g,node_(8)); if (r) return r; @; } @ @= rmems++; /* track recursion overhead */ vf=thevar(f); vg=thevar(g); if (vflo)==botsink) { oo,r=constrain_rec((vf==v? o,node_(f->hi): f),node_(g->hi)); goto shortcut; } if (o,node_(g->hi)==botsink) { oo,r=constrain_rec((vf==v? o,node_(f->lo): f),node_(g->lo)); goto shortcut; } } r0=constrain_rec((vf==v? o,node_(f->lo): f),(vg==v? o,node_(g->lo): g)); if (!r0) return NULL; /* oops, trouble */ r1=constrain_rec((vf==v? node_(f->hi): f),(vg==v? node_(g->hi): g)); if (!r1) { deref(r0); /* too bad, but we have to abort in midstream */ return NULL; } r=unique_find(v,r0,r1); shortcut:@+if (r) { if ((verbose&128)&&(v The program doesn't actually bother to check that |g| has this form. But if |g| is a general function, the meaning is dependent on the ordering of variables; consider, for instance, the case$g=\bar x_1\lor x_2$. The user who tries general functions had better be aware (or beware) of this fact. If$g$is a conjunction of$k$literals, and if they all have the highest possible rank (so that they occur at the bottom of the BDDs), quantification takes linear time in the size of the BDD for~|f|. But if the literals have low rank and occur near the top, the running time can be as bad as the$2^k$th power of that BDD size(!). @ The first case, existential quantification ($f\(E)g$), is typical. Notice that we don't use the cache unless there are multiple references to~|f|. The number of references to~|g| is immaterial here, since we're treating |g| as a one-dimensional list of literals. @= node* exist_rec(node*f, node*g) { var *v,*vg; node *r,*r0,*r1; restart:@+if (g<=topsink) return oo,f->xref++,f; /*$f\(E)1=f$*/ if (f<=topsink) return oo,f->xref++,f; /*$0\(E)g=0$,$1\(E)g=1$*/ v=thevar(f); vg=thevar(g); if (v>vg) { o,g=node_(g->hi);@+goto restart; /* |f| doesn't depend on |vg| */ } oo,r=cache_lookup(f,g,node_(15)); if (r) return r; @; } @ When the top variable of |g| is the same as the top variable of~|f|, we always descend to the |hi| branch below node~|g|, since |g| is supposed to be a conjunction. @= rmems++; /* track recursion overhead */ o,r0=exist_rec(node_(f->lo),(vg==v? o,node_(g->hi): g)); if (!r0) return NULL; /* oops, trouble */ if (r0==topsink && vg==v) { r=r0; goto gotr; } r1=exist_rec(node_(f->hi),(vg==v? node_(g->hi): g)); if (!r1) { deref(r0); /* too bad, but we have to abort in midstream */ return NULL; } if (vg>v) r=unique_find(v,r0,r1); else { r=or_rec(r0,r1); /* existential quantification happens here */ deref(r0);@+deref(r1); /* we're done with |r0| and |r1| */ } gotr:@+if (r) { if ((verbose&128)&&(v= node* all_rec(node*f, node*g) { var *v,*vg; node *r,*r0,*r1; restart:@+if (g<=topsink) return oo,f->xref++,f; /*$f\(A)1=f$*/ if (f<=topsink) return oo,f->xref++,f; /*$0\(A)g=0$,$1\(A)g=1$*/ v=thevar(f); vg=thevar(g); if (v>vg) { o,g=node_(g->hi);@+goto restart; /* |f| doesn't depend on |vg| */ } oo,r=cache_lookup(f,g,node_(9)); if (r) return r; @; } @ @= rmems++; /* track recursion overhead */ o,r0=all_rec(node_(f->lo),(vg==v? o,node_(g->hi): g)); if (!r0) return NULL; /* oops, trouble */ if (r0==botsink && vg==v) { r=r0; goto gotr; } r1=all_rec(node_(f->hi),(vg==v? node_(g->hi): g)); if (!r1) { deref(r0); /* too bad, but we have to abort in midstream */ return NULL; } if (vg>v) r=unique_find(v,r0,r1); else { r=and_rec(r0,r1); /* universal quantification happens here */ deref(r0);@+deref(r1); /* we're done with |r0| and |r1| */ } gotr:@+if (r) { if ((verbose&128)&&(v= node* diff_rec(node*f, node*g) { var *v,*vg; node *r,*r0,*r1; if (g<=topsink) return oo,f->xref++,f; /*$f\(D)1=f$*/ if (f<=topsink) return oo,botsink->xref++,botsink; /*$0\(D)g=1\(D)g=0$when |g| isn't constant */ v=thevar(f); vg=thevar(g); if (v>vg) return oo,botsink->xref++,botsink; /* |f| doesn't depend on |vg| */ oo,r=cache_lookup(f,g,node_(14)); if (r) return r; @; } @ @= rmems++; /* track recursion overhead */ o,r0=diff_rec(node_(f->lo),(vg==v? o,node_(g->hi): g)); if (!r0) return NULL; /* oops, trouble */ r1=diff_rec(node_(f->hi),(vg==v? node_(g->hi): g)); if (!r1) { deref(r0); /* too bad, but we have to abort in midstream */ return NULL; } if (vg>v) r=unique_find(v,r0,r1); else { r=xor_rec(r0,r1); /* differencing happens here */ deref(r0);@+deref(r1); /* we're done with |r0| and |r1| */ } if (r) { if ((verbose&128)&&(v= node* yes_no_rec(int curop,node*f, node*g) { var *v,*vg; node *r,*r0,*r1; if (g<=topsink) return oo,f->xref++,f; /*$f\(Y)1=f\(N)1=f$*/ if (f<=topsink) return oo,botsink->xref++,botsink; /*$0\(Y)g=1\(Y)g=0\(N)g=1\(N)g=0$when |g| isn't constant */ v=thevar(f); vg=thevar(g); if (v>vg) return oo,botsink->xref++,botsink; /* |f| doesn't depend on |vg| */ oo,r=cache_lookup(f,g,node_(curop)); if (r) return r; @; } @ @= rmems++; /* track recursion overhead */ o,r0=yes_no_rec(curop,node_(f->lo),(vg==v? o,node_(g->hi): g)); if (!r0) return NULL; /* oops, trouble */ if (r0<=topsink && vg==v) { if ((r0==topsink)==(curop==12)) { r=botsink;@+deref(r0);@+botsink->xref++; goto gotr; } } r1=yes_no_rec(curop,node_(f->hi),(vg==v? node_(g->hi): g)); if (!r1) { deref(r0); /* too bad, but we have to abort in midstream */ return NULL; } if (vg>v) r=unique_find(v,r0,r1); else { if (curop==12) r=mux_rec(r0,botsink,r1); /*$f\(Y)g=\bar r_0\land r_1$*/ else r=mux_rec(r1,botsink,r0); /*$f\(N)g=r_0\land\bar r_1$*/ deref(r0);@+deref(r1); /* we're done with |r0| and |r1| */ } gotr:@+if (r) { if ((verbose&128)&&(v= node *mux_rec(node *f, node *g, node *h); @*Ternary operations. All operations can be reduced to binary operations, but it should be interesting to see if we get a speedup by staying ternary. I like to call the first one mux,'' although many other authors have favored ite'' (meaning if-then-else). The latter doesn't seem right to me when I try to pronounce it. So I'm sticking with the well-worn, traditional name for this function. Two special cases are worthy of note:$h=1$gives $f$implies$g$'';$g=0$gives not$f$but$g$.'' I could have implemented those cases as binary operators, but I chose to let them take the slightly slower ternary route. (I'm usually a fan of speed, but this program is already long enough.) @= node *mux_rec(node *f, node *g, node *h) { var *v,*vf,*vg,*vh; node *r,*r0,*r1; if (f<=topsink) { if (f==topsink) return oo,g->xref++,g; /*$(1{?}\ g{:}\ h)=g$*/ return oo,h->xref++,h; /*$(0{?}\ g{:}\ h)=h$*/ } if (g==f || g==topsink) return or_rec(f,h); /*$(f{?}\ f{:}\ h) = (f{?}\ 1{:}\ h) = f\lor h$*/ if (h==f || h==botsink) return and_rec(f,g); /*$(f{?}\ g{:}\ f) = (f{?}\ g{:}\ 0) = f\land g$*/ if (g==h) return oo,g->xref++,g; /*$(f{?}\ g{:}\ g)=g$*/ if (g==botsink && h==topsink) return xor_rec(topsink,f); /*$(f{?}\ 0{:}\ 1) = 1\oplus f$*/ oo,r=cache_lookup(f,g,h); if (r) return r; @; } @ @= rmems++; /* track recursion overhead */ v=vf=thevar(f); if (g==botsink) vg=topofvars;@+else { vg=thevar(g);@+if (v>vg) v=vg; } if (h==topsink) vh=topofvars;@+else { o,vh=thevar(h);@+if (v>vh) v=vh; } r0=mux_rec((vf==v? o,node_(f->lo): f),(vg==v? o,node_(g->lo): g), (vh==v? o,node_(h->lo): h)); if (!r0) return NULL; /* oops, trouble */ r1=mux_rec((vf==v? o,node_(f->hi): f),(vg==v? o,node_(g->hi): g), (vh==v? o,node_(h->hi): h)); if (!r1) { deref(r0); /* too bad, but we have to abort in midstream */ return NULL; } r=unique_find(v,r0,r1); if (r) { if ((verbose&128)&&(v= node *med_rec(node *f, node *g, node *h) { var *v,*vf,*vg,*vh; node *r,*r0,*r1; if (f>g) { if (g>h) r=f,f=h,h=r; else if (f>h) r=f,f=g,g=h,h=r; else r=f,f=g,g=r; }@+else if (g>h) { if (f>h) r=f,f=h,h=g,g=r; else r=g,g=h,h=r; } /* now$f\le g\le h$*/ if (f<=topsink) { if (f==topsink) return or_rec(g,h); /*$\langle 1gh\rangle=g\lor h$*/ return and_rec(g,h); /*$\langle 0gh\rangle=g\land h$*/ } if (f==g) return oo,f->xref++,f; /*$\langle ffh\rangle=f$*/ if (g==h) return oo,g->xref++,g; /*$\langle fgg\rangle=g$*/ oo,r=cache_lookup(f,g,node_(addr_(h)+1)); if (r) return r; @; } @ @= rmems++; /* track recursion overhead */ vf=thevar(f); vg=thevar(g); o,vh=thevar(h); if (vfvh) v=vh; /* choose the top variable, |v| */ r0=med_rec((vf==v? o,node_(f->lo): f),(vg==v? o,node_(g->lo): g), (vh==v? o,node_(h->lo): h)); if (!r0) return NULL; /* oops, trouble */ r1=med_rec((vf==v? o,node_(f->hi): f),(vg==v? o,node_(g->hi): g), (vh==v? o,node_(h->hi): h)); if (!r1) { deref(r0); /* too bad, but we have to abort in midstream */ return NULL; } r=unique_find(v,r0,r1); if (r) { if ((verbose&128)&&(v= node *and_and_rec(node *f, node *g, node *h) { var *v,*vf,*vg,*vh; node *r,*r0,*r1; if (f>g) { if (g>h) r=f,f=h,h=r; else if (f>h) r=f,f=g,g=h,h=r; else r=f,f=g,g=r; }@+else if (g>h) { if (f>h) r=f,f=h,h=g,g=r; else r=g,g=h,h=r; } /* now$f\le g\le h$*/ if (f<=topsink) { if (f==topsink) return and_rec(g,h); /*$1\land g\land h=g\land h$*/ return oo,botsink->xref++,botsink; /*$0\land g\land h=0$*/ } if (f==g) return and_rec(g,h); /*$f\land f\land h=f\land h$*/ if (g==h) return and_rec(f,g); /*$f\land g\land g=f\land g$*/ oo,r=cache_lookup(f,g,node_(addr_(h)+2)); if (r) return r; @; } @ @= rmems++; /* track recursion overhead */ vf=thevar(f); vg=thevar(g); o,vh=thevar(h); if (vfvh) v=vh; /* choose the top variable, |v| */ r0=and_and_rec((vf==v? o,node_(f->lo): f),(vg==v? o,node_(g->lo): g), (vh==v? o,node_(h->lo): h)); if (!r0) return NULL; /* oops, trouble */ r1=and_and_rec((vf==v? o,node_(f->hi): f),(vg==v? o,node_(g->hi): g), (vh==v? o,node_(h->hi): h)); if (!r1) { deref(r0); /* too bad, but we have to abort in midstream */ return NULL; } r=unique_find(v,r0,r1); if (r) { if ((verbose&128)&&(v @= node* and_exist_rec(node*f, node*g, node*h) { var *v,*vf,*vg,*vh; node *r,*r0,*r1; restart:@+if (h<=topsink) return and_rec(f,g); /*$(f\land g)\(E)1=f\land g$*/ if (f==g) return exist_rec(f,h); /*$(f\land f)\(E)h=f\(E)h$*/ if (f>g) r=f, f=g, g=r; /* wow */ if (f<=topsink) { if (f==topsink) return exist_rec(g,h); /*$(1\land g)\(E)h=g\(E)h$*/ return oo,botsink->xref++,botsink; /*$(0\land d)\(E)h=0$*/ } oo,r=cache_lookup(f,g,node_(addr_(h)+3)); if (r) return r; vf=thevar(f); vg=thevar(g); o,vh=thevar(h); if (vfvh) { o,h=node_(h->hi);@+goto restart; /* |f| and |g| don't depend on |vh| */ } @; } @ @= rmems++; /* track recursion overhead */ r0=and_exist_rec((vf==v? o,node_(f->lo):f),(vg==v? o,node_(g->lo): g), (vh==v? o,node_(h->hi): h)); if (!r0) return NULL; /* oops, trouble */ if (r0==topsink && vg==v) { r=r0; goto gotr; } r1=and_exist_rec((vf==v? node_(f->hi):f),(vg==v? node_(g->hi): g), (vh==v? node_(h->hi): h)); if (!r1) { deref(r0); /* too bad, but we have to abort in midstream */ return NULL; } if (vh>v) r=unique_find(v,r0,r1); else { r=or_rec(r0,r1); /* existential quantification happens here */ deref(r0);@+deref(r1); /* we're done with |r0| and |r1| */ } gotr:@+if (r) { if ((verbose&128)&&(vk$ remain usable. Therefore we maintain a time stamp for each variable $x_k$. The time stamps for $x_0$ through $x_k$ are increased whenever $y_k$ is updated, so that cache entries for the composition of such functions won't match when |cache_lookup| is called. The time stamp for $x_v$ is zero if and only if all of $y_v$, $y_{v+1}$, \dots\ are currently undefined. Two global variables control this process: The variable |timestamp_changed| is either 0 or~1; the difference |timestamp-timestamp_changed| is the number of times we've performed a composition operation. @ @= int timestamp; /* the number of distinct compositions done or prepared */ int timestamp_changed; /* has it changed since the last composition was done? */ @ Here's how the individual time stamps are updated when the user gives a new specification for $y_k$: @repl|@>= v->repl=addr_(q); if (q==NULL) { /* cancel a previous replacement */ if (v+1timestamp!=0)) goto newstamps; while (v>=varhead && (o,v->repl==0)) o,v->timestamp=0,v--; /* cancel the previous timestamp */ if (v>=varhead) goto newstamps; timestamp-=timestamp_changed, timestamp_changed=0; /* all clear */ }@+else { oo,q->xref++; newstamps:@+if (!timestamp_changed) timestamp_changed=1,timestamp++; /* N.B.: |timestamp| won't become zero */ while (v>=varhead && (o,v->timestamp!=timestamp)) o,v->timestamp=timestamp,v--; /* this is done even when |v->proj=0| */ } @ A special adjustment to the time stamps is needed when we're wiping out the whole cache. @= if (varhead[0].timestamp==0) o,timestamp=timestamp_changed=0; else { timestamp=timestamp_changed=1; for (v=varhead;vtimestamp);v++) o,v->timestamp=1; } @ Okay, we're ready for the master recursion. @= node *compose_rec(node*f) { var *vf; node *r,*r0,*r1; if (f<=topsink) return oo,f->xref++,f; /* $f$ is constant */ o,vf=thevar(f); if (o,vf->timestamp==0) return oo,f->xref++,f; /* $f$ doesn't depend on $y$'s */ o,r=cache_lookup(f,node_(vf->timestamp),node_(0)); if (r) return r; @; } @ The computations here look basically the same as those we've been seeing in previous recursions. But in fact there is a huge difference: The functions |r0| and~|r1| now can involve {\it all\/} variables, not just the variables near the bottom of the BDDs. @= rmems++; /* track recursion overhead */ o,r0=compose_rec(node_(f->lo)); if (!r0) return NULL; /* oops, trouble */ r1=compose_rec(node_(f->hi)); if (!r1) { deref(r0); /* too bad, but we have to abort in midstream */ return NULL; } if (o,vf->repl) r=node_(vf->repl);@+else r=node_(vf->proj); r=mux_rec(r,r1,r0); /* replacement happens here */ deref(r0);@+deref(r1); /* we're done with |r0| and |r1| */ if (r) { if ((verbose&128)&&(vftimestamp,vf-varhead); cache_insert(f,node_(vf->timestamp),node_(0),r); } return r; @*Top-level calls. As mentioned above, there's a top-level wrapper'' around each of the recursive synthesis routines, so that we can launch them properly. Here's the top-level routine for binary operators. @= node* binary_top(int curop, node*f, node*g) { node *r; unsigned long long oldmems=mems, oldrmems=rmems, oldzmems=zmems; if (verbose&2) printf("beginning to compute %x %s %x:\n", id(f),binopname[curop],id(g)); cacheinserts=0; while (1) { switch (curop) { case 1: r=and_rec(f,g);@+break; /* $f\land g$ */ case 2: r=mux_rec(g,botsink,f);@+break; /* $f\land\bar g=(g{?}\ 0{:}\ f)$ */ case 4: r=mux_rec(f,botsink,g);@+break; /* $\bar f\land g=(f{?}\ 0{:}\ g)$ */ case 6: r=xor_rec(f,g);@+break; /* $f\oplus g$ */ case 7: r=or_rec(f,g);@+break; /* $f\lor g$ */ case 8: r=constrain_rec(f,g);@+break; /* $f\downarrow g$ */ case 9: r=all_rec(f,g);@+break; /* $f\(A) g$ */ case 10: case 12: r=yes_no_rec(curop,f,g);@+break; /* $f\(N)g$ or $f\(Y)g$ */ case 14: r=diff_rec(f,g);@+break; /* $f\(D) g$ */ case 15: r=exist_rec(f,g);@+break; default: fprintf(stderr,"This can't happen!\n");@+exit(-69); } if (r) break; attempt_repairs(); /* try to carry on */ } if (verbose&(1+2)) printf(" %x=%x%s%x (%llu mems, %llu rmems, %llu zmems)\n",@| id(r),id(f),binopname[curop],id(g), mems-oldmems,rmems-oldrmems,zmems-oldzmems); return r; } @ @= void attempt_repairs(void); /* collect garbage or something if there's hope */ @ Have you any wool? Yes sir, yes sir. @= node* ternary_top(int curop, node*f, node*g, node*h) { node *r; unsigned long long oldmems=mems, oldrmems=rmems, oldzmems=zmems; if (verbose&2) printf("beginning to compute %x %s %x %s %x:\n", id(f),ternopname1[curop-16],id(g),ternopname2[curop-16],id(h)); cacheinserts=0; while (1) { switch (curop) { case 16: r=mux_rec(f,g,h);@+break; /* $f{?}\ g{:}\ h$ */ case 17: r=med_rec(f,g,h);@+break; /* $\langle fgh\rangle$ */ case 18: r=and_and_rec(f,g,h);@+break; /* $f\land g\land h$ */ case 19: r=and_exist_rec(f,g,h);@+break; /* $(f\land g)\(E)h$ */ default: fprintf(stderr,"This can't happen!\n");@+exit(-69); } if (r) break; attempt_repairs(); /* try to carry on */ } if (verbose&(1+2)) printf(" %x=%x%s%x%s%x (%llu mems, %llu rmems, %llu zmems)\n",@| id(r),id(f),ternopname1[curop-16],id(g),ternopname2[curop-16], id(h),mems-oldmems,rmems-oldrmems,zmems-oldzmems); return r; } @ @= node* compose_top(node*f) { node *r; unsigned long long oldmems=mems, oldrmems=rmems, oldzmems=zmems; if (f<=topsink) return f; if (verbose&2) printf("beginning to compute %x[%u]:\n", id(f),thevar(f)->timestamp); cacheinserts=0; timestamp_changed=0; while (1) { r=compose_rec(f); if (r) break; attempt_repairs(); /* try to carry on */ } if (verbose&(1+2)) printf(" %x=%x[%u] (%llu mems, %llu rmems, %llu zmems)\n",@| id(r),id(f),thevar(f)->timestamp, mems-oldmems,rmems-oldrmems,zmems-oldzmems); return r; } @*Parsing the commands. We're almost done, but we need to control the overall process by obeying the user's instructions. The syntax for elementary user commands appeared at the beginning of this program; now we want to flesh it out and implement it. @= { @; @; @; } @ Before we do any commands, it's helpful to ensure that no embarrassing anomalies will arise. For example, a command like \.{f0=x1{\tta}x2{\tta}x3}' might necessitate making space for up to three new variables; it would be a nuisance if those attempts failed. (See the |projection| routine.) We also want to check that |timestamp| hasn't reached its maximum possible value. @d debugging 1 @= #if debugging&includesanity if (verbose&8192) sanity_check(); #endif if (totalnodes>=toobig) @; if (verbose&1024) show_stats(); while (1) { if (timestamp!=-1) { @; } attempt_repairs(); } @ @^Tweakable parameters@> @d bufsize 100 /* all commands are very short, but comments might be long */ @= char buf[bufsize]; /* our master's voice */ @ @= if (infile) { if (!fgets(buf,bufsize,infile)) { /* assume end of file */ if (file_given) goto alldone; /* quit the program if the file was |argv[1]| */ fclose(infile); infile=NULL; continue; } if (verbose&64) printf("> %s",buf); }@+else@+while (1) { printf("> ");@+fflush(stdout); /* prompt the user */ if (fgets(buf,bufsize,stdin)) break; freopen("/dev/tty","r",stdin); /* end of command-line |stdin| */ } @ The first nonblank character of each line identifies the type of command. All-blank lines are ignored; so are lines that begin with \.\#'. I haven't attempted to make this interface the slightest bit fancy. Nor have I had time to write a detailed explanation of how to use this program---sorry. Hopefully someone like David Pogue will be motivated to write the missing manual. @^Special commands@> @^Pogue, David Welch@> @^Commands@> @^BDDL, a primitive language for BDD calculations@> @d getk for(k=0;isdigit(*c);c++) k=10*k+*c-'0' /* scan a number */ @d reporterror {@+printf("Sorry; %c' confuses me %s%s",@| *(c-1),infile? "in this command: ":"in that command.",infile?buf:"\n"); goto nextcommand;@+} @= rescan:@+for (c=buf;*c==' ';c++); /* pass over initial blanks */ switch (*c++) { case '\n':@+if (!infile) printf("(Type quit' to exit the program.)\n"); case '#': continue; case '!': printf(buf+1);@+continue; /* echo the input line on |stdout| */ case 'b': @;@+continue; case 'C': print_cache();@+continue; case 'f': @;@+continue; case 'i': @;@+continue; case 'l': getk;@+leasesonlife=k;@+continue; case 'm': @;@+continue; case 'o': @;@+continue; case 'O': @;@+continue; case 'p': @;@+continue; case 'P': print_base(0,0);@+continue; /* \.P means print all'' */ case 'q': goto alldone; /* this will exit the program */ case 'r': @;@+continue; case 's': @;@+continue; case 'S': if (isdigit(*c)) @@; else siftall();@+continue; case 't': @;@+continue; case 'v': getk;@+verbose=k;@+continue; case 'V': verbose=-1;@+continue; case 'y': @;@+continue; case '$': show_stats();@+continue; default: reporterror; } nextcommand: continue; @ @= char *c,*cc; /* characters being scanned */ node *p,*q,*r; /* operands */ var *v; /* a variable */ int lhs; /* index on left side of equation */ int curop; /* current operator */ @ The \ command \.{include} \ starts up a new infile. (Instead of \.{include}, you could also say \.{input} or \.i, or even \.{ignore}.) @d passblanks for (;*c==' ';c++) @= if (infile) printf("Sorry --- you can't include one file inside of another.\n"); else { for (;isgraph(*c);c++); /* pass nonblanks */ passblanks; for (cc=c;isgraph(*c);c++); /* pass nonblanks */ *c='\0'; if (!(infile=fopen(cc,"r"))) printf("Sorry --- I couldn't open file %s'!\n",cc); } @ The command \.{p3}' prints out the BDD for$f_3$; the command \.{pp3}' prints just the profile. @d getkf getk;@+if (k>=extsize) {@+printf("f%d is out of range.\n",k);@+continue;@+} @d getkv getk;@+if (k>=varsize) {@+printf("x%d is out of range.\n",k);@+continue;@+} @= if (*c=='p') { /* \.{pp} means print a profile'' */ c++;@+getkf; printf("p%d:",k); print_profile(f[k]); }@+else { getkf; printf("f%d=",k); print_function(f[k],0); } @ The command \.{o3}' outputs the BDD for$f_3$; the command \.{ou3}' outputs it with variable names unmapped'', so that all branches go to variables with higher numbers (in spite of any reordering that has been done). Unmapped BDD output is important for programs such as {\mc BDDREAD-COUNT}, which count the number of solutions, because those programs need to know how many levels are being crossed at every downward branch. @= { int unmapped=0; if (*c=='u') c++,unmapped=1; /* \.{ou} means output unmapped'' */ getkf; sprintf(buf,"/tmp/f%d.bdd",k); freopen(buf,"w",stdout); /* redirect |stdout| to a file */ print_function(f[k],unmapped); freopen("/dev/tty","w",stdout); /* restore normal |stdout| */ } @ @= for (v=varhead;vproj) printf(" x%d",v->name); printf("\n"); @ My little finite-state automaton. @= getkf;@+lhs=k; passblanks; if (*c++!='=') reporterror; @; @; second:@; third:@; fourth:@; @; @ @d checknull(p) if (!p) {@+printf("f%d is null!\n",k);@+continue;@+} @^Unary operations@> @= passblanks; switch (*c++) { case 'x': getkv;@+p=projection(varmap[k]);@+break; case 'f': getkf;@+p=f[k];@+checknull(p);@+break; case 'c': k=*c++-'0';@+if((k&-2)==0) p=botsink+k;@+else reporterror;@+break; case '~': p=topsink;@+curop=6;@+goto second; /* reduce$\lnot f$to$1\oplus f$*/ case '.': @;@+continue; default: reporterror; } @ Quantification uses the following conventions: \smallskip\noindent{$\bullet$} A command like \.{f1=f2 A x2}' sets$f_1\gets\forall x_2\,f_2$. \smallskip\noindent{$\bullet$} A command like \.{f1=f2 E f3}' where$f_3=x_4\land x_5$sets$f_1\gets\exists x_4\,\exists x_5\,f_2$. \smallskip\noindent{$\bullet$} A command like \.{f1=f2\tta f4 E f3}' with that$f_3$sets$f_1\gets\exists x_4\,\exists x_5\,(f_2\land f_4)$. @^Quantification, syntax for@> @^Binary operations@> @^Unary operations@> @= passblanks; switch (*c++) { case '&': curop=1;@+break; /* and */ case '>': curop=2;@+break; /* butnot */ case '<': curop=4;@+break; /* notbut */ case '^': curop=6;@+break; /* xor */ case '|': curop=7;@+break; /* or */ case '_': curop=8;@+break; /* constrain */ case 'A': curop=9;@+break; /* forall */ case 'N': curop=10;@+break; /* no */ case 'Y': curop=12;@+break; /* yes */ case 'D': curop=14;@+break; /* diff */ case 'E': curop=15;@+break; /* exists */ case '?': curop=16;@+break; /* if-then-else */ case '.': curop=17;@+break; /* median */ case '[': curop=0; /* functional composition */ if (*c++!='y') reporterror; if (*c++!=']') reporterror; goto fourth; case '\n': curop=7,q=p,c--;@+goto fourth; /* change unary |p| to$p\lor p$*/ default: reporterror; } @ @= passblanks; switch (*c++) { case 'x': getkv;@+q=projection(varmap[k]);@+break; case 'f': getkf;@+q=f[k];@+checknull(q);@+break; case 'c': k=*c++-'0';@+if((k&-2)==0) q=botsink+k;@+else reporterror;@+break; default: reporterror; } @ @= @^Ternary operations@> passblanks; if (curop==1 && *c=='&') curop=18; /* and-and */ if (curop==1 && *c=='E') curop=19; /* and-exists */ if (curop<=maxbinop) r=NULL; else { if (*c++!=ternopname2[curop-16][0]) reporterror; passblanks; switch (*c++) { case 'x': getkv;@+r=projection(varmap[k]);@+break; case 'f': getkf;@+r=f[k];@+checknull(r);@+break; case 'c': k=*c++-'0';@+if((k&-2)==0) r=botsink+k;@+else reporterror;@+break; default: reporterror; } } @ We have made sure that all the necessary operands are non-|NULL|. @= passblanks; if (*c!='\n' && *c!='#') { /* comments may follow \.\#' */ reportjunk: c++; reporterror; } if (curop==0) r=compose_top(p); else if (curop<=maxbinop) r=binary_top(curop,p,q); else r=ternary_top(curop,p,q,r); @ The |sanity_check| routine tells me that I don't need to increase |r->xref| here (although I'm not sure that I totally understand why). @= if (o,f[lhs]) deref(f[lhs]); o,f[lhs]=r; @ @= if (o,f[lhs]) { deref(f[lhs]); o,f[lhs]=NULL; } @ @= getkv;@+v=&varhead[varmap[k]]; if (o,!v->proj) projection(k); /* ensure that$x_k$exists */ passblanks; if (*c++!='=') reporterror; passblanks; if (*c=='.') c++,q=NULL; else { @; if (o,q==node_(v->proj)) q=NULL; } passblanks; if (*c!='\n' && *c!='#') goto reportjunk; if (o,v->repl!=addr_(q)) { p=node_(v->repl); if (p) deref(p); @; } @ In a long calculation, it's nice to get progress reports by setting bit 128 of the |verbose| switch. But we want to see such reports only near the top of the BDDs. (Note that |varmap| is not relevant here.) @= getkv; tvar=&varhead[k+1]; @*Reordering. Now comes the new stuff, where {\mc BDD14} enters the territory into which {\mc BDD11} was afraid to tread. Everything is based on a primitive swap-in-place operation, which is made available to the user as an \.s' command for online experimentation. The swap-in-place algorithm interchanges$x_u\leftrightarrow x_v$in the ordering, where$x_u$immediately precedes~$x_v$. No new dead nodes are introduced during this process, although some nodes will disappear and others will be created. Furthermore, no pointers will change except within nodes that branch on$x_u$or~$x_v$; every node on level$u$or level~$v$that is accessible either externally or from above will therefore continue to represent the same subfunction, but in a different way. @= getkv;@+v=&varhead[varmap[k]]; if (o,!v->proj) projection(k); /* ensure that$x_k$exists */ reorder_init(); /* prepare for reordering */ if (v->up) swap(v->up,v); reorder_fin(); /* go back to normal processing */ @ Before we diddle with such a sensitive thing as the order of branching, we must clear the cache. We also remove all dead nodes, which otherwise get in the way. Furthermore, we set the |up| and |down| links inside |var| nodes. By setting |leasesonlife=1| here, I'm taking a rather cowardly approach to the problem of memory overflow: This program will simply give up, when it runs out of elbow room. No doubt there are much better ways to flail about and possibly recover, when memory gets tight, but I don't have the time or motivation to think about them today. @= void reorder_init(void) { var *v,*vup; collect_garbage(1); totalvars=0; for (v=varhead,vup=NULL;vproj) { v->aux=++totalvars; v->up=vup; if (vup) vup->down=v;@+else firstvar=v; vup=v; } if (vup) vup->down=NULL;@+else firstvar=NULL; oldleases=leasesonlife; leasesonlife=1; /* disallow reservations that fail */ } @# void reorder_fin(void) { cache_init(); leasesonlife=oldleases; } @ @= int totalvars; /* this many |var| records are in use */ var *firstvar; /* and this one is the smallest in use */ int oldleases; /* this many leases on life'' have been held over */ @ We classify the nodes on levels$u$and$v$into four categories: Level-$u$nodes that branch to at least one level-$v$node are called tangled''; the others are solitary.'' Level-$v$nodes that are reachable from levels above~$u$or from external pointers ($f_j$or$x_j$or$y_j$) are called remote''; the others, which are reachable only from level~$u$, are hidden.'' After the swap, the tangled nodes will remain on level~$u$; but they will now branch on the former~$x_v$, and their |lo| and |hi| pointers will probably change. The solitary nodes will move to level~$v$, where they will become remote; they'll still branch on the former~$x_u$as before. The remote nodes will move to level~$u$, where they will become solitary---still branching as before on the former~$x_v$. The hidden nodes will disappear and be recycled. In their place we might create `newbies,'' which are new nodes on level~$v$that branch on the old~$x_u$. The newbies are accessible only from tangled nodes that have been transmogrified; hence they will be the hidden nodes, if we decide to swap the levels back again immediately. Notice that if there are$m$tangled nodes, there are at most$2m$hidden nodes, and at most$2m$newbies. The swap is beneficial if and only if the hidden nodes outnumber the newbies. Notice also that the projection function$x_u$is always solitary; the projection function$x_v$is always remote. But the present implementation is based on the assumptions that almost all nodes on level~$u$are tangled and almost all nodes on level~$v$are hidden. Therefore, instead of retaining solitary and remote nodes in their unique tables, deleting the other nodes, swapping unique tables, and then inserting tangled/newbies, we use a different strategy by which both unique tables are essentially trashed and rebuilt from scratch. (In other words, we assume that the deletion of tangled nodes and hidden nodes will cost more than the insertion of solitary nodes and remote nodes.) We need some way to form temporary lists of all the solitary, tangled, and remote nodes. No link fields are readily available in the nodes themselves, unless we resort to the shadow memory. The present implementation solves the problem by reconfiguring the unique table for level~$u\$ before destroying it: We move all solitary nodes to the beginning of that table, and all tangled nodes to the end. This approach is consistent with our preference for cache-friendly methods like linear probing. @= void swap(var *u, var *v) { register int j,k,solptr,tangptr,umask,vmask,del; register int hcount=0,rcount=0,scount=0,tcount=0,icount=totalnodes; register node *f,*g,*h,*gg,*hh,*p,*pl,*ph,*q,*ql,*qh, *firsthidden,*lasthidden; register var *vg,*vh; unsigned long long omems=mems, ozmems=zmems; oo,umask=u->mask,vmask=v->mask; del=((u-varhead)^(v-varhead))<<(32-logvarsize); @; @; if (verbose&2048) printf( "swapping %d(x%d)<->%d(x%d): solitary %d, tangled %d, remote %d, hidden %d\n", @|u-varhead,u->name,v-varhead,v->name,scount,tcount,rcount,hcount); @; @; @