After parsing the grammar into the table, there is a "start" non-terminal from which the process begins. The program looks up "start" in the hashtable to find its definition. This gives the list of possible productions for that non-terminal. Each production is itself a list of tokens. To expand, the program chooses one of its production at random, and then recursively expands the tokens within the production to generate the random output. When a terminal is encountered, that serves as the base case to halt the recursion.
Over the years, the students have supplied us with a good collection of humorous grammar files (plots for Star Trek episodes, "Dear John" letters, extension requests, haikus, and so on) that make the assignment particularly entertaining to run.
Here try it for yourself in Java:
The bulk of the work is done in the first phase of parsing the grammar file since you encourage the students to store the grammar in a data structure that allows for easy expansion later. The grammar files are designed as white-space delimited to minimize tokenizing complication. Giving students access to (or having them write) some simple but robust file-reading routines/ADT/object is helpful here.
As it is read in, the grammar is stored into a layered data structure involving a hashtable of definitions and with nested vectors for the list of productions which are each a list of tokens. The needed collection abstractions can either be pre-provided things (such as the java.util or C++ STL objects) or ones that the students construct themselves. When teaching in C, I do the latter, but have the students build the ADTs in a earlier piece and then assign the RSG as a follow-on to help them separate the roles of implementor and client.
Expanding the grammar is quite simple, given use of the appropriate data structure to store it. The recursion here is not particularly tricky or challenging, but it does help reinforce what the students have earlier learned about recursive problem-solving.
I have quite successfully used the RSG in a programming paradigm class as the "reference program", one that the students develop or see solutions for in a variety of languages (C, Ada, C++, Lisp, Java, etc.) to allow them to compare and contrast the language facilities for solving the same problem. It is pretty good choice for such a task since it stresses a language's facilities for string manipulation, file processing, and managing arbitrary collections, which are pretty useful features in any language.
A particularly notable feature of the RSG is that it doesn't need any platform-specific features (i.e. no graphics, sounds, events, threads, etc.) so it has few portability obstacles. Many of our assignments use flashy graphics, sounds, mouse input, etc. to entice and excite the students but those features tend to tie us to a particular platform. The RSG is one of the most entertaining text-only programs we have assigned.
Julie Zelenski
Department of Computer Science
Stanford University
mailto: zelenski@cs.stanford.edu
This page in support Nick Parlante's ACM SIGCSE Symposia 1999 Panel on Nifty Assignments. Other nifty assignments are available!
Last updated: March 19, 1999.