This is part of a now lengthy series of posts on the making of Crash Bandicoot. Click here for the PREVIOUS or for the FIRST POST. I also have a newer post on LISP here.
I’m always being asked for more information on the LISP based languages I designed for the Crash and Jak games. So to that effect, I’m posting here a journal article I wrote on the subject in 1996. This is about GOOL, the LISP language used in Crash 1, Crash 2, and Crash 3. GOOL was my second custom language. Way of the Warrior had a much simpler version of this. Jak 1,2,3 & Jak X used GOAL, which was a totally new vastly superior (and vastly more work to create) implementation that included a full compiler. GOOL (the subject of this article) was mostly interpreted, although by Crash 2 basic expressions were compiled into machine code. But I’ll save details on GOAL for another time.
[ Also I want to thank my reader “Art” for helping cleanup an ancient copy of this article — stuck in some mid 90s Word format that can no longer be read. ]
Making the Solution Fit the Problem:
AI and Character Control in Crash Bandicoot
Andrew S. Gavin
Copyright (c) 1996 Andrew Gavin and Naughty Dog, Inc.
All rights reserved.
Abstract
Object control code, which the gaming world euphemistically calls AI, typically runs only a couple of times per frame. For this kind of code, speed of implementation, flexibility, and ease of later modification are the most important requirements. This is because games are all about gameplay, and good gameplay only comes from constant experimentation with and extensive reworking of the code that controls the game’s objects. The constructs and abstractions of standard programming languages are not well suited to object authoring, particularly when it comes to flow of control and state. GOOL (Game Oriented Object LISP) is a compiled language designed specifically for object control code that addresses these limitations.
Video games are the type of program which most consistently pushes the machine and programmer to the limit. The code is required run at blinding speeds, fit in tiny memory footprints, have no serious bugs, and be completed under short schedules. For the ultra high performance 5% of functions which run most frequently there is no substitute for careful hand coding in assembly. However, the rest of the program requires much more rapid implementation. It is this kind of code which is the subject of this article.
Object control code, which is euphemistically called AI in games, typically runs only a couple of times per frame. With this kind of code, speed of implementation, flexibility, and ease of later modification are often more important than maximizing execution time. This is because games are all about gameplay, and achieving good gameplay is about writing and rewriting object code time and time again. Programming languages are not immutable truths handed down from on high, but tools created by people to solve particular tasks. Like any tool, a programming language must be right for the job. One would not attempt to turn a hexagonal nut with a pentagonal wrench, neither is it easy to write a program in a language not well suited to the problem. Sadly, most programmers have only been exposed to a small set of similar and inflexible languages. They have therefore only learned a small number of ways to customize these languages to the task at hand. Let us stop for a second and take look at the abstractions given to us by each of the common choices, and at what price. But first a word about assemblers and compilers in general.
Assemblers and compilers are programs designed to transform one type of data (the source language) into another (the target language). There is nothing particularly mysterious about them, as these transforms are usually just a bunch of tabled relationships. We call one of these programs a compiler when it performs some kind of automatic allocation of CPU resources. Since most commonly found languages are fairly static, the transform rules are usually built into the compiler and can not be changed. However, most compilers offer some kind of macro facility to allow customizations. In its most general form a macro is merely a named function, which has a rule for when it is activated (i.e. the name of the macro). When it is used, this function is given the old piece of program, and can do whatever normal programming functions it wishes to calculate a new expression, which it returns to be substituted for the old. Unfortunately, most languages do not use this kind of macro, but instead create a new tiny language which defines all the functions which are allowed to run during macro expansion (typically template matching of some sort). With general purpose macros, any transform is possible, and they can even be used to write an entire compiler.
Almost all programmers have had some exposure to assembly language. An assembler basically serves the purpose of converting symbolic names into binary values. For example, “add” becomes 20. In addition, most allow simple renaming of registers, assignment of constants to symbols, evaluation of constant expressions, and some kind of macro language. Usually these macro languages consist of template substitutions, and the ability to run simple loops at expansion time. Assembly directly describes the instructions interpreted by the processor, and as such is the closest to the chip which a software engineer can get. This makes it very tedious and difficult to port to a different a machine. Additionally, since it consists primarily of moving binary words between registers and memory, and performing simple operations on them, most people find it tedious and difficult to use for large programs. In assembly, all details must be tracked by hand. Since knowledgeable humans are smarter than compilers, albeit much slower, they are capable of doing a substantially better job of creating smaller more efficient code. This is true, despite the claims of the modern OS community, compilers are still only about half as good as a talented assembly programmer. They just save a lot of time.
Many programmers learned to program with Basic. This language has an incredibly simple syntax, and typically comes with a friendly interactive environment. These features make it easy to learn and use. It however has no support for abstractions of any sort, possessing only global variables, and no macro system. Globals are great for beginners because the whole abstract arena of scope becomes a non issue. However, the absence of lexical scoping makes the isolation of code (and its associated bugs) nearly impossible. There is however an important lesson in basic which has been lost on the programming community: interactive is good. Basic typically has an interpreted listener, and this enables one to experiment very quickly with expressions to see how they work, and to debug functions before they are put into production.
The standard programming language of the last few years is C. First and foremost C provides series of convenient macros for flow control, arithmetic operations, memory reference, function calling, and structure access. The compiler writer makes expansions for these that work in the target machine language. C also provides expression parsing and simple register allocation for assembler grade data objects (i.e. words). C code is reasonably portable between machines of a similar generation (i.e. word size). As an afterthought a preprocessor provides rudimentary textual macro expansion and conditional compilation. The choice not to include any of the hallmarks of higher level languages, like memory management (aka garbage collection), run time typing, run time linking, and support for more complex data types (lists, true arrays, trees, hash tables etc.) is a reasonable one for many tasks where efficiency is crucial. However, C is crippled by an inconsistent syntax, a weak text based macro system, and an insurmountable barrier between run time and compile time name spaces. C can only be customized via the #define operator and by functions. Unfortunately, this makes it impossible to do many interesting and easy things, many of C’s fundamental areas, structures, setting, getting, expressions, flow of control, and scope are completely off limits for customization. Since functions always have a new scope, they are not useful creating flow of control constructs, and #define is so weak that it can’t even handle the vagaries of the structure syntax. For those who know C very well it is often a convenient language, since it is good at expressions and basic flow of control. However, whenever complicated data structures are involved the effort needed is obscene, and C in unable to transfer this effort from one data type to another similar one.
Modern operating system and fancy programs are filled with scripting languages. MS DOS batch language, the various Unix shell languages, perl, tcl etc. are all very common. These are toy languages. They often have inconsistent and extremely annoying syntaxes, no scoping, and no macros. They were invented basically as macro languages for operating system shells, and as such make it fairly easy to concatenate together new shell commands (a task that is very tedious in assembly or C). However, their ambiguous and inconsistent syntaxes, their slow interpreted execution speeds, and the proliferation of too many alternatives has made them annoying to invest time in learning. Recently a new abomination has become quite popular, and its name is C++. This monstrosity of a language attempts to extend C in a direction it was never intended, by making structures able to contain functions. The problem is that the structure syntax is not very flexible, so the language is only customizable in this one direction. Hence one is forced to attempt to build all abstractions around the idea of the structure as class. This leads to odd classes which do not represent data structures, but instead represent abstract ways of doing. One of the nice things about C is that the difference between pointer and object is fairly clear, but in C++ this has become incomprehensibly vague, with all sorts of implicit ways to pass by reference. C++ programs also tend to be many times larger and slower than their C counterparts, compile much slower, and because C++ compilers are written in C, which can not handle flexible data structures well, the slightest change to the source code results in full compiles of the entire source tree. I am convinced that this last problem alone makes the language a severe productivity minus. But I forgot, since C++ must determine nearly everything at compile time you still have to write all the same code over and over again for each new data type.
The advent of the new class metaphor has brought to the fore C and C++’s weakness at memory management. Programmers are forced to create and destroy these new objects in a variety of bizarre fashions. The heap is managed by the wretched malloc model, which uses wasteful memory cookies, creates mysterious crashes on overwrites, and endless fragmentation.
None of these problems are present in Lisp, which is hands down the most flexible language in common use. Lisp is an old language (having its origins in the 50s) and has grown up over the last 30 years with the evolution of programming. Today’s modern Common Lisp is a far cry from the tiny mainframe list of 30 years ago. Aided by a consistent syntax which is trivial to parse, and the only full power macro system in a commonly used language, Lisp is extremely easy to update, customize, and expand, all without fighting the basic structures of the language. Over the years as lexical scoping, optimized compilation, and object oriented programming each came into vogue Lisp was able to gracefully adopt them without losing its unique character. In Lisp programs are built out of one of the language’s built in data structure, the list. The basic Lisp expression is the form. Which is either an atom (symbol or number) or a list of other forms. Surrounded by parentheses, a Lisp lists always has its function at the head, for example the C expression 2+2 is written as (+ 2 2). This may seem backwards at first, but with this simple rule much of the ambiguity of the syntax is removed from the language. Since computers have a very hard time with ambiguity, programs that write programs are much easier in Lisp.
Let me illustrate beginning with a simple macro.
(defmacro (1+ value) "Simple macro to expand (1+ value) into (+ 1 value). Note that backquote is used. Backquote is a syntax sugar which says to return the 'quoted' list, all forms following a comma however are evaluated before being placed in the list. This allows the insertion of fields into a template. 1+ is the operator which adds 1 to its operand (C uses ++ for this)." `(+ 1 ,value))
The above form defines a function which takes as its argument the expression beginning with 1+, and returns a new expanded expression (i.e. (1+ 2) > (+ 1 2)). This is a very simple macro because it merely fills in a template. However, if our compiler did not perform constant reduction we could add it to this macro like this:
(defmacro (1+ value) "Smarter macro to expand 1+. If value is a number, then increment on the spot and return the new number as the expansion." (if (numberp value) (+ 1 value) `(+ 1 ,value)))
The form numberp tests if something is a number. If value is, we do the add in the context of the expansion, returning the new number as the result of the macro phase. If value is not a number (i.e. it is a variable or expression), we return the expanded expression to be incremented at run time.
These full power macros allow the programmer to seamlessly expand the language in new ways. For example, the lisp form cond can be implemented from if’s with a macro. Cond is a special form which is like a C “switch” statement except that each case has an arbitrary expression. For example:
(cond ((= temp 2) (print 'two)) ((numberp temp) (print 'other number)) (t (print 'other type)))
Will print “two” if temp is 2, “other number” if it is a number (other than 2), and “other type” otherwise. A simple implementation of cond would be as follows:
(defmacro cond (&rest clauses) "Implement the standard cond macro out of nested 'if's and 'when's. t must be used to specify the default case, and it must be used last. This macro uses backquote's ,@ syntax which splices a list into the list below it. Note also the use of progn. progn is a form which groups multiple forms and has as it's value, the value of the last form. cond clauses contain what is called an implicit progn, they are grouped together and the value of the last one is returned as the value of the cond." (if (eq (length clauses) 1) (if (eq (caar clauses) t) `(progn ,@(cdar clauses)) `(when ,(caar clauses) ,@(cdar clauses))) `(if ,(caar clauses) (progn ,@(cdar clauses)) (cond ,@(cdr clauses)))))
This expands the above cond into:
(if (= temp 2) (progn (print 'two)) (cond ((numberp temp) (print 'other number)) (t (print 'other type))))
After a single pass of macro expansion. The macro will peel the head off of the cond one clause at a time converting it into nested ifs. There is no way to use C’s #define to create a new flow of control construct like this, yet in a compiled language these compile time transforms are invaluable to bridging the gap between efficient and readable code.
GOOL (Game Oriented Object LISP) is my answer to the difficulties of using C and assembly for object programming. It is a compiled Lisp dialect designed specifically for the programming of interactive game objects. As a language it has the following features: Consistent syntax, full power macros, symbolic names, orthogonal setting/getting, layered computation, multiple ultra light threads, grouping of computations into states, externally introduced flow of control changes (events), small execution size, retargetable backend, and dynamic linking. The GOOL compiler is embedded in Allegro Common Lisp (an ANSI Common Lisp from Franz Inc. which I run on an Silicon Graphics workstation running IRIX). Common Lisp provides an ideal environment for writing compilers because you start off with parsing, garbage collection, lists, trees, hash tables, and macros from the get go. As a language GOOL borrows its syntax and basic forms from Common Lisp. It has all of Lisp’s basic expression, arithmetic, bookkeeping, and flow of control operators. These vary in many small ways for reasons of speed or simplicity, but GOOL code can basically be read by the few of us lucky enough to have been exposed to Lisp. GOOL is also equipped with 56 primitives and 420 macros which support its advanced flow of control and game object specific operations. Additional ones can be trivially defined globally or locally within objects, and are indistinguishable from more primitive operations.
The GOOL compiler is an modern optimizing compiler with all sorts of smarts built into various macros and primitives. It is a fully forward referenced single pass compiler. Unlike some other programming languages with single letter names, GOOL does not require you to define something textually before you use it, and you never need tertiary declarations (like prototypes). Computers are good at remembering things, and a compiler is certainly able to remember that you called a function so that it can check the arguments when it gets to the declaration of that function. GOOL is fully relocatable and dynamically linked. So it is not necessary to include code for objects which are not nearby in memory. C is so static, and overlays so difficult and incompatible, that almost no effort is made to do dynamic binding of code, resulting in much wasted memory.
The programming tasks involved in creating game object behaviors are very inconvenient under the standard functional flow of control implied by most programming languages. In the programming of game objects it is necessary for each object to have a local state. This state consists of all sorts of information: type, position, appearance, state, current execution state (program counter), and all types of other state specific to the type of object. From the point of view of a particular object’s code all this can be viewed as an object specific global space and a small stack. This state must be unique to a specific object because it is often necessary to execute the same code on many different copies of the state. In either C or assembly it is typical to make some kind of structure to hold the state, and then write various routines or code fragments that operate on the structure. This can be accomplished either with function syntax macros or structure references. GOOL on the other hand allows this state to be automatically and conveniently bound to variable names for the most straightforward syntax. For example the C:
object >transx = object >transx + immediate_meters(4);
becomes in GOOL the similar expression:
(setf transx (+ transx (meters 4)))
However if in C one wished to add some new named state to each instance of a particular object one would have to create new structure records, accessors, initializers, memory management etc. GOOL on the other hand is able to easily allocate these on the object’s local stack with just one line of code, preserving the data there from frame to frame as well. A standard programming language like C only has one thread of control. While this is appropriate for the general case, it is inappropriate for objects, which are actually better expressed as state machines. In addition, it is extremely useful to be able to layer ultra light weight threads of execution, and to offer externally introduced transfers of control (events). While threads typically complicate most applications programs with few benefits, they are essential to the convenient programming of game objects, which often have to do several things at once. For example, an object might want to rotate 180 degrees, scale up toward 50%, and move toward the player character all at once. These actions do not necessarily take the same amount of time, and it is often useful to dynamically exchange and control them. In traditional code this is very awkward.
The basic unit of code in GOOL is a code block (or thread). These often do simple things as above. An arbitrary number of these may be combined into a state, they may be borrowed from other states, and activated and deactivated on the fly. For example:
(defgstate turn scale and move toward :trans (defgcode (:label turn 180) ; set the y rotation 10 degrees closer to 180 degrees (setf roty (degseek roty (deg 180) (deg 10)))) :trans (defgcode (:label scale to 150 percent) ; set the x,y, and z scales 10% closer to 150% scale (with vec scale (seekf scale (scale 1.5) (scale .1)))) :trans (defgcode (:label move toward target) ; set the x,y, and z position closer to the target's ; (another object) position at a rate of 5 meters per second (with vec trans (seekf trans (target trans) (velocity (meters per sec 5))))) :code (defgcode (:label play animation) ; play the animation until this object is colliding with ; another, then change states (until (status colliding) (play frame group animation)) (goto collided)))
A :trans block is one which runs continuously (once per frame), and a :code block is one which has a normal program counter, running until suspended by a special primitive (frame), as in “frame is over.” These code blocks can be run as threads (as above), called as procedures, converted to lambda’s and passed to something (function pointers), and assigned to be run under special conditions (events or state exit). In this example is also illustrated the kind of simple symbolic notation used in GOOL to make object programming easier. Vectors like rotation, translation, and scale are bound to simple symbolic names (e.g. roty is the y component of the rotation vector). Many new arithmetic operations have been defined for common operations, for example, seek, which moves a number toward another number by some increment, and seekf its destructive counterpart.
GOOL also has a sophisticated event system. It is possible to send an event (with parameters) to another object or objects. The object may then choose to do what it wishes with that event, run some code, change state, ignore it, etc., and report something back to the caller. These event handlers can be bound and unbound dynamically, allowing the object to change its behavior to basic events very flexibly. For example:
:event (defgcode (:params (event params)) (reject-event-and-return ((and (event is hit on the head) (< (interrupter transy) transy)))))
Says to ignore the hit on the head event when the interrupter (sender) is below the receiver in the y dimension.
Another feature illustrated here is the indirect addressing mode, (interrupter transy), in which a variable of another object (whose pointer is in the variable interrupter) is accessed. Operations can locate and return object pointers, which can be used as parameters. For example:
(send event hit on the head (find the nearest object))
which sends hit on the head to the nearest object or:
(let ((target (find the nearest object))) (when (and target (type target turtle)) (send event hit on the head)))
which sends hit on the head to the nearest object only if it is a turtle.
It is the GOOL compiler’s responsibility to turn this state into code that executes the abstraction (the above state becomes about 25 words of R3000 assembly code). GOOL code is typically much smaller than traditional code for similar tasks because the compiler does all the book keeping for this interleaving, and it is all implicit in the runtime product. In addition it has a degree of code reuse which is practically unachievable in a normal language without extremely illegible source.
GOOL has full power macros which allow the language to be brought up to the level of the problem. For example, many game programming tasks are more descriptive than a language like C is designed for. The following code causes a paragraph of text to appear on the bottom of the screen and scroll off the top.
(credit list (1 14) ("AFTER THE") ("DISAPPEARANCE") ("OF HIS MENTOR,") ("DR. NITRUS BRIO") ("REDISCOVERED HIS") ("FIRST LOVE:") (blank 1) ("TENDING") ("BAR"))
It does this by expanding into a program which creates a bunch of scrolling text objects as follows:
(defgopm credit list (params &rest body) "This macro iterates through the clauses in its body and transforms them into spawn credit line statements which create new credit line objects. It book keeps the y position down ward by height each time." (let ((list) (y 0) (font (first params)) (height (second params))) (dolist (i body) (cond ((listp i) (case (car i) (blank (incf y (second i))) (t (push (append '(spawn credit line) (:y ,y :font ,font :h ,height)) list) (incf y 1)))))) `(progn ,@(reverse list))))(defgopm spawn credit line (line &key (y 0) (font 0) (h 18)) "This macro is purely syntactic sugar, making the above macro somewhat easier." (spawn 1 credit line (frame num ,line) (unit ,(* y h)) ,font))
The following state is the code for the actual credit line. When one of these credit line objects is spawned it creates a line of text. It then proceeds to use it’s trans to crawl upward from the starting y position until it is off the screen, in which case it kills itself.
(defgstate credit line (stance) :handles (spawn credit line) :trans (defgcode () (unless first frame (setf transy ((parent transy) transvy)) (when (> transy (unit 140)) (goto die fast)))) :code (defgcode (:params (text frame y font)) (stomp action screen relative) (set frame group text) (setf transvy y) (setf transy ((parent transy) transvy)) (sleep text frame)))
As a conglomerate the above code manages to create a scrolling paragraph of arbitrary length from a descriptive block of code. It does this by using the macro to transform the description into a program to create a cluster of new line objects. This line objects take their simple behavior and amplify it into a more substantial effect when they are created in concert. In a conventional language it would be typical to create some kind of data structure to describe different actions, and then interpret that. C in particular is a very poor language for description. Because C’s only complex data type, the structure, can not even be declared in line (e.g. “struct foo bar={1,0}” is not legal except as a global) it is extremely awkward to describe complex things. It must be done with code, and the poor textual macro expander is not up to this. Witness the wretchedness of descriptive APIs like that of X windows. The contortions necessary to describe widget creation are unbelievable. Is it no wonder that people prefer to do interface work with resource files or Tcl/Tk which are both more descriptive in nature?
Overall, having a custom language whose primitives and constructs both lend themselves to the general task (object programming), and are customizable to the specific task (a particular object) makes it much easier to write clean descriptive code very quickly. GOOL makes it possible to prototype a new creature or object in as little as 10 minutes. New things can be tried and quickly elaborated or discarded. If the object doesn’t work out it can be pulled from the game in seconds without leaving any hard to find and wasteful traces behind in the source. In addition, since GOOL is a compiled language produced by an advanced register coloring compiler with reductions, flow analysis, and simple continuations it is at least as efficient as C, more so in many cases because of its more specific knowledge of the task at hand. The use of a custom compiler allows to escape many of the classic problems of C.
A new 10th Crash post can be found HERE.
If you liked this post, follow me at:
My novels: The Darkening Dream and Untimed |