To fork, or not to

So the phases of the moon and stars have swung around once more to that points where my interest in dabbling with one of my old coding projects has been rekindled.

I guess in some senses its like the REM state of coding, clearing up various concepts and notions that I’ve used and wanting to try them out in a different context and see how they play.

For this particular project, however, I actually have a chance to play with some design considerations that I’ve never really bothered to persue before…

This time around it is, naturally enough, the old MUD language. Long after I left the Amiga, I started the process of converting/rewriting the system for Linux. My original design was cheesy, and was based on efficiency rather than more pragmatic concerns.

The overall process is hampered most of all by the fact that the original code is more or less unreadable. Developed on an Amiga with three floppy drives, I was tight for disk space, forcing me to take the Unix approach of small-is-beautiful.

What makes it harder to read is that the code started out reasonably written. I’d use #define’s and enum’s for most things. But out of the need to save space, slowly I started hardcoding….

        err=0; verbs=0; nextc(1);
        fopenw(lang1fn); close_ofps(); fopena(lang1fn); ofp1=afp; afp=NULL;
        fopenw(lang2fn); fopenw(lang3fn); fopenw(lang4fn);

        blkget(&vbmem,(char **)&vbtab,64*(sizeof(verb))); vbptr=vbtab+64;
        s1=(char *)vbptr; vbptr=vbtab;
        of2p=ftell(ofp2); of3p=ftell(ofp3); FPos=ftell(ofp4);

        do
        {
                if(err>30)
                {
                        printf(“\x07** Maximum number of errors exceeded!\n”);
                        quit();
                }
                verbs++; p=block;
loop:           do { s1=sgetl((s2=s1),block); *(s1-1)=0; } while(com(block)==-1 && *s1!=0);
                if(*s1==0) { verbs–; break; }
                tidy(block); if(block[0]==0) goto loop;
                p=skiplead(“verb=”,block); p=getword(p);
                if(Word[0]==0)
                {
                        printf(“!! \x07 verb= line without a verb!\n”); goto loop;
                }
                if(strlen(Word)>IDL)
                {
                        printf(“\x07 Invalid verb ID: \”%s\””,Word); err++;
                        do { s1=sgetl((s2=s1),block); *(s1-1)=0; } while(*s1!=0 && block[0]!=0);
                        if(*s1==0) break;
                        goto loop;
                }

                strcpy(verb.id,Word);

Original AMUL consisted of 3 applications:

  1. AMULcom – A p-code compiler for the language to produce game data
  2. AMan – A p-code loader and database manager
  3. AMUL – A client

With the world loaded into AMan the game was running. The clients ran on the same machine, connected to Aman, got pointers to all of the data in memory, and used Amiga IPC (GetMessage, PostMessage) to implement synchronous activities and locking vs other clients and the manager itself.

But essentially, the game used multiple processes abusing the Amigas flat address space model. It’s not dissimilar to shared memory, really.

Every struct and class was unique and so there was no sharing of properties. If I wanted a function to move an object between rooms, I wrote a function for that, and if I wanted a function to move a player between rooms, I wrote a function for that. And sadly this reflected in the language too.

About 9-10 years ago I started a rewrite under Linux using C++. A short coming of the Amiga design was that the manager couldn’t do everything Amul could, so in the late stages of the games evolution it would actually always run a background copy of Amul to do all the work that wasn’t attached to a player but that needed the full capabilities of a client – e.g. running the mobiles etc.

With the Linux version I contemplated going to a single binary, but couldn’t quite bring myself to detach for far from the original code. Instead I decided on a two binary approach. A compiler and a manager/client in one.

The manager loaded in all the p-code, organized everything into shared memory, and then opened up a listen socket. When a new connection came in, it forked, and created a fresh, blank, self-contained instance which then attached to the shared memory and had itself a copy of the database.

Space was reserved in the system for X many players (32 or 64), but otherwise the game’s database is non-expandable during runtime. You can’t add objects, rooms, verbs, etc. I don’t care about that too much, per se.

However; I really dislike the fact that I adhered to the massive glut of global context that I had wound up using on the Amiga. There are global variables

extern class Player *me;
extern class Player *you; // Last person who interacted with me

Sadly SMUGL predates my familiarity with the STL. I started developing it under either Net- or Free-BSD which was using a NSTL at the time (non-standard template library), which had issues. So I have my own self-tuning hash system, that frankly is poop because I got annoyed that I was spending so much time reinventing a wheel when I wanted to work on the body.

Mostly I wanted to develop inheritance and experiment with various networking topics (e.g. interacting properly with telnet).

And I never actually finished the port, very little of the features are functional because I started running into quirks. The compiler works fully, but the game engine is little more than a multi-roomed talker with objects scattered around..

Over the weekend I dabbled with the source a great deal, cleaned up some ancient cruft, and realized as I was doing it that I was rendering this instance of SMUGL completely broken.

A large part of the current codebase is taken up with the passing of data from the compiler to the game engine. I converted all of the pairs of save/load functions to single “serialize” functions, and realized that if I bit the bullet and went with STL containers instead of my own list/array/etc systems, I could reduce it even further.

But then I realized that the little amount of STL I was already starting to use was already creating a major problem. Shared memory.

It is possible to make STL work with shared memory, and part of me reason for sticking with a separately compiled engine previously was that the shared memory model I was working with was much simpler if you could predict in advance just how big it needed to be and then allocate that much and just load up all your data into it in one fell swoop.

Of course, Computers have become somewhat more powerful since I started implementing it. It took the Amiga upto 10 minutes to compile my small, sample MUD off disk. 4 minutes in memory. The first Linux port of the compiler (not nearly as difficult as the rest of the code) could compile it in under a minute. The current implementation, built in debug mode without optimization, can compile the largest game I have in less than 210ms on a ~2yr old P4 2.8Ghz, and under 90ms with optimization.

But now I think it’s time to look at switch to STL and merging the compiler and the game engine.

The big question for me now is how I am gonig to divide up tasks.

Parsing a player’s input is something I consider, today, to be acceptably atomic. Executing the series of steps defined to respond to that input might also be atomic if it weren’t for the pause command, but frankly I could get rid of that, except that it allowed an easy way to do the occasional delayed response without all the hassle of starting and handling a background event.

The three main options I see are:

1. The advantage of the fork() model is the sloppiness I can have in terms of globals and context. The disadvantage is that I would have to coerce STL into using shared memory, and I would probably, once again, be stuck with using a sealed-at-runtime game world.

2. I’m tempted to play with a threaded model, although I’m finding it hard to justify any particular division of labor. If I ditched the “pause” command and made the game-author use background events for delayed responses etc then essentially everything would be as near to atomic as mattered and it would be really hard to justify a threaded model.

3. Building a round-robbin based system means writing all the “monolithic server” select loop etc, messing about with managing my own event queues etc. Probably a lot less work, over all, than trying to thread nicely.

Both the 2 and 3 options mean that I have to do something I know I ought, which is to create a nice, clean “context” package. I developed one of these in AMUL at some point as a way of creating ‘scope’ that allowed one verb to call another, as well as allowing me to schedule background events. And I should probably do it again ;)

I never got very far with SMUGL after I got it to the point where you could more-or-less walk around game worlds again, perhaps because I discovered STL shortly afterwards and thought I’d wait a year or so until it stabilized and then reconsider my design.

SMUGL’s room compiler used to be a single pass afair that basically read the data off disk into small buffers and then built up a chunk of memory with a single write block. It then read the travel file separately and wrote it out to four different files.

On Sunday I replaced it with a multi-pass compiler that reads the whole rooms file into memory, marked off all the rooms and registered all the names, then created room entities in an STL vector and finally went thru and parsed the actual data in each marked off block to finish assigning the rooms.

It’s far less efficient. On a 386 it would probably add another 3-5 seconds to the loading time on a game with 600 rooms. But the code is far more readable and its far more maintainable.

I also finally merged the pcode formats for the travel and language data. The way you describe exits in all of my languages has always been a sort of location-specific verb definition. You can just say that north takes you to room 53, or you can use the full power of the language to do complex stuff (set off a timer to close the trapdoor behind you in 5 seconds, light fireworks, spawn the dragon, etc, etc).

It had proven too expensive to do this properly in the past, but I basically just gave each room an STL map of <verb,p-code> and voila. The conversion took me about a half hour, and another 2-3 hours to convert and reduce the game-engine code for dealing with the two different cases.

Which ever model I go with, I need to decide how I am going to express context, rather than my current use of lots of globals. And then I’ve got to retrofit all of my functions with the neccessary data.

Of course, if I go with the monolithic process, I could always just have a global context variable which is set before I start crunching anything ;)

Frankly, I guess I’ve been holding out that I might be able to salvage lots of the original AMUL code (which I still have) so that I don’t have to do things like rewrite all of the condition and action functions.

On the other hand, I’m also realizing that’s never really been an option. All entities in a SMUGL game are derived from a base class called BasicObj (piss poor choice of name, but hey I was new to C++), and so things are already expressed in a significantly different way. My changes to the travel/language code wiped out the last big chunk of compatability which was the way in which AMUL knows what the current expression context is.

Imagine a player is standing in a greenhouse. There is a plant, a honey pot and a plant pot. When a player types “plant plant in plant pot”, the game tokenizes the phrase, e.g. <word#37> <word#37> <word#37> <word#23> (because ‘in’ is a whitespace word it can ignore). However, here we have a conundrum. Plant is a verb, an adjective and a noun. Did you type adjective noun verb noun? Or did you type verb verb verb noun? And “pot” could refer to either the honey pot or the plant pot – until we decide that the word before it is infact an adjective belonging to it.

The game builds an abstract template of candidate interpretations, and then turns to the game language data (ie your code). It prioritizes the candidate options as it goes. For instance, while parsing “plant” it might prefer to look for “pot”s that are in your inventory before choosing one lying nearby. But if you have indicated that “plant” can be an alias for “drop”, and it finds no suitable verb=plant instructions, it may go on to parsing as drop with a preference for pots that you aren’t carrying, or even excluding ones you are.

Infact, the parser tries to avoid commiting to an exact evaluation of what you typed, until the last possible moment. As it starts to evaluate p-code for a candidate clause, it eliminates candidates. Lets say you write

plant noun noun=pot
  checknear noun1, checknear noun2
  if not carrying noun1 then fail “You aren’t holding the @n1.”
  destroy noun1
  print “It dies instantly, apparently killing is what you do best.”

The parser searches through the various expression clauses, and happens upon this one. In this context it can eliminate a lot of possibilities. It knows that the first word is being treated as, say, verb#15 (“plant” aka word#37). But words 3 and 4 could still be interpreted as <adj><noun><noun> or <noun><adj><noun>.

The first “checknear”, however, eliminates <adj><noun><noun> because there is no <plant><plant>. So it knows that it is now looking at <verb> <noun> <adj><noun>.

There may be multiple other plants and pots in the game, but now it can also start to narrow down to instances of objects with those labels. Indeed, if there is only one plant and one plant-pot in both the room and your inventory, it knows exactly which objects you’re referring to.

For arguments sake, lets say there is a plant in your inventory and a plant on the floor. Having reached the third line (if … fail…), the parser still has two options for noun1 (“plant”). However, if it chooses the one of the floor, the if clause will “fail” – a variation of “print” that ends the parse attempt and allows the parser to try any remaining candidate clauses.

And most of this is transparent to the author. Unfortunately, you have to be slightly aware of it or else you might force the parsers hand. For instance, if you write

print “Something @n1”
if not carrying noun1 fail “Try something else”

In order to express ‘@n1’ a decision had to be made and a candidate selected.

One of my firm goals for AMUL/SMUGL was always to create a human-friendly language and to keep the programer credentials for usability low (it’s a language largely free of symbolic constructs, although I do use some symbols to let you abstract certain things; e.g there is a symbol for ‘the weight of’ something if you need to pass a numeric argument;- I chose a rare few symbols like that in favor of a complex parenthesized, bracketed and braced language).

At one point I gave up because the vagueries of running background ‘daemons’ (tasks/events) and massaging the parser in complex cases seemed like more advanced programming. But I’ve later come to realize that its possibly quite a useful educational tool. If nothing else, its an environment in which you can actually experiment with the old coders “write the instructions to make a cup of coffee” assignment, with the scope for doing more advanced stuff.

5 Comments

Dude, I understand the constructs but I got no idea what the hell it is doing!!!

I’d go for 2 or 3. Threading can be a pita, so I would prefer 3.

Another thing that comes to my mind: Could the parser logic (plant plant in plant pot) be used for an extended damage model in a game that is not yet able to sort out damages caused by secondary effects (player shot bullet in tank, tank explode, set pilot on fire)?

Erh, no. Parser logic is just about doing algebra with words. Doing it efficiently is about dividing your labour between solving the unknowns and working out the constraints on the unknowns. That way you can frequently avoid having to resolve all of the atoms.

You’re suggesting using that same inference engine to make complex leaps in logic – cause and effect. It would use a phenomenal amount of CPU with every pass of the physics engine to search for inferences.

The only parsing that ever needs doing on that sentence you just did. The reason the game *doesn’t* do it is fairly complex, and you’d need someone who understands the client, like RickB, to explain it decently.

*runtime error* Unable to parse first sentence of the last paragraph.

Just a clarification of my previous posting.

I’ve seen pictures/writings of how the damage model is supposed to work (iirc it was a he hit on a m10). Given it was a correct description, then tracking all these tiny particles over several stages (impact, armor penetration, he explosion inside the tank) is a cpu hog in the first place.

The trajectories of the particles are already known, so it could be possible to determine what gets hit on the way through the vehicle.

Looking for inferences would be necessary only if something inside the vehicle gets hit. This could trigger a chain of events which I tried express with parser logic: put bullet in (fuel) tank triggers set tank on fire|go right through tank triggers (hurt pilot and/or explode)|do_nothing or damage control wire.

Could be totally off the mark, just thinking loud.

I’ve seen pictures/writings of how the damage model is supposed to work (iirc it was a he hit on a m10). Given it was a correct description, then tracking all these tiny particles over several stages (impact, armor penetration, he explosion inside the tank) is a cpu hog in the first place.

The trajectories of the particles are already known, so it could be possible to determine what gets hit on the way through the vehicle.

That’s what damage modelling is. That’s what we do. If you don’t do it, you can’t do accurate balistics model because each collision will change the trajectory and energy of the item travelling. That’s why its so hard to kill the machine gunner in panzers frontally, because as well as the mantle armor there are all the complex components of the gun mount. That’s why he usually takes shrapnel damage but doesn’t get killed, unless the penetration and angle of the round miss the gun and its housing entirely.

And when a round penetrates, we model spalling.

First sentence of last paragraph ought to read: “The only parsing that ever needs doing on that sentence, you just did”. Writing an inference engine can work out “engine burns, burns fire, fire kills” might be useful if you want to develop a system that can work out the fundamentals of physics.

But you’ve missed something fundamental. Fire kills. So the system obviously already knows about the connection – its programmatic, it’s pre-parsed, it doesn’t need to work it out. What doesn’t happen is a kill credit, and the reason for that it would need a client coder to explain.

Leave a Reply

Name and email address are required. Your email address will not be published.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <pre> <q cite=""> <s> <strike> <strong> 

%d bloggers like this: