Putting out for a living…

Programmers often revel in the abstract nature of what they do: the ethereal, almost magical qualities of written words reaching that critical mass where they come to life and become a functioning apparatus… Gandalfs and Merlins, masters of secret magics: We like building black boxes. Its often a matter of pride: FooBar does exactly what it says on the box, it works and it just works.

It is often taken as an insult to our artistry and our qualifications that you think you might be able to divine when it is nearly done. Managers! Go draw some Gant charts.

And testing begins when the application is complete, right? I mean, all FooBar::Add(FooBar a, int b) does is add a and b. What are you implying by suggesting that may need testing?

Trouble is, as I implied in my confession of sin, we programmers don’t tend to use our code, we maybe just test it a bit (but I won’t beat that horse right now). So we miss on a chance for a double whammy: Earning money for writing code, and earning money for what we write.

Huh?

Well, each of us gets paid for writing code – we clock in, write code, clock out, rinse and repeat until the deadline, hand over the application or API, get someone to sign off on it, go to the next product scheduling meeting, get our next assignment, ad nauseum.

But not so many of us try to get paid for what we write. Have a good long think about that. I’m not talking about bonuses, I’m talking about using the code we are writing to make the product and company healthier and more profitable, to keep our managers off our backs and the suites smiling.

It certainly never occurred to me when I reached the level of manager at Demon Internet, all I could see was the need to write the next bit of code that would allow my staff to process more orders in less time. Only when I started asking myself why Cliff promoted someone else to “help” me did I realize that I’d left him completely in the blind as to how much we were costing, how much we were improving and how much we were earning, how effective my code was and – most importantly – how cost effective I was. I just figured he’d check the corporate bank balance and see It Was Good.

But you don’t even need to become that good a capitalist. The payoff occurs even earlier. That monkey riding your back, the one not good enough to understand the magic you lay down? He doesn’t actually want to ride your back. Did you know that?

Other than looking over your shoulder and humping your leg, he’s not good enough to understand your magic.

So feed him.

You really hate when you’ve handed something off, and they come back and say “When Fred tried to amalgamate the income returns on a Friday with compound derivatives and a customer with seven accounts the system would only let Kirsty input anything that was blue“. You know that absolutely none of that is relevant to what went wrong, but you’re going to have to try and recreate the exact scenario because these idiots couldn’t debug their way out of an open doorway, and its going to take days. Besides, testing isn’t even what you’re supposed to do. You get paid to write code, not to go back and iron out stupid wrinkles.

You just aren’t being lazy enough. You need to learn to instrument your code.

Instrumentation isn’t hard, and if you learn the habit, it actually becomes easier than writing code without instrumentation. How often do you wind up retrofitting code with printfs or log entries or dummy variables to let you inspect state?

Often enough that if you picked up the instrumentation habit, you’d have a good solid system that was there all the time to let you flick a switch and do it orders of magnitude more easily. Heck, you could instrument Fred and Kirsty‘s builds so that they could repeat their ludicrous menage-a-deux and unknowingly diagnose it for you. Sweet, no?

For you C/C++ types, I’m not even talking about the obscene, all-singing, all-dancing, reflection based systems that Java and C# developers drool over and overburden their applications (and budgets) with. That is a whole (and admittedly lucrative) venture of its own.

What I’m talking about is building your code in such a fashion that you could easily run your Add(…) function through a quick, offline, out-of-band test case that proves its not the problem and let you move on. Oh, god, are you whingeing about testing already?

It takes some a little getting used to, but once you start developing the habit, you’ll start writing your functions and classes with just that little twist of subtlety that makes you master of the magic rather than slave to it. And yes, you will wind up testing your code sometimes, but if you learn the habit it won’t really be like testing, it will be more in the spirit of artistry, simple validation.

WWII Online/Battleground Europe’s host systems have a very nice feature that provides for a pretty good instrumentation system. Unfortunately only one of the developers really availed themselves of it, and in the long term even he got lazy about it.

Instrumentation begins with constants. Diagnosing issues – such as users (or other-coders-using-your-code) abusing it outside of its scope or intent – becomes much easier when you have to resort to the debugger if your constants have identities. i=5 isn’t going to help you nearly as much as i=(COUNTRY)ITALY when you have to come back to line 3201 in a year.

But this isn’t about making it easier for you to inspect your own work. We’re talking about feeding the monkey.

Learning the instrumentation habit – encapsulating functions around results and breaking your magic up into steps, applying a pinch of sequential programming to your object orientation development and design provides something magical in itself – quantification.

During my first two implementations of TOEs, I made a very serious error in judgement. I was writing a complete system from scratch with only minimal need to tie into the existing system. Infact, its creation would obliterate much of the existing system. So I developed it in terms of itself, with no frame of reference. With heavy pressure to deliver and – lets be honest – a little arrogance, I decided to forego some of the more “frippery” elements of my normal instrumentation.

It was liking modelling with clay in a dark, wet room. There was still instrumentation in there, but I didn’t have any real sense of how far along I was, or how meaningful my current codebase was to what I was finally going to have to developer. There was certainly no means for the monkey to tell how I was doing. The monkey saw status updates from me, describing what code I had written or tested or worked on. Since I get paid for writing code, I was doing my job; the monkey remained ignorant that something was causing me to write the same code. After all, he wouldn’t understand the magic anyway.

Now, the producers probably couldn’t have understood the problem I was encountering, but give the guys some credit. What they are good at is working around problems like this on a meta-scale. Their zen is in dealing with large-scale abstract issues, especially those relating to timeframes, and working out alternatives like tasking you with something else briefly or scheduling a server upgrade.

In the end, it was the monkey that realized I had come to a halt, and it was the monkey that devised a solution. If, instead of having to ride my back for “is it done yet”, I had put him been in a situation to say “this seems to have stalled”, TOE development would have happened sooner.

I was so confident in TOEs being deliverable as a whole unit that I scrimped on some of my normal development practices; uh, the ones I’ve picked up against just such a situation. Sure I was testing this and that, but I didn’t put out — I didn’t see the need to expose any of what I was doing until I judged it was ready.

Reality is that very few monkeys survive in their jobs if they do try to judge code, what they have to be good at is judging productivity, which is something we coders generally just don’t understand. Its too tangible for us. What the monkeys need is a solid way to see progress towards completion. It allows us to work our magic while they have their own magical ability to sense trouble brewing. All we have to do is feed them.

In the case of TOEs, I ran into insurmountable coding issues – brick walls: a compiler bug, a compiler installation issue and OS version issues. Being a coder, I did what I get paid to do – I tried to code my way out of the hole. The monkey carried on humping my leg, and I carried on telling the monkey I was writing code.

Here is the contradiction. We allow “the monkey” to hand us specs and designs, but then we defy them to understand what we do with them. Infact, if my monkeys had been able to track where I was at, they would have adjusted those designs. Its freaky, but the monkey is actually capable of making decisions based on data. Did I just shock you?

I’d fallen into the common mindset amongst many developers of dismissing the monkey’s ability to participate in my work by focusing on his inability to write code and thus judge mine. That didn’t matter, the monkey sees in terms of designs and specifications, and while he may act as though that initial document is the law, he’s usually capable of accepting that coders can’t walk through walls.

I came to my senses with the third round of TOEs, perhaps not in a fashion that I can use as exemplary to this post due to time constraints, but I did, at least, apply the lessons I’ve learned prior to CRS and incorporate instrumentation into my concepts and implementation.

One example would be the “dumpresupply” command, which went in right near the start. The implementation of the supply queues is heavily weighted in favor of such a command – something virtually impossible with the game’s earlier supply systems.

A trivial command, yet it made a universe of difference to development, because I was able to hand over unfinished code for testing and get useful feedback. Not “something is wrong with supply” but “steps a + b + c do not produce the expected result of d. Output after A, output after B, output after C, ERROR”.

Version 1.27 of Battleground Europe has been greeted with an excellent reception in all but two areas. First was post-release stability of the hosts. Yep. I ran into issues. I knew I would because of attempts #1 and #2, and we failed to get the server upgrades performed in time for release. I’ve had undue praise from players for my dedication to getting things fixed, but it wasn’t nearly as difficult as it might have seemed. The host changes were suitably instrumented so that finding the cause of these otherwise ghostly problems was almost painless.

The third time the hosts crashed after 1.27, I was hit by a wave of panic; there was nothing in the logs to lead to a cause and no consistency between the 3 crashes. I decided against banging my head on the desk and crying, and went played with the cat for 10 minutes to muster the strength even to face the ominous prospect infront of me. When I came back to my machine, I remembered that my code was instrumented. 90 seconds in a database client and I had the cause pinpointed.

The second failure-to-please in 1.27 is performance. The client has some specific instrumentation, with varying levels of precision, most of which has been added to try and track specific problems. The client is complex, and it runs billions of operations a second on pretty much a single task. When something goes awry, the old “attach it to a debugger” notion is ludicrous. Invariably Martini has to take a rather sledge hammer approach to finding a performance bottleneck.

We’ve looked at various performance/defect tracking solutions, but the cost of integrating them to an existing project of our client’s scope is just intimidating and impractical. And that’s just to integrate it – after that you still have to apply the tools to finding, solving and redesigning solutions.

Our client is developed for optimum performance. The sort of frippery that might let us diagnose issues in it just doesn’t exist.

In the early days a fear gripped CRS that if we built diagnostic systems into the engine, they would leak into the hands of people who might realize a way to exploit the client through it and kill our project stoney-cold dead: When you wanted to test something in WWII Online version 1.1, there were no shortcuts.

If you wanted Ciney turned French to start a test, you spawned in French at Anhee, ran over there on foot or fired up a second client and spawned a truck to transport yourself, and then you captured each facility manually one at a time. And you hoped that nobody spawned in an shot you or recaptured something.

I shudder to think of the amount of hours programmers spent on carrying out this kind of test precisely because they were averse to incorporating testing and instrumentation into their code.

The point here is that client functions are generally very all-inclusive, which means you can’t easily isolate them and work on them independently. Many of the new systems that we have developed – and I say we here because Martini, Rickb and Ramp have all been tending to do this automatically – are developed in an isolatable fashion that can be compiled into a trivial client. On the host I regularly pull out modules and compile them into a simple testing harness without the overhead of an entire host.

But in both systems its not always possible – once you start to touch on older systems you find yourself cornered into requiring the entire application. I doubt Martini could build a “headless” instance of the effects system for stand-alone benchmarks and it would probably take me some effort to build a stand-alone instance of the TOE supply system because of its ties to the strat system.

Without instrumentation, we have no means by which to track the conditions under which the client performs poorly, which makes finding this loss of performance a scary and nebulous task. It’s not as simple as, say, pointing at STOs and saying “they cost FPS” – they probably do, but are they what’s causing the problem? Removing them might fix the issue here and now, but they might not be the cause and somewhere else might be some simple defect, bug or oversight or even some piece of unoptimized code that has become overused … that is hogging CPU cycles and pushing STOs out of the bed, so to speak.

Programmers are often shy of developing instrumentation because they picture themselves being reduced to mere operators, running code again and again and crunching the resulting numbers.

It’s strange that its usually the monkey that has the inspired notion of writing some kind of tool or utility that does that work instead and allows the monkey to do that task. As though the monkey would rather not have to ride on the programmers’ backs.

At the end of the day, when code doesn’t work, someone has to figure out 1. where and 2. why, someone has to figure out the 3. cause and figure out 4. a solution. Programmers profess that 3 and 4 are their domain and speciality, and yet they perpetually leave themselves open to 1 and 2 by failing to offload that burden to the code and the system as often as possible.

Teach yourself to instrument, train yourself to automate: delegate the task of defect tracking to the code, because the monkey isn’t a programmer so bug finding is always going to wind up on your plate sooner or later.

13 Comments

Dunno if you remember…but my recollection is that one night, back before there were GMs, when the only things keeping an eye on the servers late at night was the unreliable beeper and the 2nd computer sitting on my other desk logged in with the early GM tools…resupply was broken at a large number of CPs in a new and progressive way and I’d contacted you for help. After a bit of back and forth, you left for about 5 minutes.

You came back and said “ok, here’s this set of commands”. /dumpresupply and such. I fiddled with them a bit that night and they actually helped understand what was going on.. They were also, more importantly, something that allowed me to look at the problem without having to go to each mission list in the game and watch the numbers over time myself to actually confirm that there was a problem at all and not just my imagination stoked by players complaining. That I’d bothered you for an actual reason and not simply been pushed into wasting some coder’s time because a couple players were upset at not getting their nice tanks.

Now, you were relatively new at that point, but I do remember being impressed that you were able to dive unprepared into the system and come up with that nice usable set of commands in such a short time. It meant you had enough understanding to see the code whole enough to pull usable independant chunks out quickly.

That night I figured there was a useful coder about. Called John the next day about it :)

It may be that you’d already incorporated those commands at that point, and all you did was give me access to them. Either way, it was a good thing.

…@/

The old dumpresupply was scary and dangerous – it took roughly 900ms of CPU time to execute on the new fancy boxes, burning CPU across four of the cores…

So I had hidden it away for “real emergencies only” :)

to paraphrase a quote that Jeff Quill was told many years ago,

“If anybody ever tells you anything about an XXXXXX which is so bloody complicated you can’t understand it, take it from me: it’s all balls”

Maybe I cheat because i’ve listened to you too much in the past Oli, or maybe it’s because my schooling was all in applied mathematics (and i just can’t think in the abstract) so i have to have lots of helper tools/functions to dump state/etc – then again that might also relate to my choice of career too :D

If you can’t monitor it, how can you know if it’s working correctly. The number of use cases of : “well it seems slowly now than earlier” – rather than “it took 2.5 mins now and it used to take 1.3”

mike wrote:
then again that might also relate to my choice of career too

I think you have it there – if you think back to Demon, our workloads kinda orbited a vaguely similar center of gravity, along with Ronald Khoo, James Grinter and Dave Williams. Much of what we were working on was inherently manageability and operations, and I think that is where the awareness of the complexities of tooling, instrumenting and reporting (which is really what this post is about but I didn’t want to use the word until the programmers had gotten as far thru it as they could choke).

I’d hazard a guess that Krenn’s outlook on programming is similar for the same reason.

For a “programmer”, though, the code is its own instrumentation. “Look at the code”.

It all comes back around to programmers not using their code – and while they thrill to seeing it come to life, they don’t stick around for it need its diapers changing so they are oblivious to how complex it becomes as part of a functioning systems.

Its not a problem unique to programmers. Anyone intimiately familiar with the minutae of their product’s construction can easily become blinkered by their knowledge of its capabilities

for ( i = 0 ; i < 10 ; ++i ) {
otherModule.doSomething(i) ;
}

This code was working perfectly when you wrote it, but you probably won’t mentally tag the fact that your code has a dependency that otherModule.doSomething better handle all values larger than 8 or your code will appear to break.

For all you know, otherModule.doSomething only handled those values by accident, and when it starts to break because you pass it the value 9 could be the result of a bug fix in otherModule.

It happens in every line of business and science: Nobody anticipated someone would cancel an order before it had left the desk of the sales guy who took the order; cars that predate the idea of unleaded gas don’t carry any kind of warning about filling them up with it, infact lots of manufacturers didn’t bother warning about diesel because there were only the two kinds so it wasn’t rocket science to know what to put in the tank. Oppenheimer, anyone?

Anyone else wondering who’s going to show up at Oli’s door tomorrow going “uh, did you just call me a monkey?”

:D

I also wouldn’t dare to compare the hackery that I do to your actual coding; although I think you’re probably right in that we’ve got a same general bent towards what I cringe at calling the bigger picture. Especially since I’ve spent so much time in operations and only did programming of scripts and utilities to support that end goal.

Why? To them Oli is just a ‘coder monkey’ – maybe king coder monkey or supreme coder monkey or most-necessary-and-useful coder monkey but you get the idea :-)

Oli can you give a quick example on the instrumentation you add? I think I have been doing something similar for years but without a name or specific technique. Maybe a reference to another source?

Krenn,
It’s too far to drive just to beat the snot out of him for calling names. Really. If/when I go down there to visit everyone else…Ollie’s already earned himself a few rough and tumble moments with “The Silverback Rat”. I’m sure it’ll be sufficient.

Fridge,
No. To us Ollie is an “idiot savant” or a “pet coder”. Didn’t you read my posts?

:P

By the way…”king coder monkey” is not a term anyone uses. We call that guy “the alpha geek”.

…@/

Ollie,
Yeah. I remember the “don’t use it too much”.

Actually, I don’t think I ever used it at all after that night. Someone responsible was aware of the problem. No more need for me to verify/diagnose/look for it after that.

The next day you said you were working on it. It got fixed shortly thereafter.

A coder actually fixing problems promptly after saying they would was also a nice thing. Think Mo got a call too at that point :)

…@/

Ollie just called Gophur a Monkey! I’m telling!

Doc will be jealous!

A VERY interesting read and insight.

And from a players point of view I appreciate what all the Rats (past & present) have done to make the game what it is.

Hey Ol, it’s “Gantt”.

Oh fuck. :-(

fridge wrote:
To them Oli is just a ‘coder monkey’

There’s the winner. At the end of the day, I really hope that when I wrap this little series up it will be the programmers I’m preaching to.

I already mentioned – in comments – that coders are by no means unique, but in each instance of speciality-interaction, whether it is physicsts and engineers, sales and manufacturing, artist and mechanic or coder and executive the same sorts of issues occur, its just the combination of variables that are different.

There’s always something unique about the way the individual on each side of the table is skilled at approaching a series of problems which underpins their entire approach and their base assumptions.

Yeah but only code monkeys can use the term. Otherwise it’s a slur.

So Mr. Smith, our host, is the WW2OL coder that got the ticket to the premier of ET and sold it for cash at the box office?

Tools, Instrumention, noted.

So you record an overall time and if it pushes a threshold you prompt to go into diagnostic mode? Or just do it automatically? Client Is Slow Some Times On Some Machines seems a tough bug.

Leave a Reply

Name and email address are required. Your email address will not be published.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

You may use these HTML tags and attributes:

<a href="" title="" rel=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <pre> <q cite=""> <s> <strike> <strong> 

%d bloggers like this: