Pretty graphs

While I’ve been trying to dig the TOE systems out of a marianis trench of functionality, I still have other host responsibilities to try and tend to. In part, I’m still overcoming the lack and fear of tools that existed here when I started.

In the last few months, I’ve been replacing the inter-host network with Netcode2. That light bulb that just went on “oh, that’s what the problem was”; no, the problem wasn’t Netcode2. The solution has been Netcode2, and the problem was that we didn’t have resources to maintain Netcode1.

Netcode1 is written in C, based on TCP, and in general tries to be skeletal. Over my years here, I’ve built bits of framework to make using Netcode1 easier — classes that abstracted away much of the otherwise tedious processes involved in, for example, registering an RPC call and the functions that handled its sending, receipt and data marshalling.

As I’ve been doing this, I’ve had an opportunity to begin reinstrumenting some of the systems, dealing with issues where the old logging system would kill a server process repeatedly logging an error hundreds, thousands or even tens of thousands of times a second.;

And introducing a system of performance/event counters. We used to have custom, specific tables for this kind of thing, or we had to trawl the logs. But a lot of the information going into the log files was redundant. Helpful in debugging someone’s connection or a very specific issues, but no use for the overall health, performance and status of the servers.

At the same time, I stumbled across JPGraph. I’ve already begun severing my ties with Roxen and its diagram tag and we have an Apache server internally for our wiki and trac. Trying to coerce your data into something <diagram> will use… Is just a pain. Doing it in PHP with JPGraph is a walk in the park, though – allowing me to go from zero to graphage in no time. In particular, JPGraph understands timestamps.

untitled.png

charts.png

In the past, we have found or detected problems through monitoring, which we have quite a lot of. However, the monitoring is largely passive, and until we actually observe something break via monitoring, it often goes unmonitored.

This is already earning its own keep, finally I can observe bad trends forming ahead of time. It already highlighted a leak we otherwise wouldn’t have detected (we don’t monitor the neccessary resource, doh). This leak is going to mean scheduling server downtime, but we have data we can use to project how long before it becomes a problem – so we can schedule downtime to restart the servers well in advance.

Of course, arguably we’re looking at an incomplete picture with the dataset we have, and it’ll take some time to develop meaningful sets of data and trends to work with. At some time in the future we need to sit down and predict some of those trends based on how we think things should work and see if we see anything aproximately like that.

It has shed some light on parts of the system that were pretty obscured, too. I may have found why the firebases never seem to want to save properly between host restarts – my suspicion is that they only save state after the server starts, so when we next restart the server, it reloads that state instead of the state it was in when the servers came down.

Lastly, its been pretty useful in helping me verify the conversion of our inter-host backbone to NetCode2 – which is looking pretty healthy in the beta cluster.

20 Comments

I thought for a second this would be a post about TOEs

Oli, I could kiss you. JPGraph is EXACTLY what I’ve been needing to get off my ass and look for, and here you give it to me on a silver platter. :)

It was about TOEs, if you pay attention.

I watched this company take 6 months to gradually introduce attrition equipment piece by equipment piece when it is quite obvious they could have flipped a switch and done it all at once. This was a political delay rather then a technical delay.

Please convince me that the last year and 2 months since brigade spawning has been a technical delay.

What do you smoke? Crack cocaine or just really strong weed?

I can show you the hairs KFS1 has pulled out of his head for the last 13 months I’ve been here, Rdmenace. Will that do?

“Political delay”? Where do you guys come up with these theories? Oh, wait. You might have been listening to Doc. ;)

Be more worried about the “game design delays” that come after the TOE code systems are finished and we discover in playtesting that it destroys the game on first implementation. (Not that I’m expecting that but ‘no plan survives contact with the eneCTRL^H CTRL^H CTRL^H’ … let’s just say ‘no plan survives first contact’ period.

The post by bloo above is what I am referring to. Thanks for the evidence, because sometimes I wonder if I am smoking something.

When you guys make posts to the effect that you are worried that TOE’s will destroy the game, it makes me seriously wonder if all the delays are technical.

Or, are you afraid that the kiddies are going to quit when they can’t get their toys, like you were with the attrition implementation.

Brigade spawning without TOE’s is the broken implementation. The game is hurt more for every week there is brigade spawning with No TOE. Yes there may be a slight plunge in players immediately after TOE, but the game will be better for it and you will get more players in the long run when all players find themselves deeply attached to the strategic considerations and overall health of the brigade they are in.

Brigades need to stop being viewed as spawning windows, and instead be viewed as living, breathing units.

Rd; you’re confusing multiple sources. This blog isn’t an official source, and if you go away thinking that the reason CRS hasn’t done things is because of what I write here, you’re making a grave error. I talk about technical stuff here, because I’m a technical guy. Sometimes decisions are made rooted on technical stuff. Yes, there have been non-technical, managerial reasons why we haven’t thrown in the towel and rolled out “something” the last 18 months.

We are, and have been, extremely concerned on a non-technical level about TOEs and the need to get them right – that means technically as well as gameplay wise.

The game has changed since TOEs were first designed, and it has continued to change since each redesign of TOEs. When you’ve had to redesign something this much and this often to keep up with the game, it starts to ring an alarm bell.

If you want to call it “political” when we’ve identified that the very next thing we planned to do in our development queue made our existing TOE design unworkable, then yes, some of our prioritization has been “political”.

But the last 3 months have been technical.

“I watched this company take 6 months to gradually introduce attrition equipment piece by equipment piece when it is quite obvious they could have flipped a switch and done it all at once”

I really don’t understand this sentence, either because I don’t understand what you’re referring or because you’re referring to something I don’t recognize as what you’re calling it.

First of all thanks for taking the time to respond. Especially since I’m not being the most diplomatic.

Second, the attrition bit I’m referring to may have happened before your were in the company. I’m sure you know about it. Before there were no spawn limits on anything. The concept of putting limits on how much equipment could be spawned in a given AB is what KILLER called Attrition. In a way, it was the TOE of that era. It was promised for a long time, but was delivered slowly.

When they developed the technology to do it they decided to introduce it in stages. First, they limited tanks. Then, they limited AT guns, SMG’s and sappers, and finally, they limited Rifles. It took a total of 6 month to do, but from the start it was self evident that if they wanted to do it all at once they could have.

Now, TOE’s are obviously more complicated. But what is particularly frustrating about them is that they are an integral and missing part of brigade spawning. The time between Brigade spawning and TOE should have been one 3 month patch cycle at most. Instead, they were announced for a patch last summer, but the work wasn’t even started on them until a month after that patch was finished.

It’s to the point now that the player base relationship with brigade spawning has become ossified. Some of them do not realize that brigade spawning was designed as a transitional stage of TOE. The more time passes, the more negative reaction there will be to TOE limiting aspects.

I don’t know what you have planned next that TOE will break, but in can’t be nearly as bad as when you introduced brigade loyalty and then broke it with brigade spawning. If you introduced brigade spawning and TOE’s at the same time brigade loyalty would not be nearly as broken. We might at least still have division loyalty, for instance.

It really seems like you don’t understand that this is a released product, and you apply the same expectations and standards to the development process that apply to a pre-release product.

Simple fact: If CRS hadn’t kept up a steady stream of tangible updates throughout WWII Online’s history, we wouldn’t be having this discussion. The game would have gone under years ago.

Overarching changes, like attrition or TOEs, can’t simply be dropped in. There is no practical way to proof-test these kind of design concepts other than to introduce them in babysteps.

But even then, if you understand that this is maintenance on a running engine, you would understand that the decisions aren’t “political”.

Attrition was slow and drawn out first for simple lack of instrumentation: without the ability to predict how gameplay would change, there was no way to really tell how the resupply systems were going to scale or to design resupply systems to scale around usage patterns. Atleast: not while meeting live player expectations of delivery while the same staff continued the simple day-to-day process of generally keeping the game running. Work that takes a month on a pre-release game can take upto 6 months on a released game in its first year after launch.

So attrition was introduced as safely and as rapidly as it could be. Meanwhile, the attrition design had a lot of other features and aspects that were being coded in too and reviewed in the limited off-live capacity CRS had. Eventually that stuff was dropped and attrition got finished.

What you, bizzarely, see as politicking was actually one of the things that regained CRS some of my respect, because they pulled off one heck of a job of introducing not just a switch-flippable feature but a rather complex re-engineering of critical systems *while* continuing the process of stabilizing and refining the rest of the game at large, and without the ability to hire on extra people to take on some of the “live support” responsibilities.

You’re really quite convincing, and I would feel like I was on the defensive if I didn’t just wait 3 months, Then 6 months, then a year, and then 3 months more (The last time you told me personally that TOE beta was coming out a week after 1.25 release was 3 months ago today) without seeing ANYTHING.

If you have to release this one in baby steps, where are the steps?

Do you mean introducing brigade loyalty for about a year, and then breaking it with brigade spawning, only to wait another year? Those aren’t baby steps there. those are really large missteps.

I mean at some level I have to believe you or I wouldn’t be subbed. But seriously, every time I decide to believe you… I seriously feel like Charlie Brown trying to kick a football here.

Rd, are you proposing that the Rats shouldn’t follow any development course for which they don’t have the explicit permission of *all* their customers?

If not, why are you assuming that they have not already considered the economic value of any customers that they may lose (possibly including you) by developing carefully and thoroughly, and concluded that it is less than the gain to the game from the advantages of that slower development?

And you wonder why we hate to answer “when?” questions.

Where are the baby steps? How about the brigade spawning system? How about the OIC system?

Clearly you disagree, but we’ve been doing the development in the order we’ve felt made most sense, technically and from a gameplay perspective – building the parts of the whole which most others depend on, or which we had greatest confidence in the design for.

its not political, its simple building the walls before you build the roof.

I’m amazed that ToEs have taken this long to implement. It is essentially a list of rules by which several spawn list types can be used at, and moved to, different locations, right? Everyone knows this will have a major effect on gameplay, so its reasonable, given CRS’ history of slow development for “risky” features, that people assume the delay is not related to coding challenges, but rather the process for specing out ToEs.

A second possibility is that it has not really had a high priority. ANd a third would be that the underlying code is suck a freakin’ mess that you could probably have done ToEs easier if you had started without ANY spawn system whatsoever.

It all makes me doubtful that we will ever see some really cool features like a totaly mobile spawn system or a more interesteding and sophisticated capture system.

Trout

On the larger scale, its taken a long time because the code has been specifically designed not to do it. On the “political” scale, there was a lot of other work that the company felt was as neccessary as TOEs, and that often made sense from a technical perspective too.

That very same slow introduction of attrition gave CRS some hard-to-get insight into the likely effects of TOEs, so you bet that if there are 3 steps to introduce TOEs, they’re going to choose the one that ups and culls the player’s ability to spawn last.

This is very scary for us. All it takes is one dukeacem and suddenly a whole side can’t spawn.

When we introduced brigade spawning and AOs, we expected a certain amount of backlash to it which is the backlash we got. Brigade spawning is a great thing to impose on the other guy so that he can’t run away from you. But finding yourself in the same cage is a different matter.

Brigade spawning and AOs have both united and divided the players with/against the HCs. Putting your other ‘nad in their hands isn’t going to make you love more, is it?

This isn’t why TOEs aren’t out yet. It’s not black and white. All of this runs right alongside with technical issues, resource issues, business issues, etc.

A year ago, I was ready to just up and implement TOEs, and I’m really glad that CRS took the more cautious approach. I firmly believe that the TOE design we had a 12-18 months ago was badly flawed enough to – gameplay wise – to have killed us within 6 weeks of its being released – not enough time for us to get out the first next “now lets make TOEs work” patch. At the very least, we needed a minimum OIC system before it.

TOEs could probably have been a solid 6 month development cycle to implement the overall suite of changes and updates needed to make it go in and bed down the way we want. But that’s 6, uninterrupted, focused months of my development time, plus 2, maybe 3, months of UI and accompanying client work.

We haven’t, and don’t, have the resources to do that; we have a live, living product to maintain and nurture. There are other tasks and projects that other team members have that require host work, the hosts themselves have needed work and maintenance. Folks bitch about the lack of progress on the infantry “predictor” [sic], guess where that needs to come from? Oh yeah, the host. That’s neglect. See how that works? It’s almost like if 90% of my time is taken up with TOE related projects over 18 months, only 10% of my time is available for other stuff.

Good lord. I do not read this blog religiously to have it turn into The Barracks.

Please stop pestering KFS with game related specifics about how YOU feel the game should be implemented because then we DOC in here to say: “Where’s *your* MMPOLG?”

I’ve been to 2 different Conventions and am going to the third. You can talk to ALL the Rats at length at that appropriate place.

IMHO – this blog is to share what it is like *generally* to code and work on a MMPOLG that *happens to be* WWIIOL in this case. Bringing your TOE issues and spawn issues here in specifics makes me wince.

We’ll lose KFSONE to postings like this due to frenzied attacks like you specialize in RDMENACE. SO do a game with TOE’s on DAY 1 and come back and post here about how you did it, coded it, and out-performed the ‘political’ decision based Rats, fer chrissakes.

Be positive, or constructively negative and try to stay away from allegations that KFSONE personally promised you something. If you read all the blog, you realize the economic and personal toll the Rats have undertaken just so you can play the game. Enjoy the learing process- *please*.

This guy has almost become Kfsone’s personal “blair”.

/shudder

Seriously rdmenace, do you drink heavily? If you do, you need to stop. If you don’t, maybe you need to start. Whatever the case, you need to lighten up. This isn’t CRS Central. This is Kfsone Central. Some of us enjoy reading and taking in some of the behind the scene host stuff. We don’t like to come here and read some rant about CRS decisions somehow being politically motivated. It’s absurd. Comments like yours are the quickest way to make these blogs go away, i.e rickb.com.

I’m not trying to beat rd off or anything here, I think its productive to “discussion” to have someone who likes the coffee au lait. However, if you’re going to try and read what I say the way you would Doc or any PR type person, you’re aiming for a world of hurt. I don’t speak PR, I barely speak human never mind mind-manipulation. I’m not trying to give you the whole picture here, it doesn’t even interest me. I like my little slice of the pie, my little specialized cave of coding.

Like I say – I would happily have coded up TOEs a year+ ago and worried about making it work once it was out there – which is what rdm is suggesting. Until I got faced with a couple of sharp-minded designers/producers and we went thru it with a tooth comb. And the simple fact was that the foundation wasn’t ready for TOEs.

We had another, long, meeting today about what the hold up is, where this project is stuck. It’s not currently stuck in gameplay mire, it’s stuck in technical issues.

So today we went over the TOE spec and looked at the obstacle, what parts of the design were dependent on it, what parts neccessitated it, and we found a better path.

I’ve got two days to produce a coding design for the modified version, and a couple of weeks to get an implementation into beta. We solved a lot of problems that were previously still open ended with “we’ll have to deal with that” type solutions.

The unfortunate fact is that the average consumer can’t or won’t see past his or her own nose with regard to what they “want”. Fortunately (or not, as the case may be), CRS is in a position to have to carefully consider ANY changes to the live game. Its an unfortunate fact that DOC is forced to repeat himself on the Planet forums, as there will always be those who think they know more about a situation than the developers themselves do.

I’d have loved to see TOEs in 3 years ago, but I’m smart enough to know I don’t know best how to implement them and that CRS is in a MUCH better position to make that decision. Like it or lump it.

Leave a Reply

Name and email address are required. Your email address will not be published.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

You may use these HTML tags and attributes:

<a href="" title="" rel=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <pre> <q cite=""> <s> <strike> <strong> 

%d bloggers like this: