Small achievements can feel so good

Over the weekend I built a specially optimized version of the client (with SSE2 instructions enabled, incase you're curious). It's good for version 1.23.4 only. Offline around Antwerp in a Daimler it bumps me from 70fps to 130-140fps. Online it's not quite so dramatic but still noticeable. We're contemplating a dual-build so that players with SSE2 capable machines can take advantage of these extra instruction sets.

Why not just plain old SSE to hit more CPUs? Well, it would certainly give some performance boost, but SSE2 makes some actual use of those extra 64 bits lying around on the 64bit CPUs so the improvement really justifies the overhead. I'm not sure whether plain SSE does without some coding specifically for it.

I got a bunch of tasks done today that were supposed to have been done last week but have been stopped by CTHL work. None of which I'm going to detail because they'd probably make anyone stuck with CTHLs cry – but there wasn't anything I could do today CTHL related (other than continuing to try to get more data, which naturally I did). We're still working on those outstanding CTHLs but its a matter of trying to find commonalities. Nothing obvious or evident is causing it. And nothing I could work on solidly today.

15 Comments

Providing SSE2 for those processors that can use it would be a huge boost. People running the game on old AMD Thunderbird or Pentium III processors already play at the low settings and realize that in order to make ALL of their games better they’ll require a new system. Can they last much longer even with specialized code? SSE2 is a much better choice than just adding SSE instructions. Since you’ll be doing it to optimize performance rather than give yourself overhead to add a bunch of new CPU intensive stuff that will eliminate the low-end systems, I think it’s a great idea.

Will it be something simple like multiple ww2.exe files and a Settings option, or will you be creating a single EXE that is bloated by multiple sets of code? From a QA aspect, I’d suggest having the multiple executables. Easier to test and easier to release specific fixes. I have an old Thunderbird 1.3Ghz processor system that I can use for testing.

Hmmm, my use of “overhead” isn’t right, but it certainly sounds right. I guess “extra ceiling” would be correct. Stupid multi-dimensional, one-dimensional metaphores.

:-) both my PC’s have 64-bit AMD chips in them. Can’t wait for this!

I’ll take a copy of that. :)

I havn’t a CTHL since you fixed the last patch. Personlly, I blame the programs thier running. If I *load* mercora IMradio player, even if I close it afterwards, I get CTHL’s in both WW2OL *and* Eve Online. But it’s the only way I’be found to find some hard to find music otherwise.

Anyways, do you know how *many* people using SSE(2) would help? Mabye you could get a marketing bonus with Intel(?) or something…

Interesting thought — detect SSE2 then if it’s there run that code path..

http://www.intel.com/cd/ids/developer/asmo-na/eng/20298.htm?page=3

I’m no programmer but what about an SSE3 client? Newer Athlon64s and all X2s can use it in addition to many of the Intel products.

SSE2 would get every Intel Pentium 4 user and every AMD Opteron and Athlon 64 user.

SSE3 would limit the performance increase to users with Intel Pentium 4 (Prescott and higher), Pentium D’s, and AMD Athlon 64 (since Venice Stepping E3 and San Diego Stepping E4), Athlon 64 X2, Athlon 64 FX (since San Diego Stepping E4), Opteron (since Stepping E4), Sempron (since Palermo Stepping E3), and Turion 64.

Basically, more customers would benefit from SSE2 optimizations. Those people with SSE3 support are are still able to take advantage of SSE2 instructions, which I suspect by Oliver’s enthusiam, are quite significant and worth the programmer time.

People running the game on old AMD Thunderbird or Pentium III processors already play at the low settings and realize that in order to make ALL of their games better they’ll require a new system.

Er… no, I don’t. I run middle of the road settings on a Thunderbird XP 2500, with an ATI 9800 Pro.

Thunderbirds are pretty old… ;)

We’d also have to upgrade compilers significant, and I’m not quite ready to make the leap of faith. Version 3s have been notoriously bad of hardware and software over the last 20 years, so I’d rather go with SSE2 which is a known quantity.

There’s no programmer time involved in the SSE2 optimizations – yet. It’s just a compiler option that says “turn on these things”. It’s not all-out SSE2 optimization, but it does mean that things like

(int)i = (double)j ;

become a single, pipelineable CPU instruction instead of 11-18 CPU instructions that it takes without SSE.

KFS how does that single instruction end up happening?

Krenn: Er… no, I don’t. I run middle of the road settings on a Thunderbird XP 2500, with an ATI 9800 Pro. I’m sorry. I was under the impression that AMD added SSE2 with the XP processors (although to clarify, the Athlon XP is not a “Thunderbird” processor, it’s a “Thoroughbred”).

Oh, my bad, I thought the XP was a Thunderbird. Sorry about the confusion there. Yeah, if I was running on my old 1GHz Athlon, it would be low settings across the board.

Thunderbirds were the generation prior to the Thoroughbreds. 1.4Ghz was the top of the line Thunderbird. Still have one laying around.

Leave a Reply

Name and email address are required. Your email address will not be published.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

You may use these HTML tags and attributes:

<a href="" title="" rel=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <pre> <q cite=""> <s> <strike> <strong> 

%d bloggers like this: