We upgraded a bunch of stuff today; we’ve been running like headless chickens since Friday when some rookie came along and tried hack 101 on our auth server and … succeeded. Entering a username of 1′ or ‘1’=’1 got them past the auth server and into the game; it also periodically took out the auth server. Course, it also logged a bunch of stuff.
Shame, shame, shame on the coders responsible for passing data thru to an SQL query from a client. Shame on the playgate author for passing that kind of crap to the server in the first place! I would say shame on me for not spotting it sooner, but with no apparent pressing reason for it, we hadn’t upgraded the auth box in so long that we could no-longer build executables for it.
We are incredibly lucky to have gone so far with nobody trying that!
When it became clear that we weren’t going to be able to roll out a fix and that little miss smacktard was going to continue trying to get in, it was pretty late on Saturday morning, so I created a little script called “fuckthebastard.sh” which monitored and repaired the situation, while outputting logs in the format his ISP asked for.
Not ideal tho, because this would cause a restart of the authentication token sequence. Something I have long ago fixed but couldn’t build a binary for it that the auth host would run.
Another trivial issue with auth was that the old executable *had* to be run in a debugger or it died horribly. This is because MySQL provides a feature to “ping” the database and make sure the connection you’re about to make a request on is still alive. If its timedout/gone away, mysql_ping will wake it up and reconnect. Schweet!
Only some smart mysql genius also made it generate an operating system signal that you have to write specific code for or your application aborts (SIG_PIPE).
It’s widely documented – in google. Several links to MySQL’s official forums/bugtracker saying they are aware of it and will fix it in the next patch (in 1999). It’s still current in 4.11
We also took the opportunity to make use of one of the last Dell 2×3.0ghz xeon systems into the live cluster and move the primary game database, the database proxy app and the strat host onto that machine.
Previously strat was using 0.01% cpu of a 2xPIII 800Mhz system, and the dbd+database were using 45-57% of the cpu on another 2xPIII 800 system.
Combined, they are using less than 0.25% of the CPU of this new box :) It may seem a waste but they are all three transaction servers that need to have the best possible return time, so running them like this is healthy. Could be a help in reducing some of those no-AAR issues.
After this merge, all of the live cluster is now on dual xeons, which means I don’t have to faff around building a mixture of p3 and p4 versions of the live cluster libraries/binaries, and I can turn on the full -march=pentium4 -mmmx -msse -msse2 -mfpmath=sse -O4 optimization for building game hosts. CPU usage is down about 5% on each box as a result.