Angry Players Make Sunday More Interesting

Youtopia has been growing quickly the last couple of weeks. It’s fun to watch and the team is really excited about it. Of course, with the growth comes a lot of performance tuning with our code. Today we hit an issue I wasn’t expecting at all. . .

We’ve been running Windows 2008, IIS7, and ASP.NET 3.5 in production for a while now, but haven’t had to do much of any performance tuning. It just works, and is fast. Which is awesome!

But today, Youtopia was running slowly and requests were hanging so I investigated. The databases were performing normally and not having any locking issues. The network looked good. The memcached cluster was healthy. The queueing service looked great. The ASP.NET performance counters even looked good at first glance.

None of the diagnostic performance monitors I’d used in the past (such as Requests in Application Queue) showed the issue, but requests were absolutely being queued — or otherwise not processed immediately. There were also plenty of free worker and IOCP threads. The only thing that clued me in was the Pipeline Instance Count and Requests Executing counters were exactly the same (96) on all the servers. So I started investigating from there.

It turns out that due to the way IIS7 ASP.NET integrated mode threading model functions there is a (configurable) request limit of 12 per CPU. We hit this limit in Youtopia today because we hold open requests for asynchronous Comet-like communications and there were over 288 people online simultaneously. Our three eight core web servers each had 96 (8*12) people connected to them and weren’t really serving any other requests. We aren’t running into any thread configuration limits as the long running requests are asynchronous and not using ASP.NET worker threads.

Here are a few great links that came out of my research.

With ASP.NET 3.5 SP1 it boils down to a simple configuration file change. Use something like this in the aspnet.config file (in x64 it’s at C:\Windows\Microsoft.NET\Framework64\v2.0.50727\aspnet.config). This is the default. Adjust maxConcurrentRequestsPerCPU to suit your needs.


    <system .web>
        <applicationpool maxConcurrentRequestsPerCPU="12" maxConcurrentThreadsPerCPU="0" requestQueueLimit="5000"/>
    </system>

In addition, the application pool needs to be configured to allow more requests. By default it only allows 1000 concurrent requests. This is done under the Advanced Settings for the application pool in the IIS 7 manager. Set Queue Length to 5000 to match this system level configuration.

Ditch Your Events (Part 1)

About four months ago Max, Hive7’s Lawful Evil
CEO
, decided we needed to take our games to the next level and build something
fun and accessible that everyone who plays “those farming games” would want to play.
We all brainstormed, pitched our ideas to the company, and everyone voted by comparing
every idea against every other – I wish we had a digital photo of the giant
matrix on the whiteboard. There were a bunch of great ideas, but in the end… I
won! Youtopia was
born.

Youtopia was released to the public about three months from its inception. Hats
off to the dev and art team for pulling this one together. A new technology for
the developers and fully animated objects for the art team led to much blood, sweat,
and tears, but we got ‘er done! Of course, we’re still actively developing Youtopia,
and there are lots of great things planned for the future! But, back to my tech
article…

It’s been a long time since I’ve stepped out of my comfort zone and learned a new
(to me) technology. Don’t get me wrong, I’m always experimenting with the lastest
.NET based thingie-ma-bobbers out there, but I haven’t used a completely foreign
development environment since C#/.NET came out over eight years ago. But for this
project I needed to learn Flash/AS3, and it needed to be done yesterday. Luckily
for me nobody else on our dev team knew Flash so I could still pretend like I knew
what I was talking about and make lots of (un)educated architectural decisions without
anyone being the wiser!

One such recent decision was to use an event driven property binding system. Youtopia’s
engine is based on a great open source game engine, brought to you from some of
the Dyamix/GarageGames people, called the
PushButton Engine
(or PBE). In PBE there is a class called PropertyReference.
This class facilitates a late-bound approach for one component to read the value
of a property (member variable or getter/setter) on another component. It’s a pretty
cool pattern, but requires you to poll the target component whenever you want to
know if the property changed. This works fine when you’re talking about 10’s or
100’s of components. But in Youtopia we have thousands of entities in the scene
at once. We needed this binding to be event-driven.

Of course, with my .NET background I immediately reached for the
INotifyPropertyChanged
pattern used in .NET’s data binding infrastructure.
With INotifyPropertyChanged it is the responsibility of the object owning the property
to raise an event whenever a property value changes. Any listeners will then immediately
know they need to poll for the new value if they want it.

This works great in .NET and is very performant. But in Flash, events are a whole
other story. They are an extremely feature-rich subsystem that I don’t really want
to get into. In the end, all the features and memory allocations when you raise
an event lead to poorer performance than we needed for Youtopia. We need every bit
of CPU power on that single Flash thread and really shouldn’t be wasting it raising
events.

So, I shamelessly copied the .NET patterns and brought them over to AS3. Let’s start
at the core. In order for things to perform their best, I
couldn’t use
the built-in Events. Though Troy did the benchmarking legwork,
he didn’t provide an implementation we could use to register callbacks and call
multiple functions. So, I wrote a MulticastFunction that behaves a whole lot like
the
MulticastDelegate
in .NET. Usage is really straightforward.


  var func:MulticastFunction = new MulticastFunction();

  //register my listener callback
  func.add(
    function():void
    {
      //this callback does amazingly cool stuff
      trace("hello from the callback");
    });

  //calls all the callbacks that have been added, in the order they were added
  func.apply();

As you can see, dealing with the MulticastFunction is a lot like the EventDispatcher,
but each MulticastFunction is only designed to be used for a single event. So, to
use it for events, create a public getter on your class named something reasonable
and add your callbacks to it. Done!

Ok, I realize I keep talking about event dispatching speed, but haven’t put my money
where my mouth is. I wrote some benchmarks of my own and here is the output with
a release build, in the latest standalone Flash 10 player. It does five test runs.
Download the Source


running tests...
Event dispatching took 848ms
MulticastFunction took 355ms

running tests...
Event dispatching took 846ms
MulticastFunction took 351ms

running tests...
Event dispatching took 834ms
MulticastFunction took 352ms

running tests...
Event dispatching took 836ms
MulticastFunction took 351ms

running tests...
Event dispatching took 823ms
MulticastFunction took 343ms

Yup, that’s right. MulticastFunction is nearly 2.5x faster, and I haven’t spent
much time tuning it. For example, it’s using an Array under the hood and doing more
work than it needs to during the apply call. Events will also become less performant
over time as you have to create (and potentially clone) Event objects for every
dispatch, causing a lot of garbage collection pressure. Here’s the MulticastFunction,
with lots of comments or you can download the source


package com.jdconley
{
    /**
     * A wrapper that mimics the synchronous behavior of the MulticastDelegate used in .NET for events.
     * This doesn't support any of the async methods, as we don't have free threading here.
     * It also doesn't support return values.
     * See: http://msdn.microsoft.com/en-us/library/system.multicastdelegate.aspx
     */
    public class MulticastFunction
    {
        private var _functions:Array = [];
        private var _iterators:int = 0;

        /**
         * Adds a function to be called when apply is called.
         * If the function is already in the list it won't be added twice.
         * Returns true if the function was added.
         **/
        public function add(func:Function):Boolean
        {
            var i:int = _functions.indexOf(func);
            if (i > -1)
                return false;

            //add new functions to the end so they are picked up live during an apply
            _functions.push(func);
            return true;
        }

        /**
         * Removes a function to be called when apply is called.
         * Returns true if the function was removed.
         **/
        public function remove(func:Function):Boolean
        {
            var i:int = _functions.indexOf(func);
            if (i < 0)
                return false;

            if (_iterators == 0)
                _functions.splice(i, 1);
            else
                _functions[i] = null;

            return true;
        }

        /**
         * Synchronously applies all functions that have been added.
         * Functions can be safely added or removed during an apply and changes will take effect immediately.
         * Added functions will be called, and removed functions will not.
         **/
        public function apply(thisArg:*=null, argArray:*=null):void
        {
            _iterators++;
            var holes:Boolean = false;

            for (var i:int = 0; i < _functions.length; i++)
            {
                var f:Function = _functions[i];
                if (f == null)
                    holes = true;
                else
                    f.apply(thisArg, argArray);
            }

            //cleanup holes left by removing functions during this apply call.
            //if any of the function apply's throw an error the state of _iterators will be off.
            //but, we'll only leak array slot memory if functions are removed.
            //putting a try/finally or try/catch block here significantly decreases performance.
            if (--_iterators == 0 && holes)
            {
                for (i = _functions.length - 1; i >= 0; i--)
                {
                    if (_functions[i] == null)
                        _functions.splice(i, 1);
                }
            }
        }

        /**
         * Removes all functions from the list. Stops the current apply call, if there is one.
         **/
        public function clear():void
        {
            _functions = [];
        }
    }
}

Although capture, bubble, weak references, and priority are handy features of the
Flash eventing system, they’re not always necessary and will hurt your performance
when you might have thousands of them firing per frame.

In Part 2 we’ll put this MulticastFunction to use in a more meaningful way with
the INotifyPropertyChanged implementation.

Functional Optimistic Concurrency in Knighthood

A few months ago Phil Haack wrote about how C# 3.0 is a gateway drug to functional programming. I couldn’t agree more. I find myself solving problems using functional rather than imperative programming quite often nowadays. It’s much more elegant for many problem spaces.

Before we go any further, here’s the sample app used for this article. Even if you don’t like my writing, you should play with it. Yeah, you! optimistic-concurrency.zip

One problem space that fits very well with functional patterns is in developing apps that have to use optimistic concurrency to maintain data consistency at scale. Here at Hive7 we build PvP games. In such games, multiple people and background processes are often affecting the same entity at the same time. We can’t use coarse grained locks or high isolation levels in MS-SQL, or the whole game would come to a halt. Here’s a common scenario in a game like Knighthood:

Multiple rival lords are attacking my Kingdom at once trying to steal my most prized vassal, my wife! My wall is staffed with a heavy defense, and my hospital has a strong set of medics healing my kingdom over time. But to keep a handle on the attack I also have to continuously spend gold to heal my defensive army.

In this common use case there are a number of subtleties. First, multiple people are attacking me at once. That means they’re doing damage to my defenses in real time, and at the same time. My hospital is healing my vassals over time. This occurs in a background process once every few minutes. And I’m triggering an instant heal to my defensive vassals using my gold supply. My Marketplace is also generating gold for me over time in another background process. To top it all off, this is happening across a cluster of application servers that are certain to be processing multiple requests simultaneously. Phew!

So what does all that mean? Well, basically, there are a lot of possibilities for change conflicts. And we have to deal with those conflicts to both keep a consistent data model and perform well.

There are a a number of potential strategies for managing these change conflicts in the persistent store – a few beefy Microsoft SQL Server databases in our case. We chose to go with optimistic concurrency and an abort on conflict transaction strategy. That basically means when we write data to the database we make sure we are always writing the most recent version of a row. If an application attempts to write an old version of the row, the data access layer throws an exception and aborts the transaction. Knighthood uses NHibernate so the validation is done for us automatically using a simple version number on the row. The basic algorithm is:

  1. Read data and serialize into objects (done by NHibernate)
  2. Modify objects in code
  3. Tell NHibernate to persist the changes, which does the following
    1. Increments the version number
    2. Finds all the changes and batches up insert/update calls
    3. Uses the version number in the WHERE clause of updates like: “UPDATE Table SET Col1=’blah’ WHERE Version=36″
    4. Checks the rows modified reported by SQL server and throws an exception if it’s an unexpected number

As you can imagine, this fails regularly in a high concurrency scenario, but it succeeds orders of magnitude more often than not. It’s also pretty standard for any web app nowadays.

The only problem is, to preserve consistency, an exception is thrown and the transaction is aborted when change conflicts occur. That means whatever request the application or user issued fails. We could show the user a friendly error message, but that would be a frustrating experience. Nobody likes seeing errors for non-obvious reasons. And in the case of headless software running in the background the error would just be in a log somewhere. If it’s something important that needs to happen, then we have to make sure it gets done! So us imperative programmers devise a retry scheme and write a loop with an exception trap around our code. Maybe you get clever and create a class that does this which raises an event any time you need to execute your retry-able code. But, this gets pretty cumbersome. Enter functional programming!

We have a little class named DataActions that is used to simplify and consolidate this retry process and make it painless to use. I’m going to use LINQ to SQL as the example here. Here’s some usage code:

DataActions.ExecuteOptimisticSubmitChanges<GameDataContext>(
    dc =>
    {
        var playerToMod = dc.Players.Where(p => p.ID == playerId).Single();
        SetRandomGold(playerToMod);
    });

As you can see it’s really straight forward. Notice all the goodness going on there. We don’t have to instantiate our own DataContext, manually submit the changes, or worry at all about transactions. It’s all handled by the wrapper. And, you just have to provide some code to execute once the DataContext has been instantiated.

The ExecuteOptimisticSubmitChanges helper method itself is pretty simple as well:

public static void
ExecuteOptimisticSubmitChanges<TDataContext>(Action<TDataContext> action)
    where TDataContext : DataContext, new()
{
    Retry(() =>
        {
            using (var ts = new TransactionScope())
            {
                using (var dc = new TDataContext())
                {
                    action(dc);
                    dc.SubmitChanges();
                    ts.Complete();
                }
            }
        });
}

And, finally, we have the Retry method:

public static void Retry(Action a)
{
    const int retries = 5;
    for (int i = 0; i < retries; i++)
    {
        try
        {
            a();
            break;
        }
        catch
        {
            if (i == retries - 1) throw;

            //exponential/random retry back-off.
            var rand = new Random(Guid.NewGuid().GetHashCode());
            int nextTry = rand.Next(
              (int)Math.Pow(i, 2), (int)Math.Pow(i + 1, 2) + 1);

            Thread.Sleep(nextTry);
        }
    }
}

When you string all this together you get pseudo-stacks that look like:

MyCode
  ExecuteOptimisticSubmitChanges
    Retry
      ExecuteOptimisticSubmitChanges
        MyCode

So, why should you care? The calling code is really easy to read, and you get a number of other benefits with this code. In addition to handling exceptions caused by concurrency errors, you also get retries on deadlocks, and more common Sql Connection errors.

I put together a little sample application you can play with. It uses these helpers and has a SQL Database with it. The sample simulates really high concurrency and you can watch it deal gracefully with deadlocks. Then you can change line 29 of Program.cs and execute the same concurrent code without retries enabled. It ouputs the number of failed transactions and a bunch of other interesting stuff to the console. Here’s some example output:

...

Retrying after iteration 0 in 1ms
Retrying after iteration 0 in 0ms
Thread finished with 0 failures. Concurrency at 3
Retrying after iteration 1 in 3ms
Retrying after iteration 1 in 4ms
Thread finished with 0 failures. Concurrency at 2
Retrying after iteration 2 in 5ms
Thread finished with 0 failures. Concurrency at 1
Retrying after iteration 3 in 15ms
Thread finished with 0 failures. Concurrency at 0

0 total failures and 7 total retries.
All done. Hit enter to exit.

And the same test run with retries disabled:

...

Starting worker. Concurrency at 8
Thread finished with 0 failures. Concurrency at 7
Thread finished with 0 failures. Concurrency at 6
Thread finished with 1 failures. Concurrency at 5
Thread finished with 1 failures. Concurrency at 4
Thread finished with 1 failures. Concurrency at 2
Thread finished with 2 failures. Concurrency at 3
Thread finished with 0 failures. Concurrency at 1
Thread finished with 2 failures. Concurrency at 0

7 total failures and 0 total retries.
All done. Hit enter to exit.

Here’s the download link again: optimistic-concurrency.zip

Let me know if you have any questions.

ioDrive, Changing the Way You Code

Our brilliant chief architect JD have been playing recently with one cool pricey toy from Fusion-IO. After running IODrive through our demanding MMO database usage patterns the results are amazing!

Click to continue reading “ioDrive, Changing the Way You Code”

link: pretty loaded

Pretty Loaded is a nice collection of beautiful loading animations.  Its interesting to see how far we’ve come from the progress bar.

Pretty Loaded

Be inspired: Onesize’s new motion reel

While I like watching reels okay, I don’t normally post them, but this one is particularly good for the sound track & effects as well as editing.

Onesize Reel 2008

And just to keep posting more about stuff not web game related, check out this incredibly talented group in the uk that has done consistently amazing work in motion graphics.  Mainframe

Seriously though, its always good to look outside your domain for inspiration and ideation.  The most impactful innovation happens when translating ideas from one domain to another.

Users reminisce about Knighthood over the past year…

The Good Old Days Post

When a game can create moments powerful enough for people to feel nostalgia, I get a warm tingly feeling that we’ve done something good. Gaming can bring people together online and be more than just killing boredom.

Put down the abstract factory and get something done

This may sound short sighted, and it is. In fact, that’s the point. Projects change. You have to adapt. You will never know how your code will be used 5 years from now. Stop thinking about it.

Click to continue reading “Put down the abstract factory and get something done”

Big brands do mmos

Big brands that have strong content around entertainment are finding that the best way to go online and build a successful community on the web is to create a social network powered mmo game around their content.

Disney’s pixar will be releasing a Car’s flavored mmo next year.

Cartoon networks is releasing fusion fall this month.

Fusion Fall

Abby’s Magic Academy…

Thought I’d share some more art from our archives. This one being a game based around a magic academy similar to Hotter Potter and Co. It didn’t really develop into anything more than just a concept.

I did get to come up with a couple title character designs that I still enjoy looking back on. There was only a couple of days to concept the two characters, the result being wholesome Abbey and her mischievous litte brother. Oh and a couple cute and cuddly sidekicks of course. I’m such a sucker for those clean vector lines ala Illustrator.

Abbey's Magic Academy

Abbey