Warhammer: Mark of Chaos Performance Fix

Cover

Introduction

Warhammer: Mark of Chaos is a real time strategy game released in 2006.

I did not play this game until a few weeks ago, when a friend brought it to my attention because he couldn't get it working in Wine. I gave it a try, and noticed that the game did in fact work, but it was running extremely slow and would take a good 10 minutes to get from the desktop to the game, and even then performance was terrible, with lag spikes lasting several seconds for no apparent reason.

A quick look online also revealed some people complaining about slow loadings even on Windows, so I decided to investigate and maybe try to come up with a solution.

Installation

If you're just here for the fix, here's what you have to do:

  • Download the patch archive
  • Open the game's folder (right click the icon -> open file location)
  • Find a file called binkw32.dll and rename it to binkw23.dll
  • Extract binkw32.dll from the downloaded archive to the game's folder
  • Run the game

Note: this fix is compatible only with the latest version of the game (2.14), you can get it from GOG.

Investigating the problem

My first thought was that it could be some kind of synchronization issue, but changing fsync and esync settings in Wine had no effect, nor did trying different versions of Wine, DXVK and WineD3D, so it was something more low level.

Wine has a very useful feature that can be activated with the WINEDEBUG=+relay environment variable, it logs every single call that the application makes to external functions so they can be analyzed manually, similarly to strace. I launched the game with this enabled and logged the system calls to a file, after less than 20 seconds it was already 700 megabytes, I opened it and noticed that after an initialization phase, the game was spamming MILLIONS of calls to two Windows functions:

  • GetThreadTimes: can be used to obtain information about a thread: creation timestamp, destruction timestamp, how much time it spent in kernel mode, and how much time it spent in user mode
  • QueryPerformanceCounter: gives a high resolution (<1us) timestamp

Millions of system calls

Look at all these calls!

I can see why a game could need a high resolution timestamp so QueryPerformanceCounter is not my main concern, but GetThreadTimes? Why would a game need to know how much time it spent in kernel mode? To me, this screams "leftover debug code", I think there is some code that the developers used to profile the game and identify bottlenecks, and they forgot to disable it in the final build, making the CPU bottleneck worse.

Proof of concept

Before diving deeper into the investigation, I decided to make a "proof of concept" program to see if one of these two system calls was the cause of the slowdown. The code below does 100000 calls to GetThreadTimes and measures how long it took using QueryPerformanceCounter:

#include<stdio.h>
#include<profileapi.h>
#include<processthreadsapi.h>

int main(){
    LARGE_INTEGER t1,t2;
    HANDLE hThread=GetCurrentThread();
    FILETIME lpCreationTime, lpExitTime, lpKernelTime, lpUserTime;
    BOOL bResult;
    printf("Running...\n");
    fflush(stdout);
    QueryPerformanceCounter(&t1);
    for(int i=0;i<100000;i++){
        bResult=GetThreadTimes(hThread,&lpCreationTime,&lpExitTime,&lpKernelTime,&lpUserTime);
    }
    QueryPerformanceCounter(&t2);
    printf("Ticks: %d\n",t2.QuadPart-t1.QuadPart);
    fflush(stdout);
    system("pause");
    return 0;
}

You can compile this code on Linux using MinGW with this command:

i686-w64-mingw32-gcc -static -o test.exe test.c

This will generate a test.exe file that you can use to compare Windows to Wine. The number of Ticks represents the time it took to execute the 100000 calls.

Wine took 8,326,350 ticks while Windows 10 took only 139,938 ticks, meaning that this system call is approximately 60 times slower on Wine!

I'm not sure what makes GetThreadTimes so slow in Wine, but looking at the source code I'm willing to bet that the problem is here: GetThreadTimes calls NtQueryInformationThread and it looks like this function needs to communicate with the Wine server process to fetch the information... through a socket!

Now that we know what's slowing down the game, let's see if we can fix it!

Fixing the problem

To see what the game does with this GetThreadTimes function, I opened the game's exe in Ghidra, a disassembler similar to IDA Pro, except it's open source and multiplatform and oh my god did I just install software made by the fucking NSA? Oh well, it's too late now.

The game seems to be calling GetThreadTimes in 3 locations:

Import of GetThreadTimes

Let's see them one by one.

First appearance

The first call is in a function starting at 0x402cd0 and all it seems to do is call GetThreadTimes and QueryPerformanceCounter and update some variables. The function takes no inputs, it has no outputs, and calls to this function are scattered in dozens of locations throughout the game code.

Disassembly of the first function

To me, this looks like something used for profiling the execution times of some parts of the game code, so I decided to simply replace the whole function with a return and hope for the best.

New version of the first function

Second appearance

The second call is in a function starting at 0x402d40 and is very similar to the previous one. Calls to this function are also scattered throughout the game code so it probably has the same purpose.

Disassembly of the second function

I once again decided to replace the whole function with a simple return and hope for the best.

New version of the second function

Third appearance

The third function, starting at 0x9b0b80, is more complex and our offending code is right in the middle of it, but it only gets called once at the end of the loading process so it probably initializes something.

Disassembly of the third function

I decided to just jump over it and see if it worked.

New version of the third function

Testing

After these changes, I lanched the game expecting an instant crash, but to my surprise it actually worked, and it started very quickly even in Wine, with no crashes, and I was able to get into the game! Looks like my skills are improving :)

I played different skirmish modes, tried some replays, and played through part of the campaign without a single problem, so I was probably right in my assumption that this was likely some leftover debug code, not essential for the game to work. I will update the article when I can confirm that the entire game can be beaten without issues.

I played through the entire game, tried some skirmish, and played replays all without a single issue, I can confirm that this fix is stable.

I did not test the multiplayer since it uses the now defunct GameSpy, so I have no idea if that would work with an emulator. If you try it, please let me know.

Performance on a modern PC: AMD Ryzen 7 5800x

This game is severly CPU bottlenecked, it only uses one core, it requires high clocks and high IPC, and even if this modern CPU is way faster than anything that was available in 2006, it can still bring it to its knees in some scenarios.

Hardware configuration:

  • AMD Ryzen 7 5800x
  • MSI B550 Tomahawk
  • 64GB G.Skill Ripjaws V DDR4 3600 C16 RAM
  • AMD Radeon RX 6900XT
  • Samsung 980 Pro 1TB SSD
  • Windows 10 LTSC 2021
  • Manjaro Linux 21.3.3 (Kernel 5.18) + Wine-GE-Proton7-20 + DXVK 1.10.1-async

The first test I did was a comparison of the loading times between the patched version and the unpatched version and between Windows and Wine. In this test I measured the time it took to load the first tutorial level averaged over 5 runs.

Loading times chart

On Windows, loading times were reduced by about 40%, going from an average of 9.9 seconds down to 6.3. On Wine, the situation improved dramatically, going from an insane 177.5 seconds (almost 3 minutes) down to only 5.3 seconds, a 3000% improvement. We can also see that with the patch, Wine was faster than Windows, although at these speeds it hardly matters.

For the next test, I played a quick skirmish (~5 minutes) with a large number of rats to stress the CPU bottleneck and played the replay on both Windows and Wine, with and without the patch, and compared the results. As a reminder, a frame time chart shows the time between frames being presented to the display (so it's the inverse of framerate), lower frame times are better, but more consistent is best, variations in frame times are perceived by the user as stuttering or lag spikes.

Frame times chart on Windows

On Windows, the performance difference ingame on this CPU is minimal, but a reduction in stuttering is visible from the frame times chart.

Frame times chart on Wine

On Wine, the situation is completely different: without the patch, the game is pretty much unplayable, with severe stuttering, lag spikes, and a very unstable framerate. The situation is dramatically improved with the patch, where the game is basically perfect except for one small spike.

Frame times chart on Wine and Windows

In this last chart you can see that the game now actually runs significantly better in Wine compared to Windows, with less and smaller spikes, and a generally more stable framerate. Well done, Linux!

Let's intensify the CPU bottleneck in the next tests.

Performance with a severe CPU bottleneck: Intel Core 2 Duo E8600

Hardware configuration:

  • Intel Core 2 Duo E8600
  • ASUS P5N-E SLI
  • 4GB DDR2 800MHz RAM
  • nVidia GTX 1050 2GB (to make sure we don't have a GPU bottleneck)
  • Samsung 850 Evo 250GB SSD
  • Windows 7 Enterprise x64
  • Windows 10 LTSC 2021

Loading times on this system were not as bad as I expected, but Windows 7 was (unsurprisingly) slightly faster.

Loading times chart

Moving on to frame times, the situation is much more interesting.

Frame times chart on the E8600

Frame times on the E8600 were pretty bad due to the CPU bottleneck, but with the patch we see an improvement of 10-15% when the game is CPU-bound (the middle part of the chart). Stuttering was also greatly reduced, with just a few spikes over 100ms, compared to the unpatched version that had spikes over 300ms.

I did not test Wine on this system since it wouldn't make much sense.

Distributing the fix

Distributing my fix turned out to be slightly more complicated than I anticipated. Obviously I can't just upload the modified exe because it's copyrighted, so at first I decided to go for the ASI mod approach like I did with my Mass Effect 2 fix a couple years ago... except that the modding scene for this game is beyond dead and there isn't even an ASI loader for it!

I chose to go for the fake Bink DLL approach, I used WarrantyVoider's Proxy DLL Maker to create a proxy DLL for binkw32.dll, the library that the game uses to play videos and that is loaded with the executable: the proxy DLL loads the real Bink DLL (that's been renamed to binkw23.dll), passes all function calls to the real DLL, and while it does this, my code can run in its own thread and patch the game in memory.

Patching the game is simply a matter of finding the right patterns in the code segment in memory and overwriting them with the new code, it's nothing complicated and you can simply look at the code if you want to see how I did it.

Bonus features

As an added bonus, I added a couple of features to the patch:

  • ASI loader: this can open the way to mods by making it possible to easily load custom code into the game
  • Update disabler: just in case the domain gets purchased by someone that decides to distribute malware through it

Compatibility

Version Source Compatibility
2.14 GOG Yes
2.14 (EU) Retail Yes 1
2.14 (US) Retail Probably 1
2.14 (RU) Retail Probably 1
<= 1.74 Retail No

1. The game's SafeDisc DRM is incompatible with modern systems, a crack is required.

Source code

You can get the source code from here or from the Github repo. You'll need Visual Studio 2019 to build it.

The fix is distributed under the GNU GPL v3 license.

Needless to say, I am not affiliated to Deep Silver, Bandai Namco, Black Hole Entertainment or Games Workshop, and this is not an official fix.

Conclusion

This patch made the game playable for Linux users and improved the experience for Windows users as well. This is a big win for me because I feel like I suck at reverse engineering and the fact that I was able to investigate and fix this problem all by myself in a couple of days when nobody had succeeded before really encouraged me to continue exploring this path.

It's worth noting that even though the CPU bottleneck has been reduced with this patch, the game is still heavily bottlenecked by the rendering code because it does a ton of individual draw calls and everything happens on a single thread and there's nothing I can do about it.

Mod ideas

Want to try to make an ASI mod for this game? Here's a list of ideas that I came up while testing the game:

  • Increasing the amount of gold that you start with in skirmish mode
  • Removing the default population limit of 600
  • When all objectives are achieved, the game goes to the next level after a few seconds. It would be nice to have some time to heal the troops and pick up the items scattered around the level
  • When an army only has a hero left, the game starts a 3 minute countdown, if you don't win, the opponent wins automatically. This is unfair and doesn't make it possible to have hero versus hero battles
  • Make the UI scale on high resolution displays

Share this article

Comments