Warhammer: Mark of Chaos is a real time strategy game released in 2006.
I did not play this game until a few weeks ago, when a friend brought it to my attention because he couldn't get it working in Wine. I gave it a try, and noticed that the game did in fact work, but it was running extremely slow and would take a good 10 minutes to get from the desktop to the game, and even then performance was terrible, with lag spikes lasting several seconds for no apparent reason.
A quick look online also revealed some people complaining about slow loadings even on Windows, so I decided to investigate and maybe try to come up with a solution.
If you're just here for the fix, here's what you have to do:
Note: this fix is compatible only with the latest version of the game (2.14), you can get it from GOG.
My first thought was that it could be some kind of synchronization issue, but changing fsync and esync settings in Wine had no effect, nor did trying different versions of Wine, DXVK and WineD3D, so it was something more low level.
Wine has a very useful feature that can be activated with the WINEDEBUG=+relay environment variable, it logs every single call that the application makes to external functions so they can be analyzed manually, similarly to strace. I launched the game with this enabled and logged the system calls to a file, after less than 20 seconds it was already 700 megabytes, I opened it and noticed that after an initialization phase, the game was spamming MILLIONS of calls to two Windows functions:
Look at all these calls!
I can see why a game could need a high resolution timestamp so QueryPerformanceCounter is not my main concern, but GetThreadTimes? Why would a game need to know how much time it spent in kernel mode? To me, this screams "leftover debug code", I think there is some code that the developers used to profile the game and identify bottlenecks, and they forgot to disable it in the final build, making the CPU bottleneck worse.
Before diving deeper into the investigation, I decided to make a "proof of concept" program to see if one of these two system calls was the cause of the slowdown. The code below does 100000 calls to GetThreadTimes and measures how long it took using QueryPerformanceCounter:
#include<stdio.h>
#include<profileapi.h>
#include<processthreadsapi.h>
int main(){
LARGE_INTEGER t1,t2;
HANDLE hThread=GetCurrentThread();
FILETIME lpCreationTime, lpExitTime, lpKernelTime, lpUserTime;
BOOL bResult;
printf("Running...\n");
fflush(stdout);
QueryPerformanceCounter(&t1);
for(int i=0;i<100000;i++){
bResult=GetThreadTimes(hThread,&lpCreationTime,&lpExitTime,&lpKernelTime,&lpUserTime);
}
QueryPerformanceCounter(&t2);
printf("Ticks: %d\n",t2.QuadPart-t1.QuadPart);
fflush(stdout);
system("pause");
return 0;
}
You can compile this code on Linux using MinGW with this command:
i686-w64-mingw32-gcc -static -o test.exe test.c
This will generate a test.exe file that you can use to compare Windows to Wine. The number of Ticks represents the time it took to execute the 100000 calls.
Wine took 8,326,350 ticks while Windows 10 took only 139,938 ticks, meaning that this system call is approximately 60 times slower on Wine!
I'm not sure what makes GetThreadTimes so slow in Wine, but looking at the source code I'm willing to bet that the problem is here: GetThreadTimes calls NtQueryInformationThread and it looks like this function needs to communicate with the Wine server process to fetch the information... through a socket!
Now that we know what's slowing down the game, let's see if we can fix it!
To see what the game does with this GetThreadTimes function, I opened the game's exe in Ghidra, a disassembler similar to IDA Pro, except it's open source and multiplatform and oh my god did I just install software made by the fucking NSA? Oh well, it's too late now.
The game seems to be calling GetThreadTimes in 3 locations:
Let's see them one by one.
The first call is in a function starting at 0x402cd0 and all it seems to do is call GetThreadTimes and QueryPerformanceCounter and update some variables. The function takes no inputs, it has no outputs, and calls to this function are scattered in dozens of locations throughout the game code.
To me, this looks like something used for profiling the execution times of some parts of the game code, so I decided to simply replace the whole function with a return and hope for the best.
The second call is in a function starting at 0x402d40 and is very similar to the previous one. Calls to this function are also scattered throughout the game code so it probably has the same purpose.
I once again decided to replace the whole function with a simple return and hope for the best.
The third function, starting at 0x9b0b80, is more complex and our offending code is right in the middle of it, but it only gets called once at the end of the loading process so it probably initializes something.
I decided to just jump over it and see if it worked.
After these changes, I lanched the game expecting an instant crash, but to my surprise it actually worked, and it started very quickly even in Wine, with no crashes, and I was able to get into the game! Looks like my skills are improving :)
I played different skirmish modes, tried some replays, and played through part of the campaign without a single problem, so I was probably right in my assumption that this was likely some leftover debug code, not essential for the game to work. I will update the article when I can confirm that the entire game can be beaten without issues.
I played through the entire game, tried some skirmish, and played replays all without a single issue, I can confirm that this fix is stable.
I did not test the multiplayer since it uses the now defunct GameSpy, so I have no idea if that would work with an emulator. If you try it, please let me know.
This game is severly CPU bottlenecked, it only uses one core, it requires high clocks and high IPC, and even if this modern CPU is way faster than anything that was available in 2006, it can still bring it to its knees in some scenarios.
Hardware configuration:
The first test I did was a comparison of the loading times between the patched version and the unpatched version and between Windows and Wine. In this test I measured the time it took to load the first tutorial level averaged over 5 runs.
On Windows, loading times were reduced by about 40%, going from an average of 9.9 seconds down to 6.3. On Wine, the situation improved dramatically, going from an insane 177.5 seconds (almost 3 minutes) down to only 5.3 seconds, a 3000% improvement. We can also see that with the patch, Wine was faster than Windows, although at these speeds it hardly matters.
For the next test, I played a quick skirmish (~5 minutes) with a large number of rats to stress the CPU bottleneck and played the replay on both Windows and Wine, with and without the patch, and compared the results. As a reminder, a frame time chart shows the time between frames being presented to the display (so it's the inverse of framerate), lower frame times are better, but more consistent is best, variations in frame times are perceived by the user as stuttering or lag spikes.
On Windows, the performance difference ingame on this CPU is minimal, but a reduction in stuttering is visible from the frame times chart.
On Wine, the situation is completely different: without the patch, the game is pretty much unplayable, with severe stuttering, lag spikes, and a very unstable framerate. The situation is dramatically improved with the patch, where the game is basically perfect except for one small spike.
In this last chart you can see that the game now actually runs significantly better in Wine compared to Windows, with less and smaller spikes, and a generally more stable framerate. Well done, Linux!
Let's intensify the CPU bottleneck in the next tests.
Hardware configuration:
Loading times on this system were not as bad as I expected, but Windows 7 was (unsurprisingly) slightly faster.
Moving on to frame times, the situation is much more interesting.
Frame times on the E8600 were pretty bad due to the CPU bottleneck, but with the patch we see an improvement of 10-15% when the game is CPU-bound (the middle part of the chart). Stuttering was also greatly reduced, with just a few spikes over 100ms, compared to the unpatched version that had spikes over 300ms.
I did not test Wine on this system since it wouldn't make much sense.
Distributing my fix turned out to be slightly more complicated than I anticipated. Obviously I can't just upload the modified exe because it's copyrighted, so at first I decided to go for the ASI mod approach like I did with my Mass Effect 2 fix a couple years ago... except that the modding scene for this game is beyond dead and there isn't even an ASI loader for it!
I chose to go for the fake Bink DLL approach, I used WarrantyVoider's Proxy DLL Maker to create a proxy DLL for binkw32.dll, the library that the game uses to play videos and that is loaded with the executable: the proxy DLL loads the real Bink DLL (that's been renamed to binkw23.dll), passes all function calls to the real DLL, and while it does this, my code can run in its own thread and patch the game in memory.
Patching the game is simply a matter of finding the right patterns in the code segment in memory and overwriting them with the new code, it's nothing complicated and you can simply look at the code if you want to see how I did it.
As an added bonus, I added a couple of features to the patch:
Version | Source | Compatibility |
---|---|---|
2.14 | GOG | Yes |
2.14 (EU) | Retail | Yes 1 |
2.14 (US) | Retail | Probably 1 |
2.14 (RU) | Retail | Probably 1 |
<= 1.74 | Retail | No |
1. The game's SafeDisc DRM is incompatible with modern systems, a crack is required.
You can get the source code from here or from the Github repo. You'll need Visual Studio 2019 to build it.
The fix is distributed under the GNU GPL v3 license.
Needless to say, I am not affiliated to Deep Silver, Bandai Namco, Black Hole Entertainment or Games Workshop, and this is not an official fix.
This patch made the game playable for Linux users and improved the experience for Windows users as well. This is a big win for me because I feel like I suck at reverse engineering and the fact that I was able to investigate and fix this problem all by myself in a couple of days when nobody had succeeded before really encouraged me to continue exploring this path.
It's worth noting that even though the CPU bottleneck has been reduced with this patch, the game is still heavily bottlenecked by the rendering code because it does a ton of individual draw calls and everything happens on a single thread and there's nothing I can do about it.
Want to try to make an ASI mod for this game? Here's a list of ideas that I came up while testing the game: