Thursday, May 19, 2011

Xenoborg: What is it and how does it work?

Sorry I haven't been updating this blog like I should have. I'm under quite a bit of stress atm and there are some things bothering me IRL too. But enough about that, none of you came here to read my boring stories of stress. :-(

Honestly, I don't believe I've fully described what Xenoborg is and what my intentions are for this emulator. If I haven't (or if you don't know), allow me to explain now. Xenoborg is designed to be an Microsoft Xbox1 and Sega Chihiro emulator for Windows. Right now, I'm using a static-rec/direct code execution approach. I was originally going to use the conventional methods of using an interpreter and/or dynarec like most other emulators, but honestly, that's just way too time consuming to maintain such a large CPU core with probably 100s and maybe close to 1,000 instructions. Fortunately for us, static-rec is the fastest method of emulation and has the least amount of overhead and performance penalties (in my opinion) and it seems that it's a really choice for emulating Xbox.

So, how does Xenoborg work? Well the concept is simple enough for anyone to understand, but the implementation is far more complex than it sounds. In it's current state, what Xenoborg does is take any given Xbox/Sega Chihiro .xbe, loads the contents to the base address within the PC's 4GB address space (which is 0x10000) and executes it within a special exception handler designed to do a bit of MMU hacking to emulate the hardware. Sounds simple right? Well, it took a bit of work (lots of trial and error) but I did manage to get it right.

Pulling this off wasn't very easy to do, and I'd be lying if I said that I didn't have any sort of help. Let's take a look at Xeon's execution model, shall we? The loading process was derived from that (Dxbx's was too) emulator. _SF_ (author of Xeon) stated that in order to get his .xbe files to run, he had to load the .xbe directly at 0x10000 in the PC's address space. Other methods just didn't work, so allocating a 64Mb buffer, writing the contents to it and then executing it wasn't going to work because .xbe files tend to use absolute addresses for jumps/calls and reading memory addresses within code/data sections. So loading at that base address was a must. The question is, how do we do that? It's a bit tricky, but definitely do-able. Before doing so, we have to consider a few things. Xbox needs the first 64Mb of address space (which is RAM), so of course, we need to emulate that using the same address range. Although it appears impossible to emulate the full 64Mb using my method (which I'll explain in a minute), as long as we can throw that .xbe at the right place in emulated RAM, we're okay. So what we need to do, is reserve the first 64Mb of address space.

"So, how do we do that?" Alright, this will require a bit more explanation. There's no way we can just build a .exe in Visual Studio, load it up, get a pointer to 0x10000 and start writing stuff there. You just can't do it. What you're going to need is a .exe with a fixed base address so that Windows will actually allow us to get access to that address and with a static global array of 64Mb (512Mb for Chihiro) to reserve the remaining megabytes of address space after that. The .exe itself is nothing but a .xbe "launcher", and all of the emulator code is placed in a .dll file that is placed higher up in the address space (which is typically 0x10xxxxxx) which takes control once the .exe is loaded. Once the .exe loads, what it does is calls a NULL function that never returns, then the emulator code in the .dll can overwrite the contents of that address. But before doing so, you have to consider a few things. For starters, the .exe header (or at least part of it) needs to stay intact or else the .exe will not work. I'm not sure what Xeon does, but Dxbx keeps most of the .xbe header in emulated RAM and only keeps the required values from the .exe header so it continues to run, swapping the values with the .xbe values as needed. Xenoborg does not modify the .exe header (which is a constant size of 0x1000 for both .exe and .xbe btw), but instead, keeps the header stored in a special pointer that is read/written to instead. This makes sure that the emu will continue to function properly. One more detail to keep in mind is that we can't reserve the first 64Kb of address space. This we will also need to create a seperate pointer for (if necessary) of the same size and emulate in the same manner as the header.

Okay, now that we have the address range reserved, we can just zero out the available address range given the restrictions above, right? Wrong! Now we need to set the protection on that range before we do anything to it. Just make a simple call to VirtualProtect(), and you're good! Then we can start writing stuff to that address space and start emulating stuff! Some of this stuff I learned from Patrick's work on Dxbx (who helped me get this working), so kudos to him and good luck with life outside of the scene! But before we get down to business, I found out that after writing the .xbe contents to the base address, the emulator would freeze upon exiting. This had me scratching my head for a moment, then I figured it was because I needed to restore the .exe contents. So, I backed up the .exe contents beforehand, and then restored it when it was time to exit. Then the problem went away! And BAM, now we've got an emu (almost).

"How does the actual emulation work?" After taking the above precautions and procedures, now we need to start executing the code. That's not hard, really. All I needed to do was point to the address of the entry point, and call it! "Wait, it can't be that easy, can it?" Of course not. We have to do that within an exception handler, of course. When we catch an illegal instruction, we find out exactly what it was trying to read/write to, and then emulate that particular bit of hardware/memory.

So far, this implementation appears to work rather well for me, but I'm quite sure that there may be better ways. If there are, feel free to share.

You might be wondering if this emulator will support 64-bit OSes. I can't guarantee it, but it should. The emulator builds will be 32-bit for now, but I don't know if I can create a 64-bit .exe with the same base address yet (I don't have the resources to try it). Even if it were possible, it would take more work to get the emulator working since the x86-64 architecture dropped support for numerous 16-bit instructions, MMX, and a few others. Those I'll have to emulate as they are executed. Fortunately, Xbox rarely uses any 16-bit operations and I haven't seen much MMX usage over SSE. Well, Xbox does start executing 16-bit code during boot time then switches to 32-bit protected mode later on (according to drkIIRaziel), but I haven't gotten that far as to emulate the BIOS yet. So we'll see what arises from running in 64-bit mode later on. Xeon works just fine in 64-bit mode using similar tactics, so chances are it will be okay.

Next blog update will be about my threading model; why it's necessary, and hopes to improve it (it's poorly designed atm, but it works). Since I suck at threading, feel free to throw suggestions at me because this emu requires 3 threads just to run.

Thanks for reading!

Shogun.