Monster Hunter 2 Reverse Engineering

Posted by Kresna

2023-05-05

Hello, Kresna here. I’m the hacker and programmer for 「BREAK ARTS」 responsible for all the reverse engineering, tooling, and technical work for the Monster Hunter 2 Dos English patch. If you’re interested in knowing how this project was made, buckle up! You’re on the right page.

Table of Contents

The Beginnings of the Project

The Monster Hunter 2 Dos project began when Nara sent me a message while I was at work one day. The text showed an in-game message modified to say “HiKres” with the caption “but what if we simply did a terrible translation for stuff”. We sent messages back and forth until I came home. The moment I got home and sat in my chair, I immediately began reverse engineering the game.

I had known Nara for some time by this point, and I knew she was the kind of girl that fully commits to something she’s set her mind to it. I was confident we’d see it through to the end. To be honest, I don’t think she realised it was for real at the time we started. (Edit: She told me she knew it was for real from the get go because “being serious about whims is serious business!”)

This spur-of-the-moment event is what led to the project you see today!

The Tools Used

Reverse engineering a game, extracting and reinserting the script, decompressing and compressing assets, assembling and injecting code into the game, all of these things and more require specialised tooling. In some cases, the tools are general and can be applied to most projects, while many others have to be written from scratch. I used the following tools in the creation of this patch:

abcde

abcde is a script extraction/insertion tool that implements all the features of the legendary Atlas and Cartographer tools, while extending their functionality. Written in Perl, it’s easily extensible. I even created a few small additions to the program to help us safely inject scripts within bounds. I have used Atlas and Cartographer in the past, but abcde contains significant improvements that allowed us to make space-saving changes easily to our script.

To make it brief, abcde allows us to dump scripts from the game’s files using pointer tables, or extract raw text if no pointer table exists. These scripts are then presented to us as a text file, which we can then edit using our prefered text editor. When inserting a translated script, abcde modified all the pointers to point to the text’s new location. It’s maybe the most painless way you could work on a project like this.

naken_asm

naken_asm is a multi-architecture assembler. I selected this assembler because it natively supports the Playstation 2’s MIPS R5900 Emotion Engine and uses Intel-style syntax. Any assembler supporting the MIPS R5900 would work just as well.

PCSX2 1.6

PCSX2 is the only usable PS2 emulator, and 1.6 is the version with the most useful debugger. Although PCSX2’s debugging tools are primitive, having a debugger is essential to any reverse engineering project. A lot of the very complex changes were only made possible because of live debugging with PCSX2.


Aside from the above, I also wrote many bespoke tools just for this project. Many (if not all) of these tools won’t be used for anything else. They were written to solve problems unique to this project. I don’t have these tools available for download anywhere (really, the technical implementation of them isn’t pretty, but they worked well enough and got the job done). They are:

Arazlam

Arazlam
Arazlam usage guide

Arazlam was our all-purpose script checking tool. It verified that our scripts terminated correctly with control codes, identified if a text block had a terminating control code mid block, ensured our scripts stayed within text box bounds, attempted to check that control codes for button icons and colors were correctly formatted, ensured that NPC text blocks didn’t extend beyond the text box, among many other things. Arazlam could also spellcheck any script file and it featured a custom Monster Hunter 2 dictionary file used to spellcheck in-game terms.

Arazlam
Example of Arazlam catching script errors

The patch would not be anywhere near as polished without this tool. If there was something that logic could check, I programmed it into Arazlam. It saved us hundreds of hours of technical bug tracking and manual line inspection. The time saved can not be overstated. It may have been our most indispensible tool.

Fandango and Czarine

Fandango

Czarine
Fandango and Czarine usage guide

Fandango and Czarine were the decompression and compression tools I wrote to extract data from files Monster Hunter 2 Dos. I created these tools by reverse engineering the decompression algorithm used in the game (more on this later). Why not wrap the functionality into one tool instead of writing two? Because it’s cooler this way! Since I was the only one using them, I got to name things whatever I wanted, and write the functionality the way I wanted!

The Holy Win

The Holy Win
The Holy Win usage guide

The Holy Win was a tool I wrote to split apart decompressed NPC blocks, modify all the pointers, and reassemble the blocks. It also automatically generated abcde script tables and a makefile for every NPC, allowing the entire process of extracting and reinserting the NPC scripts to be fully automated, without human error.

From the ground up, it extracted the text blocks, split apart the NPC blocks, generated automated script extraction makefiles and abcde tables, and was able to reverse the process to inject NPC text back into the game. It also had some primitive binary injection modes which I used now and again.

Icous

Icous
Icous usage guide

Icous was a tool used to extract packed game files. Some files are packaged together into one large binary, like textures, and this tool split and reassembled those files once the textures have been modified.

The VKP

The VKP
The VKP usage guide

The VKP was a tool used to inject text into quest .mib files. Monster Hunter 2 features hundreds of quests, but thousands of quest files. This tool extended each .mib quest file, scraped our quest scripts for matching translations, and injected the scripts into each quest file, while rewriting all the pointers. It was an essential tool to localise quests in the game.

Reverse Engineering Data Compression

Before tackling a project like this I first check to see what sort of complexity we’re up against. It’s important to get an idea of problems we’ll encounter. This preliminary research is important so that we don’t waste our time on something out of our abilities. Really, some groundwork has to be done first to prove a project is possible before announcing it to the public.

The first problem I encountered during my Dos research was one of compression.

In the data files for Monster Hunter 2 Dos, resources like NPC text are compressed. Translating the game would require reverse engineering the compression format and writing tools to decompress and compress the data in the same format, so that’s exactly what I did.

I used memory dumps to search for where data was decompressed. Then, by setting break points using PCSX2’s debugger for those addresses, I was able to find the routine that decompresses game assets into memory. The decompression routine in MIPS R5900 assembly is as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
Decompress_00336200:
	dmove	a3,a1
	dmove	t0,zero
	dmove	t1,zero
	nop

pos_00336210:
	bnez	t0,pos_00336228
	nop
	lh	t1,(a0)
	li	t0,0x8000
	addiu	a0,0x2
	nop

pos_00336228:
	and	v0,t1,t0
	beqz	v0,pos_003362C0
	nop
	lhu	v0,(a0)
	srl	a2,v0,0x0B
	beqz	a2,pos_00336250
	addiu	a0,0x2
	b	pos_00336258
	andi	v0,0x07FF
	nop

pos_00336250:
	lhu	a2,(a0)
	addiu	a0,0x2

pos_00336258:
	bnez	v0,pos_00336290
	nop
	beqz	a2,pos_003362D8
	nop

pos_00336268:
	sh	zero,(a1)
	addiu	a2,-0x1
	addiu	a1,0x2
	nop
	nop
	bnez	a2,pos_00336268
	nop
	b	pos_003362D0
	nop
	nop

pos_00336290:
	sll	v0,v0,0x01
	subu	v1,a1,v0

pos_00336298:
	lhu	v0,(v1)
	addiu	a2,-0x1
	sh	v0,(a1)
	addiu	v1,0x2
	addiu	a1,0x2
	bnez	a2,pos_00336298
	nop
	b	pos_003362D0
	nop
	nop

pos_003362C0:
	lhu	v0,(a0)
	sh	v0,(a1)
	addiu	a0,0x2
	addiu	a1,0x2

pos_003362D0:
	b	pos_00336210
	sra	t0,0x01

pos_003362D8:
	jr	ra
	subu	v0,a1,a3

I then used a combination of stepping through the routine with a debugger as it decompressed data, and studying the disassembled routine, to learn how it worked. From there, I wrote Czarine and Fandango to compress and decompress any game data.

The game uses a form of LZSS compression with a 4kb sliding window. Each block of data has a 1 word header with 16 words following. The header is used to store 1-bit flags to check whether the next word is a literal, or if it should be treated as a window offset and length. If the length of the stream is smaller than 31, the left-most 11 bits of the word store the length, and the right-most 5 store the offset. If the length is longer than 31, the length and offset are stored in two separate words. Because the routine only deals with 16-bit word-size entries, it is able to slide 4kb of data with only 11 bits of information.

Once the tooling was written to extract the game’s data, I felt pretty confident that we could localise the game. Still, there were many other problems that needed solving, and this was just the first step in a series of many.

Script Extraction

A script extraction and insertion utility like abcde is one of the most important tools when working on a localisation project like this. abcde is able to work on incredibly complex scripts but, thankfully, we only needed to rely on its basic functionality for Monster Hunter 2 Dos.

There were a few ways in which I extracted text from the game. The first was by finding pointer tables that point to strings used in-game. The second was by using a raw extraction method, which edits strings in-place. Finding and editing pointer tables is the best approach, as it allowed us to adjust the pointer, giving us room to write strings of any length (assuming we had the space in the rom). Sometimes a function had an embeded pointer, and it wasn’t always worth reverse engineering it. At times like that, we did in-place edits using abcde’s raw extraction method.

For 99.9% of the in-game text we used pointer tables. I found these these pointer tables by making series of memory dumps. Searching for the address of the string in the memory dump allowed me to search for a table that had a pointer to it. From there, the base pointer offset was calculated using the ROM offset and the memory offset. Everything except for NPC dialogue were extracted directly from the game’s DATA.BIN file without first unpacking it. The reason for this is because, though abcde can inject strings from one file to another, it’s much easier to work on one big chunk of data and reallocate space that way. We needed to reallocate a fair bit of data to make some new scripts work. NPC data files were first split from the DATA.BIN file and unpacked using the Holy Win before being edited and injected back into the game.

By writing a Cartographer file for abcde we were able to extract the game’s scripts in a format that looked like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
//GAME NAME:		Monster Hunter 2 Dos

// Define required TABLE variables and load the corresponding tables
#VAR(Table_0, TABLE)
#ADDTBL("./tables/sjis_mh2.tbl", Table_0)

//BLOCK #000 NAME:		Text_Block_05_001
#ACTIVETBL(Table_0) // Activate this block's starting TABLE

#JMP($179FEF00, $17EAF132) // Jump to insertion point
#HDR($179FEF00)

//POINTER #0 @ $17E66B78 - STRING #0 @ $17EADA48
#W32($17E66B78)
はい[END]
// current address: $17EADA4D

//POINTER #1 @ $17E66B7C - STRING #1 @ $17EADA50
#W32($17E66B7C)
いいえ[END]
// current address: $17EADA57

//POINTER #2 @ $17E66B80 - STRING #2 @ $17EADA60
#W32($17E66B80)
メモリーカード差込口1・2を[END]
// current address: $17EADA7D

//POINTER #3 @ $17E66B84 - STRING #3 @ $17EADA80
#W32($17E66B84)
チェックしています。[END]
// current address: $17EADA95

//POINTER #4 @ $17E66B88 - STRING #4 @ $17EADA98
#W32($17E66B88)
[END]
// current address: $17EADA99

//POINTER #5 @ $17E66B8C - STRING #5 @ $17EADAA0
#W32($17E66B8C)
電源を切ったり、メモリーカード(PS2)を[END]
// current address: $17EADAC6

//POINTER #6 @ $17E66B90 - STRING #6 @ $17EADAD0
#W32($17E66B90)
抜いたりしないでください。[END]
// current address: $17EADAEB

...And so on, and so forth, for thousands of lines

Now editing the script was as simple as translating the text, commenting out the Japanese string, and running abcde in Atlas mode on a clean DATA.BIN file. abcde handled all pointer adjustments and we could edit the scripts using any text editor. The workflow couldn’t be any easier!

Improving the Game’s Text Parser

One of my favourite things when playing a game is when text on the screen pauses momentarily for punctuation, like ,.!?. Dos will do this for some Japanese punctuation characters, but I wanted to extend the range to support the previous four english punctuation characters. This was one of the original goals of the project that both Nara and I wanted to see in the game.

Our enhanced text parser, pausing for all types of punctuation

To me, this sort of text printing gives characters a realistic speaking cadence even if it’s just text printed character-by-character on a screen. The game already partially did this, but I wanted it to support all the appropriate punctuation. Honestly, I feel it’s something most games should do. If you’re a developer and you’re reading this, consider adding cool text printing features to your game! It adds a lot of character!

Anyway, I was able to trace back to the routine that prints text one character at a time using the PCSX2 debugger. This was found by setting break points for when text was accessed to be drawn to the screen. The relevant lines are as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
pos_003598F0:
	bne	a0,v0,pos_00359940
	li	a1,0x3
	sll	a0,v1,0x08
	li	v0,0x8142
	lbu	v1,(s0)
	addu	v1,a0
	andi	v1,0xFFFF
	beq	v1,v0,pos_00359920
	li	v0,0x8148
	bne	v1,v0,pos_00359928
	nop
	nop

pos_00359920:
	b	pos_00359968
	li	a1,0xF

pos_00359928:
	li	v0,0x8141
	bne	v1,v0,pos_00359968
	nop
	b	pos_00359968
	li	a1,0x8
	nop

pos_00359940:
	bnez	a0,pos_00359968
	li	v0,0x2E
	beq	v1,v0,pos_00359960
	nop
	li	v0,0x3F
	bne	v1,v0,pos_00359968
	nop
	nop

pos_00359960:
	li	a1,0xF
	nop

pos_00359968:
	sltu	at,s2,a1
	beqz	at,pos_003599D0

Line 3 and 38 control the pause between a character being drawn in frames. Line 3 is the default pause between drawing if no pause is neccessary. Line 38 controls the pause if punctuation is encountered.

In the patch, these times were sped up as drawing the English script took much longer than drawing Kanji. Line 27 to 35 controls the pause of half-width . and ? characters, so all that remained was improving the text printing speed and adjusting the routine to pause for the remaining english punctuation. I wrote the following to replace the relevant parts of the routine:

1
2
3
4
5
6
7
8
9
.ps2_ee

.include "playstation2/registers_ee.inc"

.entry_point main
.org 0x003598F0
main:
        bne     $a0, $v0, 0x0035993C
        li      $a1, 0x01
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
.ps2_ee

.include "playstation2/registers_ee.inc"

.entry_point main
.org 0x0035993C
main:
        bnez    $a0, 0x00359968
        li      $v0, 0x2E
        beq     $v1, $v0, load
        li      $v0, 0x2C
        beq     $v1, $v0, load
        li      $v0, 0x21
        beq     $v1, $v0, load
        li      $v0, 0x3F
        bne     $v1, $v0, 0x00359968
        nop
load:
        li      $a1, 0x8

Tada! Now we have fancy text printing! I hope you enjoy this feature of our patch!

Modifying the In-game Keyboard

When players play without a keyboard, they’re able to bring up the software keyboard to enter chat messages online and offline. By default, this keyboard opens in Hiragana mode, and requires navigating a menu to select full-width latin. From there, you need to select the half-width character button to allow typing messages in English. This needs to be done at least once even if you have a USB keyboard connected, otherwise the game will default to the Japanese IME. Fixing this issue was an important goal, if we wanted to make the patch feel like a fully-localised experience.

The first step was translating the keyboard texture into English. Some software keyboard buttons, like quick chat and half/full width are written with Kanji. Translating the software keyboard’s interface allows English speaking users to navigate it, but it still doesn’t feel like a native experience when it defaults to Japanese Hiragana. I then set out to fix the issue and make the keyboard default to latin half-width every time you open and close it.

Monster Hunter 2 remembers which keyboard you last used when you close it. These values are stored in memory, and when the software keyboard is opened the last accessed keyboard appears. The method I used to find these values was primitive, but the results work. I created multiple memory dumps with different keyboards saved and compared differences in their memory. Doing this helped me narrow down which values save the keyboard state. Then, by setting breakpoints, I found the sections of each keyboard function that wrote to this address. I wasn’t able to test every possible keyboard state to see if it wrote values, so I instead seached for instructions in a disassembly which explicitly wrote values to these addresses in memory to patch them all out. Having now removed the parts of the code that overwrite the keyboard state, I set the values on game boot by writing a small assembly injection.

The newly localised keyboard, defaulting to half-width latin

Now, no matter if you use the software keyboard or type using a physical keyboard, your input is localised! This still allows players to communicate using the Japanese IME if they so desire (to communicate with Japanese players online, for example). Maintaining cross-language functionality was important as players from all across the world will be able to play together on private servers!

Maintaining Server Compatibility

When we publicly announced the project, we were approached by the Monster Hunter Oldschool server developers. They offered us access to their test server to assist in the localisation of the game’s online components. This helped development significantly. The game’s online component would never be as polished without this kind gesture and testing.

Part way through the project, I woke up to a series of messages describing a bug that prevented the patch from working correctly with the server. The problem was in relation to a series of strings in-game. When a Quest is posted to the Tavern, Elder Hall, or Assembly, the player is met with a recruitment poster that asks how many players they would like to allow to join. This screen presents selectable options displayed as 1人, 2人, 3人, and 4人 (literally “One Person”, “Two People”, etc).

The Quest signup screen, showing the 4人 message when selecting the number of players

Naturally, we ommitted the Kanji and simply localised it as 1, 2, 3, and 4 respectively. Unbeknownst to us, the server (both private and official) would send the Kanji string to the client, and the game would do a string comparison against the Japanese string. By translating 1/2/3/4人, the string comparison would fail, and the client would allow 5 players to join a quest, regardless of the game’s setting (5 players being an illegal number of players in any quest, let alone the game not respecting the player’s choice).

Because these strings are displayed to the player as well as checked by the client, leaving them untranslated felt sloppy and lazy, but translating them outright would break online compatibility. Determined to not leave a single line of Japanese untranslated, I set out to fix the problem while maintaining compatibility with the original Japanese game.

My solution to this problem was to create a duplicate pointer table for the strings, then rewrite the functions that print the text in assembly to point to the new pointer table while leaving the comparison functions untouched. The old strings could remain untranslated and the relevant checking routines would check against the original Kanji, while the new strings are used for printing. To the player, they just experience a smooth and complete localisation, without having to worry about what’s going on under the hood. Here’s an example of one of the pointers being rewritten:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
.ps2_ee

.include "playstation2/registers_ee.inc"

.entry_point main
.org 0x71E214
main:
       lui      $v0, 0x0049
       lb       $v1, 0x2B50($at)
       lui      $a0, 0x0084
       addiu    $v0, $v0, -0x3100

This allows the game to be fully localised, while also not requiring a special server for only English speaking players. Players can still play with those using an original Japanese copy of the game, while the patch respects the way the server interacts with the client.

Fixing Forced Full-width Printing

On some screens, Monster Hunter 2 would convert all half-width characters to full-width using a special function. Usually, this occurs on screens that have tables, such as the rewards screen, but it’s also used in seemingly random places like the screen you’re presented with when capturing a monster to use in the Grand Tournament. This not only looked bad but also prevented us from showing the information we needed to on some screens, as text would overlap other text or overrun the screen boundries.

Before and after the text printing patch was applied

Solving this issue was, thankfully, trivial. After spending some time in the debugger chasing down the relevant function, I erased the section that checks if the character is half-width, forcing the game to skip converting it to a full-width character entirely.

Miscellaneous Fixes and Extras

There were many other small things that needed reverse engineering, and bespoke code written. Many of them are things so small that you’d never know the hours of work I put into them, but each one added a layer of polish that made me feel glad to have taken on the challenge. One prompt would present you with the word “Yes” instead of “OK” when presenting a message. That required over three hours of work to correct, because of limitations in the game’s code. Some areas of the script had to be carefully worked around, like the Poogie’s names, so as to not corrupt references used elsewhere in the game.

Then came taking on the task of reverse engineering the PSP unlockables. Originally it was out of the scope of the project, but I had some spare time, and a tester had asked if I would look into it. They were also able to provide me with two memory cards they had made on original hardware, one before and the other after having made a PSP Connection.

I had only planned on figuring out a small set of unlockables, but as I went along I decided that, if I’m going to do the work, I may as well go all the way. I don’t know how much time I spent on that extracurricular work, but it was a pretty significant chunk. Still, I enjoyed doing it, and I’m glad there’s now a way for everyone to access the bonus features.

Draw the Rest of the Owl

Now all that remained was to translate the game!

While I worked on the technical side of the patch (developing tools, extracting scripts, fixing bugs, writing enhancements), Nara had already begun translating the scripts I had dumped. When I had no technical work left to do, I joined in the translation effort. Together, we translated the game, being the editor for each others’ work, and cross-checked each others’ localisations for consistency as we progressed. Tracking our hours worked on it made us not want to lag behind the other. Looking back at it, we translated about half of the game each.

Translation work was the majority of this project. All of the above technical work, reverse engineering, and programming only made up 10% of our total work hours. 90% of the work was translating text. Still, the work I had done allowed us to stand at the starting line and hit the ground running. The entire build process was automated, error checking was built-in, editing text was as simple as editing a text file, and we made fresh patches every Friday to test changes.

Being the majority of the work, discussing the choices we made when localising, and the process of translating a game of this size, would expand this post well beyond its scope, so perhaps that’s something best left for another day.

Closing Words

Working on this project was a very fun experience. Solving the technical problems was exciting, and each translated line felt like a huge victory. Every time we built a patch we were excited all over again seeing our work in the game.

When we started the project we didn’t know if anyone would really care or look forward to it. In fact, one might imagine that it was born from a desire to contribute something to the Monster Hunter community, but the truth is we just wanted to play Dos in English.

However, we were pleasantly surprised by the reception it received during development.

But, truthfully, the other reason I worked on this was because of Nara. If you like this patch, and you like the work I put into it to make it possible, you should thank her. If she had brought up any other game, I would have reverse engineered and translated that one. I worked on this project because she wanted to. Honestly, I’m a pretty lucky guy for it!

Despite the long hours and hard work this project was a joy from beginning to end. All of the hours I put into it were validated by Nara’s incredible work ethic and high quality localisations, but the thing that made it most enjoyable was working with Nara, herself. Working with her was the most pleasent experience I’ve ever had creating anything with another person. Without her sudden desire to start this project, without her friendship, without her endless kindness and support, this project would never have been started, let alone finished. To her I owe the opportunity of working on something great. My real life and job are mundane and unfulfilling, but working on this project brought life into every single day (or maybe the life came from something else).

If you made it this far, that’s all I’ve got to say! I hope you all enjoy the patch and look forward to what we create in the future! Now get out there and hunt some monsters!

🍓