Most games are programmed in C/C++ which are relatively low level programming languages since they are compiled directly into the native target architecture (such as x86, PowerPC or ARM assembly). In order to hack games, we can therefore decompile the assembly code back to C/C++ using e.g. IDA Pro or Ghidra if we analyze the game's executable:
The main() function is the game's or executable's entry point to the program. This is where code execution starts.
In order to modify the game's code, we should therefore learn the respective assembly dialect and/or C/C++ programming in order to understand how the code might have been written and how compilers translate C/C++ code to assembly via calling conventions and such.
In this thread we will discuss the basic of assembly programming and keep it as simple as possible to follow. Also, we'll mainly use x86 and PowerPC assembly as examples since they're the most common architectures to work with.
Thanks for reading.
The main() function is the game's or executable's entry point to the program. This is where code execution starts.
In order to modify the game's code, we should therefore learn the respective assembly dialect and/or C/C++ programming in order to understand how the code might have been written and how compilers translate C/C++ code to assembly via calling conventions and such.
In this thread we will discuss the basic of assembly programming and keep it as simple as possible to follow. Also, we'll mainly use x86 and PowerPC assembly as examples since they're the most common architectures to work with.
- Components of a computer device
Every computer or gaming console etc. runs some sort of assembly programming architecture. For the concepts of a RAM (= random access memory) exist as well as registers. Assembly instructions are procedural machine instructions which tell the processor what to do (e.g. move some data from the RAM into a register or vice versa).
In PowerPC assembly we have 32 general purpose registers (GPRs). They go from r0 to r31. These are used for almost all types of (numeric) data which isn't floating points. For floating point data, the floating point registers f0 to f31 are used.
In x86 assembly the distinction is similar but there are even more registers and they are named less straight-forward, e.g. eax, ebx and ecx. - Basic processor operations/assembly instructions
The most common operations are math operations such as adding, subtracting or multiplying, moving data from/to registers and/or the RAM as well as jump instructions to skip over or carry on executing instructions somewhere else. Calling and returning from functions are special cases of jumps. - Tools for testing and playing with assembly
x86 assembly (Jasmin):
x86 assembly (Cheat Engine):
PowerPC assembly (Assembler Utility): - Code examples
In x86 assembly we can use the MOV instruction to move data from one place to another. For example, MOV EAX, 1234 will write the value 1234 into the register EAX. We can load data from a memory address into a register by using e.g. MOV EAX, [EAX]. Note that an invalid address on the right side will cause a crash since accessing an invalid address is an error. We can add two registers by using e.g. ADD EAX, EBX. The result will be stored in EAX again. Also, register names are case insensitive. We can spell them lowercase or uppercase without any problems. If we want to perform a conditional jump, we can do so via a compare instruction CMP and a JE instruction (= jump equals):
- Code:
MOV EAX, 1234
CMP EAX, 1234
JE _SKIP
ADD EAX, 2
_SKIP:
Another important concept is function calling. For this we need the CALL instruction. Before we execute the call instruction, we also need to fill the parameters of the functions according to the calling conventions. In x86 assembly, the register ECX is the first parameter for integer arguments etc. Functions are useful to structure the code better and to re-use code. Writing bigger programs or code in general will become way too messy without functions. Assembly is no exception.
An example code for showing how calling functions works is the following:
- Code:
MOV ECX, 2
MOV EDX, 10
CALL _my_function
MOV [EBX], ECX
_my_function:
MOV EAX, ECX
MUL EDX
MOV ECX, EAX
RET
This code will firstly write value 2 into ECX and therefore sets up the first function parameter, the 2nd function parameter EDX with value 10, then call the function _my_function which is defined by the branch label. The difference between CALL and JMP is that CALL will backup the stack frame while JMP will not. In order to properly return from a function and destroy the new function's stack frame, we need to use the RET instruction. Inside the function, we will setup EAX with the multiplier, perform the multiplication of AL (8-bit version of EAX) and ECX and store the result in EAX. In case this sounds confusing, make sure to read the documentation for each instruction. Jasmin conveniently provides one, so make use of this great resource:
Finally, we write EAX into ECX again since ECX is the return value register. As the last instruction of this code snippet, MOV [EBX], EAX will be executed which will write the value in ECX into the memory address EBX: - Conclusion
And with this the assembly crash course is concluded (for now). Feel free to ask any questions or let me know what I missed and should add to this post. I can recommend learning with Jasmin since it's a great simulator without crashing anything upon making a mistake. However, it may not be as sophisticated as a real processor.
Serious hacks or mod menus should be written in C or better, in C++. Only small hacks may be written in assembly. C++ also offers the ability to inject code into other processes via hooking. This makes writing assembly by hand relatively obsolete.
Thanks for reading.