2022-10-11 : As I start my blog-C, let's publish that questionable text file. --- ███████████████████████████████████████████ ⋏ ████████████████████████████████████████████ █████████████████████████████████████≺ 00-intro.txt ≻█████████████████████████████████████ ███████████████████████████████████████████ ⋎ ████████████████████████████████████████████ The finest hacking discipline has always been binary exploitation, mostly because computers used execute programs instead of rendering web pages. For a long time, there used to be no binary exploitation guide, since programming and debugging segfaulting programs was the norm. Then, fine people started documenting nifty tricks related to software flaws of their times. « Hacking: the art of exploitation » by Eriksson is a recommended reading, even though it stops being relevant in the early 2000s when binary runtimes and compilers started to implement a large number of security mechanisms. Gather around, and rejoice with me, for there is now a complete binary exploitation guide, and you are reading it. We will go back to the roots of binary execution and reach the latest binary exploitation techniques. ⚉ https://beej.us/guide/bggdb/ ⚉ http://www.unknownroad.com/rtfm/gdbtut/ ⚉ https://www.eecs.umich.edu/courses/eecs373/readings/Debugger.pdf ⚉ https://nostarch.com/hacking2.htm ██████████████████████████████████████████████████████████████████████████████████████████ ███████████████████████████████████████████ ⋏ ████████████████████████████████████████████ █████████████████████████████████████≺ 01-roots.txt ≻█████████████████████████████████████ ███████████████████████████████████████████ ⋎ ████████████████████████████████████████████ Going back to the roots of binary execution. Let's pretend there's only 16-bit real-mode x86 CPUs for the moment, that you're using MS-DOS (which only supports real-mode). Going even further back to the roots, let's pretend you're not even using an operating system (or executive), what are you left with ? Answer: an IBM PC, with only a BIOS (or UEFI nowadays). (Plus some secret code running in the System Management Mode (SMM), but you're not supposed to know about this, right ?). The Basic Input Output System is what IBM kept as layer of proprietary bloatware when they allowed Microsoft to write an operating system (or executive) for their personal computers, when the USSR was still united. Ignoring SMM, what the motherboard hardcoded chips are supposed to do is to write down the BIOS code at an unspecified address in RAM (or at no address at all), and the BIOS code will take care of initializing the peripherals, by enumerating PCI devices, probing ports, and surely other undocumented activities we don't need to dive into. Once the peripherals such as disk drive, RAM, NICs have been initialized, the BIOS loads the first stage of the boot loader is, well, loaded, in RAM, at the offset 0x7c00. This address might as well be written in the absolute usual form, 0x7c00, or in the 16-bit pointers mode, 07C0:0000 , which can be abbreviated in 07C0:0. This form exists because in 16-bit mode (real mode is limited to 16 bits), you cannot use 32-bit registers as addresses, and have to rely on using two registers. The actual physical address used is obtained by adding both with the following formula, shifting the first register by one nibble (4 bits) : ┌────────────────────────┐ │physical = (A << 4) + B │ │ │ │0x7c00 = (07C0 << 4) + 0│ └────────────────────────┘ TODO: check 32-bit inexistence in real mode TODO: get the names of the realmode address registers So, let's boot, shall we ? The first step is to install some hypervisor, qemu provides the best interface out there. Let's install it. ▛▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▜ ▌$ apt-get install qemu-system-x86▐ ▙▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▟ Then, we'll need an operating system. Let's use the simplest operating system out there, made only of null bytes and the two magic bytes for boot sectors, 0x55AA. ▛▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▜ ▌$ python3 -c 'import sys;sys.stdout.buffer.write(bytes(0x200-2)+bytes([0x55,0xAA]))' > u▐ ▌$ hd u ▐ ▌00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| ▐ ▌* ▐ ▌000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aa |..............U.| ▐ ▌00000200 ▐ ▙▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▟ Now that our operating system is ready, let's start a virtual machine, but stop it on the very first opcode executed (-S), boot using the floppy interface (-fda), force the boot on the floppy (-boot a), and bind a local TCP port 1234 with a listening gdbserver-compliant program. ▛▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▜ ▌$ qemu-system-i386 -fda u -boot a -nographic -S -s▐ ▙▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▟ Let's connect to our debugging server with GDB, and see what happens. ▛▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▜ ▌$ gdb -q -ex 'target remote 127.1:1234' ▐ ▌Remote debugging using 127.1:1234 ▐ ▌warning: No executable has been specified and target does not support▐ ▌determining executable automatically. Try using the "file" command. ▐ ▌0x0000fff0 in ?? () ▐ ▌(gdb) ▐ ▙▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▟ So, we've stopped, but at 0xfff0, which isn't 0x7c00. This is because we're still in the BIOS part of the qemu-provided system. Let's put a breakpoint at 0x7c00 and continue execution. Don't even try to look at the code by running « x/8x $eip », or you'll see things you're not supposed to see; it would be illegal. ▛▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▜ ▌(gdb) break *0x7c00 ▐ ▌Breakpoint 1 at 0x7c00 ▐ ▌(gdb) c ▐ ▌Continuing. ▐ ▌ ▐ ▌Breakpoint 1, 0x00007c00 in ?? ()▐ ▙▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▟ Ok, we now are at 0x7c00, because our instruction register EIP is pointing to 0x7c00. The CS register (Code Segment) is pointing to 0000, which means we're actually fetching instructions from the physical memory address 0x7c00. By looking at the registry state, we can also gain information on the registers state, as left by the pre-bootloader (probably the virtual qemu BIOS, but this is out of our scope): - A stack pointer of 0x6f04, as a remnant decay from the pre-boot execution age - An EAX (main register) register set to 0xaa55, which is the magic word at 0x7c00 + 512 - 2. The convention is to check that those two bytes are present. - Some flags, pretty ephemeral and meaningless ▛▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▜ ▌(gdb) x/8x $eip ▐ ▌0x7c00: 0x00000000 0x00000000 0x00000000 0x00000000 ▐ ▌0x7c10: 0x00000000 0x00000000 0x00000000 0x00000000 ▐ ▌(gdb) info registers ▐ ▌eax 0xaa55 43605 ▐ ▌ecx 0x0 0 ▐ ▌edx 0x80 128 ▐ ▌ebx 0x0 0 ▐ ▌esp 0x6f04 0x6f04 ▐ ▌ebp 0x0 0x0 ▐ ▌esi 0x0 0 ▐ ▌edi 0x0 0 ▐ ▌eip 0x7c00 0x7c00 ▐ ▌eflags 0x202 [ IF ] ▐ ▌cs 0x0 0 ▐ ▌ss 0x0 0 ▐ ▌ds 0x0 0 ▐ ▌es 0x0 0 ▐ ▌fs 0x0 0 ▐ ▌gs 0x0 0 ▐ ▌(gdb) x/16h 0x7de0 ▐ ▌0x7de0: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000▐ ▌0x7df0: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0xaa55▐ ▌(gdb) x/h (0x7c00 + 512 - 2) ▐ ▌0x7dfe: 0xaa55 ▐ ▙▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▟ We won't go further running custom operating systems (or executives), but feel free to consult the excellent OSdev website for more relevant information. You can also run your own hard drive by providing -hda /dev/sda, which would be cocky, since you would then have two twin operating systems using the same hard drive. A recommended approach is to install GRUB on a file, or to run the following command to extract the first sectors of your disk : ▛▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▜ ▌dd if=/dev/sda bs=512 count=10 of=bootsector.img▐ ▙▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▟ In 16-bit mode, there's not much hacking to do, except maybe some hardware enumeration and probing the framebuffer. (Did you ever configure a computer without grpahic server, running only vbetool and other libcaca-inspired programs to use the screen ? You should !) ⚉ https://software.intel.com/en-us/articles/intel-sdm ⚉ https://wiki.osdev.org/Memory_Map_(x86) ⚉ https://wiki.osdev.org/