Show HN: ELF Injector
github.comThe ELF Injector allows you to "inject" arbitrary-sized relocatable code chunks into ELF executables. The code chunks will run before the original entry point of the executable runs.
Included in the project are sample chunks as well as a step-by-step tutorial on how it works.
It's a mix of C and assembly and currently runs on 32-bit ARM though it's easy to port to other architectures.
Interesting, I also had to solve this problem about a decade ago when writing a binary packer for Android .so libraries. I ended up moving the first few Pheaders around to make space for a new entry in the Elf32_Phdr table, and then used that to inject a new code segment at the end of the file. Due to padding constraints this could sometimes add a significant amount of null bytes to the binary just for padding.
I also made my code execute before the entry point by specifying it as DT_INIT in the dynamic section. This way you don't have to modify the entry point pointer or call it after your unpacking stub is done decompressing the binary in memory.
Your solution with the the thunk is much better and probably avoids a lot of the complexity I encountered in moving segment headers around! Elf is a tight format unlike PE. Not a single byte goes to waste.
Thanks for sharing your project, I learned something today!
PS one interesting piece of trivia I found was that you could strip the section header entirely from an Elf file and the OS would still load and execute it. All it needs is the segment headers. It looks like the section headers are just there as a courtesy to help tools like strip and objcopy.
You're right, you definitely can strip away the section headers. Notice that there's always a dynamic segment? If there wasn't, stripping away the section headers would make it very difficult to locate this very important piece of an ELF binary.
I had to make sure that I always inserted a page size multiple of bytes into the executable which can add up to a page of unused padding in addition to the thunk and chunk.
One of the challenges of ELF injection/infection is that you might break assumptions made by the original ELF regarding its layout, particularly for dynamic ELFs (which often parse .dynamic at runtime, etc.)
How many different target ELFs have you tried it with, and are there any that don't work?
I wrote the project entirely for fun, as a learning experience and because of that I have not extensively tested it. At the moment only ELF files of type ET_EXEC can be injected.
I was careful to only inject the thunk (the code that loads the actual relocatable code chunk the the user injects) into the available padding at the end of the text segment, injecting anything larger runs the risk of "sliding" the next segment (usually the data segment) over thereby breaking references to static data from code.
ATOM could inject code before or after any basic block in a program: https://dl.acm.org/doi/abs/10.1145/178243.178260. The general technique, IIRC, was to replace the first instruction of the basic block with a jump to code that contained your new code and then the overwritten instructions, and then jump back into the original code.
Yes, I've implemented this technique before at my job. Relocating assembly instructions especially those that contain branching logic can be tricky as the offsets have to be recomputed or a new instruction needs to be used instead. More often that not, you may not have enough space for that new instruction.
This is also similar to what is done with video game modification (think Gecko, Game Shark, Game Genie).
How does it fare on Virustotal?
Most of the these projects (often intended as packers or similar obfuscation techniques for malware) that I've seen are cool but get flagged aggressively.
I wouldn't be surprised if this gets flagged but I haven't tried it.
Always cool to see people hacking ELF! I see you're using argv[0] to find the executable file. This is fragile because argument vector contents are arbitrary and controlled entirely by the parent process. In other words, argv[0] could contain anything, even an empty string. If you're targeting Linux specifically, there's a better way: /proc/self/exe.
I created something similar to your tool: an ELF embedder. It's for arbitrary data rather than native code injection.
I leveraged ELF segments to get the kernel to mmap the data into memory on my behalf, no file I/O needed. The auxvec allows the program to reach its own program header table. From there it's just a matter of finding the right segment.
https://www.matheusmoreira.com/articles/self-contained-lone-...
My programming language's interpreter introspects into its own ELF header, finds the embedded segments and uses them to load data and code from inside itself.
I'm not entirely sure whether this approach is applicable to your case but it might be worth a try. My embedding tool could just as easily map in executable segments.
IME it's far more likely for a program to run in an environment lacking /proc (e.g. some containers) than for argv[0] to be NULL, empty, or lying. Even /proc/self/exe can been used in an exploit chain (https://lwn.net/Articles/920384/). And there have been many other /proc-related exploits, which is why it's often a good idea to not even mount /proc at all rather than just mask /proc subtrees.
But even if /proc/self/exe is visible, it might not be possible to open it. For example, you can execute a binary residing in anonymous memory (memfd) using fexecve(2), or the executable file could have been deleted immediately after execution. AFAIU, /proc/self/exec is generated as a symlink rather than exposing the file object directly (cf. BSD /dev/fd), so opening it can fail.
The easy fix is to just simply fail if argv[0] is NULL or empty, but if one is a glutton for punishment and wants to be "robust", /proc/self/exe support could be added as a (preferred) alternative rather than a substitute mechanism.
EDIT: I was wrong, /proc/self/exe is directly openable, even if the path returned by readlink doesn't exist.
Good point! I was aware that there are certain circumstances in which argv[0] might not contain the path to the executable but I didn't bother to look for an alternative. When I get a chance, I'll incorporate your suggestion. Thanks.
Thanks for sending your link, I'll take a look.