macOS reflective code loading analysis
Table of Contents
Background #
Reflective code loading is an interesting attack technique, useful when there’s a desire to conceal or protect the code that’s being executed on a system. This is primarily accomplished by avoiding the creation of any files on disk (e.g., downloading the binary that’s then going to run) or other execution artifacts that would become indicators of behavior. Instead, code can be downloaded and executed directly in the memory of an otherwise benign process. While not completely interchangeable, sometimes we see this technique also referred to as “fileless”, “artifact-free”, and/or “in-memory (only)” as those are desirable characteristics.
Probably the best known implementation of this technique on macOS relies on
functionality exposed by dyld
, and first documented in
“The Mac Hacker’s
Handbook” by Dino Dai Zovi and Charlie Miller in 2009, though it’s been
also discussed/popularized by
Patrick Wardle at BlackHat 2015, and by
Stephanie Archibald at INFILTRATE ‘17. The high-level approach is:
- receive code over a socket
- if it’s a binary, change a field in the Mach-O header from
MH_EXEC
toMH_BUNDLE
- needed for the subsequent steps as the necessary APIs expect a bundle1 - use
NSCreateObjectFileImageFromMemory
to, well, create an object file from the memory region that contains the binary - use
NSLinkModule
to link in the necessary shared libraries - call functions or pass execution to the loaded code (after figuring out the entry point, in the case it was actually a binary)
This technique has been used by malware authors rather recently, notably the Lazarus Group in the 2019 version of Apple Jeus.
There’s just one thing: this technique hasn’t been truly fileless for… some time.
How it (really) works today #
@bluec0re initially got me looking into this in depth, after noticing during a debug run that artifacts are, in fact, created when these specific functions are called. Given the public perception around this in-memory loading technique, this was a bit of a surprise, so I spent some time doing a bit more research and putting it together for this analysis.
I used @its_a_feature’s simple macos_execute_from_memory PoC for testing (there’s also Archibald’s PoC, for historical reference). For convenience/ease of iteration, these PoCs do actually load a target binary from disk, rather than over the network, but the execution is meant to be in-memory only.
To figure out what’s going on, a good starting place is to run the code while
printing anything that has to do with dyld
, given that’s where the two
functions are. There are some helpful environment variables to set before
running the code:
clang -g -o main main.c
DYLD_PRINT_APIS=1 DYLD_PRINT_LIBRARIES=1 ./main
|
|
Looks like NSLinkModule
uses dlopen
to load the
NSCreateObjectFileImageFromMemory-VTd6S38q
file from a temporary location. To
confirm and get a bit more detail, I monitored the execution of ./main
using
Patrick Wardle’s
File Monitor to see what files are being accessed. To
narrow the output, I’m piping this through jq
to filter out for events related
to the PoC, and extracting only the type and the file destination for each
event:
|
|
|
|
During the execution of ./main
a file is created, accessed a few times, then
deleted. The name changes slightly with every run (the 7zEgh32K
above,
compared to VTd6S38q
previously) and based on the path it’s fairly obvious
this is a temporary file. I wanted to understand more about when it’s created,
and if its presence means this particular technique is no longer as interesting
as we once thought: it can be replaced with a much less brittle combo of
dlopen
and dlsym
, but it’s also not artifact-free.
So next I ran ./main
in lldb
with a breakpoint on
NSCreateObjectFileImageFromMemory
, stepping through instructions to figure out
what’s going on:
|
|
After for a while we eventually get to
dyld4::APIs::NSCreateObjectFileImageFromMemory(void const*, unsigned long, __NSObjectFileImage**)
which superficially looks like the function called in main.c
but, as we’ll see
a bit later on, is not.
Continuing on we reach first
libdyld.dylib`NSCreateObjectFileImageFromMemory
and then
dyld`dyld4::APIs::NSLinkModule(__NSObjectFileImage*, char const*, unsigned int)
in a similar fashion. Here we’ll see a call to mkstemp
that’s a give-away
we’re likely creating a temporary file:
|
|
Right after the return from mkstemp
, I checked the temporary file location to
see a file created there:
|
|
The SHA-256 checksum of this temp file matched the SHA-256 checksum of the file
that was loaded in-memory by the PoC, so it’s pretty clear at this point that
NSLinkModule
writes the file out, then uses dlopen
to load it back to do the
rest. dlopen
is part of a stable API, unlike the long-deprecated
NSCreateObjectFileImageFromMemory
and NSLinkModule
.
Tracing through the source code #
Apple publishes
the source code to dyld, though
with some delay after the corresponding OS release that introduces a new
version. For example, the latest available source is
852, which corresponds to macOS 11, but the current
macOS 12.2 version is dyld-941.5
.
strings /usr/lib/dyld | rg "dyld-\b\d\d\d\b" # also /System/Library/dyld/dyld_shared_cache_arm64e
@(#)PROGRAM:dyld PROJECT:dyld-941.5
An interesting fact is that the function we saw earlier as
dyld4::APIs::NSCreateObjectFileImageFromMemory
is from the dyld4
namespace,
which is not present in any of the source code releases (there’s only dyld
and
dyld3
respectively.) During debugging I saw a few functions called that are
part of the dyld3
namespace, so my guess is that all of these will coexist, at
least for a while. For the purposes of this analysis, I’ll focus on the
852 version of dyld
, but I’ll keep an eye out for whenever 941
drops, to check out dyld4
there.
There are a few implementations of NSCreateObjectFileImageFromMemory
in the
source for
852, in:
-
src/dyldAPIs.cpp
added with thedyld-43
release, about 17 years ago. -
src/dyldAPIsInLibSystem.cpp
added bydyld-519.2.1
in 2017 -
dyld3/APIs_macOS.cpp
where the relevant portion was also added withdyld-519.2.1
in 2017.
The same is true for NSLinkModule
, just a few lines below in those same files.
It looks like gUseDyld3
controls whether the version that’s being used is the
one in the dyld3
namespace, versus the “original” from the dyld-43
release.
There are a few places this is assigned to throughout the code base, though only
one spot where it’s set to true
, in
libdyldEntryVector.cpp
.
Following the path further through the build config files in Xcode and the
source itself, I believe the entry vector is a dependency to building
libdyld.dylib
. So that’s how we get to the code path that uses the new
(dyld3
) implementation of NSCreateObjectFileImageFromMemory
and
NSLinkModule
, the latter of which creates a temp file and loads it via
dlopen
.
How long has this been the case? #
I admit that, while researching, I went through a few different theories as to
when this behavior changed. Given I hadn’t seen anyone discuss it, seemed like a
recent change. But when I began tracing through the dyld
source, I figured
maybe this changed around 2017 when the initial dyld3
code was added, and
somehow nobody noticed. I even went to search VirusTotal for files named
according to this pattern, and found quite a few, dating back to 2018.
Somehow though, nobody noticing just… didn’t seem right?
Reverse engineering dyld
for better “carbon” dating #
While version 941.5 isn’t out, I was curious to check out how execution of this
code path looks now, so I opened my local copy of dyld
in Ghidra. Searching
for NSLinkModule
returned a single location, in the dyld4
namespace as we
expected. The decompiled2 code does bear resemblance to the source in
version
852:
|
|
Line 32 confirms the template for the temporary name is the same as what we saw during execution.
Line 51 would be around where we expect a dlopen
call to happen, although in
this case there’s an indirection to resolve an address for a function from a
jump table. Unfortunately Ghidra isn’t able to resolve that table, so we’re left
speculating if that’ll end up being dlopen
or not. But I’m pretty sure that’s
where we end up eventually, given what we’ve observed during live debugging.
I wanted to compare this version with some of the older ones, but lacking at the
moment any non-Monterey installs, I had to resort to grabbing copies of
/usr/lib/dyld
from VMs and, eventually, download them from VirusTotal. I was
quite surprised to see that every one of them, as far back as 551, and including
852, do in fact use a non-dlopen
version of the code, basically the true
fileless implementation. I can’t rule out that maybe all of these versions are
off VMs, where maybe something is different with dyld
… I think it’s rather
curious that I didn’t find any that use the dyld3
implementation.
The thing that throws me off the most—if this behavior really was only enabled
in Monterey—is that so many files matching the mkstemp
pattern of
/private/var/tmp/NSCreateObjectFileImageFromMemory-XXXXXX
show up in
VT since 2018. I’m not convinced either way just
yet.
Future work #
The predictable file name and location means XDR can easily grab these files whenever they’re created, as it would be exactly the code to run that’s written out (so any obfuscation would happen before, most likely). There are some benign uses of these APIs, certainly, but there are bound to be some cool findings too. I haven’t spent any time looking through such files on VT to see if anything good is there, but it’s certainly tempting.
Once the code for the latest version of dyld
is out, I hope to better
understand the specifics of how the dyld4
namespace is enabled for use. Of
course, I also want to see what direction Apple is taking for these APIs, given
the new namespace.
On the offensive tooling side, I’m curious if it’s possible to develop a pure
in-memory (true fileless) way to execute code on macOS. Mostly that seems to
require writing a Mach-O loader, and I bet significant parts of the logic can be
borrowed from both the deprecated code, and the dlopen
implementation.
If you happen to come across any interesting implementations of true in-memory loaders for macOS, have figured out some of the missing parts I didn’t get to in this analysis, or found any issues/mistakes etc. please let me know via Twitter.
Thanks for reading!
If we’re to go by Apple’s old documentation, the motivation for this API is to allow loading of plug-ins in applications that might want them. Bundles (in the Mach-O sense) fit this purpose rather well, though since it’s trivial to turn a binary into a bundle as shown ↩︎
The GNU demangler that’s part of Ghidra threw a bunch of errors and did not demangle any names, so I ran the decompiled code through
c++filt
manually and adjusted as necessary. Still, I may have made some errors re: parameters, which was fine for my purposes but may not be for yours. ↩︎