Skip to main content

macOS reflective code loading analysis

·12 mins

Background #

Reflective code loading is an interesting attack technique, useful when there’s a desire to conceal or protect the code that’s being executed on a system. This is primarily accomplished by avoiding the creation of any files on disk (e.g., downloading the binary that’s then going to run) or other execution artifacts that would become indicators of behavior. Instead, code can be downloaded and executed directly in the memory of an otherwise benign process. While not completely interchangeable, sometimes we see this technique also referred to as “fileless”, “artifact-free”, and/or “in-memory (only)” as those are desirable characteristics.

Probably the best known implementation of this technique on macOS relies on functionality exposed by dyld, and first documented in “The Mac Hacker’s Handbook” by Dino Dai Zovi and Charlie Miller in 2009, though it’s been also discussed/popularized by Patrick Wardle at BlackHat 2015, and by Stephanie Archibald at INFILTRATE ‘17. The high-level approach is:

  1. receive code over a socket
  2. if it’s a binary, change a field in the Mach-O header from MH_EXEC to MH_BUNDLE - needed for the subsequent steps as the necessary APIs expect a bundle1
  3. use NSCreateObjectFileImageFromMemory to, well, create an object file from the memory region that contains the binary
  4. use NSLinkModule to link in the necessary shared libraries
  5. call functions or pass execution to the loaded code (after figuring out the entry point, in the case it was actually a binary)

This technique has been used by malware authors rather recently, notably the Lazarus Group in the 2019 version of Apple Jeus.

There’s just one thing: this technique hasn’t been truly fileless for… some time.

How it (really) works today #

@bluec0re initially got me looking into this in depth, after noticing during a debug run that artifacts are, in fact, created when these specific functions are called. Given the public perception around this in-memory loading technique, this was a bit of a surprise, so I spent some time doing a bit more research and putting it together for this analysis.

I used @its_a_feature’s simple macos_execute_from_memory PoC for testing (there’s also Archibald’s PoC, for historical reference). For convenience/ease of iteration, these PoCs do actually load a target binary from disk, rather than over the network, but the execution is meant to be in-memory only.

To figure out what’s going on, a good starting place is to run the code while printing anything that has to do with dyld, given that’s where the two functions are. There are some helpful environment variables to set before running the code:

clang -g -o main main.c
DYLD_PRINT_APIS=1 DYLD_PRINT_LIBRARIES=1 ./main
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// some output ellided
dyld[80184]: NSCreateObjectFileImageFromMemory(0x105010000, 0x00008258)
dyld[80184]: NSCreateObjectFileImageFromMemory() copy 0x105010000 to 0x10501c000
dyld[80184]: NSLinkModule(0x6000029d4270, module)
dyld[80184]: dlopen("/var/folders/q8/28dylf2973q_22bqzy2_lplr0000gn/T/NSCreateObjectFileImageFromMemory-VTd6S38q", 0x80000080)
dyld[80184]: <DDBBB7CE-78F7-3E78-AD09-2C79ED030E2A> /private/var/folders/q8/28dylf2973q_22bqzy2_lplr0000gn/T/NSCreateObjectFileImageFromMemory-VTd6S38q
dyld[80184]:       dlopen(NSCreateObjectFileImageFromMemory-VTd6S38q) => 0x20a24d4a0
dyld[80184]: NSLinkModule(0x6000029d4270, module) => 0x20a24d4a0
dyld[80184]: NSLookupSymbolInModule(0x20a24d4a0, _execute)
dyld[80184]: NSLookupSymbolInModule(0x20a24d4a0, _execute) => 0x105037f84
old timey mode:	Executed!
dyld[80184]: NSUnLinkModule(0x20a24d4a0)
dyld[80184]: dlclose(0x20a24d4a0)
dyld[80184]: NSDestroyObjectFileImage(0x6000029d4270)

Looks like NSLinkModule uses dlopen to load the NSCreateObjectFileImageFromMemory-VTd6S38q file from a temporary location. To confirm and get a bit more detail, I monitored the execution of ./main using Patrick Wardle’s File Monitor to see what files are being accessed. To narrow the output, I’m piping this through jq to filter out for events related to the PoC, and extracting only the type and the file destination for each event:

1
2
3
4
5
6
sudo /Applications/FileMonitor.app/Contents/MacOS/FileMonitor | \
jq  '.
   | select(.file.process.path | contains("macos_execute_from_memory/main"))
   | {event: .event, dest: .file.destination}' | \
tee filemon-filtered.json # also saving for futher analysis
# start ./main in another terminal, Ctrl-C the FileMonitor process when done
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
{
  "event": "ES_EVENT_TYPE_NOTIFY_OPEN",
  "dest": "…/macos_execute_from_memory/test.bundle"
}
{
  "event": "ES_EVENT_TYPE_NOTIFY_CLOSE",
  "dest": "…/macos_execute_from_memory/test.bundle"
}
{
  "event": "ES_EVENT_TYPE_NOTIFY_CREATE",
  "dest": "/private/var/folders/q8/28dylf2973q_22bqzy2_lplr0000gn/T/NSCreateObjectFileImageFromMemory-7zEgh32K"
}
{
  "event": "ES_EVENT_TYPE_NOTIFY_WRITE",
  "dest": "/private/var/folders/q8/28dylf2973q_22bqzy2_lplr0000gn/T/NSCreateObjectFileImageFromMemory-7zEgh32K"
}
{
  "event": "ES_EVENT_TYPE_NOTIFY_CLOSE",
  "dest": "/private/var/folders/q8/28dylf2973q_22bqzy2_lplr0000gn/T/NSCreateObjectFileImageFromMemory-7zEgh32K"
}
{
  "event": "ES_EVENT_TYPE_NOTIFY_OPEN",
  "dest": "/private/var/folders/q8/28dylf2973q_22bqzy2_lplr0000gn/T/NSCreateObjectFileImageFromMemory-7zEgh32K"
}
{
  "event": "ES_EVENT_TYPE_NOTIFY_CLOSE",
  "dest": "/private/var/folders/q8/28dylf2973q_22bqzy2_lplr0000gn/T/NSCreateObjectFileImageFromMemory-7zEgh32K"
}
// open-close twice more
{
  "event": "ES_EVENT_TYPE_NOTIFY_UNLINK",
  "dest": "/private/var/folders/q8/28dylf2973q_22bqzy2_lplr0000gn/T/NSCreateObjectFileImageFromMemory-7zEgh32K"
}

During the execution of ./main a file is created, accessed a few times, then deleted. The name changes slightly with every run (the 7zEgh32K above, compared to VTd6S38q previously) and based on the path it’s fairly obvious this is a temporary file. I wanted to understand more about when it’s created, and if its presence means this particular technique is no longer as interesting as we once thought: it can be replaced with a much less brittle combo of dlopen and dlsym, but it’s also not artifact-free.

So next I ran ./main in lldb with a breakpoint on NSCreateObjectFileImageFromMemory, stepping through instructions to figure out what’s going on:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
(lldb) b NSCreateObjectFileImageFromMemory
Breakpoint 1: where = libdyld.dylib`NSCreateObjectFileImageFromMemory, address = 0x0000000180314fa0
(lldb) r
Process 94604 launched: 'macos_execute_from_memory/main' (arm64)
Process 94604 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x000000018e3b8fa0 libdyld.dylib`NSCreateObjectFileImageFromMemory
libdyld.dylib`NSCreateObjectFileImageFromMemory:
->  0x18e3b8fa0 <+0>:  mov    x3, x2
    0x18e3b8fa4 <+4>:  mov    x2, x1
    0x18e3b8fa8 <+8>:  mov    x1, x0
    0x18e3b8fac <+12>: adrp   x8, 366208
Target 0: (main) stopped.

After for a while we eventually get to dyld4::APIs::NSCreateObjectFileImageFromMemory(void const*, unsigned long, __NSObjectFileImage**) which superficially looks like the function called in main.c but, as we’ll see a bit later on, is not.

Continuing on we reach first libdyld.dylib`NSCreateObjectFileImageFromMemory and then dyld`dyld4::APIs::NSLinkModule(__NSObjectFileImage*, char const*, unsigned int) in a similar fashion. Here we’ll see a call to mkstemp that’s a give-away we’re likely creating a temporary file:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
(lldb)
Process 97220 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = instruction step into
    frame #0: 0x000000018e3bdea8 libdyld.dylib`dyld4::LibSystemHelpers::getenv(char const*) const
libdyld.dylib`dyld4::LibSystemHelpers::getenv:
->  0x18e3bdea8 <+0>: mov    x0, x1
    0x18e3bdeac <+4>: b      0x18e3c329c               ; symbol stub for: getenv

libdyld.dylib`dyld4::LibSystemHelpers::mkstemp:
    0x18e3bdeb0 <+0>: mov    x0, x1
    0x18e3bdeb4 <+4>: b      0x18e3c336c               ; symbol stub for: mkstemp
Target 0: (main) stopped.
(lldb)
Process 97220 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = instruction step into
    frame #0: 0x000000018e3bdeac libdyld.dylib`dyld4::LibSystemHelpers::getenv(char const*) const + 4
libdyld.dylib`dyld4::LibSystemHelpers::getenv:
->  0x18e3bdeac <+4>: b      0x18e3c329c               ; symbol stub for: getenv

libdyld.dylib`dyld4::LibSystemHelpers::mkstemp:
    0x18e3bdeb0 <+0>: mov    x0, x1
    0x18e3bdeb4 <+4>: b      0x18e3c336c               ; symbol stub for: mkstemp

libdyld.dylib`dyld4::LibSystemHelpers::getTLVGetAddrFunc:
    0x18e3bdeb8 <+0>: adrp   x16, -5
Target 0: (main) stopped.

Right after the return from mkstemp, I checked the temporary file location to see a file created there:

1
2
fd NSCreateObjectFileImageFromMemory /private/var/folders/
/private/var/folders/q8/28dylf2973q_22bqzy2_lplr0000gn/T/NSCreateObjectFileImageFromMemory-wySJJ7WJ

The SHA-256 checksum of this temp file matched the SHA-256 checksum of the file that was loaded in-memory by the PoC, so it’s pretty clear at this point that NSLinkModule writes the file out, then uses dlopen to load it back to do the rest. dlopen is part of a stable API, unlike the long-deprecated NSCreateObjectFileImageFromMemory and NSLinkModule.

Tracing through the source code #

Apple publishes the source code to dyld, though with some delay after the corresponding OS release that introduces a new version. For example, the latest available source is 852, which corresponds to macOS 11, but the current macOS 12.2 version is dyld-941.5.

strings /usr/lib/dyld | rg "dyld-\b\d\d\d\b" # also /System/Library/dyld/dyld_shared_cache_arm64e
@(#)PROGRAM:dyld  PROJECT:dyld-941.5

An interesting fact is that the function we saw earlier as dyld4::APIs::NSCreateObjectFileImageFromMemory is from the dyld4 namespace, which is not present in any of the source code releases (there’s only dyld and dyld3 respectively.) During debugging I saw a few functions called that are part of the dyld3 namespace, so my guess is that all of these will coexist, at least for a while. For the purposes of this analysis, I’ll focus on the 852 version of dyld, but I’ll keep an eye out for whenever 941 drops, to check out dyld4 there.

There are a few implementations of NSCreateObjectFileImageFromMemory in the source for 852, in:

The same is true for NSLinkModule, just a few lines below in those same files.

It looks like gUseDyld3 controls whether the version that’s being used is the one in the dyld3 namespace, versus the “original” from the dyld-43 release. There are a few places this is assigned to throughout the code base, though only one spot where it’s set to true, in libdyldEntryVector.cpp. Following the path further through the build config files in Xcode and the source itself, I believe the entry vector is a dependency to building libdyld.dylib. So that’s how we get to the code path that uses the new (dyld3) implementation of NSCreateObjectFileImageFromMemory and NSLinkModule, the latter of which creates a temp file and loads it via dlopen.

How long has this been the case? #

I admit that, while researching, I went through a few different theories as to when this behavior changed. Given I hadn’t seen anyone discuss it, seemed like a recent change. But when I began tracing through the dyld source, I figured maybe this changed around 2017 when the initial dyld3 code was added, and somehow nobody noticed. I even went to search VirusTotal for files named according to this pattern, and found quite a few, dating back to 2018.

VirusTotal search results for
<code>name:NSCreateObjectFileImageFromMemory*</code>
VirusTotal search results for name:NSCreateObjectFileImageFromMemory*

Somehow though, nobody noticing just… didn’t seem right?

Reverse engineering dyld for better “carbon” dating #

While version 941.5 isn’t out, I was curious to check out how execution of this code path looks now, so I opened my local copy of dyld in Ghidra. Searching for NSLinkModule returned a single location, in the dyld4 namespace as we expected. The decompiled2 code does bear resemblance to the source in version 852:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
void dyld4::APIs::NSLinkModule(__NSObjectFileImage*, char const*, unsigned int)

{
  int iVar1;
  char *pcVar2;
  size_t sVar3;
  undefined8 uVar4;
  long *plVar5;
  char acStack1096 [1024];
  long local_48;

  local_48 = ___stack_chk_guard;
  if (*(char *)(param_1[1] + 0x90) != '\0') {
    dyld4::RuntimeState::log(param_1,"NSLinkModule(%p, %s)\n");
  }
  if (param_2[1] == (char *)0x0) {
    uVar4 = 0;
  }
  else {
    *param_2 = (char *)0x0;
    pcVar2 = (char *)(**(code **)(*(long *)param_1[0xd] + 0x80))((long *)param_1[0xd],"TMPDIR");
    if ((pcVar2 == (char *)0x0) || (sVar3 = _strlen(pcVar2), sVar3 < 3)) {
      _strlcpy(acStack1096,"/tmp/",0x400);
    }
    else {
      _strlcpy(acStack1096,pcVar2,0x400);
      sVar3 = _strlen(pcVar2);
      if (pcVar2[sVar3 - 1] != '/') {
        _strlcat(acStack1096,"/",0x400);
      }
    }
    _strlcat(acStack1096,"NSCreateObjectFileImageFromMemory-XXXXXXXX",0x400);
    iVar1 = (**(code **)(*(long *)param_1[0xd] + 0x88))((long *)param_1[0xd],acStack1096);
    if (iVar1 != -1) {
      pcVar2 = (char *)_pwrite(iVar1,param_2[1],(size_t)param_2[2],0);
      if (pcVar2 == param_2[2]) {
        plVar5 = (long *)param_1[0xd];
        sVar3 = _strlen(acStack1096);
        pcVar2 = (char *)(**(code **)(*plVar5 + 8))(plVar5,sVar3 + 1);
        *param_2 = pcVar2;
        _strcpy(pcVar2,acStack1096);
      }
      _close(iVar1);
    }
    uVar4 = 0x80000080;
  }
  if (*param_2 != (char *)0x0) {
    pcVar2 = (char *)(**(code **)(*param_1 + 0x70))(param_1,*param_2,uVar4);
    param_2[4] = pcVar2;
    if (pcVar2 != (char *)0x0) {
      pcVar2 = (char *)dyld4::Loader::loadAddress(dyld4::RuntimeState& pcVar2 >> 1,param_1);
      param_2[3] = pcVar2;
      if (param_2[1] != (char *)0x0) {
        _unlink(*param_2);
      }
      if (*(char *)(param_1[1] + 0x90) != '\0') {
        dyld4::RuntimeState::log(param_1,"NSLinkModule(%p, %s) => %p\n");
      }
      pcVar2 = param_2[4];
      goto LAB_00029c40;
    }
    if (*(char *)(param_1[1] + 0x90) != '\0') {
      (**(code **)(*param_1 + 0x80))(param_1);
      dyld4::RuntimeState::log(param_1,"NSLinkModule(%p, %s) => NULL (%s)\n");
    }
  }
  pcVar2 = (char *)0x0;
LAB_00029c40:
  if (___stack_chk_guard != local_48) {
                    /* WARNING: Subroutine does not return */
    ___stack_chk_fail(pcVar2);
  }
  return;
}

Line 32 confirms the template for the temporary name is the same as what we saw during execution.

Line 51 would be around where we expect a dlopen call to happen, although in this case there’s an indirection to resolve an address for a function from a jump table. Unfortunately Ghidra isn’t able to resolve that table, so we’re left speculating if that’ll end up being dlopen or not. But I’m pretty sure that’s where we end up eventually, given what we’ve observed during live debugging.

I wanted to compare this version with some of the older ones, but lacking at the moment any non-Monterey installs, I had to resort to grabbing copies of /usr/lib/dyld from VMs and, eventually, download them from VirusTotal. I was quite surprised to see that every one of them, as far back as 551, and including 852, do in fact use a non-dlopen version of the code, basically the true fileless implementation. I can’t rule out that maybe all of these versions are off VMs, where maybe something is different with dyld… I think it’s rather curious that I didn’t find any that use the dyld3 implementation.

The thing that throws me off the most—if this behavior really was only enabled in Monterey—is that so many files matching the mkstemp pattern of /private/var/tmp/NSCreateObjectFileImageFromMemory-XXXXXX show up in VT since 2018. I’m not convinced either way just yet.

Future work #

The predictable file name and location means XDR can easily grab these files whenever they’re created, as it would be exactly the code to run that’s written out (so any obfuscation would happen before, most likely). There are some benign uses of these APIs, certainly, but there are bound to be some cool findings too. I haven’t spent any time looking through such files on VT to see if anything good is there, but it’s certainly tempting.

Once the code for the latest version of dyld is out, I hope to better understand the specifics of how the dyld4 namespace is enabled for use. Of course, I also want to see what direction Apple is taking for these APIs, given the new namespace.

On the offensive tooling side, I’m curious if it’s possible to develop a pure in-memory (true fileless) way to execute code on macOS. Mostly that seems to require writing a Mach-O loader, and I bet significant parts of the logic can be borrowed from both the deprecated code, and the dlopen implementation.

If you happen to come across any interesting implementations of true in-memory loaders for macOS, have figured out some of the missing parts I didn’t get to in this analysis, or found any issues/mistakes etc. please let me know via Twitter.

Thanks for reading!


  1. If we’re to go by Apple’s old documentation, the motivation for this API is to allow loading of plug-ins in applications that might want them. Bundles (in the Mach-O sense) fit this purpose rather well, though since it’s trivial to turn a binary into a bundle as shown ↩︎

  2. The GNU demangler that’s part of Ghidra threw a bunch of errors and did not demangle any names, so I ran the decompiled code through c++filt manually and adjusted as necessary. Still, I may have made some errors re: parameters, which was fine for my purposes but may not be for yours. ↩︎