MachO file format - Value of `fileoff` field in LC_SEGMENT_64 load command

I compiled a simple program such as

int main()
{
    return 0; 
}

using Clang into an executable and asked otool to report the load commands generated by the compiler. The one I'm interested in is LC_SEGMENT_64, in particular the one that describes the __TEXT segment within the file. The description I get is this:

$ otool -lV foo
foo:
Load command 0
      cmd LC_SEGMENT_64
  cmdsize 72
  segname __PAGEZERO
   vmaddr 0x0000000000000000
   vmsize 0x0000000100000000
  fileoff 0
 filesize 0
  maxprot ---
 initprot ---
   nsects 0
    flags (none)
Load command 1
      cmd LC_SEGMENT_64
  cmdsize 312
  segname __TEXT
   vmaddr 0x0000000100000000
   vmsize 0x0000000000001000
  fileoff 0
 filesize 4096
  maxprot rwx
 initprot r-x
   nsects 3
    flags (none)
Section
  sectname __text
   segname __TEXT
      addr 0x0000000100000f90
      size 0x000000000000000f
    offset 3984
     align 2^4 (16)
    reloff 0
    nreloc 0
      type S_REGULAR
attributes PURE_INSTRUCTIONS SOME_INSTRUCTIONS
 reserved1 0
 reserved2 0

My question is: why is the fileoff field in the second load command set to zero?

Apple's documentation for this field states that

The file is mapped starting at fileoff to the beginning of the segment in memory, vmaddr.

This, initially, led me to believe that this field, in conjunction with filesize, indicated the loader something like this: "Take the contents of the file from fileoff to fileoff + filesize and this is the sequence of instructions you're gonna ask the processor to run". But my assumption doesn't hold if this value is zero, of course.

I thought that, since the segment has at least one section, the loader will use the value of the respective offset in the section's description to locate the code to run, and hence such value isn't exactly needed --- we can see that, in fact, the first section within this segment has a value for the offset field (in this case 3984, which I validated with otool -s __TEXT __text -j foo and indeed refers to the offset at which this section is located within the file).

But, if I do the same thing to the object file generated from the same source file (i.e. a file with type MH_OBJECT instead of MH_EXECUTE), the result I get is this:

$ otool -lV foo.o
foo.o:
Load command 0
      cmd LC_SEGMENT_64
  cmdsize 312
  segname
   vmaddr 0x0000000000000000
   vmsize 0x0000000000000070
  fileoff 464
 filesize 112
  maxprot rwx
 initprot rwx
   nsects 3
    flags (none)
Section
  sectname __text
   segname __TEXT
      addr 0x0000000000000000
      size 0x000000000000000f
    offset 464
     align 2^4 (16)
    reloff 0
    nreloc 0
      type S_REGULAR
attributes PURE_INSTRUCTIONS SOME_INSTRUCTIONS
 reserved1 0
 reserved2 0

In this case, the load command does have a value for its fileoff field, which is the same as the one for its first section, __text.

Answers


otool makes it hard to realize, but the answer is simple - Observe here:

$ jtool -v -l /tmp/a | grep SEG
LC 00: LC_SEGMENT_64          Mem: 0x000000000-0x100000000  File: Not Mapped    ---/--__PAGEZERO
LC 01: LC_SEGMENT_64          Mem: 0x100000000-0x100001000  File: 0x0-0x1000    r-x/rw__TEXT
LC 02: LC_SEGMENT_64          Mem: 0x100001000-0x100002000  File: 0x1000-0x1098 r--/rw__LINKEDIT

The __TEXT segment is mapped from the beginning of the file (or slice, if fat ("universal")). That is, with the Mach-O header. This is actually a feature, because the Mach-O then gets parsed by dyld (your friendly loader) for other load commands (notably libraries). The other issue is that __TEXT.__text is often in the very same page , so you'd have to map the whole page anyway.


Need Your Help

C# WinForms transparent click-through control to paint on

c# winforms controls transparent

I am using a custom TreeView class because i need drag-and-drop capabilities. I am painting lines over treeview in order to show the user where will the dragged item go. This creates a lot of visible

R: adding to an existing expression()

r plot expression

Assume you have an existing expression() in R, e.g.