Matt Galloway

My home on the 'net.

A look inside blocks: Episode 1

Today I have been taking a look at the internals of how blocks work from a compiler perspective. By blocks, I mean the closure that Apple added to the C language and is now well and truly established as part of the language from a clang/LLVM perspective. I had been wondering just what a “block” was and how it magically seems to appear as an Objective-C object (you can copy, retain, release them for instance). This blog post delves into blocks a little.

The basics

This is a block:

1
2
3
void(^block)(void) = ^{
    NSLog(@"I'm a block!");
};

This creates a variable called block which has a simple block assigned to it. That’s easy. Done right? No. I wanted to understand what exactly the compiler does with that bit of code.

Further more, you can pass variables to block:

1
2
3
void(^block)(int a) = ^{
    NSLog(@"I'm a block! a = %i", a);
};

Or even return values from them:

1
2
3
4
int(^block)(void) = ^{
    NSLog(@"I'm a block!");
    return 1;
};

And being a closure, they wrap up the context they are in:

1
2
3
4
int a = 1;
void(^block)(void) = ^{
    NSLog(@"I'm a block! a = %i", a);
};

So just how does the compiler sort all of these bits out then? That is what I was interested in.

Diving into a simple example

My first idea was to look at how the compiler compiles a very simple block. Consider the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#import <dispatch/dispatch.h>

typedef void(^BlockA)(void);

__attribute__((noinline))
void runBlockA(BlockA block) {
    block();
}

void doBlockA() {
    BlockA block = ^{
        // Empty block
    };
    runBlockA(block);
}

The reason for the two functions is that I wanted to see both how a block is “called” and how a block is set up. If both of these were in one function then the optimiser might be too clever and we wouldn’t see anything interesting. I had to make the runBlockA function noinline so that the optimiser didn’t just inline that function in doBlockA reducing it to the same problem.

The relevant bits of that code compiles down to this (armv7, O3):

1
2
3
4
5
6
7
8
    .globl  _runBlockA
    .align  2
    .code   16                      @ @runBlockA
    .thumb_func     _runBlockA
_runBlockA:
@ BB#0:
    ldr     r1, [r0, #12]
    bx      r1

This is the runBlockA function. So, that’s fairly simple then. Taking a look back up to the source for this, the function is just calling the block. r0 (register 0) is set to the first argument of the function in the ARM EABI. The first instruction therefore means that r1 is loaded from the value held in the adress stored in r0 + 12. Think of this as a dereference of a pointer, reading 12 bytes into it. Then we branch to that address. Notice that r1 is used, which means that r0 is still the block itself. So it’s likely that the function this is calling takes the block as its first parameter.

From this I can ascertain that the block is likely some sort of structure where the function the block should execute is stored 12 bytes into said structure. And when a block is passed around, a pointer to one of these structures is passed.

Now onto the doBlockA method:

1
2
3
4
5
6
7
8
9
10
    .globl  _doBlockA
    .align  2
    .code   16                      @ @doBlockA
    .thumb_func     _doBlockA
_doBlockA:
    movw    r0, :lower16:(___block_literal_global-(LPC1_0+4))
    movt    r0, :upper16:(___block_literal_global-(LPC1_0+4))
LPC1_0:
    add     r0, pc
    b.w     _runBlockA

Well, that’s pretty simple also. This is a program counter relative load. You can just think of this as loading the address of the variable called ___block_literal_global into r0. Then the runBlockA function is called. So given we know that the block object is being passed to runBlockA, this ___block_literal_global must be that block object.

Now we’re getting somewhere! But what exactly is ___block_literal_global? Well, looking through the assembly we find this:

1
2
3
4
5
6
7
    .align  2                       @ @__block_literal_global
___block_literal_global:
    .long   __NSConcreteGlobalBlock
    .long   1342177280              @ 0x50000000
    .long   0                       @ 0x0
    .long   ___doBlockA_block_invoke_0
    .long   ___block_descriptor_tmp

Ah ha! That looks very much like a struct to me. There’s 5 values in the struct, each of which are 4-bytes (long). This must be the block object that runBlockA was acting upon. And look, 12 bytes into the struct is what looks suspiciously like a function pointer as it’s called ___doBlockA_block_invoke_0. Remember that was what the runBlockA function was jumping to.

But what is __NSConcreteGlobalBlock? Well, we’ll come back to that. It’s ___doBlockA_block_invoke_0 and ___block_descriptor_tmp that are of interest since these also appear in the assembly:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
    .align  2
    .code   16                      @ @__doBlockA_block_invoke_0
    .thumb_func     ___doBlockA_block_invoke_0
___doBlockA_block_invoke_0:
    bx      lr

    .section        __DATA,__const
    .align  2                       @ @__block_descriptor_tmp
___block_descriptor_tmp:
    .long   0                       @ 0x0
    .long   20                      @ 0x14
    .long   L_.str
    .long   L_OBJC_CLASS_NAME_

    .section        __TEXT,__cstring,cstring_literals
L_.str:                                 @ @.str
    .asciz   "v4@?0"

    .section        __TEXT,__objc_classname,cstring_literals
L_OBJC_CLASS_NAME_:                     @ @"\01L_OBJC_CLASS_NAME_"
    .asciz   "\001"

That ___doBlockA_block_invoke_0 looks suspiciously like the actual block implementation itself, since the block we used was an empty block. This function just returns straight away, exactly how we’d expect an empty function to be compiled.

Then comes ___block_descriptor_tmp. This appears to be another struct, this time with 4 values in it. The second one is 20 which is how big the ___block_literal_global is. Maybe that is a size value then? There’s also a C-string called .str which has a value v4@?0. This looks like some form of encoding of a type. That might be an encoding of the block type (i.e. it returns void and takes no parameters). The other values I have no idea about.

But the source is out there, isn’t it?

Yes, the source is out there! It’s part of the compiler-rt project within LLVM. Trawling through the code I found the following definitions within Block_private.h:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
struct Block_descriptor {
    unsigned long int reserved;
    unsigned long int size;
    void (*copy)(void *dst, void *src);
    void (*dispose)(void *);
};

struct Block_layout {
    void *isa;
    int flags;
    int reserved;
    void (*invoke)(void *, ...);
    struct Block_descriptor *descriptor;
    /* Imported variables. */
};

Those look awfully familiar! The Block_layout struct is what our ___block_literal_global is and the Block_descriptor struct is what our ___block_descriptor_tmp is. And look, I was right about the size being the 2nd value of the descriptor. The bit that’s slightly strange is the 3rd and 4th values of the Block_descriptor. These look like they should be function pointers but in our compiled case they seemed to be 2 strings. I’ll ignore that little point for now.

The isa of Block_layout is interesting as that must be what _NSConcreteGlobalBlock is and also must be how a block can emulate being an Objective-C object. If _NSConcreteGlobalBlock is a Class then the Objective-C message dispatch system will happily treat a block object as a normal object. This is similar to how toll-free bridging works. For more information on that side of things, have a read of Mike Ash’s excellent blog post about it.

Having pieced all that together, the compiler looks like it’s treating the code as something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#import <dispatch/dispatch.h>

__attribute__((noinline))
void runBlockA(struct Block_layout *block) {
    block->invoke();
}

void block_invoke(struct Block_layout *block) {
    // Empty block function
}

void doBlockA() {
    struct Block_descriptor descriptor;
    descriptor->reserved = 0;
    descriptor->size = 20;
    descriptor->copy = NULL;
    descriptor->dispose = NULL;

    struct Block_layout block;
    block->isa = _NSConcreteGlobalBlock;
    block->flags = 1342177280;
    block->reserved = 0;
    block->invoke = block_invoke;
    block->descriptor = descriptor;

    runBlockA(&block);
}

That’s good to know. It makes a lot more sense now what’s going on under the hood of blocks.

What’s next?

Next up I will take a look at a block that takes a parameter and a block that captures variables from the enclosing scope. These will surely make things a bit different! So, watch this space for more.

Comments