Programming with C Blocks

  On Apple Devices
  by Joachim Bengtsson
  1. What are Blocks?
  2. What are Blocks Good For?
  3. Getting Started
    1. On and for Mac OS X 10.6 Snow Leopard, or for iOS 4
    2. On Mac OS X 10.5 Leopard or for iPhone
  4. Blocks in C
    1. Syntax and Usage
    2. Memory Management
  5. Blocks in Objective-C
  6. Blocks and ARC (Automatic Reference Counting)
  7. Blocks in C++
  8. Block Goodies
  9. References and Additional Sources
  10. Version History and Downloads
  11. 0 Comments

0Introduction

In Mac OS X 10.6, Apple introduced a syntax and runtime for using blocks (more commonly known as closures) in C and Objective-C. These were both later back-ported to Mac OS X 10.5 and the iPhone by Plausible Labs. However, there is very little documentation on blocks; even Apple's documentation that come with 10.6 is very vague on what memory management rules apply. I hope that you can avoid all the trial-and-error that I went through with the help of this guide.

This is sort of a bottom-up tutorial, going from the absolute basics in C, and moving towards higher abstractions in Objective-C. I doubt that reading just the Objective-C part will do you much good; at least, read the chapter on memory management in C first.

Have fun!
  Joachim Bengtsson

1What are Blocks?

Blocks are like functions, but written inline with the rest of your code, inside other functions. They are also called closures, because they close around variables in your scope. (They can also be called lambdas). Let me demonstrate what this means:

counter.zip
#include <stdio.h>
#include <Block.h>
typedef int (^IntBlock)();

IntBlock counter(int start, int increment) {
	__block int i = start;
	
	return Block_copy( ^ {
		int ret = i;
		i += increment;
		return ret;
	});
	
}

int main() {
	IntBlock mycounter = counter(5, 2);
	printf("First call: %d\n", mycounter());
	printf("Second call: %d\n", mycounter());
	printf("Third call: %d\n", mycounter());
	
	Block_release(mycounter);
	
	return 0;
}
/* Output:
	First call: 5
	Second call: 7
	Third call: 9
*/
			

counter is an ordinary C function that returns one of these fabled blocks. When you have a reference to one, you can call it, just as if it was a function pointer. The difference is that the block can use the variables in counter even after the call to that function has returned! The variables i and increment have become a part of the state of the block mycounter. I will go into more detail of how each part of this example works in chapter 4.

2What are Blocks Good For?

In Ruby, they are used in place of many control structures. The standard for loop is replaced by an ordinary function taking a block:


[1, 2, 3, 4].each do |i|
	puts i
end
# outputs the numbers 1, 2, 3, 4 on a line each
	

each is an ordinary function that takes a block: the block is the code between the 'do' and the 'end'. This is much more powerful than one might think, because with the ability to write your own control structures, you no longer need to depend on the evolution of the language to make your code terse and readable. Functional programming has many such useful control structures, such as the map and reduce functions; the first maps each value in a list to another value, while the second reduces a list of values to a single value (e g the sum of integers in the list). This is an entire topic in itself, and I encourage to learn more about functional programming on Wikipedia as I won't be covering that anymore here.

In Erlang, they are used as a concurrency primitive together with the 'light-weight process', instead of the thread. This simple example looks through an array looking for a match, but each element is tested in its own separate process; each element being tested at the same time.

some_function() ->
  lists:for_each([1, 2, 3, 4], fun(element) -> % (1)
    spawn(fun -> % (2)
      if
        element > 1 andalso element < 4 -> % (3)
          io:format("Found a match: ~p!~n", [element]); % (4)
        true -> true
      end
    end
  end),
  io:format("This line appears after the above statement, "
            "but still executes before the code in it."). % (5)

% Outputs:
% This line appears after the above statement, but still executes before the code in it.
% Found a match: 3!
% Found a match: 2!
% OR with the lines swapped in any order, depending on scheduling

Erlang looks weird to the uninitiated, so I'll step it through for you. On the line numbered (1), we define an array with four numbers as elements, and calls the function lists:for_each with that list as a first argument, and a block taking one argument as the second argument (just as the function Enumerable#each takes a block argument in the Ruby example above). The block begins at the -> and goes on until the last end. All that first block does is it spawns a new Erlang process (line (2)), again taking a block as an argument to do the actual test, but now THIS block (line (2) still) is executing concurrently, and thus the test on line (3) is done concurrently for all elements in the array.

In Cocoa, they can be used in both these ways, and more; for example, they are good for callbacks and delayed execution. Examples of these uses can be found at mikeash.com [4]. Apple's main reason for introducing blocks in C is most likely because they are perfect building blocks for concurrency. You can read more about Apple's Grand Central Dispatch in [5].

3Getting Started

On and for Mac OS X 10.6 Snow Leopard, or for iOS 4

  1. You're done! Block support is an integral part of Snow Leopard and iOS 4 and there's nothing you need to do to use them. Even the standard library for blocks is included in libSystem, so you always automatically link with it.

On Mac OS X 10.5 Leopard or for iPhone

If you want to use blocks in applications targetting Mac OS 10.5 or iPhone OS 2 or 3, you need to use the third-party GCC fork Plausible Blocks. From their site:

Plausible Blocks (PLBlocks) provides a drop-in runtime and toolchain for using blocks in iPhone 2.2+ and Mac OS X 10.5 applications. Both the runtime and compiler patches are direct backports from Apple's Snow Leopard source releases.

PLBlocks is provided by Plausible Labs.

  1. Download the latest plblocks disk image for your OS from the Google Code page.
  2. Mount the disk image and run the installer. When done, keep the disk image mounted — you'll need it later.
  3. Restart Xcode if it was running, and open the project in which you want to use blocks.
  4. Open either the project-wide settings, or in the target settings (cmd-opt-E) for the specific target you want to use blocks in
  5. Search for "Compiler"
  6. From the drop-down for the setting "C/C++ Compiler Version", choose "GCC 4.2 (Plausible Blocks)" as shown in this image:
  7. On the disk image with the plblocks installer, you'll find a folder called "PLBlocks Runtime". Save this to somewhere on your disk.
  8. Now, for the target that will use blocks, you will need to link to the blocks runtime.
    • For an iPhone app, just drag the iPhone framework from the "Runtime" folder into your project and choose to link with your target.
    • For a Mac OS X 10.5 app, drag the Mac framework into your project and link with your target, and add an embed framework build phase. If you are unfamiliar with how to do this, read my guide on framework embedding (Note: You might only need to follow steps 1 through 3).

4Blocks in C

Syntax and Usage

Variables pointing to blocks take on the exact same syntax as variables pointing to functions, except * is substituted for ^. For example, this is a function pointer to a function taking an int and returning a float:

float (*myfuncptr)(int);

and this is a block pointer to a block taking an int and returning a float:

float (^myblockptr)(int);

As with function pointers, you'll likely want to typedef those types, as it can get relatively hairy otherwise. For example, a pointer to a block returning a block taking a block would be something like void (^(^myblockptr)(void (^)()))();, which is nigh impossible to read. A simple typedef later, and it's much simpler:

typedef void (^Block)();
Block (^myblockptr)(Block);

Declaring blocks themselves is where we get into the unknown, as it doesn't really look like C, although they resemble function declarations. Let's start with the basics:

myvar1 = ^ returntype (type arg1, type arg2, and so on) {
	block contents;
	like in a function;
	return returnvalue;
};

This defines a block literal (from after = to and including }), explicitly mentions its return type, an argument list, the block body, a return statement, and assigns this literal to the variable myvar1.

A literal is a value that can be built at compile-time. An integer literal (The 3 in int a = 3;) and a string literal (The "foobar" in const char *b = "foobar";) are other examples of literals. The fact that a block declaration is a literal is important later when we get into memory management.

Finding a return statement in a block like this is vexing to some. Does it return from the enclosing function, you may ask? No, it returns a value that can be used by the caller of the block. See 'Calling blocks'. Note: If the block has multiple return statements, they must return the same type.

Finally, some parts of a block declaration are optional. These are:

Calling blocks and returning values is as easy with a function or a function pointer. If you take the value from calling the block, you will get the value returned with return. Example:

typedef int(^IntBlock)();
IntBlock threeBlock = ^ { 
	return 3;
};
int three = threeBlock();

// Return values work just as in C functions. return needs to be explicit! This is not Ruby.
IntBlock fourBlock = ^ {
	4;
};
// Yields on compile:
//  error: incompatible block pointer types initializing 'void (^)(void)', expected 'IntBlock'
// This is because we neither specified the return type, 
// nor provided a return statement, thus implying void return.

		

Using variables in the closure scope is very straight-forward if you're just reading them. Just use them, and they will be magically managed by your block. However, if you want to be able to modify the variable, it need to be prefixed by the __block storage qualifier. Let's make the counter from the first example go forwards AND backwards:

counter2.zip
#include <stdio.h>
#include <Block.h>

typedef int (^IntBlock)();
typedef struct {
	IntBlock forward;
	IntBlock backward;
} Counter;

Counter MakeCounter(int start, int increment) {
	Counter counter;

	__block int i = start;

	counter.forward = Block_copy( ^ {
		i += increment;
		return i;
	});
	counter.backward = Block_copy( ^ {
		i -= increment;
		return i;
	});

	return counter;

}

int main() {
	Counter counter = MakeCounter(5, 2);
	printf("Forward one: %d\n", counter.forward());
	printf("Forward one more: %d\n", counter.forward());
	printf("Backward one: %d\n", counter.backward());

	Block_release(counter.forward);
	Block_release(counter.backward);

	return 0;
}
/* Outputs:
Forward one: 7
Forward one more: 9
Backward one: 7
*/
		

Note how we in the blocks use increment without doing any work, yet reference it outside the MakeCounter function. However, we only read from it. We also use i, but we modify it from inside the blocks. Thus, we need the __block keyword with that variable.

Sending and taking blocks as arguments is again like doing so with function pointers. The difference is that you can define your block inline, in the call.

blockarguments.zip
#include <stdio.h>
#include <Block.h>

void intforeach(int *array, unsigned count, void(^callback)(int))
{
	for(unsigned i = 0; i < count; i++)
		callback(array[i]);
}

int main (int argc, const char * argv[]) {
	int numbers[] = {72, 101, 108, 108, 111, 33};
	
	intforeach(numbers, 6, ^ (int number) {
		printf("%c", number);
	});
	printf("\n");
	
	return 0;
}
/* Outputs:
Hello!
*/

Notice how we call intforeach with an inline block. (If you're wondering about the numbers: they are the ascii codes for the letters in the word "Hello!"), and how intforeach could have been an ordinary C function taking a function pointer and still be written the exact same way except for switching the ^ for a *.

Memory Management

What does 'memory management' in the context of blocks mean? It doesn't mean storage for the actual code. The code of the block is compiled and loaded into the binary like any other function. The memory a block requires is that of the variables it has closed around; that is, any variables the block references need to be copied into the block's private memory.

So far, we have just assumed that the memory has somehow magically become part of the block, and will magically disappear. Unfortunately Apple haven't added a garbage collector to C, so that's not quite the case. However, to understand the following, you must have a basic understanding of stack and heap memory, and how they differ.

When you define a block literal, you create the storage for this literal on the stack. The variable pointing to this literal can still be considered a pointer, however. This means that the following code compiles, but does not work:

typedef void(^Block)(void);

Block blockMaker() {
	int a = 3; // (1)
	Block block = ^ { // (2)
		return a;
	}
	return block; // (3)
}
int main() {
	Block block2 = blockMaker(); // (4)
	int b = block2(); // (5)
	return 0;
}

This is basically what the code above does:

Block blockMaker() {
	int a = 3; // (1)
	struct Block_literal_1 *block;
	struct Block_literal_1 blockStorage = ...; // (2)
	block = &blockStorage; // (2b)
	return block; // (3)
}

At (1), the value 3 is allocated on the stack and named a, as expected. At (2) and the following block, the value of a is copied into the block literal. Then, also at (2) (2b in the second example), a pointer to the literal is in turn assigned to the variable block. If you think of Block as just a pointer type, and the assignment in (2) as taking the address of a literal, you might see why (3) is invalid, but we'll get to that.

In (4), the variable block2 gets a reference to the literal in (2). However, both the variables a and block (together with blocks copy of the value in a) have now fallen off the stack, as blockMaker has returned. When we call block2 in (5), we might segfault, get a corrupted value, or whatever — the behavior is undefined.

The same effect can be demonstrated without involving blocks:

int * intMaker() {
	int a = 3; // (1)
	return &a; // (2)
}
int main() {
	int *b = intMaker(); // (3)
	return 0;
}

intMaker returns a pointer to an object on the stack (1), which will disappear together with the rest of the state of the function call when intMaker returns (2).

How do work around this problem? Simple — we move the block to the heap. The function Block_copy() takes a block pointer, and if it's a stack block, copies it to the heap, or if it's already a heap block, increases its retain count (like an immutable object in Cocoa would do). Exactly what happens is an implementation detail, but the thing to take away is that you should never return a block literal from a function, but rather a copy of it.

When we're done with the block, just release it with Block_release(). Thus, the correct way to implement the blockMaker example is like so:

typedef void(^Block)(void);

Block blockMaker() {
	int a = 3;
	Block block = ^ {
		return a;
	}
	return Block_copy(block); // (1)
}
int main() {
	Block block2 = blockMaker();
	int b = block2();
	Block_release(block2); // (2)
	
	return 0;
}

Notice how we move the block to the heap in (1), and discard the block when we're done with it in (2). For anyone who has done Cocoa or CoreFoundation programming, this pattern should be familiar.

(An aside: If you try to return a literal like so: Block foo() { int i = 3; return ^ { printf("Fail %d", i); }; }, the compiler will complain that you are trying to return a stack literal, which isn't possible.)

Be careful! You don't have to return from a function for something to fall off the stack. The following example is equally invalid ([2] and [4]):

typedef void(^BasicBlock)(void);
void someFunction() {
	BasicBlock block;
	if(condition) {
		block = ^ { ... };
	} else {
		block = ^ { ... };
	}
	...
}

// Basically equivalent of:
void someFunction() {
	BasicBlock block;
	if(condition) {
		struct Block_literal_1 blockStorage = ...;
		block = &blockStorage;
	} // blockStorage falls off the stack here
	else 
	{
		struct Block_literal_1 blockStorage = ...;
		block = &blockStorage;
	} // blockStorage falls off the stack here
	// and block thus points to non-existing/invalid memory
	...
}

// Correct:
void someFunction() {
	BasicBlock block;
	if(condition) {
		block = Block_copy(^ { ... });
	} else {
		block = Block_copy(^ { ... });
	}
	...
}

5Blocks in Objective-C

The really weird and wonderful thing about blocks is that blocks are actually Objective-C objects. Even if you create them from a C++ library, the specification says that every block must have the memory layout of an Objective-C object. The runtime then adds Objective-C methods to them, which allow them to be stored in collections, used with properties, and just generally work wherever you'd expect. These methods are:

Blocks in Objective-C have one more very important difference from blocks in C in handling variables that reference objects. All local objects are automatically retained as they are referenced! If you reference an instance variable from a block declared in a method, this retains self, as you're implicitly doing self->theIvar. An example is in order:

logmessage.zip
typedef void(^BasicBlock)(void);
@interface LogMessage : NSObject {
	NSString *logLevel;
}
@end
@implementation LogMessage
-(BasicBlock)printLater:(NSString*)someObject;
{
	return [[^ {
		NSLog(@"%@: %@", 
			logLevel,  // (1)
			someObject // (2)
		);
	} copy] autorelease]; // (3)
}
@end

Here's a method that simply returns a block that lets you print the given string, prefixed by the object's log level. In (3), the block is copied, because as you remember, you can't return a block literal since it's on the stack. Still, we want to follow common Cocoa patterns and never return an owning reference from a method called neither copy, retain nor alloc. This gives us the idiom [[^{} copy] autorelease], which is what you should always use when returning blocks.

Now, for the auto retaining magic, notice how we reference the argument object someObject in (2), and implicitly self in (1) (which really says self->logLevel). When the block is copied in (3), Block_copy notices that logLevel and someObject are objects, and retains them. When the block is released, it will free all its captured variables, and if they are objects, release them.

With even self being retained, this is a very easy way to accidentally create reference cycles and thus memory leaks. What if you want to avoid this behavior? Just give the variable __block storage (Thanks to mikeash[4] for pointing this out). Example:

__blockself.zip
-(void)someMethod;
{
	__block TypeOfSelf *blockSelf = self;
	^ {
		// Because blockSelf is __block, the following reference
		// won't retain self:
		blockSelf->myIvar += 3;
	}
	...
}

Why does this work? Well, if the variable is __block, it can be changed from within a block. If it's an object pointer, this means changing the object pointer itself, not the object. If the block autoretains object pointers, what should happen if the pointer is changed, pointing to another object? The concept is so hairy that they chose the simplest solution: __block storage objects simply aren't autoretained.

Finally, one more memory management gotcha before we move onto syntax. Remember that blocks are objects? And objects are automatically retained? Yes, that's right, blocks also automatically retain blocks they refer to! (My research.) This means that the following code will work fine:

typedef void(^BasicBlock)(void);

// Returns a block that aborts the process
-(BasicBlock)doSomethingAsynchronous;
{
	BasicBlock cleanup = [[^{
		// Do some common cleanup needed in all cases
	} copy] autorelease];
	
	__block AsyncDownloader *downloader = [AsyncDownloader fetch:@"http://domain/some.file" options:$dict(
		@"success", [[^ {
			[downloader.dataValue writeToFile:@"some/path" atomically:NO];
			DisplayAlert(@"Download complete");
			cleanup();
		} copy] autorelease],
		@"failure", [[^ {
			DisplayAlert(@"Error: %@", downloader.error);
			cleanup();
		} copy] autorelease]
	)];
	
	return [[^ {
		[downloader abort];
		cleanup();
	} copy] autorelease];
}

Notice how downloader is declared __block — otherwise me referencing it in the callback blocks would retain it, and the downloader retains the blocks, thus creating a cycle. $dict is a very handy macro which creates a dictionary from key-value pairs (the NSDictionary constructor is too verbose for my taste).

Notice how all three blocks reference the cleanup block: it is thus retained and properly memory managed until the referring blocks disappear.

This example also highlights a problem with stack block literals — collections, such as NSDictionary above, retains its values, but -[Block retain] doesn't do anything on stack blocks! Thus, we must move the blocks to the heap before we can insert them into the dictionary. For example, to insert a few delayed actions into an array, you'd do it like this:

arrayofblocks.zip
NSArray *someActions = [NSArray arrayWithObjects:
	[[^ { NSLog(@"Hello");   } copy] autorelease],
	[[^ { NSLog(@"World!");  } copy] autorelease],
	[[^ { NSLog(@"Awesome.");} copy] autorelease],
	nil
];

for (void(^block)() in someActions) {
	block();
}
/* Outputs:
2009-08-23 13:06:06.514 arrayofblocks[32449:a0f] Hello
2009-08-23 13:06:06.517 arrayofblocks[32449:a0f] World!
2009-08-23 13:06:06.517 arrayofblocks[32449:a0f] Awesome.
*/
	

In the syntax department, you might have two questions. How do I use non-typedef'd blocks as method arguments, and can I use them as properties? This example should make that clear:

propertyblocks.zip
#import <Foundation/Foundation.h>
#import <stdlib.h>

@interface PredicateRunner : NSObject
{
	BOOL (^predicate)();
}
@property (copy) BOOL (^predicate)();
-(void)callTrueBlock:(void(^)())true_ falseBlock:(void(^)())false_;
@end

@implementation PredicateRunner
@synthesize predicate;
-(void)callTrueBlock:(void(^)())true_ falseBlock:(void(^)())false_;
{
	if(predicate())
		true_();
	else 
		false_();

}
-(void)dealloc;
{
	self.predicate = nil;
	[super dealloc];
}
@end



int main (int argc, const char * argv[]) {
	NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
	srandom(time(NULL));

	PredicateRunner *pr = [[PredicateRunner new] autorelease];
	pr.predicate = ^ BOOL {
		return random()%2;
	};
	[pr callTrueBlock:^ {NSLog(@"Yeah");} falseBlock:^ {NSLog(@"Nope");} ];

	[pool drain];
	return 0;
}

Typedefs will make that much easier to read, though.

6Blocks and ARC (Automatic Reference Counting)

Mike Ash has an excellent run-down of block usage with ARC[0]. In summary:

7Blocks in C++

See [1] and [2] for details on how blocks work with C++ variables. In essence: Copy constructors and destructors are being called as expected. Read-only variables are const-copied into the block.

8Block Goodies

Code snippets, libraries and frameworks using blocks to simplify working with Objective-C are popping up here and there as Mac and iPhone developers are figuring Blocks out and applying them to their common tasks. I'll add them here as I find them. Post in the comments if there's something I'm missing!

9References and Additional Sources

If you want to read more about blocks, the following links are great places to keep reading:

10Version History

This site is versioned, so to see exactly what has changed, or to download the entire site with samples and all, check out the repository.

11Comments

blog comments powered by Disqus