Core Foundation: Basics - WWDC 2000

Mac OS • 33:00

Core Foundation provides a rich set of C APIs for strings, collections, user preferences, property lists, plug-ins, XML handling, and more. These APIs are designed with performance, consistency, and portability in mind, and are available in the Carbon API set. This session covers Core Foundation fundamentals, conventions, and paradigms, as well as the basic set of services and APIs.

Speaker: Becky Willrich

Unlisted on Apple Developer site

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

We're going to start by talking about CF's role in Mac OS X as a whole. Then we're going to look at some of the concepts and paradigms there. And finally, in this talk, we'll be introducing the basic CF types. So, when I talk about Core Foundation, this is the big take-home point. Core Foundation is the lingua franca in Mac OS X. Well, what does that mean? A lingua franca is a medium of communication between people of different languages. So what are the different languages on Mac OS X?

Carbon and Cocoa Core Foundation is a substrate layer that lives beneath both Carbon and Cocoa, so it provides the common language by which those two environments speak to one another. As such, it bridges the basic types between both stacks, and provides those common services that both stacks need to share, things like the pasteboard, the event model. So, just as an example here, you see the Carbon types on the right and the Cocoa types on the left. Core Foundation acts to convert to and from the types on either side.

Hopefully you've seen this diagram by now. This is the basic architecture on Mac OS X. I want to point out the core services layer that lives below the application services, the graphical services like Quartz and OpenGL and QuickTime, but above Darwin, the operating system. That layer is where we live. We provide the common services needed by everything above us that does not depend on graphics.

So if you look inside the core services layer, you'll see a number of the non-graphical managers you're familiar with. Things like the file manager, memory manager, resource manager. And in with all of those is Core Foundation. In fact, Core Foundation, were you to look inside the core services code, lies below almost all of those.

So, what is CF? It's a non-graphical substrate library available to both Carbon and Cocoa. In fact, it's even below Classic. It is written in pure C because that is the common language beneath all of those layers, and it's designed around some basic object-oriented paradigms. So, CF provides two basic things to the system. The first one is some basic types by which all of the different substrates can speak.

So that's things like strings, arrays, and dictionaries, and then we combine them together in an abstraction that we call property lists. The other thing it provides is the non-graphical services that every application needs. So, that's localization support, user preferences, as I said, also the pasteboard and the run loop.

In this talk, we're going to focus on the basic data types. There's a follow-on to this talk immediately after lunch in this same room, where we'll be talking about the second half of that stack, the app services. So where is CF available? On Mac OS X, Core Foundation is available as its own library. On 8 and 9, it ships as part of CarbonLib.

Now there are some pieces of Core Foundation that are only available on 10. Those center around the mock-specific pieces, which obviously do not apply on 8 and 9. Finally, a subset of Core Foundation is available inside of Darwin. So if you're interested, you can check out the Darwin source code and see what we're doing.

So I know that as a developer, when I go and approach a stack, one of the first things I don't want to hear is, by the way, there's this other large API I have to learn. So if you don't want to learn CF, relax. You don't really have to.

You're missing out, but you don't have to. Carbon and Cocoa are both complete APIs unto themselves. You can write entirely to those APIs if you wish. And Core Foundation will silently help out underneath the stacks, providing smooth communication. However, this is what you're going to miss. If you yourself need to write code that bridges between Carbon and Cocoa applications-- for instance, if you're writing a low-level library that several different applications are going to use-- you should be using CF to convert the types to and from.

Internationalization support is wired into Core Foundation. It's one of the easiest places to get it from. There's a rich, simple set of data types around the internationalized strings inside of Core Foundation. That's what we're going to focus on here today. And there is some new functionality which you can only get by talking to the Core Foundation stack.

So what is that new functionality? Property lists is a big one, and that's the one we'll talk about today. There's also an XML parser inside of Core Foundation. It's not the only XML parser on the system, but it's the only one in C, and it's nice and low level for your use.

User preferences: There's a unified user preferences model inside of Core Foundation. It will save you from having to invent your own preferences file format, and doing your own work of reading and writing to file. Finally, and perhaps most importantly, The bundle and plug-in model that surrounds the entire app packaging scheme on Mac OS X is centered in Core Foundation. So if you want to pull apart the pieces of that app packaging format, you should be talking to Core Foundation.

Okay, so now I'm going to move on to talk about the basic concepts and paradigms inside of Core Foundation. I'm going to spend a moment on the general philosophy of the library. Then I'm going to talk about how we manage to get some object orientation inside of what is a purely C library. And finally, I'm going to look at the memory management scheme that we use.

The general philosophy of core foundation is to be lean and mean. We are just barely above the operating system layer. Performance and efficiency is perhaps our primary goal. For that reason, we have exported a minimal API and tried to make it as powerful and extensible as possible. So you will not find a lot of convenience functions to do simple operations.

You will find some complex, powerful, functional API where you may have to do some configuration of the arguments going in. Finally, there is no safety net in Core Foundation. We follow the same philosophy as the C language: if you tell us to do something, we're going to do it, no questions asked. So if you hand us a bad pointer, we're going to try and access it.

To try and ease the road a little bit, we do provide a debug library, and that debug library is available so that as you code, you can run against it, and it will catch a number of common errors. And that way you can start to debug your program.

So, object orientation inside of Core Foundation focuses on CF types. And we try to make this as available as possible in the naming inside of Core Foundation. So each type is opaque and represents a pseudo class, and each such type is going to be named CF, and then the class name Ref.

The related functions, which are acting as methods on the class, will always be prefixed with the name of the class, cf, class name, and then an action describing what the function does. Related constants also are prefixed with the class name, k, cf, class, and then a descriptor. So for example, if we look at CFStrings, the type is CFStringRef. An example of a function would be CFString append, and an example of the constant would be kcf string encoding ascii.

Furthermore, inside the functions, the first argument is always the type instance, what you would consider the receiver if you were writing in a true object-oriented language. Now there's one important exception to that, and that's creators, and we'll get to those in a minute. We use common verbs to have common meanings throughout the API, so that CFArray add is going to work the same way as CFDictionary add. Finally, when we are passing information back to you, we will always put those out parameters, the ones that are going to be passed by reference, at the very end.

There are a small number of polymorphic functions inside of CF. This is the entire list. These functions can be used with any CF type, regardless of class. So, equality and hashing, CFEqual and CFHash, that's what allows us to add objects into collections and retrieve them out of it.

There's a little bit of introspection. CFGetTypeId will return the type ID of the class. CF copy description will provide a string description of the object. This is for debugging purposes only. These are not intended to show to the user. In fact, they will frequently have the hex address of the object embedded in them.

And finally, memory management. We use a reference counting memory management scheme, so CF retain gains a reference, CF release releases it. CF get retain count is intended for debugging purposes only, so you can track how the reference counting goes through an object's lifetime. And then CF get allocator allows you to discover how the object was allocated.

So, as I just said, CF types are reference counted, retained to take a reference, released to release it. Anytime you have a reference counting scheme, you have this common problem of who has the reference when an object is returned by a function. Here are the rules. Very, very simple.

If the function you called includes the word "get," you did not get a reference. The object you called retains the reference. So if you intend to keep that object, you'd better retain it yourself. Functions that include the words "copy" or "create," on the other hand, give you a reference.

And when you are done with that object, you must release it, or else you will have a leak. There is something important to keep in mind here. We are simply using copy and create as a way of telling you, the developer, whether or not you've got a reference.

There may be some optimizations going on in the back, particularly in the case of immutable objects. So even though you've received a reference, copy may not actually do any memory copying, and likewise, create may not do any allocation. CF is simply being more efficient and it shouldn't matter to you.

Allocators. An allocator, there is a type for it, CFAllocatorRef. An allocator encapsulates an allocation scheme. You call an allocator to allocate memory, to release it, to reallocate it, and so on. And every create function, so every function that could create a CF object, takes an allocator as its first argument and uses that allocator to create the memory necessary. Normally, you're going to want to pass kcfallocator default, which asks CF to use whatever the current default allocator is. There's one, we have one that we install by default. If you want to, you can install your own to do custom memory management.

That's all I'm going to say on the basic structure of CF, or the basic concepts. And now I'm going to move on to talk about some of the basic types inside of Core Foundation. We're going to start with CFString, because that's probably the most basic of the types, the one that you're most likely to be using.

Then we're going to talk briefly about CFArray and CFDictionary, which are two of our collection types. Once we're done with that, we'll look at how collections and mutability work together inside this library. And finally, we're going to see how all of that comes together to form property lists.

Conceptually, a CFString is simply an array of Unicode characters. The goal in creating a CF type, whereas I show here, we're trying to get strings to a new level of abstraction. We're trying to solve the 8-bit ASCII problem. If we can do that, then internationalization becomes easy, because the basic type throughout the entire system will support Unicode characters.

Because we're at such a low level, we've worked hard to assure performance. So even though conceptually you're looking at an array of Unicode characters, it may not internally be stored that way. In particular, you're not always going to be storing two bytes per character. And as we move forward, CFStrings are becoming the way to communicate strings in all of the public APIs.

CFString maintains a very rich functionality set. There are a number of different creation functions depending on where you may have gotten your string data from, whether it's coming from a Carbon stack or Cocoa, or you just read it out of a file. There's a lot of support for encoding conversions from CFStrings to whatever encoding you need to use. We will do comparisons, finds, searches. There are some more complex operations like exploding a string into its component pieces or combining the component pieces back again. And there's even some printf style formatting and parsing available.

So, let's look at some of the API. For creation and access, I give an example of a couple of the basic creation methods here. CF Create With Characters allows you to take an array of Unicode characters and create a string out of it. Pretty straightforward. If, on the other hand, you're coming from a Pascal string, for instance, you would use the second function: CFString Create With Pascal String. There are several other similar kinds of functions for the other types: CFString Create With File System Representation, CFString Create From Data, where you provide an encoding, and so on.

For basic access, you'll use the next two functions. CFString getLength returns the current length of the string. CFString getCharacterAtIndex returns the character at the given index. If you want more efficient processing, you can get a range of characters at once using CFString GetCharacters. And if you have a CFString and find that you need a Pascal string or some other type to communicate with another API, you can get it with CFString GetPascalString. Finally, the last line there shows you how you create a constant CFString. CFSTR, all caps, and then the string within quotes will produce a constant CFString for your use.

Mutating, some CFStrings are mutable. We'll talk more about that in a bit. If you have immutable strings, here are the basic functions you'll use. CFString append appends another string onto the base string. CFString replace replaces a range of characters inside the current string with another string. CFString insert performs a straight up insertion, and CFString trim is added here as an example of more complex functionality available. CFString trim will take a trim string and compare it to the beginning and the end of the given string and strip it off however many times it may appear.

Now, I said that CFStrings are becoming the way by which strings are communicated throughout the APIs. So here's an example of this. If you look inside the Carbon toolbox, there's some old API you may be familiar with: setWtitle and getWtitle, which set and get the title of a window. After the introduction of Core Foundation, the API below it has been introduced: Set Window Title with CFString, Copy Window Title as CFString, which allows you to set the window title in a richer, Unicode-ready manner.

Okay, another basic type inside of CF is CFData. CFData is simply a way to get an object representation around an array of bytes. It's more versatile than using a simple void star or a struct pointer because it's opaque and because it encapsulates the notion of length. And there are a couple very basic accessors: cfdata-get-length and cfdata-get-byte-pointer.

I'm going to move on now to talk about collections. There are a number of different collections inside of Core Foundation, and by a collection I simply mean an object which can hold other pointers, and most of the time you're going to be holding other objects. CFArray and dictionary are by far the most common of the collections you would be using.

An array, as you would expect, holds an ordered list of values, and a dictionary holds values as key-value pairs. You give me the key, I'll give you the value. There are a number of less common collections also on the system: Set, Bag, Bit Vector, Binary Heap, Tree are just a few of them. There are actually several others as well.

So, let's look at CFArray. A couple creation methods: CFArray create creates an immutable array. You give me an array of values and the number there, and I'll create an array around it. CFArray create mutable creates a mutable array. And you will tell us the capacity and the callbacks.

We'll talk about that in a moment. Once you've got an array, the basic API is CFArray get count to find out how many objects are in the array, and then CFArray get value at index, which will give you the value at the given index. CFArray insert value at index is the most common mechanism.

CFDictionary. You'll notice the naming between CFDictionary and CFArray is very consistent. CFDictionary create creates an immutable dictionary. CFDictionary create mutable creates a mutable one. CFDictionary get count gives you the number of key value pairs currently stored in the dictionary. Get value, you provide the key, will give you the value back associated with that key. And set value allows you to associate a value with a given key.

So now that we've looked at those basic APIs, let's talk about collections in more general terms. All of our collections contain pointer-sized values. As I said, most of the time, you're going to be storing other objects, but not always. So how do you configure the behavior differently, depending on what's being stored? Well, that's what the callbacks are all about. You pass the callbacks when the collection is first created. And the callbacks tell the object what to do as objects are being added and removed from the collection.

Collections can also be created either mutable or immutable, and we'll go into both callbacks and mutability in greater detail. Okay. So as I said, callbacks determine what the array or the dictionary is going to do as values are added or removed. 90% of the time, you're going to be storing other CF objects inside of the collection. When that happens, use the default callbacks we provide.

So for example, KCF type array callbacks is the set of callbacks you use for an array that's going to store CF types. As you would expect, as objects are added to an array, this set of callbacks will retain the objects on the way in, release them on the way out.

Of the remaining 10% of the time, 9% of that is going to be, your collections are probably going to hold pointers that you just want to manage yourself. You would rather the collection does not do any management for you. When that's the case, simply pass null. At that point, CFArray, CFDictionary, all the collections will simply take their hands off. You give it a pointer value, it'll hold the pointer value, but it won't do anything with it.

Once in a blue moon, you will be using CFArray or CFDictionary to store a special data structure that has its own memory management scheme. This is the point at which you're going to be writing your own callbacks. You will have to add functions that define how the object is retained on the way into the collection, how it should be released on the way out, and then you will pass those callbacks to CFArray or whatever collection you're using.

Moving on to mutability. There are three kinds of mutability in this system. An immutable object, as you would expect, you can change neither the contents nor the size. Once the object has been created, the data is going to remain constant until the object is destroyed. A fixed-size collection: you can change the contents all you want, as long as you do not exceed the given capacity. And a fully mutable collection: you can change the contents and the size.

Not all of the collections support all of the kinds of mutability. Both array and dictionary do. When you're in doubt, check the documentation. The documentation clearly marks what kinds of mutability are supported by which collections. And also, I wanted to mention here, although CFString and CFData are not themselves collections of other objects, conceptually, a string is a collection of characters, and data is a collection of bytes. And therefore, we use these mutability principles for both CFString and CFData as well. And they also support all three types of mutability.

So the way it works is this: when you call CFString create, CFArray create, whatever, you're going to get back a fully immutable object. When you call CFArrayCreateMutable, you will get back a mutable object. If the specified capacity is greater than zero, that will be taken as the maximum capacity for your collection.

So CFArrayCreateMutable with a capacity of 10 produces an array that can never hold more than 10 objects. On the other hand, if you pass a capacity of zero, we will create a fully mutable object for you, and the array or string or dictionary will automatically grow as new objects are added in.

So, for example, CFArray create is going to produce an immutable array. You just provide the objects that are going in and the number of them, and there you go. CFDictionary create mutable, passing a capacity of 10, creates a mutable dictionary that can never contain more than 10 key value pairs. CFString Create Mutable, passing a capacity of zero, produces a fully mutable string of unlimited length, and as strings and characters are added to this string, it will grow to accommodate them.

So, how does this all come together? Well, given these basic pieces of strings, datas, arrays, and dictionaries, you can actually build up some very complex trees of objects. We also support several other types, but String, Data, Array, and Dictionary form the core of them. There are a couple restrictions. First of all, it has to be a true tree, no loops, so arrays can't include themselves, dictionaries can't include themselves.

Secondly, the keys of the dictionaries must always be strings. And we create an abstraction on top of such a structure and call it a CF property list ref. So why are property lists useful? They're useful because we make them persistent. They have a simple, flattened XML representation, and you can easily convert to and from that flattened XML representation, expressed as a data, and the rich tree structure. Now, we do not handle the file access part of this. As a rule, Core Foundation does not touch the file system. So it's up to you to read and write the data to and from whatever file you're taking it from.

So why does this make property lists useful? It's useful because property lists are very easy to program. They're small. They're very, very flexible. If you need to represent new data, just tack another object onto the end of the array or create a new string. And as I said, they're persistent. The API for persistency is very straightforward.

So, if you ever find yourself needing a small, flexible data format of some kind, try out CF Property Lists. See if it'll manage your needs. The one caveat here is, property lists are not useful as a general database. It does not scale well, and that's because XML is a serialized data representation. So there's no ability to scan into the middle and locate an object out of the middle.

So for example, if you look in Mac OS X where we use property lists ourselves, you'll find them in small configuration files all over the place. We use them to configure applications and provide much of the information that Finder needs, for instance, when viewing an app package. Plug-ins use configuration files to announce their APIs, as well as announcing how they fulfill the API.

There are a number of small system Unix-y kind of parameter configuration files that have been migrated to property lists. We also use them for the user preferences on this system. User preferences is a classic example of where we don't know what information a given application needs to store, but we do know that it's likely to be small, and we do know that our types are likely to accommodate them.

[Transcript missing]

So you need to deliver an application yesterday? You might well prototype using property lists as your data store. That will at least save you the burden of figuring out what your final data format is, and we found it remarkably flexible for doing quick prototyping. In fact, the original HTML rendering engine in the Cocoa framework used property lists entirely for its backing store.

So I mentioned there were some other property list types. These are very simple object wrappers that you would rarely use except that you need to store a particular kind of value in a property list. The three most common are listed here: CFDate, Wraps a date or time, CFNumber, and CFBoolean wrap simple numeric values. And again, we're not suggesting that you use this in place of a raw integer or a raw Boolean. Rather, when you find you need to store such a value in a property list, these wrappers are available to do that. Okay. Debugging and getting help.

I mentioned earlier that there is a debug library available inside of Core Foundation. What the debug library adds is a number of extra assertions, argument checking, that kind of thing. And what happens is as soon as one of those assertions finds an invalid argument, it's going to log an error and abort your program.

That's going to make it much easier to isolate the problem, find the change, get a good backtrace. And we found that many of the most common programming errors we see with Core Foundation can be caught this way. So please run against the debug library as you develop Core Foundation code. I've also put a list here of the most common errors in programming with Core Foundation. Passing null is a CF type. There's a classic one. Null is not a CF type. It is not a valid argument to any of the pseudo-methods inside of Core Foundation.

Another common error is releasing the return value of a get function. You call CFArray getValue. You never got a reference, the function includes the word get, but you release it anyway. Well, sooner or later that array is going to be destroyed, or someone else is going to remove that value from the array. At that point, the array will release the object itself. Boom, memory smasher.

The other half of that equation is failing to retain the value that came back from a get function. So once again, you get a value out of CFArray using CFArrayGetValue. You intend to keep that value past the lifetime of the array. You need to retain it. If you do not, when the array disappears, so does your object.

Okay, so there you are in a debugging session. You've got a CF type, you don't know what it is, and you're trying to figure out what's gone wrong. Use CF get-type-ID to determine the class of the object you've got. You can compare that with CF type name get-type-ID. So CFString get-type-ID returns the type ID for all strings. CFArray get-type-ID returns the type ID for all arrays.

CF Show will print out any CF object to the console, or wherever standard out happens to be. In GDB, you can invoke CF Show directly from within a breakpoint. It's very useful for looking at what exactly has gotten into your string. If you would rather get that information into memory, use CF copy description, and it will give you exactly the same output, but stored in a CFString, so you can, if you want, write it to file, look at it later, whatever.

Tracking memory problems that occur in this system. CFGEP retain count will always return the current reference count of an object. You can use that to make sure that the reference count is incrementing or decrementing as you expect. You can install a custom default allocator and use that to locate leaks and memory corruption.

And finally, if you look on a DP4 system, there's an application in System Developer Applications called ObjectAlloc. You can run your program under ObjectAlloc, and it will show you some of the CF types as they are allocated and destroyed. All of the common ones are there: CFString, Array, Dictionary, all that kind of stuff.

Finally, there's a fair bit of documentation online. You can find it in System Developer Documentation Core Foundation on the DP4 CD. It's also available on the web at developer.apple.com/techpubs/corefound ation. There is some example code inside of System Developer Examples Core Foundation . And the release notes are available in System Developer Documentation release notes: corefoundation.html.