Writing an Input Method Using the Input Method Kit - WWDC 2007

Mac OS X Essentials • 44:39

Learn to quickly and easily support international users with Unicode input methods on Leopard with the Input Method Kit framework. This hands-on session guides you through the complete development of a fully functional input method.

Speakers: John Harvey, Jöns Åkerlund

Unlisted on Apple Developer site

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript has potential transcription errors. We are working on an improved version.

And this is Session 156, Input Method Kit hands on. I'm John Harvey, and Jns kerlund will be taking you through the code that I talk about. So thanks again for coming. So we're going to talk about input methods. First it's the Input Method Kit, which is new to Leopard, and all of Apple's input methods have been ported to use the Input Method Kit.

Before we get started, what we'll do here is show you the Simplified Chinese, which is an interesting example because it uses every bit of the kit. And then we'll also show you our Japanese input method, Kotoeri, which is built with the Input Method Kit, but they opted to use their own candidate window. That illustrates the flexibility you have. We offer candidate window support, but you can do your own.

So if you could switch over to the demo machines please? And so this right here is Simplified Chinese. You see it has annotations, and you can move through the annotation window, and then we can, it has preferences, this is all done by the kit, offered to you by the kit.

And then we have our Japanese input method, Kotoeri, which is probably the most complicated input method that Apple ships. And this candidate window is their own implementation. Slightly different, you see this annotation window's fairly complicated. And so there's an example of, that we've been able to port our older input component based input methods fairly easily and successfully. So back to the slides please.

So developing input methods. First of all, the big change to understand is that input methods are now just Cocoa applications. They're Cocoa applications that link with the input method kit framework, which as I said is new to Leopard. And what the Input Method Kit does, the simplest way to look at it is that it provides the boilerplate code that in the past if you read an input method, all developers doing input methods basically just had to duplicate this code. So as much as possible we've tried to provide methods that will take care of the common things you need to do.

So it's a foundation based API, that means it links below Cocoa, links below HI toolbox. Its goal, primary goal is make it easier to develop input methods to focus on letting you think about text conversion, and not logistics or putting you know, code formation. So it's make it easier to develop import input methods.

We think that's illustrated by the fact that in the examples you have a lot less code. The old sample had twenty eight files, while the new, the simpler new sample has eight. And I said they're applications, but there's one thing to understand, they're primarily background applications. They're background applications with a little bit of user interface, which is reflected in the text input menu, and then also in the candidate window, if you use a candidate window.

But because they're applications, background application, that gives you the benefit of you have your own separate address space, which has a lot of advantages. For one thing it means that you aren't required to build a 64-bit version if you don't want to. If you needed to load in any running process, you would have to have a four way FAT version. And this way you can build your 30 bit. If you need 64, you can use 64. It doesn't matter, you're in your own address space. And it's made some parts of debugging a lot easier to do. And we'll have a slide about that near the end.

So what's the division of labor? The kit provides communication with your client application, it provides, if you want to use it, a basic candidate window support, and it manages your modes for the most part, handling a selection by a user and informing you about a mode change. And then finally it offers semi automatic preference window display. And what do you have to provide? You have to provide input text conversion, obviously, which is done through key bindings, and or event handling.

By that I mean you need to look at the events and do some parsing, figure out how to convert it. And then you optionally can provide a menu with commands, mode specific commands or preferences. And then you have to tell the system about yourself and your info P list.

So now, so this is a hands on session. There's going to be five stages, we're going to start by creating a do nothing input method. That's an input method that simply runs, gets input, and actually logs input to console. It doesn't convert or do anything like that. Then we'll add a simple conversion engine. We won't focus much on that, cause again, that's where you provide the expertise.

And we'll provide mode support, candidate window, and preferences. So, if you could have stage zero of number input open if you're going to follow along. This is, we're going to create a do nothing input method. Before we look at the code, I just want to talk about, if you're creating, you know, this is real standard Xcode stuff right here. If you're going to create an application, you make a new project, and when the assistant window comes up, in this case you want to pick Cocoa application.

You name it, ours are called number input, and then you store it somewhere. The suggestion here is users share, but of course you can put it anywhere you want. And of course you name it whatever you want. But you, naturally if you were starting from absolute zero, that's what you'd need to do. So if we could switch over to the demo machine, we're going to start looking at code now.

First thing you want to do is modify the default main that you get from Xcode. Line fifty one there we're going to import the header for Input Method Kit. It's its own framework, you need to import that one header. Then you're going to need to define a connection name.

It's not required, but we suggest that you use a pseudo DNS string here. It's pseudo because this string is used with NS connection, the foundation class. And those strings aren't allowed to have spaces or periods. So we've substituted underbars for the periods, so ours is called com apple input method number input connection.

That's on line fifty five in the code. And so then down in the main function, or actually right before it, we're going to define a global variable, which is our Input Method Kit server, IMK server there. We always need to allocate one of these in main, and that allocation takes place there on line sixty seven through sixty eight. Notice that the parameters are the bundle identifier, which of course identifies you.

And then the connection name, which is going to be the way clients find your input method and connect with you. Once you've done that, we've loaded the nib explicitly because it's a background application, and then finally we start the run loop. Unless you add candidates, which we will later, this is going to be the last time you ever look at your main file.

Okay, so can we come back to the slides, talk a little bit about the input controller class. So you create a controller class, and that class should inherit from IMK input controller, which is define, is one of the headers in the kit. Whenever an input session starts on the client side, the IM kit will allocate an instance of your controller class, and hook it up with that session. That means you don't have to worry about managing sessions. When you get, a message is sent to your controller, you know it is for a specific session.

But because you have one controller per input session, that means it needs to be fairly lightweight. I mean you won't have millions of them probably, but there'll be a number, so you don't really want to do things like carry along big dictionaries with every controller or anything like that. Then your controller's also responsible for managing buffers. In the example here we have a composition buffer where we put converted text after we convert it, and then we have the original buffer, which holds the text that the user has entered in an input session.

Your controller obviously is responsible for receiving input. We'll talk about that in fair depth. There's a number of options for how you want to do that. And then of course you need to be able to return, compose, converted text to the client. So let's, we'll go to the code now, we're going to look at our controller class, back to the demo machine please.

Okay. So we look at number input controller dot age, it's pretty simple. It just inherits from IMK input controller there, and then doesn't do anything else at this point. And then moving over to the dot M file, we see that we've selected one of three possible input methods, it's called, there on line seventy seven, it's what I like to characterize as the simplest one. It just takes a string, which is the Unicode, the Unicode string that results from typing a given key.

So let's talk a little bit about the options that you have for input. I need to switch back to the slides here. So there's three ways to do this. There's this first way which we just saw, which is the IM kit has taken the event, completely parsed it, pulled the Unicode out, and passes you the Unicode, and as well the object which represents the input session on the client side.

If you want a little bit more information about the actual key down, we have the next one which is input text key modifiers. That includes the string, but it also includes the key that caused that string to be generated, and the state of the modifier keys when you, when the user pressed that key.

And finally, and this is commonly what you would use if you're porting an older input method, cause it more closely matches the way events were handed off to you then, you can just use straight events, do all the parsing yourself, pull the event apart yourself. And that method is called handle event. And as in all these you get a client, which is an ID, and that will, as I said, represent the input session on the client side. In our example we're going to use the first type.

Now one thing about using this simpler one, the input text with just a string, is that we also here, because you don't have access to the key code there, well we do this key binding, which is modeled after the way NS responder works. We support all the methods that are in NS responder, like insert new line, delete backwards.

So if you, and then we also often, excuse me, offer custom key binding, which you provide a dictionary which maps keys to a given action method. And again, this is only, this is, you only need to, you take this approach if you are using the first method for receiving input, which is input text client. In conjunction with that, what will happen is you'll always get a message to did command by selector.

Now if you're familiar with NS responder, you know they have a method called do command by selector. We have did command by selector because as an input method you always need to return whether or not you've eaten the event, or handled the input. You return yes if you had, and no if you haven't. Now you may say to yourself well wait a minute, I implemented an action method for insert new line, and I just handled that. But when you're doing an input method, a lot of times it depends on the state of your input buffer.

So for example, if the user presses return key, did command by selector is called, you want to say well hold on, am I actually doing an input session now. If I am, well I want to do something, I'll go ahead and let my method, my action method insert new line be called. But if I'm not handling an input session, I want to just say no, I didn't handle that, and let the client application have that return key back.

Okay. So moving again to the code. What we're going to look at now is the info P list, the extra information you need to add to describe your input method. Could we have the demo machine please? Okay. So we're looking now at the info P list, it's you know, generic application one. The first thing you need to do is in your bundle identifier there on line twelve, you need to say that you're an input method, so that comes after your company name. And then you name your product.

You need to use this key LS background only that states you're primarily a background application that has a string value of one. You again need to have your public connection name. Now probably you're thinking to yourself why do I have to have it twice? You actually don't. It's certainly, if you can just define it here and then load it in main, again in our example we had a string constant just because it's easier to show. But this is the important place that you have to have it, because this is where the kit will look to find the connection name.

You need to define the name of your input class, that is on line forty seven. So the kit will look there and find the name of the class, and use that to instantiate the class. You need, absolute, you need a icon for display in the preferences panel and the text input menu.

And then finally you need the name of your primary language. This information is there for the text input sources manager, which was talked about in a prior international session. So for the do nothing input method since we don't have modes, that's all we're going to add. And so if we can, oh we're on the code, sorry. So the next thing you want to do obviously is you want to build the input method and install it. So we can go ahead and just build it, that's done very quickly on this machine.

And you have several options where to put it, system input, Apple input methods go in system slash library slash input methods. You can put it in library slash input methods, or tilde library slash input methods. Now you probably note that input methods is a new directory on Leopard. You don't put your input method in components the way you used to, there si now a directory named for you.

If you happen to be running an older input method, you can go to terminal and kill it. You need to kill it twice, because there's some notifications that go around, and it'll get, once it does, the first time it gets restarted. But two kills usually take care of it. And so then we can open up text edit and see what we've got.

First of all on the input menu, you see there it is. You can turn it on, shows up in the text input menu. And this one doesn't do anything, it's a do nothing input menu. We could look at console if that's very available. We'll take a little time and look at it, in applications, and we should see some output there.

All right, there's a G in there. So we log in that output, and so that's the do nothing input method. That was pretty painless, went pretty fast. So next we'll move to stage one, we'll add some real input and a conversion engine. So could I have the slides again?

Okay, so the conversion engine. Again, this is your expertise here, so our sample will be very simple, and we won't get into a lot of detail about that. But basically what it does, it takes input and runs it through the number conversion classes that the system provides. An interesting thing about the example is it demonstrates one approach for managing a global resource. In the example the conversion engine is a global resource. Each controller doesn't carry around its own conversion engine, instead it will get the global conversion. And we manage that by having an application delegate, which has an accessor to get the conversion engine.

For that reason, we add number input application delegate dot H and M, and as I said, they manage our global, or manage shared resources. In this example that will be the conversion menu, conversion engine, and then later the menu. So if we could look at the code, or look at the demo machine real quick please And so we have the conversion engine right here, we, one thing to note, so it's an IB outlet. The reason for that is that we actually, we told the interface builder to instantiate our conversion engine for us, so we don't actually have any code that instantiates it, we just wrote it, told interface builder about that class, and said instantiate this.

So that will be allocated for us when we open, and we have one accessor here. We also have a method to find out the mode. All right. Okay, we have a method to find out the mode, and right now, must be the next one, yeah. Right now the mode is just numbered decimal, we haven't added any other but the simplest mode.

So then we'll go back to the number input controller and we'll take a look at what we've done to support input. In our input method text method there we've added some scanning, we scan the input with NS scanner to see if it's part of a decimal number, and that happens on line eighty six through eighty nine if you're following along.

And if it is part of a decimal number we append it to our original buffer, that happens on line ninety three. If we append it we'll then inform the user that we're running an input session, an active input session, or inform the client by calling one of the communication protocol methods, which we'll get into.

And if it's not part of a decimal number we may possibly have to convert our other input, and then we send that along. And we want to take a look here at the key binding support I'd talked about. On line one seventy we have our did command by selector.

You can see there on one eighty three, I'm looking at the state of the buffer to decide if I really want to handle this. And if I do, I call either one or two of my action methods, which I can find out by looking at the selector. We have insert new line there that will commit the composition, and we have delete backward.

Delete the backward user's range of composed character sequences to figure out backspacing the correct Unicode way. And if we're backspacing, we want to tell the client that the input session is still active, but it's changed. So we use a method there called set mark text. And so we pass that message to the client, the sender.

Okay, we added buffer management as well in this file, that's on line one twenty six. One thirty five, one twenty six is the accessor. It lazily allocates a buffer. One thirty five is where you can change the buffer and accessor and setter. And then for our original text, on line one forty three we have an accessor. We have a setter on one sixty three. But we also have an append on one fifty two. And if you look on one fifty two there you see the set mark text method being used again.

So let's talk a little bit in depth about sending text to a client. You've seen this sender parameter, this gets passed into your input method. And I talked about that represents an input session. And that object will always adhere to a group of protocols that are defined in IMK input session dot age.

Now that header is actually part of HI toolbox. The reason it's there in HI toolbox is because parts of HI toolbox, in particular the text services manager, need to know about those protocols, since they're defining some client objects. And also they don't want to link with the input method kit, because that is at a lower level.

So that, you, that you pull in from the HI toolbox headers. And there's two important methods, there's set mark text selection rage replacement range. You call that to update or begin an active input session. It's obviously, it says set mark so you'll have the classic underlines, gray or black.

And we see it here in the code, it's called when we append new text, on line one fifty nine when we handle a backspace, on two twenty two, and when we hit our special trigger conversion key, which is space in our example, and that's on line two fifty eight.

You see it takes a selection range, which tells how to do selection, and it takes a replacement, and so the selection range is local to the actual input session, so zero is the beginning of the input session. Then it takes replacement range, which is, references the whole document and tells where to put it. If you just want to use the current insertion point, which is what we do here, you make a range of NS not found.

And then when you're ready to complete a session, when you're done converting and you just want to pass that text back to the client and say all right, put this in your document, it's been converted, that, you use a method called insert text, which only takes a replacement range which tells where to place it. And one thing to note is both of these methods can take either an attributed string or a regular string.

And so that's stage one, we added input and conversion. We want to build that now, and install it, kill the old one. These, you'll go through these steps several times. So we put it there until the library input methods, go to terminal maybe. Did you kill the old one?

Yeah. All right. You just kill all number input, kill it twice, try killing it again. I don't think, there we go. So now we have conversion going on, a very simple decimal conversion. It does things like strip zeros, round up, pretty simple thing. But it was pretty quick to write there. Now we move on to stage two where we're going to add modes. So if you're following along, you want to find number input two and get that open in Xcode. And could I have the slides again.

So what are modes? Probably you know, but briefly a mode means that you have one input method that converts text in a number of different ways. A classic example is a Japanese input method, although really all the Chinese do it as well. But in Japanese you have the well known modes Hiragana, Katakana, and Romaji, along with a number of others.

As I said, the kit handles basic mode switching for you. When the mode switches, the kit will grab that mode and inform you of it by passing you a message, which is set value for tag client. And what you really need, the important thing you need to do, which is actually what you've always had to do ever since we've supported modes, is add a component input mode dict, D I C T, to your info P list.

And you do that exactly the same way. There is one new thing which we'll talk about. The details of how to do that are all covered in technical note twenty one twenty eight, which has been out for a while. And if we could switch back to the demo machine, we'll take a look at what we've added to the P list for this one, that's the component input mode dict is on line sixty four through one eighty seven.

And what's it got in it? It's got menu icons, it's got whether or not a mode is on by default, or whether it is visible by default. You can optionally provide a key, a modifier key so the user doesn't have to pull down the menu to select the mode.

And all this information is detailed in the technote. One thing to look at which is new I think is that first key you see there in this example, it's on line seventy four. That's the TIS input source ID. That identifies the mode to the text input sources manager. Now if you don't supply that, text input sources will generate one that it hopes to be unique, but it's really best to just identify your mode right there. Okay, so this information is primarily used to display the modes in the international preferences panel, and the text input menu.

You also need to provide an array of keys in the info P list, that's on line one eighty eight is where it starts. These keys describe the order that the mode should be displayed, and also provide links to localized strings. So it provides the order to show the strings, and then it gives a key to find the correctly localized string. Our examples only have English, and that's done in the info P list strings file, pretty simple there. On line seven you see we have decimal currency, percent, scientific, and spell out.

Okay. Then going back to the code, we add our support for set value for tag. We override this method, and all this does is take the mode string and map it to an NS number, format, or style. And then there at the end it tells the conversion engine what the new mode is.

So that's it for adding modes. We'll build this now. Now the one thing about this is we've installed the input method several times without having to log out or log in. When we install this one with the mode changes, because we're really changing the way it displays in the preferences panel, and the text input method, we're going to need to log out and log in.

There'll probably be a very simple tool available to just do this change to make your debugging life easier. But we don't have it here now. So we'll just quickly log out and log back in, and then we can go to the preferences panel, let's take a look, see what we got, hopefully if we installed it right.

Right. So there's our modes. And you can turn them on and off, and if we go to text edit and look at the pull down, the text input menu, we see modes now, we don't see number nine any more. So now we've added modes, and you can, so you can select spell out there, which is a nice new feature that comes in Leopard. These are, actually I think it comes in ten point four. So we've added modes very quickly.

So stage three now, we're going to add a candidate window, we're going to use the candidate window support that the IM kit provides. So you want to have number input three, or two open for this in Xcode. And could I have the slides back. So there's several things you have to do. You have to allocate an instance of the IMK candidate's class, that's the first thing you need to do. Then you need to write a method, you need to override a method that actually supplies candidates NS attributed string, it returns an array of strings.

Then you need to provide a method to handle candidate selection and a callback sort of method that handles candidate selection and the user browsing. For the user's final choice you'll be sent the candidate selected message, which includes the string they chose. And as the user browses, as they browse along you'll get candidate selected change, which again include the string that they just chose as they browsed up and down or across the candidate window.

And then if you have selectable annotations, you can get, you would need this, to implement this method, which is called annotation selected. That takes the annotation string, and then the candidate string, the candidate string the annotation accompanies. The example here doesn't use that, it only illustrates the first two.

So back to the demo machine. And we'll look at what we do for this. We'll go back to main, despite my promises, and declare a global variable there, IMK candidates. And then inside main we'll allocate that right there. And you'll see the parameters that need to be passed are the server, which you allocated earlier, so you connect the server to a candidates object. And then the type of candidates panel it is. There's three types, there's a vertical scrolling type, a horizontal what we call stepping, because it has a stepper to move up and down from the rows, and then we have grid candidates.

And then finally in our controller class on line three oh six we have our candidates method. You can see it creates na array, then it takes the current original buffer there, which is the text the user entered, runs it through each of the conversions we support, and then stores the result of each conversion in that array, and then returns the array so it can be displayed.

On line three twenty five we have the method for when the user browses, candidate selection changed. You can see it's just a call to the method set mark text to update the active input session. And we update something we use for backspacing. And then candidate selected for when the user makes a final choice on line three thirty six. And in that case we say that's all done, the user's finished, commit this composition.

So that's it for candidate windows. We want to build that now and see what we got from that. Very quick build, move it into tilde library input methods again, kill the old one a couple times till you're sure it's dead. And then back to text edit, pick a mode, do some typing, hit the space twice, and there's the candidate window So that was quick, that was very quick.

( applause )

Thanks. I think in all my years of doing these WWDCs that's the first time I've ever gotten any applause, so that's a major moment.

( laughter )

( applause )

So, almost done now. We're going to have plenty of time for questions it looks like. What we're going to do now is add preferences, a very simple preferences panel.

Where am I? Okay, could I have the slide, or no, let me, we'll stay on the code, I'm sorry. Let's stay on the demo machine. So for this you're going to do most of your work inside interface builder. First thing you want to do is you want to take your main menu dot NIB, the default NIB that you get when you create the project. You want to open that up in interface builder, and you want to add a menu, a command menu. So we can do that here, main menu dot NIB. And you can see in interface builder we've got a little menu there, and it just has one item, preferences.

And then you want to go back to your code. You want to add an action menu, or an action method to that item preferences. The action method that takes care of preferences as actually defined in the default IMK input controller class. So all you need to do is just say, you don't need to implement this, you just need to say that this preferences item, it's actual method is show preferences. And we do it in the code and the awake from NIB method there. And it's just the standard menu item, method set action. And we say the selector is show preferences.

And the other thing you need to do is now that you have a menu, you need to return it to the IM kit, who will then pass it over eventually to the text input menu. And so what we've got here is we have, we've overridden the method in IMK input controller menu.

Remember I said that we're letting the delegate manage our shared resources. We say delegate give me the menu you've got, and that's returned to the input method, or IM Input Method Kit, which will work its way to TSM, and then up into the text input menu. So that's your preferences menu. That was a lot of talking for very little thing to do. The next thing you do now, you make a separate NIB, which has to be called preferences dot NIB You can see it there, lowercase preferences dot NIB.

And that has a window, and it has a controls, whatever kind of controls you want to do, it's all done in interface builder. We have a matrix of radio buttons. And then you use NS binding to bind to an NS user defaults controller, which is that green object there behind it. We just added one of them, and we bind the value of the matrix to values.

Okay. So once you do that, you do all that in interface builder. And NS user defaults controller, what that does is it basically hooks up with NS user defaults, and automatically writes the values to NS user default. So you will need to add code to look at these or defaults, which we do back in our code in the controller class on line two hundred and forty, which, okay, so you can see there on, well it's actually two thirty nine. We call NS user defaults, standard user defaults, and we get an integer for the key vertical candidate. And then if it's on, we use the single column scrolling candidate, which you saw before. And if it's off, we switch over to our step, single row stepping candidate.

Then finally you have to tell us what your default values are for your preferences. You do that in a P list, which again has a required name, preferences dot P list. We define the key vertical candidate, and the integer, the default value, which is vertical candidate is on.

So at that point we're ready to build this and see what we've got. So it's a standard thing, you get pretty used to this as you work. Build, get rid of the old one, put in the new one, kill the old one, and go see if everything's all right.

Turn it on there, go up to there, pick one of those, let's see what we got. So there, all right, so we have a preferences item now in the text input menu. We can select that, change to horizontal if we want, you can leave it open, or close it. It's a floating utility window. And type, see we've changed to a horizontal stepper there, rather than the vertical. We can put that away, and enter that text there. I've typed some new text.

And go back to the vertical. So we added preferences, pretty straight forward. You can see in the Korean input method and the Chinese input methods, they use exactly the same approach, and their preference panels are a good bit more sophisticated. So it's not just limited. So see, they have a tabs and popups, and more buttons and things like that. So it's certainly not limiting. Our, the example is very simple. But they basically did exactly what we just did here.

Okay. So that's, so we've built our example, five steps. First do nothing, then add input, then we added modes, conversion window, and preferences. So we've got a pretty full featured input method as the final result. And that's what the kit offers. So I said I'd talk a little bit about debugging, could I have the slides back.

So debugging I think is a lot easier. It's easier because you're in your own separate process. Cause you're in your own process, it's very easy to just take GDB and attach to your process and debug, just like you would any other application. You can actually do a thing where you can start your application, your input method, without selecting it from the text input menu because it's running, and when you did go and select it, it would be, the running version would be you.

So it's very easy to do things like start it up from terminal to log output and see things like that. And as you've seen throughout this talk, it's easy to kill it, get the old one out, and restart it. In the old days you'd have to put your component in, launch another application, get it loaded inside that application's process, and then start to debug. Now you can just focus on your application, which is the input method.

The bad news about this is that if you're going to be, if you're using GDB, you need to use two machine debugging. And that's because if you're using GDB on a single machine, you're debugging an input method which you'd also be doing input into GDB, and things get a little scary, or don't work if you're doing that kind of thing. So hopefully you're familiar with what I mean by two machine debugging. It's, my own case, if I debug anything, I usually go that way, unless I'm really in a hurry. So, that's the talk.

If there's any questions, please ask them. The other thing about this which is convenient is that immediately after this talk we have an input method and internationalization lab, lab D, and that's at 10:30. And there'll be engineers to answer any questions on the new methods that we talked about earlier in the week. We'll be there to talk about input methods. And then there's the technical note I mentioned, which you'll want to be familiar with if you haven't created component input mode, and the URL is there.