Darwin • 1:02:07
The Common UNIX Printing System (CUPS), the popular UNIX printing solution, is coming to Darwin. This session covers the design, implementation and capabilities of CUPS. Developers will learn how to use CUPS to enhance the printing capabilities of Darwin applications. Presented by CUPS architect Michael R. Sweet.
Speakers: Richard Blanchard, Mike Sweet
Unlisted on Apple Developer site
Transcript
This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.
Welcome to the Darwin printing session. My name is Richard Blanchard. I am the engineering manager for printing. I have a very brief story for you this afternoon, but it's a really good story and it should warm the hearts of all you optimists. Oh, there's a few of you. The story starts actually with a definition of serendipity that I've always been very fond of. And that is serendipity is looking for a needle in the haystack and fighting the farmer's daughter.
So for us, our needle was a printing system that we could open up and use in Darwin, but that was also powerful enough for us to use in OS X. The Haystack we were digging through had a lot of different options in it. We could take our OS X printing system, open it up, bring it to Darwin. We could write a new one from scratch, we know how to do that. Or we could take any one of the half dozen or so existing open source printing systems.
Well, we found an answer, and it was a very good answer. We found CUPS, the Common UNIX Printing System, from Easy Software Products. It's available under the GNU General Public License, so we can use it in Darwin. It's incredibly powerful, so it'll let us do everything we wanted to do in OS X.
It's based upon a few simple direct concepts, in particular IPP, the Internet Printing Protocol, and PostScript printer description files. Now our existing OS X printing system is also based upon IPP. It was designed around it, so we're comfortable with that. And we've been using PostScript printer description files on Mac OSes for over a decade. So the two simple direct concepts in CUPS we're very comfortable with.
But there's more. It gets even better. The story just keeps improving. It's well documented. We announced on Monday that we were going to use CUPS in Darwin and in OS X. And on Monday, there was great documentation available. The truth is, it was available long before Monday, and the truth is, we had nothing to do with it. But nonetheless, our printing systems are incredibly well documented.
So please, go online, go to CUPS.org. You'll find a half dozen or so documents describing CUPS from different vantage points. You'll find news groups and a mailing list. But, what you should really do, and if you want to understand the Darwin Printing System, and you want to understand the OS X Printing System, you need this book. This book. The book that I have three copies of. The books that you can buy over in the vendor hall. The books that you absolutely must read if you're going to understand what we're doing.
Point made? So, that's why we're using CUPS. That's how you can find out more about what we're doing with CUPS. That's the really great story I have, which is that we had some needs, and we found an answer that met all our expectations. I want to take a brief moment and talk about layering in CUPS because you can get the CUPS source code now from a couple different places.
The pure place, the place you get pure CUPS is from CUPS.org. If you really want to see what the CUPS source code looks like, go there. That is the clearinghouse. But because we're going to use it in OS X and it's released under the GNU General Public License, we also need to give you the source code that we're using in OS X and we're doing that through Darwin because Darwin's also using CUPS.
So you can go to Darwin as of Monday and download the source code that we're using in Darwin that we're using in Jaguar. It's right there. You can download it and see what the spooling system is that we're using. We still have pieces that are closed in OS X, but the spooling system is open. You can get that source code. That source code that's on Darwin does remove a few pieces from the CUPS distribution you'd find on CUPS.org. That's for some legal reasons. We're trying to work through some of those.
But you can go get those pieces because they're also under GPL and add them back. There's nothing to stop you from that. When you get that piece that we put in Darwin -- again, this is -- is this Darwin's -- this is a CUPS repository that we keep. It's live. So as we work on Jaguar, you can see the changes we're putting in there. When we check something in, you see it immediately. So we take that, we put it in OS X, and we put a couple pieces back.
In particular, we put back some of our Quartz pieces. We put in a piece that can convert from PDF to raster and PDF to PostScript. Those pieces right now are closed. We have plans to make them open. So again, two messages. We're very happy and excited to be using CUPS. It's open. Here's how you can get it. If you want to see pure CUPS, go to CUPS.org. If you want to see what we're doing, go to Darwin.
But CUPS is the most interesting. It's really what we're talking about here today. Michael Sweet is the architect of CUPS. He's also the author of the CUPS book, which is why it's so good. He has graciously flown out here to talk to you today to describe what CUPS is, what you get from CUPS.org and mostly what you get from CUPS in Darwin. So, Michael.
Hi, as I said, my name is Mike Sweet. I'm co-owner of Easy Software Products and I'm the guy responsible for CUPS. What is CUPS? Well, CUPS is a replacement for all the existing Unix printing systems that you may have seen in the past. One of the problems with the traditional Unix printing systems is they've been developed for printing text files. LPR and system 5 printing were developed in the '70s.
And back then, all you really had were text printers. And it did that job really well, but it doesn't really work these days. CUPS goes away from that. Instead of thinking of printing text files or PostScript files, it thinks in terms of printing documents. So what we call it is a complete printing solution.
Now who's using CUPS besides Apple? Well, you'll find that most of the Linux distributions that are out there now include CUPS. Now if you have FreeBSD or OpenBSD or NetBSD, those operating systems include a ports package for CUPS. Most of the major printer manufacturers have CUPS drivers or are using CUPS in their products. Examples here are Canon and Okidata licensed CUPS for use in their printing products. Canon is using it for some internal R&D stuff right now, but Okidata will be shipping it with their products.
Absinth, Genicom and Xerox have funded development or provided to us printers and support to support their products and they're very excited about using CUPS as well. In the free software arena, there's GIMP Print, which provides some really high quality printer drivers. You can use those directly in Darwin or in Mac OS X. And then the KDE project, if you're using that with Darwin, also includes complete support for CUPS.
Now as I mentioned before, CUPS is designed to print modern document formats, PDF, PostScript, image files like TIFF, basically anything you're likely to run into we can support. LPD was designed just to print text files, so it's really not up to the task. When the user uses CUPS, instead of dealing with, "Okay, I've got a PostScript file, so I know I can print it to this printer, but this other printer doesn't support it, so I can't print that file there, so I have to run some other program to generate the print file," CUPS takes care of that for the user, so the user never has to worry about that.
Instead of using LPD, we chose to use IPP, which was just coming out at the time we were designing CUPS. It's a really good move because IPP really provides the full range of attributes that you need for printing, and it also provides things like authentication and encryption support that was missing from LPD.
When you're using CUPS, you can support PostScript printers just by using the PPD files that are already available for Mac OS or Windows. It doesn't require anything different. The PPD files naturally describe every feature that's available on the printer, at least as far as the manufacturer is exposing. And you'll be able to print duplexed or on special media.
And it'll handle all that transparently to you and the user. If you want to support a non-post-script printer driver, then you write a special filter and provide a PPD file that tells CUPS to use that filter when it's printing. So as far as the application and the user are concerned, everything looks like a post-script printer.
Now the good thing about using PPD files is you can use the same PPD files in every operating system. So we can support Mac OS clients from UNIX servers, we can support Windows clients from UNIX servers, we can support Windows clients from Mac servers. It really doesn't matter. Everybody can use a PPD file and everybody's got a PostScript driver, so that all works. The same filters that you write for one operating system, unless they're using something that's really operating system specific, will work on another operating system just by recompiling it. And that piece, all the device specific, operating system specific stuff is handled by a separate program that CUPS provides.
IPP was designed from the ground up to support things like access control, security, encryption, where LPD was just a little hack that they came up with. As far as I've seen, it's just been over a few days as the usual story. And it really isn't set up to do anything more complicated than listing a set of hosts that are allowed to print.
Another advantage of IPP is you can pass in any kind of job attributes that you want, so that you can say, "This is a double-sided document," or, "This needs to be printed on photo media," and, "This is the resolution I need to print at." Whereas LPD, there's really no support for that. There's been some vendor extensions for that, but they're not standard, so you can't rely on them. As I mentioned before, there's authentication, access control and encryption, which LPD completely lacks.
Now, in the old days, you had a text file you wanted to print or a man page. You just LPR the file and everything worked pretty well. Same thing happens with CUPS. Commands are exactly the same, plus you have the option of using command line options to control the media and in this case use the Pretty Print option, which is the equivalent of the old However, if you have a PDF document and you want to print double-sided, there's really no way to tell LPD to do that or even to print the PDF file unless you have a special filter added, and then that filter depends on which printer you've got.
With COPS, all printers support PDF files. All printers support PostScript files. All printers support JPEG files, and they all take the same kinds of options. Now, in this example, we print on legal size media and do two-sided printing on the long which just means you're not flipping a page on each side.
If you had a printer that doesn't support duplex printing, it'll go through and print single-sided and, you know, if you send the IPP request, it'll actually come back and say, "Sorry, this option wasn't supported, but I'm going to print your file anyway."
[Transcript missing]
This is a version that's on the standards track within the IETF, and basically just about everybody supports it. There's a few exceptions. Microsoft only supports IPP 1.0. We are compatible with 1.0 clients, so you can still print, but there's a few things that if you're talking to a Microsoft server, it has to kind of go into back compatibility mode.
CUPS itself has a small central server that handles any client connections, any network communications and actually scheduling and printing the jobs. It's just a big finite state machine and it does what it needs to do and then fires off other processes to do the hard work. Any applications that want to send a print job or get a PPD file and pop up options for the user just have to communicate with the scheduler via HTTP and IPP. IPP is layered over HTTP.
There's also a web browser interface. Now, the nice thing about this interface is that you can have users open up the interface to be able to access the documentation online. You can provide your own content if you have specific rules for how you send a print job and where you go to get it and if you don't pick up your job within five minutes we're going to throw it away, that sort of thing. It also lets you monitor print jobs, printers, classes and for the administrator to do administrative functions.
When a job gets printed, it fires off a set of programs called filters and those are all piped into a back end program which actually communicates with the printer. There's back ends for parallel, serial, USB. There's a new fire wire interface going into the new version of CUPS. Plus a bunch of network protocols, app socket for Jet Direct type of thing, LPD for older network interfaces and of course IPP.
All of the printers and jobs and devices in CUPS are managed dynamically by the scheduler. So if you have a new network interface come up, it can recognize that and start using it right away. Similarly, new devices, new printers, new jobs, all that information is dynamic. So you don't have to worry about reconfiguring the scheduler and then, "Oh, I have to restart it every time." This is a simplified block diagram of CUPS.
The commands communicate to the scheduler via IPP. And there's also a series of mini daemons that provide some auxiliary functions. One of them provides LPD client support. There's another that does polling so you can cross the subnet barrier and we'll cover that a little bit later. And then all the filters, printer drivers, backends and so forth are invoked by the scheduler and communicate with the scheduler as things happen.
Now if you send a print job, you run the LPR command, it actually takes the print file, opens up the connection with the scheduler, says, "Okay, is this printer okay for me to print to, number one?" It is. "Alright, here's the options, here's the job file, go ahead and print it." Scheduler will copy that job file into its spool directory and then schedule it for printing.
There are certain options that you can set. You can tell it to hold the job until the operator releases it. You can have it print at night, so if you have a long job, you don't want to tie up the printer. You can do that. It's very flexible.
Once the job does get scheduled to print, it goes through, it determines which filters to run and which printer drivers to run. It fires those off so they can start reading the document file. And then the back end opens up a connection to the printer, whether it's a network printer or a locally connected printer. The filters don't really care.
And they all communicate data back and forth. In the case of supporting LPD clients, we have that mini-demon called CUPS LPD. It handles all the requests, talks to the scheduler, passes on those things. So if the user asks for a job listing, it'll get that information from the scheduler. If the user wants to submit a file, it goes through and gets put into the spool directory. And you can start and stop printers and do all those sorts of things via LPD.
From the user's perspective and from the developer's perspective, CUPS provides a lot of stuff. There's command line interfaces, the web interfaces, we've talked about those a little bit and we'll cover those in depth in a second here. All of the filter and driver interfaces and various other services for applications. Now, your typical Berkeley printing commands, which you're probably most familiar with, your LPC, LPR, LPQ and LPRM, those are all available and those are all fully functional. They look just like the originals.
They also provide the System 5 commands. These may be less familiar to you, but these are the ones that are provided on AT&T System 5 based UNIXs. Accept and reject will allow the printer to accept or reject new print jobs. So if you're having a problem with the printer, you can actually tell the queue to stop accepting new requests until you solve the problem.
Cancel does the same thing as LPRM, it cancels a job. Enable and disable, start and stop the printer. So if you have to stop the printer to change a toner cartridge or something and you don't want it to try printing any new jobs, you can tell it to do so. LP is the same as LPR. LP admin is the command you use to do command line printer administration, add new printers, delete printers, set up classes. And LP stat is the equivalent to LPQ in the Berkeley world.
Now as far as the CUPS commands go, there's two that you might use as an administrator. The first is CUPS Add SMB. This is the program that you use to export your printer drivers from CUPS into Samba, and then the Samba drivers can be downloaded by Windows clients automatically. They see the printer in the network neighborhood, they double click on the printer, and it will download the printer driver from Samba and install it on a local Windows client.
So it's very easy to set up printers and share them with Windows clients that way. CUPSd is the actual scheduler program. Normally you won't have to run this, but if you're doing things with a scheduler and you want to stop it, that's a process you'd kill, and then you could start it back up. There's also the init scripts as usual.
Some other CUPS commands, there's LPINFO, which allows you to list devices that are available on the system and printer drivers. There's short and long forms. The short forms are easily parsable by programs and scripts. The long forms are nice for the user so they can see all the details about the particular drivers.
LPMOVE moves jobs from one printer to another and LPOPTIONS sets up default options for the printers. Richard Blanchard, Mike Sweet Now, one of the nice things about LPOPTIONS as well is you can set up instances of printers. So if you have a LaserJet and you want the default to print duplex but you want to offer a simplex queue as well, rather than setting up a separate print queue, you just set up a separate instance. And for LPOPTIONS you just specify LaserJet/simplex and that would set up a separate instance of that printer with new options.
The web interface is very simple. It's all template driven. The front page there has just listed out what tasks you can do on it. There's links there to download the software, get online help, manage your jobs, manage your printers, all very straight-forward. Whenever there's an administrative task, you try to do a wizard type of interface rather than type everything in on one form and then expect the user to know what to do. So it's very flexible that way. And you can customize this. So if you want to have your own header on the top or have your own look and feel, you can completely replace the templates that come with CUPS and provide your own local version.
This is a list of the printers that are available on that one system. And as you see, there's various commands that you can access right from each printer to resume, to start the printer, stop the printer, accept jobs, reject jobs, set options for the printer, reconfigure the printer.
So if the IP address of the printer changes and you want to, you've hard coded the IP address in there, you can go in and modify the printer without having to do an And it also shows you your default printer, so if you need to change that, you can set the default from inside of there.
As I said, you can customize all of the user interface with HTML templates, but you can also localize the interface. So if you have users that are speaking French and users that are speaking English, you can provide a French version of the templates. Just put them in a subdirectory. The web interface will detect what language is preferred by the browser and will use the appropriate templates. So it's very easy to add support for other languages.
The file filter interface is really where CUPS shines. You can put in any kind of file format and as long as you identify this is how I know it's a Foo file and this is how I convert from Foo to PostScript and so forth, then you can support any kind of document you might run into.
The scheduler keeps track of all of the available filters on the system and all of the available file types. And when a print job comes in, it will determine what type of file is being printed and then see what's the best way to get that file printed for a printer. It goes through, finds the least cost solution, you know, whether it's running one program or five programs. It keeps track of the relative cost of each filter and uses the right combination.
The filters that come with CUPS print international text, and by that I mean all of the ISO and Windows character sets plus UTF-8 for Unicode text. There's a PostScript filter for printing PostScript files and also for converting the PostScript files to a device-dependent format for ripping later. There's a PDF filter for printing PDF files and an HPGL2 filter for printing HPGL2 files. A lot of printers don't support those formats directly, so this makes sure that you can print from a CAD application, for example, that's designed to just produce HPGL2.
There's also image file filters. What we chose to do rather than just converting to PostScript and using the PostScript rep is we actually have a Image File RIP that generates raster images directly for raster printers and a PostScript filter that generates PostScript for PostScript printers. It's a lot more efficient. And then the PostScript RIP handles printing PostScript files directly.
All of the non-postscript printers drivers use a common raster format. It's basically a page header at the top that says these are the options to use for this page and then the raster data itself. The header also includes the dimensions of the raster data, the color space that's being used, resolution and so forth. There's several sample drivers included with CUPS. Covers most of the Epson and HP printers. There's also Okidata dot matrix support and Dymo label printer, which is kind of an oddball, but it was easy to support.
Now if you go to write your own file filter, it's very basic rule. You convert from one format to another. And you tell the scheduler, "These are the formats I support in, and this is the format that comes out." And that's all you gotta do. Some filters are used both as an actual printer driver and some are just used as an inline filter to go to something else.
The PostScript to PostScript filter is an example of one that's used to go both as a printer driver and into the PostScript rep. Because it does the, all the page sorting and inserting of page commands and that sort of thing that goes on either directly to a PostScript printer or into the PostScript rep. The Image to PostScript filter is another example. That's the one that takes an image file and converts it into PostScript for a particular printer.
Now most of the filters you'll see in CUPS either produce PostScript or they produce a raster string. You may have other programs that go to some other intermediate format. There's a CGM to HPGL filter out there, I think on the bazaar, and that would then convert CGM to HPGL2 and then the HPGL2 would get converted to PostScript and so forth until it got printed. Now filters that produce raster data are called RIPs.
Every filter that you're going to write takes seven standard arguments on the command line. The argument zero is normally your program name. In CUPS we pass it through as the printer name. This is kind of compatible with the System 5 interface script mechanism. And you'll find that all of the arguments here are actually compatible. So if you have a legacy printer driver for System 5 printing system, you can still use those and it just treats it as a do all filter within the system.
Each filter gets the current job ID, the printer name, the user name for the person that's printing the job, a title for the job, the number of copies that need to be produced, and then any options that are going through. The first filter gets the file to be printed. That's the file that's in the spool directory. Every other filter after that has to read from the standard input.
So one of the rules when writing a filter for CUPS is that your filter has to be able to read from standard input or from a file. If you can't read from a standard input stream, then you have to copy standard input to a temporary file and then deal with that. There's some filters, like the PDF filter, that have to work that way because they need random access to the file.
There's a lot of environment variables that get passed in. Some of these are kind of like what you get from a CGI application. Some of these are CUPS specific. Some of the ones to keep in mind is CUPS data dir, which defines where to find the CUPS data files. This allows your filter to not be hard coded for a particular path that CUPS might be installed with. And also CUPS server route, which tells you where to find your configuration files.
The device URI is the actual device that your driver will eventually talk to. Now, we've kind of loosely modeled all of the printer communications to a final device URI that describes, well, this is an IPP printer, or this is a serial printer, or this is a parallel printer. It's very generalized. It allows us to add new backends in the future without breaking any compatibility, and the applications just don't really care.
Lang environment variable set to the language of the print job. So if you have a French user that submits a job, that Lang variable will be FR, but an English user will get EN passed in. So you can localize any messages that you produce for the user based on that language.
The path is set up to a, to user bin and slash bin, I think. And it just restricts what normally gets exported. Normally you'd get some really long path that's inherited from the environment and we really didn't want to do that for security reasons. A PPD environment variable tells you where to find the PPD file for the printer.
Usually it will be under CUPS server root, but it could be located elsewhere. Just use that and you can get the PPD file right away. And then the printer environment variable is the same name that you get in ARCV0, but if you're writing a script-based filter, you can't really get ARCV0 without having the wrong name in there.
For the RIP filters, there's an environment variable that's passed in for them called RIP Cache. It determines the maximum amount of memory that will be used for that individual filter. So, for example, if you have a server that's got five raster printers on it and you've only got 512 megs of memory, for example, and these are large format printers, you can say, "Well, I only want to use 32 megs per printer maximum so that if they're all going at the same time, I'm not going to run out of memory just because I'm printing. Some of the other software, TZ for the time zone and user are available. They're not generally used because they're available in other places and it's like the software environment variables, not all that interesting.
Now, file filters are read from standard input or file name that comes in on the command line, but every bit of output that they send goes out on the standard output file. This lets you pipe in to the next filter and the next filter and then finally into a back-end process which talks to the printer. Any status messages, error messages that you want to pass back to the scheduler or the user, get passed on standard error.
So naturally, in order to identify it you need to have a little bit of a prefix string. Just do your fprintf standard error and put in error colon to say it's an error or info colon to say it's an informational message. There's also things for page accounting and for printer status which will be coming in CUPS 1.2 and we'll talk about that a little later.
Anything except for the debug messages gets passed to the user. So if you have an error message, that gets logged in the error log and it also gets put in the printer state message so that when the user checks the printer status, they'll see, oh, the printer's out of paper.
Once you've written the file filter, you have to register it with CUPS. There's two files you create. One is a types file and one's a cons file. The types file lists a bunch of MIME types and magic rules to identify the file type. For the purposes of printing, the only rules that are of any use are the ones that actually look at the contents of the file.
There's also extension rules that you can use. Those are only used when you're using the web interface. So for printing, you don't really use the extensions at all. In this example, we're registering the application/PDF MIME type and we say any file that starts with percent PDF is a PDF file.
The COPS file maps the actual filter. It says this is the file format that I take in and this is what I produce coming out and the relative cost of running that filter and then the filter itself. Now if you don't specify the full path to the filter, it'll assume it's in the CUPS server bin directory. If you do specify the full path, it'll run it from wherever you told it to run it from.
And as I mentioned before, CUPS automatically uses the lowest cost solution. So if you have a new PostScript RIP that you install that's faster than GhostScript, you can give it a lower cost and it will use that one in preference to GhostScript because it's a better solution, a faster solution.
You write a printer driver, things work just a little bit differently. Instead of registering mine types and conversion filters in a special file in etc. CUPS, you actually put the information in the PPD file for the printer driver. The CUPS filter attribute specifies what filters to run for specific types of files.
In this example, we're saying that the CUPS raster format can be converted to the printer's format using raster to my printer. It has a relative cost of 50. If you're printing a plain text file, you can use text to my printer and it only has a cost of 50. So when it goes to print a text file, it'll choose that over going from text to postscript, postscript to raster, raster to my printer.
Now, you notice in the CUPS filter line, it only shows one MIME type. This is because the printer MIME type, which will be printer slash printer name, gets the It gets inserted there automatically. It's assumed that any filters that you list in the PPD file are going from one format to the printer's native format.
Once you get out of all the filters, it goes into a back-end process. The back-end process actually communicates with the printer. So when you're printing, and you're printing to a USB port, it goes through the USB back-end and all the print data comes from one set of filters.
You reconnect that to a network port, then it's running at one of the network back-ends, but the print filters stay the same. So all that communication is common and generalized so that you don't have to worry about, "Am I able to talk from this driver to this particular back-end?" They all use the same interface.
One of the other things that backends do is they provide the list of available devices. So, for example, on the network backend, you could actually list the available network printers that are out there. So when you're first setting up your printers on your network, it'll see the printers on the network, list them out there, and you'll see I've got a LaserJet 4000 here and I've got a Xerox N40 over there, and just pick them from a list when you're adding a printer rather than trying to remember what the IP address is and what protocol they use and that sort of thing. It can identify that stuff automatically. Similarly, for USB printers and for parallel printers and for serial printers, it'll list out what's connected to the port.
Now another interesting thing you can do is you can define a back end that's kind of a pseudo back end. It acts as an agent for another back end. An example of this might be an SLP back end that could go out there and see what SLP aware printers are out there.
And then once it sees what printers are out there, it can query to find out what printing services are available and then tell CUPS, hey, you've got a printer over here and it supports IPP and LPD and these others. And then when it actually sends a job, instead of going through that SLP back end, it'll use the IPP back end or the LPD back end, depending on how you configure the printer.
Standard CUPS comes with backends for parallel, serial, USB, AppSocket, which is the JetDirect protocol, LPD and IPP connected printers. There's a new FireWire interface that's going in probably in CUPS 1.1.16 and also in CUPS 1.2. If you get Samba, you download it, you configure it with CUPS support, you'll get a program called SMB Spool. That is the SMB backend for CUPS.
Now, a lot of people ask about what can I do about job accounting when I've got a thousand users and they're all wasting the paper in my printer and what do I do to keep track of that stuff? Well, each job object in the server, and it keeps track of things kind of in an object-oriented fashion, though it's not C++ or anything like that.
Each job object has a bunch of state information and as long as that job object is being tracked by CUPS, you'll be able to query that information and see how many pages have been printed, what the state of the job is, if it's being held or if it's completed or if it aborted because the printer said, "Sorry, I can't print Chinese text on the printer." You know, whatever condition happens to occur for the job when it's being printed, that's reflected in that job object.
Now, when you're printing to a non-postscript printer, the accounting information is 100% accurate. We know this because when we print a file, we're preparing each raster page or each text page and then sending it off to the printer. So we know exactly how many pages we're sending out and so we can keep track of the information that way. When you go to a postscript printer, things are a little bit different right now. It only uses the page comments in the postscript file. Normally you're printing from a Mac client or a Windows client. This isn't a big issue because those systems do produce DSC compliant files.
However, there are some applications and a lot on UNIX that tend not to be 100% compliant. We do the best we can, but we can't guarantee right now that postscript printers will have 100% accurate accounting. That is coming in CUPS 1.2 when we're actually adding a back channel support so that we can query the printer, find out from the printer how many pages did you print, and then put that information into the database.
Now, as the file is being printed, there is a page log file that actually lists out every file, every file that's printed, every page that's printed on each printer, what user printed it, when it was printed. So you can actually track down, if you see that there was a failure on the printer, you can see, oh, well that was on page four in the document.
[Transcript missing]
Now, quotas is another big thing. How do you do quotas with CUPS? Well, you can set quotas on any printer or any class, which is a collection of printers. The quotas currently are based on the number of kilobytes or the number of pages, and it can be over a specific period of time, or it can be just forever. The current setup you can say, I want to allow the users to print 100 pages per week and if they go over that limit their jobs get cut off. You can set one global limit on each queue for all users.
That's something also that is changing in 1.2. There's actually an accounting API that we'll be providing so that you can plug in whatever accounting system you want if it's based off of a database or some other specialized functionality. When we went out and asked people what they wanted, we got a thousand different answers. So we figured that was the best solution.
Now, another big advantage of CUPS is how it supports network printing. Traditionally, you have a server and then on each client you'll configure it, say, okay, I want to provide this printer and this printer and this printer. Then you go to the next client and you do the same thing. And if the server changes or the queue's changed, you have to go to every client and change them or you have to come up with some solution to push the configuration files out to the clients.
Not very convenient. Certainly for the user it can be frustrating because if the printing situation changes suddenly, it won't be reflected on their systems. So what we do with CUPS is we use directory services. We have two protocols we support right now. In CUPS 1.2 we're probably going to be adding three more. One of them will be zero conf. There's also LDAP and so forth. We'll cover that later.
The first native protocol is called CUPS browsing protocol. It's broadcast based, so it doesn't go past the subnet. It's very simple. It just sends out, "I've got a printer here," and then the clients receive that and say, "Okay," and they log it locally so that the user will see the printer.
When the clients are assembling their list of printers, they can create implicit classes. It basically means you see two LaserJet 4000s out there and say, "Okay, well, we'll tell the user there's one LaserJet 4000 and they print to it and it'll go to the first available one." So it's very easy to set up not only redundant printing, but load balance printing. And of course if you have two labs and they happen to be on the same subnet but you don't want to show the printers from one lab in one area and the other lab in the other area, you can filter out those printers very easily.
Now, the CUPS browsing protocol, like I said, is broadcast based. It just regularly broadcasts printer information, its available and its current status. When the status changes, it sends out a new broadcast. Each broadcast is about 80 bytes per printer. It varies on the size of the server name and the printer name. And the default configuration sends it out every 30 seconds.
When the clients receive the packet, they will add the printer if it's not already there. And if it is there, it will say, "Okay, I saw this printer, so I'm not going to try to remove it in the timeout interval." The default timeout interval is like 300 seconds.
So if it doesn't receive an update after 10 tries, then it says, "The server must be down or the printer's gone away." And it will remove it from the list of available printers. So if you were at the networking session before, the laptop that got connected to the printer, it made the printer available on the network and the other computer saw it immediately and added that to the user interface. That's that functionality here.
Now as far as the client support, CUPS provides native IPP and LPD support. The IPP support is right in the scheduler and the LPD support is through the CUPS LPD mini daemon. The mini-daemon approach was chosen because, number one, LPD is not the friendliest of protocols to deal with, and number two, it was a lot easier to do it that way and assure that the scheduler itself would not get caught up in some state where it wouldn't be able to recover from.
IPP supports encryption. We use the OpenSSL library to provide 128 bit SSL and TLS encryption. We support both persistent encryption, so when you connect it's always using encryption, and also HTTP upgrade, so that you can upgrade from a non-encrypt, unencrypted connection to an encrypted connection before you actually send the print job. You notice if you do a man on LP or LPR, they support an option minus capital E that will encrypt your job.
So if you do LPR minus E, mycontract.pdf, it'll send that job to the CUPS server encrypted, so that nobody else will be able to see it. You know, as much protection as you can get from that. You can enable the encryption on all requests or you can just make it optional. The encryption can be required by the client or by the server or by both. It's a very flexible scenario.
Now as I mentioned before, printer drivers use PPD files. We use the standard PPD files that you get for any PostScript printer for Windows or Mac OS. And if you have a non-post-script printer, you're still using a PPD file and it's got some extra information to tell CUPS that it's a non-post-script printer file. Besides the CUPS filter attribute, there's also color profile attributes. There's an attribute to specify whether or not the printer can produce its own copies, and other assorted attributes that you can use to tell the scheduler and any filters that you need special behavior from them.
As far as clients are concerned, they get the PPD file for the printer using HTTP. The URL for printer might be IPP://servername/printer/fu and the PPD file would be servername/printer/fu.ppd. So it's very easy to get the PPD file. The CUPS API provides a complete set of functions for accessing this. So if you wanted to get a PPD file, there's a function called CUPS Get PPD and you get a PPD file and you can open it up and do whatever you need with it.
Now that API provides not only those convenience functions for doing the day-to-day stuff your application might need. They also provide the low-level functions for HTTP and IPP. Now, like the Apple solutions, they're all finite state machine based and they keep track of all the stuff for doing SSL and TLS and digest authentication. All that stuff is very straightforward, very fast and very simple. The PPD files are accessed using PPD functions, obviously, and they load in all of the attributes in the PPD file so that you can access them from your program rather than writing your own PPD parser.
The other part of the CUPS libraries that are installed on the system is something called the imaging library. Now, aside from supporting the printer drivers with the raster functions, these are also used by the image filters to load image files, do scaling on the fly, and do color space conversion and color management. This normally you wouldn't use in your applications, but in your filters you might find them useful if you're dealing with raster images. The scaling modes that are supported are nearest neighbor, which is used for draft output, and bilinear. We're also adding bicubic in CUPS 1.2.
Now, besides the book, there's a lot of documentation that comes standard with CUPS for the end user and for the developer. There's a typical administrator's manual that takes you through how to set up CUPS, how to add printers, how to do the typical administrative type of things. There's a programmer's manual that's a reference and a tutorial for the CUPS API.
And there's the user's manual that has all the stuff the user might need to know on setting up printer instances, printing files with special options and so forth. And of course, the ubiquitous man pages where you do a man LPR, man filter, and you get your quick summary.
Now, aside from the normal kind of documentation you get, we have a lot of documentation that we've generated, primarily because we came from the military arena and this kind of documentation is standard there. So there's a configuration management plan. If you do code submissions for CUPS, there are guidelines in there for how the code has to be formatted and documented. It's very straightforward, no stringent requirements, just a basic this is how we do things.
There's a document that describes the CUPS implementation of IPP. If you're doing specific IPP requests, you'll want to look at that document because it does describe every single attribute and every single operation we support and how they're used within CUPS. The interface design document covers all of the interfaces within CUPS, both internal and external. If you're using anything with the raster filters, it's useful there. A lot of the content there refers to publicly available specifications such as IPP and the PPD spec.
The overview document is kind of a short version of this presentation and it covers all of the functionality that's in the printing system. Software design description is your typical high level design document that kind of sort of gives you an idea of what the printing system does and how it does it, but you really need to look at the code because it doesn't go to the detailed level.
Software security report is something we developed in response to Linux vendors specifically that wanted to know what are the risks involved with using CUPS. And you're providing this IPP printing service and doing all these things with a network. How is that going to affect the security on the system and what are the risks? This document describes them.
There's a translation manual. A lot of people have been submitting translations for a very long time. I'm not going to go into all the details of the translation manual. But I'm going to go through the manual and I'm going to go through the instructions. So, the first thing that you're going to do is you're going to go through the manual and you're going to go through the manual and you're going to go through the instructions.
And then you're going to go through the instructions. And then you're going to go through the instructions. And then you're going to go through the instructions. So, that's a little bit more of a general overview of what the instructions are going to do. I'm going to go through the instructions. And then I'm going to go through the instructions.
And then I'm going to go through the instructions. So, the first thing that you're going to do is you're going to go through the instructions. And then you're going to go through the instructions. And then you're going to go through the instructions. And finally, there's a software version description document which outlines the changes from the old versions to the new versions. It's pretty boring and if you look at the change log you'll probably get more information.
Okay, what's coming in the new version of CUPS? This is not what you'll have in Jaguar, but you'll probably see this in some future version, I'm sure. IP version 6 support is high on the list. Obviously, a lot of people are looking at this seriously now because we're running out of version 4 addresses.
The current version of CUPS 1.2, there is a development tree in CVS that you can download. That version is fully IP version 6 functional, except it doesn't support version 6 IP addresses and URLs yet. That is being addressed. It also will have full ICC color profile support. The beginnings of that are in there, but not complete yet.
It should be within the month. There's support for two new operations from the IPP standard. One is called Print URI, so you can say print/dot.org and it will go and grab the home page from there and print it out on your printer. And then you can print it out on your printer. And similarly, send URIs used when you're sending an individual page in a collection of pages rather than just one.
One of the things that we get often is people want to print a compressed PostScript file. You have a document from the internet that's been gzipped. This will allow you to do that and it tells the scheduler that the document is compressed and actually will send the document to the scheduler even if it's off in some other place in the world. It will keep it compressed until it gets to the server.
Another big thing is IPP notifications. We have a lot of new events that we're adding to the standard IPP notifications. So whenever the printers change, whenever you have a fault on the job, you can get notified of that and take whatever appropriate action. There's three notification methods we'll be supporting: IPP Get, INDP and Mailto. Mailto is probably the one you usually use.
LDAP support, another big one. We may be supporting authentication, certainly the directory services for listing the available printers. Another big addition for the backends is back channel support, so you can actually feed data from the print drivers into the backend and then the printer responds back with some message. That'll make its way back to the printer driver and you can do your chats with the printer to find out if it's got the right ink cartridge in there and so forth.
Mentioned before, Firewire. There are some printers out there that only work good with Firewire. Why that should be, I don't know, but they take a lot of data and they really need the speed that Firewire offers. We're working on Linux right now. We'll be adding support for Darwin. It will be in there and it will probably be the fastest interface we support.
There's also extended PPD options. Right now everything's restricted to a Boolean or pick from a list, or pick many from a list. We're adding support for additional things so you can specify a number and text and different types of attributes. Those are going in there. They have kind of a backwards compatibility mode for standard printer drivers. So they'll see a standard PPD file with some funny attributes that are intermixed in there, but they'll ignore them.
And the CUPS-aware applications will see, oh, this is a special attribute and I can act accordingly. This is mostly in response to the GIMP print project where they want to have a slider for the gamma control for each of the individual color channels and the density and so forth.
Another feature that's been requested a lot is policies for the IPP operations. Right now, you can say, "Alright, I only want people in this group to be able to add printers." But that also applies to starting and stopping printers and accepting and rejecting jobs, basically doing any administrative tasks.
So the new policy mechanism will allow you to say, "Well, users from this group can do printer administration on this printer." Or, "These users are allowed to cancel jobs, but they're not allowed to start and stop the printer." Be very flexible. You have total control over individual IPP operations and print CUPS.
The new version also has extended raster support. One of the comments that we've received in the past is, "Well, you don't have enough user-defined options in there for us to pass into the reps." Or, "You don't have 16-bit per channel support," or, "You don't have ICC color spaces." Well, that's all coming in the new version and it works very well.
Finally, there's full localization for the command line interfaces. Right now you get English. In the future you'll get whatever language you happen to be talking and the interface also supports transcoding to local character sets. So message catalogs are in UTF-8 and you'll get them in whatever character set you happen to be configured for.
Just what I said. There will be more printer drivers. We're looking to include drivers for Canon and also we'll probably have the full set of Xerox, HP, Okidata, Genicom and I forget. There's another printer manufacturer, but they've all agreed to provide us with PPDs and we can provide them standard with CUPS. So the number of drivers that will be available out of the box will be substantial.
One problem that we ran into with USB and will probably run into with Firewire is you disconnect a device and then you connect a different one and the print system still thinks it has, you know, you disconnect an HP, you connect an Epson, it still thinks it has an HP connected. There will be a new device monitor that will keep track of those changes and automatically update the print cues as needed so that they can still operate.
Well, Richard mentioned the CUPS website. There's also our website for our company. There's the commercial version of CUPS called ESP Print Pro that may or may not make its way into Darwin or into Mac OS, but we provide a lot of printer drivers there, over 3,000 right now. And then there's the GIMP Print Project, which has all of the free drivers for various inkjet printers primarily. There's also OMNI and HPIJS and so on. You go to any of these pages, you'll be able to find links to all of the available open source and commercial drivers for CUPS.
A lot of add-ons for CUPS. There's a SourceForge project out there for those add-ons. So it's basically any kind of Thank you. Richard Blaenchard: Thank you. So, the first thing you're looking for, a graphics panel or a printer driver, you'll find it there as well. And then, for any information on IPP, go to the printer working group web page. They're the ones that actually run the working group for developing IPP. They work in conjunction with the IETF IPP working group. In fact, most of the members are shared between the two groups.
And that's where you'll find all the IPP specs and all the stuff that's And those are the RFCs you want to look at. 2910 and 2911 are the current IPP 1.1 specifications. There are a bunch of new RFCs that hopefully within the next couple months will actually be published. They've been in the queue for a little over a year now.
And they're catching up. It's been a little difficult to get them published. And the old LPD specification 1179, if you have a particular problem with an LPD client, you can use this to reference that and see if there's anything we're doing wrong or if anything your client's doing wrong. And with that... I'm an imaging evangelist.
I'm responsible for most of the graphics tracks here at WWDC. What I wanted to do is sort of bring things together for a moment. What we've done here is tell you about CUPS, the printing architecture that's being put into Darwin. And we've told you about CUPS in terms of what its full functionality is on Darwin and other forms of UNIX and Linux. There's a second part to that story, and that story is how Mac OS X as an operating system is going to leverage CUPS.
And that's actually what's happening tomorrow in Hall 2 at 10:30 in the morning. In the printing and Mac OS X session, we're going to spend a lot of time communicating how we plan to leverage CUPS and expose it in terms of UI and functionality in the operating system. So it's sort of the companion piece. They're basically bookends. So if you like what you heard in this session and you need to find out more about the specifics of what we're going to plan to do with CUPS and Mac OS X, please attend that session. Thank you.
Now, if you need to contact me with any questions relating to graphics and imaging and specifically printing and CUPS, please feel free to send me an email at [email protected]. That's my email address up there. And what I want to do is bring up a Q&A panel so we can field questions from the audience relative to what we've spoken about here today.