Information Technologies • 1:13:22
Apple continues to introduce hardware and software targeted at scientific computing which has spurred increased adoption of Mac OS X in the sciences. This session will review Apple's technological advancements supporting scientific application development, Apple's momentum in the market, and the variety of initiatives underway throughout the company and the community to further science on the Mac.
Speakers: Elizabeth Kerr, Simon Patience, Liang Hoe, Alyssa Goodman, Jay Lyerly, Brian Gupton
Unlisted on Apple Developer site
Transcript
This transcript has potential transcription errors. We are working on an improved version.
Right, so our agenda for this afternoon's session is I'm going to give a brief overview of what we're doing in the scientific computing for Mac OS X market. Simon Patience is going to give us a nice presentation on technology and trends in scientific computing. You'll also see a few really nice demos from our customers and developers, which I think you'll really enjoy. We'll do a quick summary and hopefully we'll have time for Q and A. So please think about some interesting questions for the presenters while we're going through the session and we'll get to that at the end.
So it's been a another year since we've been here last year talking. And there's been a lot going on in the market. I wanted to first start out with some what I think is really good news. And you guys may not see data like this very often, especially at the developer conference.
We poll the market about once a year to see how Mac is doing. And one of the reasons why this is relevant today is because what's essential to our success is what you guys are doing for the platform. And I think you see that what we're doing is really working. If you look at our total market share growth, it's gone from 15 percent to 19 percent.
And that may not seem huge, but it actually is a big deal, because the more we get the more we'll continue to get. It's really particularly good in, in the academic segment and in the government segment, where you can see our growth is fantastic. Industry we have a little bit of work to do still.
But I'll talk about how we're, in some of our programs we hope to continue to address that and hopefully make some more in roads there as we continue into the next year. Another piece of data that I'm showing you is just how many people use Mac in their day to day work, in the lab, for scientific research. And its really about 45 percent of all scientists use a Mac on a daily basis.
They don't all own a Mac, but they all use one. So we're really touching a lot of people out there. And I really think again that this has so much to do with the fantastic applications that you all are bringing to the platform, especially as we add features to Mac OS X and to the hardware.
So how do we partner with our developers? I wanted to talk a little bit about this and just some of the new things we're doing this year as well as some of the stuff we do every year, but I'll just remind you for those of you that haven't been here before and heard my, my talk.
Just starting with conferences and trade shows, there's the standard national conferences and big shows like Society for Neuroscience, which we'll be doing this fall. But what we also do some market specific ones. And again, looking back at the previous slide, how do we address industry? Bringing some of your solutions with our solutions in front of customers at a conference like Drug Discovery and Development which is really focused on the pharmaceutical and biotech industry. We also participate in some user group meetings.
And these are specifically ones that you all put on. So if you have user group meetings that you think would be valuable for, for Apple to participate in, please let us know and we'd be happy to think about it and look at a way that we could play a role.
And another thing we've done are some institution specific events, again bringing in our partners to talk about their applications. The one I'm showing on the screen is one at the NIH, they have a training session program where they invite industry people to come and participate. So we've done that too, really nice audience there.
Another element of what we've been doing this year are providing more what we call seminars online. These are available on the Apple dot com website, at seminars dot apple dot com. And we're really focusing on more of a how to element, to bring people in. The first is about how to build a cluster. This is simple in so many ways, but it really is helpful. So its you know basically how you plug everything in, how you wrap the servers and so on.
But we also partnered with the bio team to talk about their application Inquiry which runs on an Apple workgroup cluster. And the second one is Getting Started with OsiriX which for those of you who don't know is a fantastic open source medical visualization application. And this is a radiologist presenting how you download and use this application from the get go on a Mac.
We're really interested in doing more types of these, so again, very open to your ideas on what might be useful and ways to really reach out and touch this community. Another pretty interesting initiative we've done is something called the technology immersion program, getting back to this industrial type customer and pharmaceutical and so pharmaceutical and biotech industry. You know how do we get them to just take a look at our hardware, take a look at the applications that you guys are making for Mac OS X. Because we know that that once they get it in their hands, they like it.
They want it and they're compelled by it and they can see that it really is a viable option for them. And more than a viable option, actually a preferable option. And so our goal here was to put these systems in their hands. Our first area of focus was computational chemistry, because over the past year or so we've gotten some fantastic applications, put it over to Mac OS X.
So a very nice suite of applications for making a computational chemistry workstation. So the customer gets the MacPro with 30 inch Cinema Display, a bunch of apps for chemistry, a bunch of apps for productivity. And we leave it to them and their IT group to set it up, get it started and use it for a few months to see how it works. What we get back is we get feedback.
Is it really all that we think it is? Is there anything else we need to do to make it a viable option? And we get access to their IT groups. So here's a bunch of IT people that may not have considered even touching a Mac or they're completely Windows or LINUX or both centric, but not Mac.
And they, they engage with the system, the see that it's not impossible, or even that difficult to get it on the network. And it's a really, it's a really interesting program that's working quite well for us. This is a very different type of program, switching gears a little bit.
Our Apple distinguished scientist program. This is a brand new program that's really being headed up by our higher ed science group. And what they're looking at is how do we find ambassadors out to the scientific community for Apple that have more a voice to the customer. And these are really the people who are using Apple technology for doing hardcore research themselves.
And so the goal here with this program, these awards, is to find these ambassadors if you will and learn from them, hear back from them as almost an advisory group and also have them talk out to our customers for us about how they use the technology and just represent Apple in a much more industry, scientifically focused way.
I want to talk a little bit about our website. We speak most often through our website, Apple dot com slash science. And I wanted to spend a few minutes today because we have recently redesigned this, to take you on a, a quick tour. Some I'm going to go over here and we're going to switch over to one of these demo systems magically.
( Laughter )
Now I can't see it, fantastic. Okay, so, I'll be able to do this. So this is the new home page. What we did was we really wanted to make it more interactive. And again, have a better way to highlight what our developers are doing and solutions and focus on solutions on the platform, not just the hardware and the software. But what is a scientist going to do with hardware and software. So some new sections of better sections I should say, because they all were here before pretty much, is software for science. Let's take a look at that.
and you can see there's just a, it's much more visually appealing, its much more engaging and its easier to find applications that you're looking for. And what I really like in addition is you know you can break it down by discipline here, but also if what they're looking for isn't right there, they can just click right there into the Mac Products Guide and do a search.
I'm going to mentioned the Mac Products Guide at the end as well. But if you haven't updated your application and your description or even have your application in the Mac Products Guide, please put it there. People do use this. And if they don't find it there, they just assume that your application doesn't run on a Mac. So another thing that I think is pretty interesting is the science solutions section.
Excuse me. Here we're taking a look at different disciplines and saying how are people using the hardware again and what should they be, what kind of applications work, what should they buy. Let's just take a quick look at medical imaging. I think this is a fantastic example. What you can see it just goes into a lot more detail and then there's your partners and suggested software applications for that particular area of focus right there.
And then the profiles page, and the profiles are these customer profiles we often get great ideas from you and we profile our developers, customers, somebody who's using a fantastic application and the Mac to do research, interesting research. We're always open to new ideas if you've got suggestions, please let us know. This is one the most popular parts of Apple dot com slash science. It gets more traffic than any other part. People love these profiles. And this is just a much better navigation to search out different topics and all of these go much deeper.
So we like to keep this up to date and add new profiles as often as possible. So again, please if you've got some new ideas or customers you'd like, you'd like to suggest, then let us know. The final thing I'll talk about is inside the image. Now this is pretty different, but it's also been quite popular. Its our really look at how art meets science.
And we're in this part of the website, these are small columns. They're interview style where we talk a scientist that has, that has come up with or used a particular image to convey a concept or an idea with scientific base and talk about how they come up with it, how they made the image, what went into it, a little of the, you know the, the art side of things.
And again, it's been really compelling and we get a lot of information and discussion about this art meets science idea. So with that, I'm going to go back to my other slides. I just wanted to give you a sense of some of the new elements on the website. And encourage you to check it out if you haven't already done so. Okay, so let's talk about the communities.
Most of you have heard of some of these, but we have three main areas I think are worth talking about a little bit. Mac Research, and I'll talk more about that in the next slide, Mac enterprise dot org, which is more focused on the IT side of things as opposed to the research side of things with Mac research. And the developer connection. These are all areas where you're going to find great community, great information about scientific computing as a developer.
Mac research is really a fantastic success. Now this almost, it's about two years old this month. It started in June of 2005 this was launched. And today they have over 2000 active users. They get 30000 unique visitors a month and over 200000 page views a month. This is a fantastic place for you to be talking about your application.
If you have a customer that can write a review of your application and post it on Mac research. If you just want to ask the community a question about, about your application, about functionality, anything like that, there's a bunch of people from Mac research here at the conference, who are going to be at the science connection on and off and just throughout you can meet them. But I encourage you if you haven't checked this out, to please do so.
And I think even nicer is as of I think in a couple of weeks, shortly anyway, they've redesigned the site and it's going to go live with a new look that's much more interactive and it'll be more user friendly. So I think we'll get even more people using Mac research as a resource and really fantastic place to focus the audience. So I encourage you to, to make as much use of this as you can.
Okay, finally as I promised, I'm going to talk about the Mac products guide because we really encourage and want to, I want to end with this, to encourage you to make sure that your application description and the information about your application in the Mac products guide is compelling and up to date.
And if you're not in, alls it takes is a quick submission to get yourself listed on the Mac products guide. Our customer do use this resource to find out what applications are available. So if you're not there you're probably missing out. So with that, I'm going to turn the presentation over to Simon Patience, thanks for you attention. ( Applause ) >> Thank you.
Good afternoon. I'm Simon Patience. I'm the vice president of Core OS, that's the Unix part of Mac OS X. So today I'm going to talk about four areas that influencing science and are driving the direction of scientific computing. And the idea is to basically show you the influence that Apple is having in these various different areas.
( Period of silence )
So the trends in data management is that fundamentally there's an explosion of data. And it's driving scientific computing in, in various different ways. A scientist from John Hopkins was predicting that there will be more data collected in the next five years than in all of previous recorded human history.
So that's a pretty amazing statistic. Now when you have all this data, you have to manage it and by manage I don't just mean store it on a disk. You have to find better ways to, to organize it, to be able to search it, to be able to find the, the information in there. And that creates a new set of challenges for us and sort new database infrastructures.
In addition, we're getting data sources from increasingly different and diverse places. Multiple different instruments, but also different geographic areas. And so that creates, creates a new set of problems. And so now it's got to the point where we have so much data that it's more important now to find the information that's hiding in that existing data than it is to get new data.
( Period of silence )
So here's an example of the data explosion. This is Gen Bank which is a public accessible central database of genetic information. One of the largest of its type in the world. And in the last few years, last five years, we have grown from 17 billion to 75 billion entries in the database, the number of base pairs in the, in the database. This is a growth rate of doubling in every 18 months. So we're basically going exponential here in the growth rate.
Another source of large amounts of data is the large Hadron Collider at Cern, Switzerland. This is a proton proton and heavy ion collider, so it's a really sort of atom smasher. Its 27 kilometers in circumference, one of the largest in the world. 40 million collisions per second. So this machine, this system can produce 10 pedabytes of data per year. That's a huge amount of data to be able to analyze. This is scheduled to start in September this year.
So the next thing is that we've got, you know when we gather all this data, the next problem is, is not only to store this data, but you know how do you process it? And so what we really want is you know your super computer. So what's, you know what sort of computer would you want to process this sort of information from? We're not talking about necessarily where you'd store it. But you know to process this information, you need a decent sized machine. So we look at storage, the memory, processor and obviously it needs to be MP.
So you know three terabytes of storage is a reasonable amount to keep your sort of working set of data in I guess, you know. So 16 gigabytes of memory would be a good amount of memory to, to be able to process reasonable sets in you know three gigahertz zones obviously, that's a fast, fast processor and naturally 64-bit, large amounts of data, 64-bit processors are essential. And MP of course you need to be able to build crunch these numbers effectively So let's have eight of them.
All right what would this super computer look like? It looks like that. Which is pretty amazing that you can get all that computer power in that box and you throw in a 30 inch displa0y or in this case a 5 inch display by the look of things. Then it's an amazing amount of power that Apple can bring into your desk, desk side basically. To help you process this huge quantity of data.
And to show you what, what we can do with such a large amount of data, we have, I'd like to Marion Ho up to, who is a science partnership manager at Apple, to demo a 64-bit OsiriX on a dataset that is pretty challenging.
( Applause )
So one of the big problems in medical imaging its, its actually one of the, one of the driving forces is, so just switch over here.
One of the driving forces in medical imaging is just the amount of data. So what, one of the, one of the, so as we mentioned, OsiriX is one of the, its one of the, its one of the very popular dicom, what we call medical dicom viewers. A medical imaging viewer. If you're looking at data from things like CAT scanners.
And of course CAT scans, CAT scanners for those of you in the medical field know they, we can generate a lot of data out of those things. Right here we have about 3000 slices, represents about 4 gigabytes of data. And what we're seeing is actually we're seeing OsiriX here in 64-bit on Leopardd, running, running all those street house slices in memory in a side by side, in a side by side study.
Why is this impressive you might say? Well, it's all gray and everything. Well, well and frankly, if you think about it, if you before 64-bit and before we could get, we could do this on, on the Mac with OsiriX, you really only had two options. You, you're dealing with 32 bit, so you had, you had to use more virtual memory or you had to down sample your size, your dataset.
And neither of those choices is obviously preferable with you know with virtual memory you run into a lot of performance issues and with down sampling your sample, you are, you're starting to possibly lose data, just to put that in perspective. You know, you're looking seven millimeters slices here.
You know you start missing every other slice and you start missing possible nodules or pieces of, or cancer regions that are, that are smaller than that slice you just missed. So it's really important that we can see all this data. And that's really what OsiriX is showing us here with 64-bit.
Now one of the things though is you know we have all this data, so lets, you know there are other things we can do with this data. We can also look at it in once we have all this in, we can actually also look at it in three dimensions.
And that's something that OsiriX does really well. We can take all this, these 1500 slices and, and compress them and put them into a, into a three dimensional view. And, and what I wanted to show you, actually is this kind of data in three dimensions. So let's see here. Let's do some demo.
So we, you know we're in here and we're looking at some data. And so we can, so that's taken the slices and its, and its, we've put it in three dimensions. And now we can see the, the model, the anatomical models. What we're actually looking at what is the bottom half of a person.
And, and so what you're basically seeing, so what this really basically shows us is that in addition to just seeing the stuff in two dimensions, we can also you know we can also visually see this, this kind of data in three dimensions in a, in a way that's more intuitive and natural, not just to doctors but to average users.
And that's really the power of having 64 bit and having this dataset available. And, and fully available to, to the rest of the programs. So kind of with that I just wanted to lead into what Simon is going to talk about next which is, is large scale data visualization. So thank you.
( Applause )
So that's a brief discussion about data management and like Marion said, we've now gotta talk about you know what you do when you've got all this data, how do you look at it and how do you start to see what information is hiding in there.
So for data visualization you know this, this data explosion has been basically requiring us to get more sophisticated visualization techniques and so forth. So the trend is been towards realistic virtualization in 3D models. The of high performance networking basically means that we are actually, we're able to move this data around far more effectively. And what that means is that we have a, a much greater demand for being able to have high performance visualization pretty much anywhere as you have the ability to, to move, move both the data and, and the, the visual, visualization around the world.
And one of the answers to this is, is to use display walls to be able to use the, this large scale visualization of complex data to be to see what is hiding inside the data. Basically looking for the information hiding in there. And the difference is that display walls are high resolution. We have a big screen here, but this is not high resolution.
And the, the high resolution walls are really up to things like 200 mega pixels and they show the big picture. As opposed to just you know the, the wide data. So you can zoom in on the data that you're actually looking at and find the details that are hidden in that picture.
So the other thing we want to talk about is going towards 3D visualization. So this is a picture of, of 2D data and the interesting thing is that you can see a number of different images of 2D data. And the wall provides some insight into what's going on.
When you see it in 3D and this is the same data by the way that you saw in those 2D slides, you get a huge improvement and insight into what that data is actually hiding. It's much easier to visualize the problems. And these are using regular sort of Apple technology, so OpenGL and, and the Mac pro's multi CPU hardware, 64-bits and so forth. And that allow you to be able to do this kind of processing, to produce these images, that give you these insights.
Display walls are becoming increasingly more useful and used. This one is actually the hyper wall at UC Irvine. This is a 50, 30 inch Cinema Display wall. So it has a 200 mega pixel resolution. There are 25 Macs behind this, Mac pros rendering these images. So basically they're using commodity hardware, here Apple displays and Apple computers.
And the interesting thing about these display walls is that it's a centralized resource and so scientists from various disciplines are sort waiting to use the wall. And they start to collaborate in a more productive way, sharing ideas from their own disciplines with each other to create new ways of, of solving their problems and finding information about their own data by using techniques to used in other disciplines.
The one thing about this though is that you know it's a, while the hardware is relatively simple, you know and commoditized, the software is custom software and it's a relatively complex task to be able to partition your job up and send it out to these render engines which then display all this. However, there is a solution.
So Chromium is an open source software package available for being able to run these large display walls and Tungsten Graphics has announced yesterday that it's brought enhanced Chromium for Mac OS X. So this is a way for you to take your OpenGL programs without modification and, and Chromium can run many of these without modification to put your data up onto a display wall.
We also have Quartz Extreme. This is a new feature in Leopard that allows you to take your, your data and with, this is a, not a programmable thing. This is, this is Quartz Composer that allows you to, to display your data on a display wall. This is not quite the, as large scale as, as the Tungsten solution. But it also helps you basically more simply get your data through simple software means to commodity hardware to high resolution displays to be able to do the analysis is a much better way.
So the combination of visualization and the intermixing of science disciplines as they have to use these central resources has produced interesting results and new techniques in, in analyzing data. And so what I would like to do now is to introduce Alyssa Goodman from Harvard to talk about her experiences in new techniques with astronomical medicine.
( Applause )
Thank you.
Well I'd really like to thank Apple for inviting me here. And thank Simon for really giving a perfect introduction to this short presentation. We've started a new organization at Harvard called the Initiative in Innovative Computing, whose goal is to essentially address all of the challenges that Simon is talking about today. So dealing essentially with the explosion of data that really is taking place right now.
And with the need to process the data, store the data, share the data, collaborate on the data, visualize the data, etcetera. And so instead of giving you a whole talk about the mission of the IIC, which is very similar to the mission of this session, I'm going to talk just about one project which is near and dear to my heart, which is called astronomical medicine.
And so let me explain how I got involved in a project like the IIC itself and specifically in astronomical medicine. So I'm an astronomer and in that capacity I study how stars like the sun and stars not like the sun form all over our galaxy and in other galaxies.
And in order to do that, what we wind up doing is taking a large number of images at many different waves lengths that also include spectrally resolved images, so essentially three dimensional images that I'll talk about in a few minutes. But we image these very large regions of mostly our galaxy that form hundreds, thousands or in some cases millions of stars at once.
And what we're trying to do is to pick apart the physical properties of these regions, so things like what's the density, what's the velocity of the material, what's the temperature of the material. And what do physical models of how this kind of material evolves over time tell us about the population of stars that could form from this kind of material.
So what is traditionally done in astronomy or was traditionally done is that people specialize in particle wave length regimes because this is difficult to do the observations. And so what's not done as much as it should be, but it is done increasingly more now, is to combine the results from all of these different techniques.
So it's sort of in the medical world like combining CAT scans, MRIs, CT scan, PET scans, etcetera, all in the same patient to try to get a fuller view of what's going on with that patient. So this patient here is called the Perseus molecular cloud. It's about 700 and something light years away from us. And its forming thousands of stars that the same time.
And what you see here is an image where in the background you have a historical optical photo, the colored stuff that you see here is omission from dust that's taken with the Spitzer space telescope. The color contours of what I'm going to talk about more in a couple of minutes and the very small, those of you sitting in the front can see all these little red and blue markings that are particularly important so called star forming cores.
The point here however is that I was invited to a meeting a few years ago, this is just to review here what the different kinds of data are. And again, just to emphasize that I'm going to be talking specifically more on the talk about these so called 13 CO contours.
But I'll get to that in just a second. And so I was telling this story about the need to integrate this kind of data at a meeting that was sponsored jointly by the National Science Foundation and the National Institutes of Health at the National Institutes of Health campus in Bethessa.
And it was organized by the graphics and visualization community in computer science with the premise that the visualization community in computer science had quote unquote failed the scientific community. And that they had gone off and done things that were interesting to them but that weren't particularly useful for science.
And so at that meeting, if I can just go back here for a minute, I was explaining this kind of data and saying that wouldn't it be incredibly nice if there better ways to browse through all this information and in particular that showing a contour map of the particular kind of so called molecular structural lighting that we were interested in was a crime because in fact it was three dimensional data and what we were showing was essentially a smooshing of all of the data together. It's not unlike looking at that person you were looking at before from the bottom and taking the whole bottom half of their body and kind of adding it up in that it doesn't mean very much.
But that's what the green contours in this, in this image mean. So I told you that the premise of the meeting was Viz has failed the scientific community. That's my friends Hans Peter Fitzer who's now employed at Harvard at the Initiative in Innovative Computing. He also was from Mitsubishi Electric research labs.
And then here in the audience today, I don't know where he is, but is Michael Halle whose there at that top and then there's, there's me. And so Mike was in the audience when I was giving a presentation about those challenges. And he came up to me afterwards and he said, oh we can help you. No problem. You know we do this all the time in medical imaging. And so he and I started collaboration. And we said, you know alright, well if this is so easy, let's just go get an undergraduate to do a thesis about this.
( Laughter )
No biggie. Okay? And so there's the, hey the word unsuspecting was removed from my slide. Okay that used to say unsuspecting undergraduate. Her name is Michele Borkin and she now works at the IIC as well. And so we said, okay, you know there's some formatting incapabilities and some problems with coordinates and well one's astronomy and one's medicine, but this shouldn't be that hard. And she did it.
Okay? And the story is a little bit more complicated than that, but basically the tools that were available and the software which, which does all run on standard commodity Mac hardware in particular, made this remarkably easy. And so let me just show you some of the results and what it is that we learned. But before I do that, let me just explain here just a little bit more. If you saw the demo before of, of OsiriX, you'll know why this is not working.
( Period of silence )
( Period of silence )
Okay, guys. ( Period of silence ) Alright, well. We'll show you in just a second. Now you got it. Okay, the trick is to go several slides forward and then backward in the presentation should this happen to you. Okay, so anyways. So in medical imaging you're familiar with the kind of technique where you scan forward and backward through slices of someone's head.
And because you have a context of what someone's head and the inside of their brain looks like, this will work for you. Okay, but if I show you this movie here, you're going to say what is this? okay, and what is this is the inside of that star forming molecular cloud that I showed you and the different slices that I'm scrolling through correspond to different velocities which roughly correspond to different distances in three dimensions.
Okay, so those three dimensional, sorry those green contours I showed you are made up of the sum of all of those bits of dimension at different velocities. So if I show you instead this same image that you saw before and then I saw you the movie on top, which isn't on top anymore. So never mind, use your imagination. And in these contours, okay, what you would see is the filling of those contours by the sum okay of all of the emission at all of these different velocities.
And because I wasn't able to show that to you on the image, you have a better picture of how people usually try to analyze this. Okay? And so what they'll do is they'll look at those kind of movies and they'll try to reconstruct structures in three dimensions. So what did Mike and Michele and our friends at the ICC do? Well, using OsiriX and also other software called 3D Slicer, which are both based on ITK and VTK, we were able to take this and turn it into a three dimensional movie.
( Period of silence )
I saw it work before when we went through the slides. Okay, here you go. So, thank you very much. ( Applause ) Okay, so that is very much the equivalent of what you saw in OsiriX before of taking two dimensional data and making it three dimensional and things like being able, by the way just in, in keynote, okay to take this transparent movie and manipulate it like this and try and figure out what 2D structures map to what 3D structures is incredibly useful to us. And things like this big arch of material that you see here, okay. Nobody even knew that that was there. People thought that there were two different clouds at two different distances that weren't connected. And this actually changes our interpretation of how stars form in this region.
So just so that I end on a, on a moderate note rather than making you think that we have this all solved, I just want to let you know that the kind of size of dataset that I'm talking about is roughly what Simon and Liang were referring to as a really large medical image at the moment. So you're talking about something like 10 to the 8th foxels worth of information.
However, we have other projects that we're working on at the IIC right now where the kinds of solutions that we have philosophically would work, but technically they won't work. So for example we have one project that has to do with combining electron microscope images, slices and then tiling them and building up a 3D volume of wha=t the brain looks like at 5 nanometer resolution. And if you do that for half a millimeter of brain tissue, half a millimeter cube, you get 10 to 14th voxels, okay? A million times more than the data cube is just showed you.
And I'm sorry but that doesn't fit even in 16 gigabytes of RAM. So we need to do something more clever, there's a challenge for all of you. And we also need to be very clever about what you need to see. So what Simon was alluding to also is that this data explosion means that you can't look at all your data anymore.
And so you have to be very careful about having tools that let you browse selectively and informed by the knowledge of that particular discipline where you're looking at the data. Just for reference, if you took the half millimeter of brain tissue and tried to do the whole brain, instead you'd get 10 to the 22 foxels.
So the last thing I wanted to show you is that we actually have released the code for the 3D Slicer version of astronomical medicine project and you can go get that online. And then to share the results of our work, we've actually investigated and are now using a technology which I'll show you over here where you can actually take the results and put them in a regular PDF file, okay, so this is just Acrobat Reader open right here.
And then what I'm going to do is click in this figure, which shows you a portion of the Perseus molecular cloud conflicts that I was showing you before. And then I'm going to be able to do something pretty cool, okay. And so, okay, yeah. Okay, so on my machine that shows it to you in real time and doesn't disappear while it moves.
But in any case, you can manipulate this 3D volume and see if from any direction that you want. And you can also turn off and on particular layers that you do or don't want to see. And so that's inside of a paper. So somebody can go to say Nature's website and take the paper version that looks like that and say, oh I really wish I could see that cube from the side.
And obviously you could do the same thing with a medical image or a molecular structure. So I would just like to conclude and thank the large number of collaborates only some of whom are shown here. And turn things back over to Simon. Thank you.
( Applause )
Thank you.
( Applause ) We need slides back please. So that again goes to show that even presentations are not an exact science. Sorry. So data visualization. The next thing I'd like to talk about is, is the collaboration and, and networking aspects of science. And as we've just seen, you know collaboration is very important to be able to help solve these more complex problems as we get sort of cross pollination of feeding different ideas into, to how to address a hard problem in one domain which is solved easily in another.
And you know it just goes to show that no one in as expert in everything and so it, we really benefit from, from this collaboration. And the other interesting thing is that you know we have technologies which are you know now second nature for for most people even outside the science realm, in the kazoomo space, you know email obviously everybody's you know wouldn't, couldn't live without. iChat is pervasive.
And with the advent of high speed networks, you know in, in our lives, allow you know real time disposal of, of much more rich information and also allows more powerful tools. So we have now video iChat and we have white boarding in 3D and stuff like that. So a combination of this, the online central data repositories, you know this very large databases that are in the world now that are sort of focusing, concentrating the data into one place for wide spread storage and sharing throughout, throughout the disciplines and things like the virtual observatories, Gen Bank that we saw before. Is creating a whole interesting, new environment to collaborate with.
And in fact there is some interesting use of tools. So OsiriX for example is using iChat Theater to share screen images between, between collaborating doctors in real time. Now they've done this by using some sort of in the past anyway, by using some undocumented capabilities in Tiger. Which we've now sort of formalized in Leopard to become iChat Theater.
So we can see sort of the collaboration increasing people's productivities as they can sort of share the, the data that they have with their colleagues in other places. And this really just sort of emphasizes the, the importance of being able to do complex technical visualization. So what I'd like to do to, to demonstrate this and to talk about this, this, this fascinating technology is to invite Jay Lyerly up onto stage -- he's the Senior Macintosh developer from CEI -- to talk about Insight.
( Applause )
Here you go.
Thanks.
Hi, my name is Jay Lyerly and I'm a software developer for Computational Engineering International. We're a small software firm located outside of Raleigh, North Carolina. We make a package called Insight. It's a, this is Insight here. It's a package for the visualization and analysis of both scientific and engineering data. Here we have a simple model loaded up this is airflow over some high electronics and heat flow study. And this is the kind of interface you see when you bring up Insight on your, a laptop or you workstation.
Insight also supports clusters. We do computational clusters where the, the cluster nodes do number crunching kind of activities. And we do graphics clusters where all the different cluster nodes take place in, in distributed rendering. Here's a snapshot taken from the super computing conference last fall in Apple's booth.
This is one of our graphics engineers explaining that behind this wall there are a half dozen Mac pros working together to, to render this image in parallel. And the resulting image is, is completely interactive. You can spin this around, manipulate the model, do animations just like you would with Insight on your desktop.
Now in the past computational clusters and display walls like this were really the domain of our larger customers, the national labs, large industrial customers like that. One thing we're trying to do with the Mac platform is to bring these high end technologies to small groups and even individual users.
To that end we use several Apple technologies on the Mac. First off we have QuickTime components for our custom video format. This allows users to create high quality engineering animations in Insight and then pull those directly into things like iMovie and Final Cut. We use Apple Script in Bonjour to make launching Insight in a closer mode easier. This makes, this makes setting up a cluster much more, much simpler and much more dynamic. You can imagine walking into a conference room where there's the display wall and dynamically connecting that to, to that display wall from, from your MacBook. 64-bit computing is also very important.
In Tiger we ship 64-bit computational core. In Leopard we're extending that to full 64-bit graphics top to bottom. This is really important for our customers. They are well into this 64-bit memory space. And we're really excited to be able to offer that competitive big memory support on the Mac platform.
Another new feature we're adding in Leopard is iChat Theater. And it's going to go a long way to help increasef the usage of Insight as a collaborative tool. I'm going to spend the next few minutes talking about how we've done that integration and what that means for our users and how that's going to work.
So my background is actually in astrophysics. And as we saw a few minutes ago, it's a very visual field. Also in astrophysics there's a lot of collaboration going on. You can imagine a theoretician developing stellar models who wants to share that information with his observer colleague, who's usually at a different facility.
They really need some software to be able to do that interactively in real time, where they can discuss what's going on and look at the model and see how you know orientation or temporal evolution of that model effects what's going to be observed. And so this is of course important in astrophysics, it's important in a lot of other fields, these collaboration, these type of collaborations come up all the time. And we've recognized that for a while in developing Insight.
And even gone so far as to put a collaboration mode into Insight. And we've had sort of varying degrees of success with that. So the older Insight collaboration mode, mode works sort of like this. I bring up my model, I have it loaded into Insight. I would call my collaborator on the phone, tell them to launch Insight, give them some information about what machine to connect back to, and they would make a connection.
And if all that goes well, everything works and they can see what I can see, as I you know go ahead and do analysis on this model, I manipulate the geometry, they see that in real time. So, so once this is connected and, and all running it works great. But there are several hurdles that are kinda nontrivial to get to this point.
First off, both I and my colleague have to have Insight. So that's really going to limit the pool of people that I can collaborate with. Also usually when you want to collaborate in this fashion, the person's you know not at your, at your location. They're somewhere else far away. So they're not on your local network, you need to worry about IP addresses and host names, you need to worry about firewalls and port forwarding to make sure those connections can happen. You might have to get your network admin involved.
And so there's, there's a lot of things to get right to get this to work, so it's really kind of a high barrier of entry So we want to know, we're looking for a way to make this easier. And to make this simple enough that our users could do it without really worrying about how it works and use it on almost an everyday basis.
So of course last year at WWDC, iChat Theater was announced and this seemed you know ready made for what we needed to do. So we started looking into that. iChat Theater of course is geared toward Cocoa applications, Insight is actually an X11 application. So we implemented this in a, in a helper application called Insight Theater, yeah Insight Theater in itself is just a little Cocoa app whose job it is to pull our imagery straight out of Insight and make that available for iChat.
So this is how it works from an end users perspective. Start in the same place, I've got my model loaded up in Insight and I want to share that over the Internet. This time I'm going to collaborate with my colleague Amanda whose somewhere out there on the Internet with her Macintosh running iChat.
So I start up my video conference, there's Amanda, and there I am in the little Insight preview window. And now I'm ready to start the Insight Theater. So I click in Insight, launch the helper app. It comes up. I hit the start theater button. And now my Insight preview has gone away and it's replaced by the content from Insight.
And from Amanda's point of view, she sees a full iChat AV window with the same thing I'm looking at in Insight. And of course this is all real time as I manipulate my model, do analysis, do animation, she's going to see that. And we have the voice chat going on in the background. So this is actually pretty easy.
We just, you know I brought up my model in Insight, I've started my iChat AV session and in two clicks, I'm sharing the imagery straight out of Insight over the Internet to my colleague who all I have to have is a Macintosh. So this addresses quite a few of the hurdles we saw before.
Instead of having to have everybody to have Insight to do this, I just need the other person to have iChat. And so if we think back to the astrophysics example where the theoretician wants to collaborate with the observers, observers typically don't have tools like Insight. It is a real high end computational things. So that was a collaboration that wouldn't happen before. So now with this technology, its, its made available you know as a much greater chance that that could happen.
From a developer stand point from wanting to sell software, this is really great because every time one of my customers uses Insight in this way, they're basically doing a mini commercial for me. They're doing a tour of my software. They're showing their colleague how they're using Insight in a very domain specific way to do analysis and do useful work. And so that's really a plus from a developer's perspective.
From a practical stand point it also solves all our network problems. No longer do you have to worry about IP address and host names, you just have to find your buddy in the buddy list and click on him. You don't have to worry about setting up firewalls and port forwards just for Insight. This is all piggybacked off iChat. So if iChat works and you can do video conferences, then Insight Theater is going to work.
So we're pretty happy with Insight Theater. It's going to ship in the fall with our release of Insight for Leopard. That's the full 64-bit Insight. As far as developing, we've been very happy with how easy it is to, to interface with Insight Theater. I keep joking that it actually took me longer to make these slides than it did to do the integration.
So I think it does really going to be a big boon for our users. Its, it's going to be something they're going to find really useful and they're going to be able to, to use in their daily workflow to share their work where they really couldn't do it before. And as, as a user I'm hoping to see iChat theater in lots of other Mac applications this fall. So in closing, feel free to check out our Insight and our other applications at www dot insight dot com. Thanks.
( Applause )
So that's collaboration and networking, being able to use standard sort of consumer based tools and Apple technology to be able to do interesting things with sharing information. So the next thing I'd like to talk about is class grids and grid computing, just in case that you know little super computer that you bought earlier on in the presentation didn't quite do all the computation that you needed.
So we have this really large datasets. You know sometimes that it's just too much for one machine. And these, you still need to find the information that's finding in that data. You need to mine it. We have the high speed networks now that are able to, to connect machines together and this is really what's driving large grids. We have there's a couple of examples at different types of grids. You have dedicated grids, you know single organization building up racks of machines to be able to provide the grid as a service.
But then we also have a number of people who have noticed that, that they have machines sitting around and not doing anything. And they can harvest these bare cycles. And some of these are over pretty wide scale such as setting at home. And but other things, states Kentucky and Maine, have also noticed that they have a huge pool of machines sitting in their schools doing nothing at night. And so they're sort of getting the spare cycles out of their K12 Macs to be able to reuse that as part of scientific research.
And the interesting thing about those things that's sitting in the classrooms are actually there now all multicore and 64-bit. So they are surprisingly powerful machines sitting there, that they can harvest these, these networks. Because almost the entire part line of Mac, all the four quadrants of the MacBook, MacBook Pro, iMac and the MacPro are all 64-bit.
So here are some examples of some, some dedicated clusters. So we have sort of teragrid. This is a 250 teraflop cluster, 30 pedabytes of data storage. It's the world's largest distributed infrastructure for open science research. We also have the open science grid. This process around about 15000 jobs a day, 30 simultaneous jobs during peak hours and has over 50 locations in the global grid.
And finally we have the Mac open grid. The interesting thing about this one is that it's a 100 percent max. So 500, 500 node machine I believe, 100 percent max. The last one that I'd like to talk about is the Kentucky Dataseam Initiative. And so I'd like to introduce Brian Gupton who's the CEO over the Kentucky Dataseam Initiative to come and talk about how they build their grid.
( Applause )
Thank you.
Thanks.
( Applause )
Don't worry, Apple didn't design this. Good afternoon, I'm, I'm glad to be here and talk to you about something that, that is very dear and special to me, this program. And to talk to you a little bit and share about something that's going on that really special in the state of Kentucky that's not only driving research, but it's changing the future of many lives. It's saving some lives. And it's almost bringing additional opportunity and changing perceptions and expectation sets for the citizens of Kentucky.
So what is Kentucky Dataseam Initiative? State wide grid computing initiative, it was founded in 2003. We're based in Louisville, Kentucky. We have 47 participating school districts that represent about 250000 of the state's approximately 600000 students. The regions you see colored there, that's very important to the grid because these counties in the state of Kentucky all are coal producing counties that is counties that, that produce coal.
And coal and tobacco are two of our major industries in the state of Kentucky. We take our name from, from that reasoning. We look at seams of data to bring our, our economy into a 21st century environment where coal fueled much of our economy for, for many, many years.
So why do we do this? Well, three things. We want to advance commericalizable research. That is research that moves from the theoretical to the practical and has an opportunity to create a product, create a job and to expand our tax base in the state of Kentucky. We use that commercializable research as an opportunity to drive educational advancement within the state.
One of the things that we have a tremendous challenge of in the state of Kentucky is getting our young male and female students both to pursue careers in the sciences. And, and many times in the state of Kentucky we are bringing in students from, from outside the state who utilize Kentucky based scholarships and stipends to do work in our research institutions.
And they often go back home or they go and, and form companies that compete against Kentucky opportunities. And that's a challenge for us in the state because we are wanting to not only create a critical mass of opportunity but a critical mass of understanding why science is important and how science and education can directly relate to jobs and opportunity. So as I alluded to earlier, we're looking to drive and diversify our economic development in the state. For many years, our tax base completely subsisted of some degree of manufacturing but also coal and tobacco.
And so what we had to do in the state was bring in some enterprise that was reflective of the 21st century eco!nomy. And as a part of that, the Cabinet for Economic Development created a department of commercialization and innovation to drive knowledge based jobs in the state and create knowledge based opportunities for the sons and daughters of coal miners and, and farmers.
And as a part of that, they created a bucks for brains program to attract researchers to the state. And it's been very successful. And as a part of a study that we did in 2003, we looked at our researchers as a traditional business. And we had to continue to give those researchers infrastructure to allow their research to grow and continue to be successful or those researchers were going to leave.
They'd go to another institution just like you would any traditional business. So and even down to for you history buffs, in looking at economic development. That's a, that's a picture of Lyndon Johnson, he sited his war on poverty right in Appalachia in one of the counties that were we're working in which you saw on the map earlier.
So, if you take anything away about the program, it can be summed up as kids, cancer, computers. We had an existing base of computers in the state of Kentucky and as a part of this program to continue to drive research and opportunity, we were able to receive funding to build out this grid. So not only are we looking at grid computing as, as you all well know grid computing is a tremendous way to get more out of what you already purchased.
But we in Kentucky look at grid computing as a way to anticipate purchase of assets to be able to get that win win. Big difference. So as a part of this, this grid, the data seam, we have 5100 computers. It's the largest 100 percent Apple grid on the planet from what we understand. It's comprised of G4s, G5s, Intel based iMacs and MacPros, just sitting in the state's environment.
So what are we doing with it? We're working with the University of Louisville, the James Graham Brown Cancer Center, they're part of the bucks for brains researchers team that we have in the state. They also had a proven track record of being able to create cancer drugs. And the challenge we had when we met with them originally was that they had, they had identified eight different targets. But because of their internal computing capacity, they could only work on two at any given time.
And so we know that that's, that's a problem for their funding, being able to, to generate additional funding dollars, pursue funding and grant opportunities as well as move those drugs and those opportunities onto the next level. So the University of Louisville is working on structure based drug design as you well know, our next generation of cancer therapies are those that are cancer specific.
Not toxic side effects. The University of Louisville had, as I said earlier, had a proven track record in those and actually brought one through trials that was 67 percent effective in stage three patients with no toxic side effects. So as a part of the infrastructure that we've brought to this, to the program, that eight targets has been able to grow to 20 with a more expansive investigation of each of those targets. And we've been able to increase the libraries that they're investigating those targets against from 300000 to over three million libraries.
So the technology that's involved. Of course, the Mac OS X environment has been key to it. It's been very easy for our researchers to recompile applications code that they already have on their dedicated assets in that environment to run on our grid. And so there's a tremendous cost savings there because they're not having to go out and assume other types of software.
The Xgrid it's built in and we all know about that. The Intel architecture has been key for us. The move to the Intel chip has allowed the machines that we're putting in the environment to process that research much faster and with better results. And then of course with the Apple Remote Desktop application, we're able to remotely manage that grid because as you can tell, not only is our grid remote, but some of those application, or some of those environments in Appalachia are pretty remote too. And, and so that application has been very vital to us to be able to pick up and, and manage the grid at any given point.
So the future. We've talked a little bit about the research, but it's, it's important to take pause here and let you understand the educational aspects of what we're doing. To let you, to let you know the types of machines that we're putting in, it was a C change. It was big difference in Appalachia. To let you know some of the computers that we were replacing were, were IBM value points running OsiriX.
Yeah. And, and they were using these as educational machines. And we know that that's not reflective of an environment that you're going to send a student upon graduation out into the world, we hope not. And, and so the technology that's coming in is very new and very key to helping us raise that next generation of scientists and researchers.
And its, and we're seeing things happen in these school districts where the teachers, the students, they're getting excited about these opportunities. They understand the importance of their role in doing things to change their communities. They understand that it's going to take them creating those opportunities and understanding and expectation sets that, that they may not be coal miners. They may not be farmers.
And science is okay for boys and girls. And trust me that's a challenge. And, and so part of our message is, is trying to address both genders and, and their role in pursuing scientific opportunity. So as we move forward with these computers in the school districts, we've got like I said, 5100 machines out there right now.
We're going to be putting another two to 3000 machines on the grid by August of this year. As a part of the program, we bring scientists and researchers from our university as well as teaching professors from our other partnering institution, Moorhead State University out to the districts to talk to the kids about opportunities in science.
We give the researchers a chance to start developing that talent, that one day can start filling positions in their labs and, and start to grow that critical mass of opportunity and understanding in the state. As a part, as a part of the program, as I said, we've got new technology and it's very different than, than what the teachers are used to. We, we've trained 1400 teachers statewide in the better use of technology in the classroom and we'll train another 1000 teachers by the end of the year.
As a part of the program, we also have scholarships and are working with the University of Louisville, their understanding that once you get these kids excited for science, you have to give them a path to move on out. And so they've put nearly a half a million dollars together for data seam scholars for students that want to pursue careers in math, science and engineering.
And of course, since this is a little bit different, we're looking at other opportunities to expand the program in states and countries because while we're not, we're not the only people that are having this type of challenge. And, and we're very excited about the types of, types of things that are going on.
I want to conclude with two things to, to let you know what's going on. I brought this up here and as I, as I said before, Apple didn't make this. Jonathon Ive had no hand in, in this bucket. To, to illustrate what's going on in the state of Kentucky, this is a miner's dinner pail.
And this was high technology for a lot of folks in our state. This was the, this was the tried and true. This is what they carried to work every day. In fact, my father carried this one to work every day. This is what I carry to work every day. It's very exciting. And I want to introduce you, the last thing I want to do is introduce you to some folks that are helping make this happen. We've got some folks here in the front row and I'd like for them to stand up. Second row too.
( Applause )
These folks that are, that we just gave a hand to are representatives from our states K12 environment. And, and on top of everything else that they have to do in the course of their day, in their normal job, they're setting this grid up for us, they're keeping it going.
And they're helping us change opportunities and prospects and expectations in their school districts. And they've been tremendous colleagues and depending on the challenge of the day they've been pretty good co conspirators and they've been darn good friends. And, and I wanted to thank them and as, if you get a chance during the week, pull them aside because I'm sure they have more than one story to tell you. So thank you very much.
( Applause )
So that's our coverage of clusters and grid computing which actually concludes the section of the presentation. So basically what we've talked about is that the data explosion is, is really driving everything in science. And the, the neat thing is that Apple is basically addressing a lot of these trends through the data side of things. Being able to both store your, your data on, on commodity pricing storage, being able to process it simply with, with things like the Mac pro, through data visualization with the commodity screens and the render farms.
Through collaboration technology there with iChat Theater. And then in clusters and grid computing with large amounts of actually sort of consumer based products being able to take advantage of iMacs in, in clusters, which is pretty amazing. So at this point, I'd like to turn the, the conversation back to, to Liz who will finish up for us.
( Applause ) >> Great. So before we go, I just wanted to quickly go over a few things that I wanted to highlight at the developer conference this week. And I think because of the time we're going to be a little short for Q and A, so I want to just let everybody know that the speakers will be up in the science connection on the third floor after the presentation. So if you do have questions for us or want to have a discussion, we probably need to get out of the room as soon as I'm done.
And so just we don't want to cut off your question, but you'll probably have to do it in the science connection. So it's on the third floor. So the first thing is the scientific development poster session, how many people here have a poster? Right on and those of you that don't why not?
( Laughter )
630 tonight on the second floor corridor. Please come, there's some fantastic work that's going to be shown in these posters, they're just beautiful, so please come and support the people who did present posters this year. We also have community discussion up in the science, right across from the science community, science connection area listed here. Today we talked about collaborative technologies, we have high performance computing, graphics computing tomorrow.
And some community discussion on Thursday. On Friday there's a very specific, I think we have this at the end, so let me wait. Apple design awards are tonight as well. And in the science connection, if you have an app that you want to demo, we have systems up there that you're welcome to load on to the system and demo for anyone that's interested in seeing it.
Here's some sessions, I'm just going to skip this, you guys know what you want to see here hopefully. I want to talk for a minute about the display wall at WWDC. If you haven't seen this, this is the graphics in media lab. We, special thanks to Tungsten Graphics for an incredible job importing Chromium over to Mac OS X, in incredible, I think record time. Its being shown every morning. You can go see it, potentially play with it. Every morning this week. The afternoon we're going to be showing the cords composer elements.
Cords Composer, the application they've built for the show is now available in your Xcode developer tools. So that demo app is now in Xcode. So it, Cords Composer is the afternoon. The Tungsten Graphics version of Chromium is in the morning. On Friday we're also going to have a discussion section with the Chromium Tungsten Graphics rather engineers also during lunch time. So that'll be our fourth lunch time session on Friday. It's not on the calendar, so if you're interested, make a note of it please.