2025-10-30 • iOS, iPadOS, macOS, tvOS, visionOS, watchOS • 2:07:05
Join us online to learn how to elevate your app experience by maximizing performance and resolving inefficiencies. Whether you’re optimizing an existing app or just starting out, you’ll learn how to improve your app’s responsiveness with SwiftUI, monitor performance when using foundation models, and explore ways to reduce your app’s battery usage. Conducted in English.
Speakers: Natalia Suarez, Henry Mason, Steven Peterson, Alejandro Lucena
Transcript
This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.
So my name is Cole and I'm a performance evangelist here at Apple and I'm delighted to welcome you all here to the Apple Developer Center. The Developer Center here today provides a great venue for the event here at Apple Park with different spaces for connecting and collaborating together. And this room is Big Sur. It's a state-of-the-art space that's designed to support a range of activities from in-person presentations, studio recordings, and live broadcasts. There are also labs, briefing rooms, and conference rooms that let us host activities like this and many more.
So for those of you joining us here today in person, you'll get to experience more areas of this developer center later this afternoon. This is one of four developer centers around the world where we host designers, developers for things like sessions, labs, and workshops. And we also have many more people joining us online right now. Hello, world. Thanks for tuning in.
So right now, I just want to talk about a few small things for those of you here in the room. If you haven't figured it out already, it's easy to stay connected throughout the day using the Apple Wi-Fi network. And if you need to recharge at any point, there's power available at every seat. It's just located right in front of the armrests.
And for those of you joining online, you have access to an online Q&A. A team of Apple engineers and experts are in the back answering as many questions as they can throughout today's morning presentations. And if we sadly don't get to your question today, please join us on the Apple Developer Forums after the event to continue the discussion.
And for those tuning in today, these presentations are really just meant to be a special experience just for you, both online and here in person. So we'd ask that you please refrain from recording video or live streaming during today's presentations. You are welcome to take photos, however, throughout the event, and we'll also send you follow up information once the event concludes so you won't miss a thing. And with that small request out of the way, today's event is all about building incredible apps that are tuned for speed and efficiency.
When you think of great performance, you might first think of apps that have heavy requirements. Things like rich 3D graphics, games, advanced machine learning computations, complex data analysis. But performance is a quality of every app. People want every app to launch quickly when they tap on its icon. For it to scroll and animate smoothly when interacting with it. and for it to work efficiently with battery and storage.
For example, it's truly a delight to use an app that launches quickly. The app's launch time can be almost imperceptible when tapping its icon, making your content and your experiences immediately ready to enjoy. But slow launch times break this illusion. It can make someone feel like they need to do heavy lifting to use your app, creating a sense of distance from your app's content.
And when using an app, smooth scrolling and smooth animations bring your content to life. But when scrolling starts to hitch, the app can feel broken. It doesn't respond to touch the way someone might expect. And that can lead to a sense of distrust and unreliability in the app.
And when it comes to battery life, apps that are efficient let people enjoy everything your app has to offer without compromise. When apps use too much battery for the value they provide, people might notice this right in battery settings, and they might even reconsider using your app altogether.
Great apps tune their performance across a number of these different characteristics, like how much they hang, how much they write to storage, how often they're unexpectedly terminated, and more. These characteristics describe a sense of quality in your product. And when you build great performance into your app, you're imbuing it with a sense of quality that people feel and remember when using your product. So, today, you'll learn a variety of tools and techniques to bring great performance to your app. Starting with tools you can use during development, like instruments, to profile for performance and learn where the opportunities are in your app. But there's a number of other tools you can use as well, like XCode Organizer, which provides insights about the performance of your app on customer devices. The App Store Connect API that lets you capture those insights from Organizer for your own tools and release process. And Metric Kit, a framework that provides on-device metrics and reports. So you have the raw data to build your own custom performance dashboards. So here's the plan for today.
This morning, here in Big Sur, there will be four presentations. You'll first learn techniques for optimizing for power with the new design and liquid glass. And then you'll learn how to get fast responses when using foundation models. After a quick break, we'll dive deep into some Swift UI tools and techniques for performance. And just before lunch, you'll hear from a special guest here with us today from Snapchat about how they approach performance in their team.
around 12:15 PM Pacific time. This is where our online event will end for the day. But for those of you here in person, there are a few more activities after lunch: performance labs and hands-on profiling. This afternoon in person is designed to help you learn even more about where you can make performance improvements and get your questions answered. There are over 50 Apple engineers that specialize in performance and power joining us here today and they're really excited to learn how they can help.
There are two activities that you can take advantage of this afternoon. The performance labs and the hands-on profiling space. The labs are a great place if you already have some specific questions about performance or power. And the labs will pair you with an engineer based on your question for a discussion.
If you don't have a specific question, or maybe you just wanna try out some of the tools you hear about today, the hands-on profiling time is a great opportunity. You'll be able to profile your app alongside some Apple engineers and get advice and feedback along the way. I'll talk more about these activities just before lunch. So that's the agenda, and there's a lot here. So let's get started. Please help me welcome the first presenter for the day, Natalia, to share more about power optimization with the new design.
Thank you, Cole. Hi. My name is Natalia Suarez, and I'm a software engineer here at Apple. I work with different feature teams at Apple to optimize for power. And today, I will share how you can optimize your app's adoption of the new design to minimize impact on users' battery life. In this session, I will discuss key considerations for power when adopting liquid glass in your app, techniques to improve the efficiency of scroll edge effects.
Optimize power with the new design
and some next steps you can try in your app. I'll start with liquid glass. Liquid glass is a new, dynamic material with the optical properties of glass, which brings a sense of fluidity to your app. It affects how the interface looks, feels, and moves, and then adapts in response to the content below the glass layer. Adopting this new material is a great opportunity to evaluate your app's rendering performance and reduce its power impact.
To make liquid glass as efficient as possible in your app, I'll discuss how you can, in many cases, combine separate liquid glass effects into containers that can improve rendering efficiency, And I'll cover how you can improve power by avoiding scenarios where this material re-renders when it doesn't need to. I'll start with containers.
Views that use liquid glass can be combined into containers to allow the system to render multiple liquid glass components more efficiently. You can do this by wrapping liquid glass elements that are a similar size and near each other in a glass effect container in Swift UI. There are two things to keep in mind when using the glass effect container-- the size of the objects you are grouping and the relative location of those objects.
One of the most important considerations when using containers is the size of each liquid glass element. When a liquid glass element is small, the system can minimize the number of visual effects it needs to render as they aren't visually apparent. When you group a small element with a larger one, however, you lose out on this optimization because the system treats it as one large element. So it adds additional visual effects that are not visible, increasing rendering power.
For example, the system would not want to group the keyboard with the text field, as the keyboard is a large liquid glass element, while the bar is small. If the system were to group them together, they would be adding additional effects to the text field that are not visually apparent and would then lead to inefficient rendering of that bar.
An example where you would want to group elements, however, is an example like calendar. Calendar has three liquid glass buttons of a similar size next to each other. This system will therefore group these three buttons together, allowing them to render more efficiently to optimize power. Size, however, is not the only thing you need to consider while using containers. You also need to keep in mind that you should not group all glass that is a similar size together. The location of these elements is also a crucial consideration.
The container calculates liquid glass effects across the entire area between your elements, and any change within that area triggers a re-render of all objects in the container. So, if you have liquid glass elements on opposite corners and something animates in the middle, both distant elements have to re-render unnecessarily, leading to an increase in battery consumption, even though those elements have not changed. An example of this would be in maps.
If I were to group the weather view in the top left of the screen with the buttons on the bottom right, any time I were to move my location indicator in the middle, it would force a re-rendering of all those liquid glass elements, despite them not changing. So what MAPS does to minimize unnecessary rendering is it groups elements optimally by only grouping the right most buttons together and not including the weather element. This way, the use of containers is maximized while minimizing the surface area that would cause the material to rerender.
Now that I've explained a couple of key considerations while using liquid glass containers, let's dive into a simple code example on how to create them. In this example, glass effect container is used in the body of a Swift UI view to combine two buttons. To add the liquid glass effect, the button style modifier specifies glass, which gets applied to all of the buttons in the container.
This results in the container optimizing rendering performance by combining multiple glass effects, leading to fewer overall rendering passes. So it's a great optimization to reduce the power impact of your app. While using glass effect containers is an important consideration to help maintain great battery life, you also need to consider new types of re-rendering issues.
With liquid glass, sheets, buttons, and other controls are translucent, allowing your content to shine through. But in some cases, these effects are not visually apparent and can be optimized to reduce rendering costs. For example, consider my simple pedometer app. It counts my steps and displays them with this prominent text. It also shows a progress view animation, indicating that my steps are actively being captured. And below that, there's a button that I can tap to view my workout details.
When I tap on the "View Workout Details" button, a sheet appears over this view with more information about my distance, duration, and more. When viewing this sheet, the content below is still visible through the glass effect. For example, the step count is faintly visible and it updates about once per second as I step. I know it's hard to see, but I promise you it's there.
It's great that the content below is coming through the material and is updating with each step just like I'd expect. However, consider the progress view. In this case, the animation of the progress view is not visually apparent at all, and yet it is still animating underneath, causing part of this large glass sheet to render on every frame. The added power cost of this re-rendering is not providing any additional value to someone using this app. This is something I can improve and I'll show you how I would identify this problem and measure its impact.
The first thing I want to do is identify the parts of the screen that are re-rendering consistently where I don't expect it. XCode provides a great visual tool that lets you observe where rendering is occurring. To do this, I'll go to my Mac, open XCode with my iPhone wirelessly connected and I'll run my project. I'll wait a few seconds for the app to build and launch on my iPhone.
Then in Xcode, I'll navigate to Debug, View Debugging, Rendering, and Flash Updated Regions. Now, on my iPhone, every time I render on screen, it will flash with a yellow color. So when the step counter updates, it flashes yellow. And the progress view is continuously flashing yellow, showing that it is constantly rendering to display that animation.
Next, in my app, I'll tap the "View Workout Details" button to bring up the sheet. When the sheet comes up, it has a persistent yellow square over it. That's what I suspected. Even though it's not visually apparent, the yellow color indicates that this part of the screen is constantly updated.
I've gone ahead and added an optimization to my app to test this out. This optimization will hide the spinner when the sheet is presented. Now I'll toggle that on and bring up the sheet. When the progress view is hidden, the yellow rectangle only flashes occasionally, around once per second when the step count increases, which is what I would expect. By hiding the spinner, the glass effect now only renders when the visually apparent content changes.
So I know this optimization changes how frequently this sheet is re-rendered. But how do I know if there's a power savings from this optimization? One of the best ways to measure power in your app is using the power profiler available in instruments. To do this, I'll go back into Xcode, click stop, and go to product profile. The app will build for release mode and bring up instruments. I'll choose the power profiler.
and click "Record". Instruments will launch the app and to profile, I'll tap the "View Workout Details" button. Right now, Instruments is recording power information about my app. I haven't turned on my optimization, so right now, the progress view is animating under the glass effect, causing that constant re-rendering. I'll give it about 10 seconds or so, and then I'll click stop.
Now, I'll do the same thing, but without the progress view animation occurring under the glass effect. So I'll click record again. I'll toggle on my app optimization and bring up the sheet. I'll give this another 10 seconds or so, letting the step counter increment and letting instruments collect power data from the app during this run.
All right, that should be enough. I'll click stop. Instruments profiled a lot of information about the performance of this app, including the time profiler, network activity, and power. And it presents it in a timeline. I'm going to focus on the power profiler instrument for this because I just want to measure the power difference for my optimization. On the sidebar, instruments captured my two runs. First, without my optimization and then with it. I'll select the first run and then select the last few seconds of the timeline. Without this optimization, the summary pane and instruments tells me that the device used roughly 9% of its battery per hour. Now, I'll click on the second run and select the similar time region.
During the second run that did have the optimization, the device used roughly 6% of its battery per hour. That's a 2% per hour improvement. That's huge. So this is great. The Flash updated regions tool and X code let me identify the excess re-rendering and the power profiler let me measure the power impact to verify that my optimization worked. These are great tools to use when optimizing your app with the new design. So now that we have discussed some useful tips on reducing power while adopting liquid glass, let's discuss how to improve the efficiency of the scroll edge effect.
Scroll edge effects sample the content underneath it in order to preserve the legibility of the content above scroll views, like navigation titles. One of the ways you can improve the efficiency of scroll edge effect rendering is by considering the background of your scroll views. Scroll views with opaque, static, and solid backgrounds can explicitly set their background visibility. By doing this, the system can use more efficient ways to render the edge effect. This optimization only applies to the background of your scroll view. If you have dynamic content in your scroll view, but the background is opaque, static, and solid, then you can still take advantage of this optimization. Let's go through some examples.
Here I have the weather app. The weather app uses a dynamic and non-solid background. So the system would not be able to utilize the optimization here. But the system would be able to utilize the optimization in an app like Photos. Despite there being a list of photos with a range of colors, the background is opaque, static, and solid. It's just white. Therefore, the system is able to utilize this optimization to render the scroll edge effect more efficiently. Let's go through some sample code to show you how to enable this optimization.
Here, I have some sample code that creates a Swift UI view that defines a scroll view in its body. Now, to enable this optimization, all I have to do is apply the scroll content background modifier to the scroll view and set it to visible. This will allow the system to render this effect more efficiently, helping reduce power and improve battery life. You can also do this in UI kit through simply setting a color on the scroll view.
I've shown a few different tips and techniques today. So I'll leave you with a quick checklist of the things you can optimize in your app when adopting the new design. First, consider combining liquid glass elements into a glass effect container if they have a similar size and are near each other. Inspect when views re-render using the flash updated regions tool and identify opportunities to reduce re-rendering when it's not visually apparent. You might be surprised about what you find.
When your scroll views have an opaque, solid, and static background color, set visibility explicitly on the scroll view for more efficient scroll edge effect rendering. And lastly, measure the impact of your optimizations using the Power Profiler. Thank you for your time today, and back to Cole. Thanks, Natalia.
Our next presentation this morning is all about performance with Foundation Models. In iOS 26, the Foundation Models framework lets you bring the power of Apple's large language models to your app. And our next presenter, Henry, has some really great performance tips when you get started using this framework. So please help me welcome Henry.
Generate fast responses with Foundation Models
Hi, I'm Henry Mason. I work on Apple Intelligence. Today, I'm going to talk about how to get great performance with Apple's new Foundation Models framework. It's a remarkable moment for artificial intelligence. AI isn't just about generating text or answering questions. It can amplify creativity and simplify complexity. With Apple's new Foundation Models framework, you can use AI to summarize information, translate ideas, craft natural language interfaces, or generate structured data directly in your apps. You can build tools that understand what users mean, not just what they tap. You can let them describe what they want, then watch your app bring it to life.
To start, I'll talk about the new Apple Foundation Models Framework. Then I'll cover a bit about how foundation language models work. This will allow us to have a better understanding of the fundamental performance characteristics of foundation models. Then I'll go over an example app that uses the foundation models API to locate some performance problems.
Finally, I'll explore some techniques for addressing some common problems and make your Apps Foundation model performance great. To start off, I'll review what makes up the new Foundation Models framework. The Foundation Models framework is new in iOS, iPadOS, macOS and VisionOS 26. It allows you to use the same on-device generative AI technology that we use in Apple Intelligence right in your applications.
The new Apple Foundation Models framework is deeply integrated into the Apple platform. It is private by design. Every request stays entirely on the user's device. Nothing is sent to the server. User's data stays theirs. The Foundation Model Framework is also available offline. Whether users are on a plane or just an internet dead spot, it's still right there, ready to help. No internet, no problem.
the Foundation Model Framework is free to use for both users and developers. There's no subscriptions, no usage quotas, no token budgets. It's part of the platform and it's included at no additional cost. And finally, it's built in. People running iOS, iPadOS, macOS and VisionOS already have access to Foundation Models. For developers, this means no new API keys to manage or new SDKs to bundle. For users, there's no extra installation, no setup, no friction. Foundation Models are ready to use as soon as your app is ready to launch.
So Foundation Models framework's private, always available, free, and built in. It's just a few lines of code. You don't need a cloud account, you don't need to think about tokens, you just import foundation models, create a language model session, and start generating. I'll go over how easy that is in code.
I start by importing foundation models. Then I create a language model session using the default configuration. That session represents your usage of the on-device model. Next, define a simple prompt to tell the session what to do. Calling respond returns the model's answer as a string. The power of a model, modern language model, right there in your app.
One of the challenges of LLMs is the unpredictability of their output. When you generate a raw string, the output may be in an unexpected format. If you ask for JSON, there's no guarantee that the keys will be what you expect, or even that the output will be valid JSON at all.
Foundation Model's Generable API addresses this. You write normal Swift code, and the framework takes care of parsing the response for you. In this example, the search suggestions type represents a list of search terms. By annotating the type with Generable, the response from the session is guaranteed to be in a valid format. And it's really exciting to see what developers have been able to do with foundation models. Here's Stoic. It gives users hyper-personal journaling prompts based on recent user entries.
Here's VLLO. It's a video editing app that uses the foundation models framework with the vision framework. It uses them together to autonomously suggest matching music tracks and stickers for different video scenes. And here's Smart Gym. It's a fitness app that uses Apple's Foundation Models framework to let users describe workouts in a natural language, then convert them into structured routines. Its new Smart Trainer feature learns from past workouts and gives personalized recommendations, such as adjusting reps or weights or proposing new routines. It is so exciting to see what else will be possible with the Foundation Models framework. Next, I'll talk about how large language models, or LLMs, actually generate answers, and why the way you use them changes how fast and efficient they feel.
An LLM doesn't store pre-written answers. It generates text one piece at a time. The input text is broken to little fragments called tokens. Each token is typically a word or piece of words. The model looks at the whole conversation so far, then predicts the most likely next token, again and again, until the answer is complete. In this example, the question "What are 10 facts about Swift?" is broken into eight tokens.
Under the hood, each token flows through a statistical model that decides what is the most likely next token. It weighs all the past tokens before picking the next one. Every step, the model is scanning back through the entire conversation so far. In this example, the model first reads the entire input, "What are 10 facts about Swift?" to predict the most likely word that follows the question mark token.
the model finds that the word "here" is the most likely next token. On the second step, the process repeats. But this time the model looks at the entire sequence so far, "What are 10 facts about Swift here?" and predicts the most likely token after the word "here". In this example, the prediction is the word "R". This process repeats over and over for each output word. One important thing to note here is that the model can read faster than it can write. It can read thousands of words in big batches, but if you ask it to generate a thousand words, it has to build each word step by step.
So for example, summarizing a long article will be fast because it's mostly reading. But writing an article of the same length will take much longer. The time from the model seeing the input to completing the output is called latency. One other thing to note. Different kinds of text will break into tokens differently. A simple English word like banana will just be one token. But some text, like dates or phone numbers, will turn into many tokens. So characters of symbols will take more time to process than characters of plain words.
Now I'll talk a bit about the context window. The context window is the amount of information that the model can see and work with at one time. It's like the model's short-term memory. The window conceptually includes the input and output of a generation session. In the foundation model's framework, this includes the user prompt, the tool definitions, the instructions, and any generated content. Anything that increases the size of any of these components will increase latency and risks exhausting the finite context window.
When you give the model a short prompt, it doesn't have much to keep track of. But if you prompt with pages of text, the model will have to reread all that text for each output token. So the longer prompts will mean more work every step of the way, which will lead to higher latency for your users. Every token counts. Engineering the prompt is a balancing act. On the one hand, you need to be precise enough to remove ambiguity because vague instructions will cause low quality results.
On the other hand, too much detail bloats the prompt and crowds out space for the actual task. The art lies in finding the sweet spot, just enough specificity to guide the model reliably, expressed in as few words as possible. One more thing to keep in mind: response length directly affects speed. Every extra output token takes extra time. If you ask for long paragraphs, you'll notice the model will take longer to finish. You want to guide the language model towards short focused outputs so you can get responses faster.
Up until now, I focused on performance in terms of how long it takes for a response to finish entirely. The general aim is to reduce latency, the time the user spends waiting for the system. Latency matters because it's what the users feel. Even a split-second delay affects how interactive an interface feels. That's why it's so important to care about minimizing latency. But it's equally important to consider how responsive the generation operation is to users. For interactive applications, it's not just the latency of finishing the entire operation that's important. It's also important to consider how long until the user can tell that the work has started. The faster the results become visible, the more alive your app will feel.
When the model starts showing results as soon as they're ready, the user feels the answer arriving quickly. Even if the total generation time is the same, perceived latency will drop dramatically. Instead of staring at a blank screen or a spinner, the user sees progress right away, and that creates trust that the model is working for them. This keeps the experience snappy and engaging, even when the output gets longer.
One final trick to keep in mind is reuse. When the start of the conversation transcript or prefix stays the same, the model doesn't have to start from scratch every time. The model can leverage work it's already done instead of recalculating the very beginning. That means less computation, faster responses, and more efficient use of limited context. So in scenarios where you're invoking the model repeatedly with the same instructions, it really pays to make sure the prefix stays consistent. You get lower latency and a smoother experience overall. Let's talk about performance patterns specific to the foundation model's framework.
The language model session class represents the context used to generate output within your application. The session is more than just a reference to the model. It's where the system maintains context, tracks state, and optimizes its work across multiple calls. If you make a new session and throw it away every time, you're forcing the framework to re-initialize, throw away all that accumulated context, and pay the setup cost all over again.
The instructions field given to the language model sessions initializer should be direct and concise. It should only contain the information needed to respond to a single kind of user request. Irrelevant or repeated instructions will take extra processing time, delaying your users from seeing the results they really care about. So it's important to avoid vague or subjective prompts.
When you annotate a type with a generable macro, the framework automatically generates a schema that defines how to map the generated content to that type. It allows you to work at the level of your app's logic rather than worrying about tokens and text. Generable doesn't merely reduce boilerplate, it literally guarantees the structure of the output will be valid. You can't be sure that the content of the fields will be correct, but there'll be no possibility of the generated type containing illegal strings or invalid numeric values.
The guide macro works with Generable to provide further guidance to the language model. It lets you communicate both programmatic rules, like regular expressions, and natural language guidance in the description argument. Keep in mind that the description field also contributes to the use of the language model context window. So just like with the instructions field, be concise here. In fact, if the name of the field is self-explanatory, you might be able to skip the use of the guide macro entirely.
Under the hood, General Bull asked the language model to represent the response data using JSON. That's great, because JSON is flexible and maps naturally to most existing Swift structs and enums. It's already widely used by lots of existing Swift code, for example when communicating with web services. One sticky point though: JSON itself can be pretty verbose.
Every brace, every quote, every comma all takes up tokens. And because field names have to be repeated over and over, the cost can add up quickly. If you're asking for a list of objects with many fields each, the field names might out weigh the actual values. This makes the responses slower, and it leaves you with less room in the context window for the parts of the output that your users actually care about.
So, how can you find and diagnose these problems? Here's a sample app that creates a multiple choice quiz. The user types a topic and then clicks "Generate". You'll notice that while the quiz is being generated, all the user sees is a spinner. There's little indication from the user interface about how much work is being done. It also takes a really long time from pressing the "Generate" button to the user seeing anything. 12 seconds is a long time to wait. Let's go over what might be causing that.
Like most performance questions on Apple platforms, the first place you should go for answers is instruments. Instruments is like an x-ray for your app. It shows you exactly where your code is spending time. Instead of guessing what's slowing things down, instruments gives clear, real-time visualizations tied directly to the source of what's going on under the hood. And now in XCode 26, Instruments has special support for foundation models requests.
When you're ready to dive deeper into performance, Instruments is just a click away. From Xcode, go up to the Products menu and choose Profile. Your app will launch directly into Instruments. You can also use the keyboard shortcut to jump in. By default, it's Command I. Xcode will build your target, launch Instruments, configure to profile your application. A great starting point is the CPU Profiler template, which performs fast, low-overhead measurement of CPU time and thermal state.
If you're using other machine learning tools, also consider using the CoreML template, which shows which hardware components of the device are being used. For example, it can break down the impact of the CPU, GPU, and Neural Engine independently. This can be really helpful when you're combining the Foundation Models Framework with other machine learning models.
And of course, be sure to add the Foundation Models instrument to your trace. Click the plus button in the upper left side of the trace template screen, scroll down to Foundation Models, or just search for it by name. The Foundation Models instrument will be added to the timeline.
It can also be helpful to save the set of instruments as a new template. Go to the File menu and click "Save as Template." Give your template some name you'll remember. Here I'll go with "Foundation Models." Then click "Save." Finally, click the red "Record" button in the upper left corner of the window to start recording your trace. You can also go to "Record Trace" under the "File" menu, or use the keyboard shortcut, which defaults to "Command R."
This will launch your application with instruments recording a trace in the background. Perform your foundation model's request, then return to instruments to stop recording the trace. Now I'll take you on a little tour of what's in the instrument's trace. The measurement of each instrument is arranged vertically, grouped by which instrument performed the measurement. In this example, there are four tracks: CPU profiler, points of interest, thermal state, hangs, and foundation models. And at the bottom, there's the app life cycle. Horizontally, there's a timeline of measured events.
When reviewing performance, it is good to first check that your development device is not under thermal pressure or busy with other work. you'll want to make sure the device is in a nominal thermal state before running any further tests. In this trace, the green thermal state is good news, and the CPU profiler is not showing significant CPU usage in the application.
Once focused specifically on foundation models work, it makes sense to dive into what is being measured. Latency is shown by the width of each component on the timeline. The blue response timeline gives a start and end point to each overall request. The yellow asset loading timeline shows the time when the system needed to load model data from storage before the request could be fulfilled. Finally, the blue and turquoise inference timeline shows time spent preparing the generation schema, processing input prompt, and finally computing the output tokens. Note that in this trace, the yellow loading and turquoise preparing vocabulary sections overlap with the blue response section on the timeline. This indicates that these operations are preventing the results from getting to the user.
Instruments also shows how many input and output tokens were used for this request. In general, more output tokens means more processing time. So, now that I've gone over some common performance issues and how to find them, how can they be solved? Let's dive into some ways to make Foundation Model's performance great.
In the Foundation Models framework, you can call Language Model Session's Respond API to get a full reply once the response is finished generating. You make a request, you get a response object all at once. With the Language Model Session's Stream Response API, you don't have to wait for the whole answer. The model produces partial responses as soon as they're ready, and your app can display them immediately. This means users see the responses unfold piece by piece almost instantly. It doesn't make the model think faster, but it feels faster. Perceived latency will drop dramatically and your app will feel more alive. The model produces partial response when they're ready.
It's also important to only use one language model session for each set of instructions and retain a reference to it for the duration of your app's user interface. Use concise instructions that are agnostic to each specific request and put only the request-specific information into the prompt field. Another potential opportunity that can speed up the request is to disable include schema in prompt in your response call. Include schema in prompt defaults to true, which tells Foundation models to include information about general types before processing the user's request. This improves output quality, but requires the model to consume more input tokens. In this example, the quiz host class can remember that the session has already responded to one prompt. Repeated work can then be avoided in subsequent responses by setting include schema in prompt to false.
Let's go back to the recorded profile in instruments for a moment. Two major sources of latency here are the asset loading segment in yellow and the prepare vocabulary in turquoise. These two operations need to be performed before the session can respond to the user's request. But luckily, these usually can be performed ahead of time before the user's action kick starts the output generation. The work can be hidden behind periods of time when the application is waiting for the user. Preparing the language model session for inference is called pre-warming. Pre-warming a language model session is like preheating an oven while preparing ingredients. Pre-warming the session gets the model loaded from storage into memory and prepares the model for output generation.
In the Foundation Models framework, this ahead of time preparation is exposed through the pre-Warm API. Simply add a call to the pre-Warm API of your language model session to allow the system to get an early start on this work. The Swift UI Task View Modifier, which performs work asynchronously as the view appears, may be a good place to do this in your application.
Here's the effect of adding the pre-warm call to the app's view. When I re-record the trace under the same conditions as before, the yellow asset loading segment no longer overlaps with the response segment at the top. This means that loading work is no longer contributing to user perceived latency, and users are no longer waiting for this work to finish before they can start seeing the response from the language model session.
As I mentioned previously, generable is very powerful, but you need to be mindful of the amount of output it can cause. Because the generated response objects need to become JSON objects in text form, the names of each property in the generable types will take up space in the output. If generating multiple types of the same data, each field can end up duplicated multiple times in the response text. This is exacerbated by JSON's text format requiring several tokens for symbols like quotes and braces.
you can reduce the overhead of the JSON representation in a few ways. Shorter field names will typically result in shorter raw text content. Field names that have different starting tokens will also be easier for the model to decode because it will spend less time navigating ambiguities. If practical, consider flattening your type hierarchy rather than having several nested types. This reduces the amount of output tokens, which is automatically decoded into your generable type.
With all these changes, the app is now much faster. Now the user gets the generated questions almost immediately after pressing the generate button, and the whole quiz is ready in less than seven seconds. I've covered a lot here. I've discussed Apple's Foundation Models API and walked through some techniques for finding performance bottlenecks there. Always start by measuring, and Instruments is the right tool for that job. Keep your prompts and guides to the point.
Make sure you don't have a foundation models framework doing the same preparation work twice and try to get that work done while not blocking the user. Use the streaming API to keep your UI responsive. And finally, edit your generable types to make them faster. For more information about the Foundation Models Framework, check out these videos from WWDC 2025. I hope this helps you in your journey to deliver great generative AI experiences in your apps. Thank you. I'm Becca Cole.
Thanks, Henry. All right. Before we continue on, I think it's time for a short break. How's that sound? So we're going to pause here for about 15 minutes. Can use the washroom, stretch your legs, get another sip of coffee. We'll be back here shortly. And thanks and see you soon.
Our program will continue in five minutes. All right. Welcome back, everyone. So today, we have a lot more to cover, so we're just gonna get into it. You may recognize our next presenter from a WWDC session on Swift UI performance. And he's back here today to go even deeper and share some more insights on how to get the best performance from Swift UI. So let's get started. Please help me welcome Steven. Thanks, Cole. Hi, my name is Steven, and I'm an engineer on the Swift UI team.
Dive deep into SwiftUI performance
Today, I'm going to share some techniques that you can use right away to help you optimize your Swift UI apps performance. This is an advanced talk that builds on another talk from WWDC25, "Optimize Swift UI Performance with Instruments." If you're new to Swift UI performance optimization, I'd highly recommend starting with that talk first to learn how to get started profiling your apps.
There are also other great WWDC sessions about Swift UI performance, and I'd recommend checking those out too. At a high level, the best advice I can give you for getting great Swift UI performance is this. Ensure your views update quickly and only when needed. Excessive or slow view updates can cause your apps to be less responsive due to hitches and hangs. Both of these considerations are important, but today I'm going to take you on a deep dive that focuses on one in particular, the importance of updating your view bodies only when needed.
You'll typically find that individual view body updates in SwiftUI are very fast, but as your app becomes more complex, the cumulative performance cost of your view updates and their downstream cost can become a far bigger concern. That's why it's important to profile your app to assess whether you're updating your views efficiently. And that starts with instruments, the powerful performance profiling tool included with XCode. Instruments 26 includes the next-generation SwiftUI instrument, which records important data about the Swift UI work happening in your apps. Data you can analyze to create targeted fixes for bottlenecks that may be slowing your app down.
The best time to profile your app and validate performance is while you're building it. But let's be realistic. Things don't always work like that. You've got deadlines. You're in a hurry to ship your app. Or maybe you're just working on a project with tons of code, some of which you didn't write, and you're just trying to figure out how to make your app perform better.
Perhaps you've even already tried profiling your app and instruments. But when you recorded a trace, you ended up with something like this. A timeline with tons of information and you're not sure where to start. I know firsthand from my own work that this can be pretty overwhelming. So today I'm going to share three high-impact opportunities to focus on that can help you improve your app's performance right away. This advice is based on real-world experience improving Swift UI app performance both inside and outside of Apple.
First, I'll talk about the role your app's data flow plays and how often your view bodies update and how to optimize it. Then, I'll show you how to identify and eliminate extra updates that can be caused by storing closures on your views. Finally, I'll talk about how to identify hot paths in your code and share some ways to cut down on excessive updates in these performance-sensitive areas.
For this talk, I wanted to come up with something new to show you, an app with some realistic performance issues similar to ones you might encounter in your own apps. But what kind of app? If you're anything like me, you find it easier to work on something you're passionate about. But what am I passionate about that's worthy of an app? Well, a few months ago, I adopted a dog. Her name is Pretzel. Since then, my Photos app has been completely dominated by pictures of her.
And pretty much nothing else. So yeah, Pretzel is definitely a passion of mine. So I got to thinking, how can I show you how to improve app performance and show off some more of these adorable dog photos? With a game, of course. Let me take you on a quick tour. My game is a take on the classic card game, Memory. To play, you tap on a card to turn it over.
Then you tap to flip another card. If they match, you get a point. If you make a guess and the cards don't match, they flip back over and you have to try again. The objective is to try to remember where you saw different cards in order to find all of the matches. And that's the game. Now I'll talk about how the app is structured.
My app's main view is called game view. The game state is stored in a few state variables. The list of cards to display, the pick one and pick two card index variables, which keep track of the selected cards, and another variable called matched cards that contains the set of matches that have been found successfully.
The cards are laid out inside a four each, wrapped by a lazy V-Grid in a scroll view, each represented by a card view. And that's the basic setup. I'm excited to show you what I've built so far and take you along as I profile and optimize my game. So let's check out the code in Xcode.
Here's my main view I mentioned, game view. And here's my state variable for the full set of cards. There are 16 different photo cards, two of each. The cards are currently in a fixed order to make testing easier, but in the final game, they'll be shuffled. And here are the three state variables I mentioned. The pick one card index, pick two card index, and the set of matched cards.
I've also added this cheat mode toggle that shows the card letters on the cards so I don't have to memorize the order while testing. And there's also a neat parallax effect for the background behind the cards when scrolling. I'll talk more about that later. And here's where I'm initializing the card view. I'm passing it all the properties needed to display the card. as well as bindings to my three game state variables so I can update them when choosing cards.
Lastly, I have view builders for the front and back of the card. The front of the card shows the image displayed when I choose a card. And the Backview Builder contains a view called Card Backview that shows the pretzel icon on the back of the card. with an overlay showing the card letter when sheet mode is enabled. Now I'll command-click on Card View to show you that view.
The card views a button, and the button's action is the logic for picking cards. If pick one card index is nil when I tap a card, that means I'm at the start of a turn, and this is the first card I'm picking. So I update the pick one card index to the index of this card.
Otherwise, this is pick number two, and I set the pick two card index binding to this card's index. Then I need to check for a match. If the card for the first pick matches the current card, I add it to the set of matched cards, and then set both picks to nil to get ready for the next turn. If this isn't a match, I'll also set the picks back to nil, but I do that in an animation with a two-second delay. This keeps the cards face up for a moment so I can try to remember them.
The button label is the card itself. I start with the back view, ViewBuilder, and add an overlay. The front of the card, the photo, is shown if the card's index is one of the two picks, or if this card has already been found and is in the matched cards set.
The same logic goes for disabling the button. If the card is flipped over, it can't be tapped again. And that's pretty much how the game works. There's more to talk about, but I'm itching to find out how the app is performing. So let's go ahead and profile for the first time. I'll click product and profile. And this builds the app in release mode and launches instruments. I'll choose the Swift UI template. And I'll also maximize the instruments window.
Today I'll be focusing on the Swift UI instrument, which shows all the updates performed by Swift UI during the trace. and the Hitches instrument, which identifies times when the app is unresponsive during animations, like scrolling. Using your app while recording an instrument gathers a lot of data. Before each recording, I like to have a plan. For this first recording, I'll test making a couple of guesses. I'll click "Record," which launches the app.
And I'll start by immediately making a match. I'll tap G and then the other G. And then I'll make an incorrect guess. I'll tap zero and then B. And then I'll click to stop recording. Instruments finishes recording, terminates the app, and prepares the profiling data for display. For this trace, I'll be investigating how many updates happen when I tap the cards.
The top-level Swift UI track has four lanes. Update groups shows a representation in gray of all of the work performed by Swift UI. The other three lanes show long updates, view bodies, representable views, and other types of updates that take longer than half a millisecond to run. And expanding the track shows all updates, not just the long ones. The detail pane below also shows update counts, and the total duration for all of the updates. Now I'll expand the process for my app.
Beneath my process, there's a list of the modules responsible for all of the updates. The first thing I notice is that the majority of the time is spent in updates coming from the Swift UI module. But the best place to start is with the updates in my own views, because ultimately all the Swift UI work that appears in the trace is triggered by updates to my Apps UI.
Back up in the timeline, the areas of activity indicate the parts of the trace where Swift UI work was happening. I want to focus on the area where I first began interacting with the app. So I'll highlight the portion where I tapped the first card. The first thing I notice here is that CardView's body updated 14 times for this first tap. I only tapped one card here, but there are 14 updates. So how can I figure out why? If I hover on the view, I can click the arrow and a context menu comes up and I'll click show cause and effect graph.
The cause and effect graph shows how updates flow through your app. The graph contains updates from my app's views, shown in blue, and Swift UI updates in gray. Leading to the card view body is a node for pick one card index state variable. The line between the two nodes shows a count of 14. This tells me that this state variable updating caused 14 CardView body updates. Where does this number 14 come from? Well, when the app initially launches, 14 cards are visible on the screen. Let's go back to the detail view.
and I'll highlight the second tab. 14 updates again. What about the wrong guess I made? I'll highlight the area at the end of the trace, And this time there are 42 CardView updates for my incorrect guess. To figure out why there are so many CardView updates, it's important to understand how data is flowing through the app. This brings me to the first topic I'd like to discuss. How to optimize your data flow to reduce unnecessary updates. Let's begin by revisiting how my game tracks its state. As I mentioned, the game state is tracked by three state variables. one for each picked card, and the set of successful matches. Each card has corresponding bindings for the game state. These bindings serve two purposes. First, writing to the bindings updates the game state when a card is tapped and the button action runs.
Second, the button's label and disabled state are determined by reading the values of the bindings. Now I'll do a little simulation to count the updates that happen when I'm interacting with the game. When I tap the first card, labeled "G", The card updates its binding for "Pick One Card" index to zero, which is the index of the card I tapped. This updates the corresponding state variable in the game view to zero as well.
Now remember how each card shows the front or back based on its bindings? Since each card has a binding to pick one card index, each card updates because the pick one card index state variable on GameView changed. And this results in 14 total card updates. Now if I pick a second card that isn't a match, card O for example, The card updates its binding for the pick two card index to five, which is the index of the second card I tapped. This updates the pick two card index state variable in game view to five as well. Since each card has a binding to pick two card index, each card view updates again.
And that's a lot of updates. But we're not done yet. Since this wasn't a match, after two seconds, the second card I tapped sets its pick one and pick two bindings back to nil, which updates the game view state variables to nil. And all the cards update again.
This brings the grand total to 42 card view updates for one incorrect guess, after only two card taps, even though the majority of the cards never changed. I only expect four updates, one for each tap, and two more for the cards flipping back over. I must be able to do better than 42.
Each card has a binding to each of the three game state variables. So every card has to update any time one of these variables changes. I'd like to structure my app's data flow in a way that allows updates to happen for only card views that changed. A great way to do this is by moving the game state variables into a class that adopts the observable macro and passing that class to each card. Here's how observable works.
The Observable macro lets you add update tracking to properties of a class. When you access a property from an Observable class in a view body, SwiftUI tracks that access and will only trigger an update to that view when the property is set. Starting with iOS 26, setting a property with an equatable type to the same value on an Observable class won't trigger an update. On earlier releases, checking a quality manually before setting properties is necessary to avoid triggering extra updates.
When you write a property on an observable class from one of your views, it doesn't trigger an update on that view unless you're also reading from that property. To learn more about the observable macro, check out the WWDC video, "Discover Observation in Swift UI." Observable seems like a promising way to improve my app's data flow and bring the CardView update count down to my target of four. Let's go back to Xcode to explore how I can use this approach to improve my game.
Let me switch back to XCode. And I've already coded up the observable changes I described in an updated version of my game view called "Game View Observable." So I'll change my app's main view to that view and then command-click to jump to the new view. Here's my observable class called "Game". I've moved the state variables for the list of cards, the pick one card index, the pick two card index, and the match cards into the class. And I'm storing the class in a state variable.
Previously, each card's button action was writing the bindings in order to update the game state variables. Now I'm passing the game object to card view, so the action can update the game object's properties when a card is tapped. Before, each card was also reading all of the game state bindings to decide which side of the card to show. That meant that any time any of the game state variables changed, every card had to update.
The way I fix this is by creating this isFlipped property on the card view. Now the decision about whether the card is flipped is made outside of the card view. This breaks the dependency on all of the state variables, which means only cards that have actually changed will be updated.
This works because SwiftUI runs a view's body if the value of the view struct changes. When the properties on my game class are updated, if isFlipped hasn't changed for a given card, SwiftUI can skip updating it. Now I'll profile again to verify my changes. And this time I'll just make an incorrect guess since that's the one that had 42 updates. I'll tap O and then B. And I'll stop recording.
And I'll highlight the area where I tapped and expand my apps module. And the card view has four updates, and that's exactly what I was going for. Let's go back to the cause and effect graph to inspect how it changed. And the graph is a lot simpler now. If I click on the arrow leading to the CardView body node, I see that there are four updates. And looking in the inspector on the right, is flipped changed four times. This basically means that when isFlipped changed, that's what caused CardViewBodies to run. One update for each card I tapped and one for each of the cards flipping back over after my incorrect guess.
This shows how taking the time to understand and optimize your app's data flow can make a significant difference in how many times your views update. To recap, Examine your view's updates in a trace. If a view is updating far more frequently than you expect, it's not worth investigating. Not all unexpected updates will lead to performance problems, but they frequently do. It's important to structure your views to only depend on the data they need to keep them up to date. If a view updates frequently without actually changing, that's a sign that there's room to improve your data flow. Observable classes are a good way to establish only the dependencies you need. Since writing to an observable property trigger an update in the view you're writing it from, you can update it from anywhere, knowing that only the views that read the property will update. You can also break unwanted dependencies by moving conditional view logic outside of a view. This allows Swift UI to skip running the view's body if nothing is changed. This is what I did with the isFlipped property on my card view.
Before I move on, I need to say one more important thing about this topic. To optimize your app's data flow, you really only need to do one simple thing. Rewrite your entire app. I'm joking. You can apply the techniques I'm sharing today to incrementally improve your app's performance right away. I've just explored how Observable can help eliminate extra updates, but this advice isn't necessarily one-size-fits-all.
Here are some things to consider when choosing between State and Observable. State is great for storing data locally in the view that owns it, when you need to persist that data across view-body updates. State is also useful in cases where you have a subview that effectively needs to borrow its parent state to read or modify it using a binding. This pattern is used extensively throughout Swift UI's APIs like sheet presentation or controls.
Observable classes are great for shared state in your app that needs to be read or written by many views across your app. They give you more control over when your views update, based on where you choose to read and write their properties. For instance, now my card views write the game's state properties on my new observable class, but they don't read those properties, so my class allows me to avoid extra card updates.
So I'm pretty confident that I have things under control for the updates that happen when tapping card views. But my investigation isn't quite done yet. One area where good or bad app performance is particularly noticeable is when scrolling. Smooth scrolling is key to having your apps feel fast and responsive. Hitches while scrolling, like in the example on the right, are distracting and can make an app feel unresponsive and unpolished. With that in mind, I'd like to get a sense of my app's scroll performance. The best way to start evaluating scroll performance is to try it yourself. So let's go back to the app.
Typically, I'll start by running the app and scrolling to get a sense of the scroll performance. So I'll switch back to Xcode and I'll run the app now. And as I scroll up and down, the pattern in the background moves diagonally at a different speed from the foreground. This is the parallax effect I mentioned earlier. But the scrolling doesn't seem very smooth to me. The next step is to verify that the app is actually hitching by profiling using instruments. So I'll go ahead and stop the app.
And I'll click product and profile again. And record. And this time I won't even tap any cards. I'll just scroll to the bottom and stop the trace. I'm particularly interested in the hitches instrument for this trace. And sometimes when it takes longer to transfer the data, that's a sign that I may have a ton of updates and I have some stuff to look at here.
The Hitch's instrument is almost solid red during the time I was recording, so I'm not just imagining the bad scroll performance. I'll highlight the section of the trace where I was scrolling. And the hitches instrument shows that while I was scrolling, the app was hitching for 800 milliseconds.
and I was only scrolling for less than two seconds. So that basically means that the animation was frozen for a large percentage of the time while I scrolled. I should check what kind of work SwiftUI was doing. So I'll click the SwiftUI instrument. And there are a staggering 150,000 updates that happened while I was scrolling.
The majority of the updates are from Swift UI itself, about 140,000. But like I mentioned before, I'll start by analyzing my own views updates because they're ultimately responsible for most of the Swift UI work. For my own views, I had about 6,000 updates. I'll expand my apps module to check which views we're updating. The CardView body, updated around 3,600 times, And the new game view updated 135 times, and so did the background view.
And strangely enough, card back view also updated 2200 times. It makes sense to me that background view is updating because of the parallax effect that causes it to move when I scroll. But card view and card back view don't seem like they should update at all, since they're not even changing.
The average duration for each of these updates is very short, just a few microseconds. But it's important to remember that each of these view updates can also trigger more expensive SwiftUI work. If I expand the SwiftUI module, There are a ton of button updates happening while I'm scrolling.
Almost 22,000. These are the buttons in each card. Let's go back to My View Updates to understand what's happening. There are a few different things I want to investigate here, but I'll start with the biggest set of updates for my own views, the card view. I'll go to the cause and effect graph again to try to track down why the cards are being updated.
To the left of the CardViewBody node, there are a few nodes. The graph shows lots of different updates from Swift UI, but I like to focus on the nodes with the most updates. This For Each node is causing a ton of updates to CardView, about 3,400. If I click the line leading from For Each to my CardViewBody node, the inspector shows which properties caused the CardViewBody to update. Here, the change property that's updating the card view 3,400 times is the back view property. Let's go back to Xcode to investigate.
Back view is the view builder closure I pass to the card to render the back. But the back view property is a view builder closure and its contents don't change at all while I'm scrolling. So how could this property be changing and causing all of these updates? This brings me to the next topic.
Use escaping closures only when needed. Let's break down what's happening here. I'll start with a simplified version of my game view. I have the state variable for sheet mode. and the card view in my 4H. The card view takes the isFlipped parameter, a view builder closure for the front of the card, and one for the back.
And here's a simplified card view. It has the isFlipped Boolean and the two ViewBuilder properties where I'm storing the closures that produce the front and back of the card, and the button label that shows the front view if isFlipped is true. I know from my trace that GameView is updating quite a bit while I'm scrolling.
I'll investigate that later. But my card views are also updating thousands of times while I scroll. The cause and effect graph told me that this back view closure property changing is the cause of the updates. But how? The contents of this view builder are unchanged for the whole time I'm scrolling.
If I examine the view builder closures more closely, each one captures a variable from its enclosing context. Front view captures card, which is local to the four each, and back view captures is cheat mode enabled, the state variable on the view. But the back view closure is also capturing something else that isn't necessarily obvious. It's implicitly capturing self.
And what is self here? It's the game view struct, which was updated while scrolling. Every time GameView's body runs, the CardView initializer is passed a newly allocated BackView closure that captures this changed self-value. That explains why CardView is updating. Its BackView property is continuously changing. That's definitely not what I want. So how can I prevent this? Well, the closure changes, but the view it produces doesn't. Let's go back to XCode and use this information to come up with a fix.
So my back view closure is implicitly capturing self when I'm accessing the is cheat mode enabled state variable. And this causes the card views to update continuously because the closure is changing. The best way to fix this is to stop storing the view builder closures on the card view and instead store their results. I'll switch back to card view. And here are the ViewBuilder closures. The syntax I'm using here causes the synthesized init for this view to store these escaping closures as properties on the view. I can update both of these view builders to no longer be stored closures by simply editing their types.
This causes the init to synthesize or to execute the closures to store their output of the front and back view properties instead of storing the closures. I'm updating both closures because even though front view wasn't causing me a problem, in the future it could, so it's better to be consistent. I also need to update my view body to reference the stored views directly since they're no longer functions. Now let's profile again. I'll scroll to the bottom again and stop recording. I'm hoping to see a reduction in card view updates and hopefully fewer hitches as well.
I'll highlight the area where I was scrolling again. And I'll check for the updates to card view. In my last trace, there were thousands of CardView updates, and now there are only 18. I can expand CardView to see what kinds of updates these were. All 18 updates were creation events instead of updates. So these are just the new cards being created as I scroll. Let me check the Swift UI updates too.
The massive number of button updates I had before are gone, which is great news. Let me check the previous trace to get a sense of the overall improvement. So in the previous trace, I had a total of 154,000 total updates, taking 1.25 seconds. And in the current trace, I have 122,000. taking about 900 milliseconds. That's a pretty significant improvement, and it explains why I have fewer hitches. But there are still some hitches left, and I definitely like to reduce those further.
For now, I've made significant progress in reducing both Swift UI's updates and my own. And it's great to make incremental progress when profiling an app. Let me go over what I've just shown you. Closures, are spooky. They're difficult to compare. Various characteristics of closures, such as their captures, mean that it isn't always possible to determine if a closure has changed between view updates. This means that storing closures on your views can affect Swift UI's ability to determine what needs to be updated. So you should avoid doing this if you can. Run view builder closures when you initialize your views and store the result instead of the closure. This allows Swift UI to compare the view produced by the closure instead of the closure itself. You can do this explicitly in your initializers or update your property types like I did for my front view and back view closures to have the synthesized initializer correctly handle this for you.
SwiftUI can't always compare closures. This can cause extra view updates. It's best to avoid storing escaping closures on your views if you can. If you absolutely have to store an escaping closure on a view for something other than a view builder, for example, your view may update every time its parent updates. Minimizing updates to the parent view can reduce the impact of this. You can determine if you're facing this issue by checking the cause and effect graph in instruments.
I've eliminated a lot of unwanted CardView updates by fixing the issue with escaping ViewBuilder closures. But the trace shows I still have some hitches when scrolling. That brings me to the last thing I'd like to talk about today. The importance of removing expensive work from hot paths in your app's code. When you're reacting to changes in geometry that happen continuously, like scrolling or resizing a window, this is an example of a hot path. A hot path is an area of your code that executes very frequently. These areas are extremely prone to performance problems. The parallax effect in my app is a classic example of one of these paths. Here's how it works.
When the scroll view is scrolled, its geometry is continuously sent to the on-scroll geometry change handler. The handler grabs the content offset Y value and passes it to the action. Then I calculate the parallax offset, which is just the Y value divided by 16, a constant I picked because I thought it looked good. The offset is then saved to this state variable.
The state variable update causes the view body to run, which in turn writes the offset into the environment. Then the background view reads the value from the environment and uses it to offset the background image. There are two potentially expensive updates happening in the middle of this hot path. The update to the parallax offset state variable and writing the value to the environment. Let's go back to instruments to find out how these updates are affecting my game.
I'll select the SwiftUI instrument and collapse it. And I'm interested in views that update a lot, my own view when I'm scrolling. I'll start by investigating card back view, which doesn't seem like it should be updating at all, but was updated 2,800 times. Let me show you card back view in Xcode.
I'll jump to it by using the shortcut for quick open, which is command shift O, and go to card back view. This view only has one property, color scheme, which is red from the environment and it's definitely not changing while scrolling. So what's causing these updates? Let me investigate in instruments.
I'll expand CardBackView, and there are more details about the updates. Interestingly, the vast majority of the CardBackView bodies are labeled as "Skipped Update," which means that the view was updated, but the body didn't run. I'll check the cause and effect graph again to find out the cause of these skipped updates. Directly leading to my card back view body node, there's a CGFloat environment writer that's updating. And before that node, the GameView body is updating a bunch of times. and leading to the GameView body, there are a bunch of nodes for state updates for the parallax offset.
This means that when I'm scrolling, the parallax offset is being written a bunch of times, which causes the GameView body to update, which writes the parallax offset to the environment, which causes a ton of updates to card-back view where the view body is ultimately skipped. CardBackView isn't reading the Parallax Offset CGFloat Environment variable, yet it's still here in the graph with all these skipped updates. But why? Let me go back to Xcode.
The answer lies in how the environment works. When any value in the environment changes, each view that accesses the environment is notified that its body may need to run. Card back view is reading the color scheme from the environment. But every time any value is written to the environment, like my parallax offset value in this case, this view has to check to make sure the color scheme hasn't changed. All views that read the environment have to do this comparison. Since the color scheme didn't change here, the body is skipped. Hence the skipped updates in the graph. Let me go back to instruments and the list of updates. All of these card-back view skipped updates don't take very long, only around 11 milliseconds. But I bet this isn't the only view with skipped updates. To see the others, I can click on the drop-down and choose consequences. This view shows all the updates grouped by category.
And I have about almost 13,000 skipped updates that are taking around 92 milliseconds. That's a lot more than just the card back views. In fact, these skipped updates make up around 10% of the total time. These might not all be caused by my continuous environment updates, but the majority of them likely are, and it's a significant percentage of the total updates in my trace. But these skipped updates still don't explain the massive number of total SwiftUI updates or the hitches because skipped updates don't cause any other work. Let me go back to the list of all updates.
I said before that there were two potential issues I suspected might be causing me scroll performance trouble: continuously updating the environment, which I just explored, and updating the parallax offset state variable continuously, which causes the GameView body to run over and over again. Now I'll examine whether those GameView body updates may be causing a lot of expensive downstream Swift UI work. Let me expand the list of Swift UI updates again.
And there are actually some things here I recognize. For each, which I use to build the list of cards, and lazy V-Grid layout, which makes sense because everything is wrapped in a lazy V-Grid. and text, which I'm using to overlay the letters when sheet mode is enabled. These updates are taking a lot of time.
and I know that they're used in my GameView body. I could search for each of these in the cause and effect graph, but I know my app's code, and these updates can't be coming from anywhere else. Continuously updating the GameView body while scrolling appears to be causing all this expensive downstream Swift UI work. So in addition to the environment issue, this is also something I need to fix. I've done quite a bit of exploration of the problem, but how can I make this better?
Ideally, I would just monitor the scroll position and share the parallax offset directly with the background view, and only update that view because that's the one that's changing. The good news is I was able to use some of the techniques I've already discussed to achieve this. Let me switch back to Xcode.
And I've made one last new version of the game view with my fixes called GameView Parallax. So I'll update the main view and then jump to the new view. First, I've moved the parallax offset state variable to my observable game class. Next, I updated my on-scroll geometry change action to write the offset to the new property on my observable class.
Earlier I told you that views only update based on changes to observable properties when reading their values, not writing them. The GameView doesn't read the parallax offset, it only writes it. So when my handler is reacting to the scroll geometry change, the GameView body doesn't even need to run now, allowing me to avoid all that expensive downstream work.
And now, instead of setting the offset into the environment, I'm setting the game class. This is safe because the class is a reference type with a pointer that won't change. So it won't cause any of the continuous environment updates I saw earlier. I also could have just passed the game object directly to the background view, but I wanted to make the point here that putting observable classes into the environment is a safe thing to do. Let's check out my updates to background view.
The only differences here are that now I'm getting the game class from the environment and reading the parallax offset from the class. Let's profile one last time to check the results of this change. I'll go ahead and record and scroll and then click stop. I'm hoping for a substantial reduction in updates from this change.
I'll highlight the area where I was scrolling again, and I'll check the list of updates. The total update count now is just over 10,000, taking under 100 milliseconds. Let me check the previous trace. The previous trace had 126,000 updates, taking almost a full second. So that's a massive improvement. Let me go back to the latest trace.
The thing I'm most excited about is that there are no hitches anymore. So I know that my scroll performance is as smooth as it can possibly be. These improvements really demonstrate how much reducing expensive work in an app's hot path can pay off. The downstream work caused by updates in these areas can be massive.
The most important takeaway from this last set of improvements, and perhaps the most important takeaway from the talk, is this. Pay attention to hot paths. They're extremely performance-sensitive. And be extra careful when performing updates in these areas, especially to state and environment variables. When you repeatedly update a state variable in a hot path in a view, that can cause the view body to run continuously. All of the subviews in that view body are reinitialized, which can trigger all kinds of other expensive downstream Swift UI work if you continuously update state in a complex view. If you absolutely have to update a state variable in a hot path, isolating that state to a smaller view can help reduce the cost by minimizing the cumulative work that has to be done for every update. Writing a value to the environment doesn't only affect the views that read the key where you're writing it. It updates any view that reads from any environment key.
Even though view bodies are skipped for these updates, they aren't free. And the time spent can add up as your app's view count grows. That's why you should avoid writing to the environment frequently, especially in hot paths, such as storing changes to geometry, for instance. Use observable classes instead. They allow you to write values from anywhere without triggering unwanted updates and to update only the views that read those values.
I'm happy to say that during our time together today, I think I've made my little pet project a lot better. But more importantly, I hope you're already thinking of ways to use the techniques I've shared to improve your own apps. Let me leave you with some next steps. It's never too early or too late to find ways to improve your app's performance by profiling and instruments. Your app's data flow plays a huge part in how often your views update. Use observable classes for parts of your app's state that are shared by many views.
Storing escaping closures on your views can lead to unwanted updates. Store the results of your view builders on views instead of the closures themselves to allow Swift UI to compare your views more easily. And pay special attention to the work happening in hot pads in your code. And focus on minimizing that work to improve overall performance. And lastly, thanks for taking this journey with Pretzel and me. I can't wait for you to unleash your app's full performance potential. Thank you.
Thanks, Steven. That was great. Give it up for Pretzel. All right. Before we conclude this morning, we have one more presentation today. Alejandro is here from Snapchat to share more about how their team is able to detect and fix performance problems even before they ship. So please help me welcome Alejandro.
Cool. Thanks, cool. Good morning, everyone. My name is Alejandro Lucena, and I'm a software engineer focusing on performance tools at Snap. I'm very excited to demonstrate some of our methodologies for approaching difficult performance problems and how we utilize Apple technologies to enable our teams to do their best work.
Performance tools at Snap
Today, I'll examine some of the fundamental performance goals at Snap and how those goals guide development. Next, I'll discuss the tools we've built for metric analysis, I'll then explain how to leverage UI tests for performance automation before finally going through an interesting example of everything put together. Let's start with the performance goals first.
Two of our most critical performance metrics are focused around P90 or 90th percentile tail latencies for app startup, or how long it takes to launch the app and become usable, and lens apply delay, or how long it takes to apply a lens effect to your camera capture. We use these metrics as a close proxy for the experience we wish to deliver to our customers. Namely, being able to open Snapchat and capture that perfect moment with your friends and family as fluidly and reliably as possible.
It's so frustrating when you feel the need to force close an app because it's loading too slowly or isn't responding to your gestures. Keeping a close eye on these two metrics keeps us accountable, since they represent a likely worst-case scenario and this helps you establish lower bounds on performance.
Given the vast distribution of different iPhones that our customers use, a holistic approach to app performance becomes necessary. This means focusing on what is going to move the needle for everybody. This shifts our attention to the fundamental aspects of application performance, such as locality, parallelism, best software development practices, and being mindful of I.O. patterns. And perhaps most importantly, we need to be able to catch regressions from automation via performance tests and fix them before the weekly release.
Of course, however, no single person is an island. There are hundreds of engineers involved in all aspects of the application, and their participation is absolutely crucial in maintaining the overall health of our app. Each of those teams has relevant domain-specific metrics that they're interested in tracking. It's very important that our feature teams have a straightforward way to onboard their own metrics, expressed as intervals and counters. Analyzing those metrics also needs to be as accessible as possible throughout each stage of the product lifecycle. Concretely, this means having at-desk workflows demonstrating how individual changes affect those metrics, PR comments that trigger performance tests, and dashboards that break down the data across interesting dimensions, among others.
Part of the team's spirit at Snap is to minimize the divergence between our in-house processes for iOS and other platforms as this lets us share existing pipelines and infrastructure. Not only is this cost and time effective, but it also promotes collaboration and knowledge sharing between our orgs. And finally, our system should naturally encourage engineers to use Apple's powerful tooling and discover the wealth of information that those tools expose to corroborate our performance findings. Bridging together these desires may seem daunting, so I'd like to walk through the tools we've built to accomplish our performance goals.
First, we built a tracing library for teams to easily emit their metrics. Feature engineers track their interval metrics with begin trace and end trace, while events are tracked with trace counter. In addition to high-level metric tracking, this also supports our team's needs for detailed tracking with nested intervals as well. Backed by a lightweight buffer, our tracing library ensures that these operations remain low overhead and suitable for production.
The tracing library persists its data in a protocol buffer, a common format suitable for multiple platforms that we support. Critically, the same data file feeds three major processes at Snap. Local at-desk workflows for quickly iterating on prototypes and code changes, production sampling to keep a pulse on customer-level performance, and metric analysis from performance tests for automated regression checks.
We use an open-source web app to examine the protobuffs locally. Each of these intervals are ingested via the tracing library and are rendered next to the thread that ran them. The AtDesk workflow allows our engineers to collect traces on-demand and describe what the application was doing, validate the behavior of nested steps, and observe interactions between threads. We also incorporate usage metrics from our usage and task info as counters in order to understand our impact on system resources.
This grants us a great amount of detail that we use to annotate the trace. We can now easily answer questions about these intervals, such as, was the thread busy with compute, or was it blocked waiting for some data? Each scenario requires a very different approach. The trace data contains those insights directly, removing the guesswork for our teams.
Our teams also need to be able to root-cause metric regressions with our tools. The best way to do this is with the Instruments Profiler. Our tracing API optionally emits an equivalent OS signpost call for intervals and counters, which means they're ingested into instruments automatically. Now, interval names can be matched to deeper system information, such as thread states. For instance, you can now conclude that the main thread was blocked during this innermost interval because it was waiting on a lock held by another thread.
Now, I want to show you how to leverage XCUI test for performance automation and regression checks. Our performance test suite, based on XE UI test, simplifies how teams author and express their performance-sensitive application flows. Tests are split into setup and measurement blocks. Setup blocks bring the application just one step away from a desired workflow. For instance, you load the Maps tab as setup for map scrolling tests.
The measure block then executes the performance-sensitive workflow without polluting the metrics with setup steps. Once the measurement block finishes, the test runner collects the trace file and loops back to the setup. Additionally, the test suite optionally captures instruments traces for the same exact measurement interval. This unlocks the power of instruments data to triage and understand specific performance tests in depth.
Our automation is oriented around reducing device variance in testing, so we run performance tests across many devices. Each device produces a number of trace files, which are then sent to an analysis service for further processing. we collect a large amount of samples to perform proper statistical analysis. And now, without further ado, I'll show you an example of how all of this has come together to catch and root cause a tricky regression.
As I mentioned at the beginning of this presentation, lens apply delay is one of our key metrics and we try to guard it very tightly. So it was quite alarming for us when a built system migration regressed this by nearly 14%, especially when no other features were impacted.
Lens engineers verified that build flags matched between the two, lens call sites were identical, and there weren't any differences in build warnings or diagnostics. So what gives? And what do we do about it? The good news is, however, that we were able to run instruments over the same performance tests and understand exactly what changed, without guessing. In this sample instrument's trace collected from a performance test, the metric is automatically ingested again, which means you can perform detailed analysis on just this particular region. Since this region is associated with high CPU usage, it's instructive to inspect the on-CPU samples.
Focusing in on the metrics range, Time Profiler attributes 355 milliseconds to this interval. I asked instruments to display this call tree as inverted, such that I'm able to inspect the function calls bottom up and get right to the lowest level. So far this seems okay. Nothing immediately jumps out as being problematic. But how does this compare to the original profile? In the original profile, that same interval is reported as 278 ms, which is definitely better than 355, so I might be onto something here. And while this inverted call tree appears roughly the same as the other one, there is a key difference upon closer inspection.
Without getting too much into the weeds, the main difference is that our original build system used SIMD versions of our functions, thus enabling data parallelism, whereas the new version remained scalar. A seemingly small difference like this, being responsible for a big performance penalty, might come as a surprise. But the impact is unmistakable.
And so, despite the initial stress that this regression caused, this is a fantastic display of exactly what our tooling aims to solve. First, our automation correctly and repeatedly flagged the regression in what is otherwise an atypical scenario. Normally, build system migrations don't introduce this sort of one-off niche regression. In fact, this is one of those situations where it's easy to discard the result as noise because no actual code changed.
Secondly, this metric was defined by the Lens team, not a Performance or Performance Tooling team, so we know that our Tracing Library is readily serving our feature teams well. And finally, I want to emphasize that this type of problem requires a tool such as instruments to root cause. Trying to deduce this yourself without profiler data is extremely difficult. The fact that we could rely on instruments to lead us to the root cause by simply rerunning the test with the proper flags enabled is a testament to these tools' success.
I want to close out by saying that understanding app performance requires great tools. Modern systems are incredibly complex and have very different behaviors than one might expect. Investing in the right set of tools eliminates hours of guessing and places you on a data-driven path to finding the root cause. Next, it's important to have observability at each stage of the product life cycle. This not only keeps you accountable for the customer experience in production, but it also maximizes your odds of catching and even fixing regressions before they ship.
Lastly, none of this is possible without your teams. Regularly check in with your feature teams and ask about their experience using the tools. Be mindful of their feedback. Consider where there's room for improvement and help to unblock those teams. And with that, I'll conclude my discussion on performance tooling at Snap. I hope this information was useful, and I want to thank you all very much for your time and attention. Now, back to Cole. Thank you.
Thank you so much, Alejandro. All right. Before we conclude this morning's presentations, I'd like to share a bit more about what's in store for this afternoon for those of you here in the room with us today. There are two activities available this afternoon at the Developer Center, a hands-on profiling space and the Performance Labs. I'll start with the hands-on profiling space.
When you checked in today, you should have received a sticker on the back of your badge with a time, and this is when you're invited to come join us in the hands-on profiling room. The activity starts at that time and runs for 45 minutes. This dedicated area is an opportunity to try out some of the profiling tools we talked about this morning, like instruments with your own app.
And we have a bunch of Apple engineers with different specialties to help you understand the data that you collect and find performance opportunities in your app. At the time that's indicated on your badge, follow the signs to the hands-on area. And don't forget to bring your Mac and a device with you.
Next, there's also the performance labs. The labs are great when you already have some power and performance questions and you want to be paired with an engineer to discuss them. Some of you may have already submitted a request for a lab appointment. If you did, check your email for a confirmation. That will tell you what time to check in for your appointment. Just follow the signs to the lab check-in when your appointment time comes around. If you haven't signed up for a lab in advance, that's okay too. Just follow the signs to the lab concierge after lunch. Let us know what you'd like to chat about, and we'll pair you with an engineer.
I do have one more pro tip before you visit either the labs or the hands-on room, and that's that I highly recommend building your app for profiling in XCode during lunch. In XCode, navigate to product, build for, profiling, and let your project build completely. That way, if your project takes a long time to build, you'll be waiting on it during any of the activities.
So that's what we have in store this afternoon. And if you're not in a lab appointment or the hands-on area, please feel free to make use of the beautiful open spaces here at the Apple Developer Center. Some time after lunch, we'll also serve some snacks and refreshments. And for those of you joining online, this concludes our activities for today's event. Thank you everyone for joining us here today. And thanks for building your great apps on Apple platforms. Have a great rest of your day.