Ty Smith will share the key metrics that productivity engineers at Uber capture and utilize to optimize Developer Experience. He shares one of their biggest DPE initiative’s wins that you will not want to miss learning more about.
Building large mobile apps is hard. Keeping developers productive while working on large mobile apps is harder. Most Android developer tools target smaller apps, so what’s the best way to think about building some of the largest apps in the world? In this talk, you’ll learn how Uber enables hundreds of Android developers to contribute to hundreds of apps in a single monorepo while keeping developers productive and shipping reliable apps quickly.
Ty Smith is a senior staff engineer at Uber, where he leads the Android platform group and chairs the Uber’s Open-Source Technical Steering Committee. He is passionate about tools, frameworks, and open-source to help developers make great mobile apps. Ty is a Google Developer Expert for Android and Kotlin, engaging regularly with the community through conferences, open-source, and writing, and as an organizer for conferences and meetups. He is an angel investor, tech advisor, and a member of multiple venture capital advisory syndicates. Ty has been at Uber for six years. Before that, he worked at Twitter on the Fabric developer tools, Evernote, and a variety of smaller startups and consulting firms.
Performance Acceleration technologies for Android eliminate the pain of idle wait time and avoidable context switching resulting from long build and test feedback cycle times. Gradle Build Scans for Android give you granular analytic information for every build. Failure Analytics can leverage build data to proactively find unreliable builds and tests and learn how many people and environments are affected by the problem. Finally, Flaky Test Management will help you proactively detect flakiness in your application and tests.
Interested in Android Developer Productivity? Try these next steps:
- Hear four experts weigh in on the best ways to improve the speed of your Android builds.
- Watch Nelson Osacky from the Gradle Enterprise Solutions Engineering team show you how to scale your Android build.
- Sign up for our Gradle Enterprise for Productivity Engineers session to learn more about the tools and technology available to scale your developer teams.
Ty Smith: All right. Let's give Rooz a hand. He's a great hype man, huh?
All right, everyone. I don't think I need to intro myself after that, but I'm
Ty, I work at Ube, I lead our Android platform team. I've been doing Android a
very long time now and working for developers for most of that time at this
point. So today we're going to be talking about mobile developer productivity at
Uber. I've given some similar talks in the past, most of the time they were
focused on kind of the in-app development and the libraries and frameworks.
Today we're going to be talking a bit more around the developer experience, some
of the build tools, IDE, how we measure that sort of thing. So real quick, the
agenda, I'll give a brief overview of kind of the scale we're working with
within Mobile, specifically at Uber. You attended my colleague Gotham's talk
earlier. He kind of gave an overview of mobile in general, of Uber in general,
we'll be zooming in to the mobile space a little bit. Then we'll talk through
the development stack in the workflow, how we measure things, how we make our
builds faster, and what are some of the constraints that we have. And finally,
we'll dig into the IDE and some of our dev tools.
So I've been at Uber about seven years. I've seen this grow quite a bit, right now, Uber probably has around 800 mobile engineers. I say probably, because we have a lot of polyglot engineers that contribute and all of the different repos, and so typically we measure this by monthly active contributors to our repos. We have tens of thousands of build modules, tens of millions of lines of code in Android, and then our mobile architecture called RIBs, which is viper-like architecture. It's all of our apps use this, everything that's built it's converged, and so we have thousands of our architectural units and code base as well. And then obviously many, many production apps and hundreds of internal apps as well. So let's talk a little bit about how we structured the teams to support this kind of scale. So at Uber, there's typically three types of teams or maybe two and a half. You have your feature engineers, which are on cross-functional teams and they're just working on the end product. They maybe working on a specific feature in the rider app or the eats up, and then we have within those orgs, we have some platform teams that typically kind of serve a product platform role. So they may own some of the core flows or core features as well as some of the architectural and library needs for those applications. Like there's a rider platform team, a driver platform team, and then we have the core infrastructure platform engineers and pure library engineers. This is where I live in the organization, and these are folks whose customer, at the end of the day is entirely other developers, either internally or with our open-source projects external as well. There we go. So to slice that a little bit different, here's a very simplified view of our org chart reporting up to our execs. We have multiple big orgs. I said under platform engineering, which has all of infrastructure, you know, compute storage and everything else, as well as developer platform, which is an org of about 200 people that focuses entirely on building our DevEx, and frameworks, and code review and everything else. Under that, a sub-org called Mobile Platform is what I lead. This is probably about 30 engineers total and Android and iOS, and then we have a number of kind of dotted lines into that. And then the mobility org you know, has Rider and Driver, they have their platform team, which I described as well as a ton of feature teams working on those end features, and then a delivery org, and then there's a bunch of other one-off orgs, like Safety and Freight that kind of have similar views, but this is just kind of to give you an overview of this. And then we have a cross-cutting built for our more senior mobile engineers called mobile IRC. We use the branding as kind of a cross-functional guild in general, and then there's specific domains like mobile, web, email that sort of thing. So if we're to zoom in a little bit more on my org, which is serving the developers and what we're going to be talking about a little more today, mobile platform has a number of sub-teams. We have DevEx teams, we have foundations which is frameworks teams, we have networking teams, we have observability, data, dev tools, testing, all kinds of stuff that kind of sits under this group.
And our vision is to be the industry leader for how developers build, deploy and manage high-quality software productively and at scale. This is for our entire org, and when we take a look at the mobile engineers specifically, we really take this to heart as our north star at the company. The end of the day, this little cartoon justifies well what we're trying to build. We want other companies to look and say, well how did Uber do that? We're going to take some inspiration from that.
So let's talk a bit more around the tech stack and the developer workflow that all of the engineers working at the company are using to build, that our org is putting out. Well, we are in an Android mono repo or an iOS mono repo. I'll talk about Android specifically since that's my bread and butter. We have a number of the apps that live within that, and they sit on top of a core set of architecture and framework and build tools. And then, under that we support, Kotlin as a programing language. We have many open source libraries, we have documentation platform we own. We have storage library, and UI libraries and all kinds of stuff like that. And then, we have build tools, and testing, and IDE our team, and device lab, and a bunch of stuff. And then some cross-functional groups that focus on things like observability or networking. And here, if I zoom in a little bit more, here's a big list of things that I'm not going to go through. All of them you can kind of reference back when I post the slides, but this starts to kind of line item. A few of the things in each one of those areas, from our architecture Ribs, to some of our open source libraries to code gen our views like stylist and artist, to our remote development environment or our developer analytics stack. They kind of live in some of these others, and I'll, I'll post this and at the end, there's resources with a bunch of links where these are open source, so you can dive into more of these. This is mostly to give you the perspective that we do push the standardized environment that folks are using, and so when they are these, you know, feature engineers that are working in it, they have a very standard stack that they can use.
So the mobile, the workflow looks a little bit different if you're a mobile developer at a big company than if you're a backend developer. You know, if you're a backend engineer, you might be able to deploy quickly, and if there's an issue, you can fix it, you can redeploy, you can hotfix. For mobile engineering there's some different constraints, right? You're putting out a binary into the wild. It has a long period of time going through App Store review and verification. You have in customers that may be on versions that are much older than the current one and they may not update regularly. And so we have this long cycle of verification, you know, first moving from the developers and our loop where they're developing to CI and our submit queue, you know, our CD pipeline finally into like dog fooding and verification with testing, at any one point in this as it's moving further to the right, if there's an issue that feedback is more expensive, right? And so, it's we want to try to prioritize moving these things left, but this is kind of an overview of this, and as we talk about measuring our analytics and our DevEX, I'll call back to this a little bit. So let's hop into how we think about measuring this. We you know, we're an org serving 500ish, 400 - 500 Android engineers. We have our org goals, KPIs, everything else. Obviously, we need to get leadership buy in, we need to have things we can measure, how do we think about that? Well, we have a lot of different things that we measure. You've heard in some of the earlier talks people discussing NPS or Net Promoter Score, that's definitely a really important one to us. We do regular developer surveys and take a bunch of feedback and ask the NPS question. But then we measure overall developer throughput and aggregate build time, a lot more granular things like failure rate, git performance time, the overall release funnel time that I was showing a minute ago, you know, the time there, the delays there, that's all measured. The IDE performance. We have a lot of metrics in the IDE like indexing or code analysis time, file opening time, a bunch of stuff like, CI/CD up time and then, you know, the apps things themselves, reliability, performance, you know, we measure those.
So NPS, I won't go too deep into this. You've heard a lot about it today, but it asks the question, would you recommend this developer stack to someone else? And we've been tracking this for a long time. We've seen this consistently going up into the right, as we continue to invest and developers are more productive and happier. And then in that survey, we also ask a bunch of free-form questions and with those comes with a lot of great feedback that we're able to take into our roadmaps and planning. So the other big part of our analysis stack, though, is this tool we called LDA or local developer analytics. And so this is a daemon that we've run on everyone's development developer environment, it's on their local computers. And it has hooks into all the different tools that they're using. It collects this data, and then it forwards this up to a log collection front end, which pumps that onto Kafka. And we can, you know, set that up for monitoring, we could search it, we can query it, we can use it for debugging. It becomes really powerful for us understanding the data so that we can move from just taking anecdotal or customer feedback into quantifiable things that we can measure. So what are some of those local hooks that I mentioned? Well, you know, we hook in to git, so we can get git performance time. We hook into arcanis, which is the CLI for fabricator, which is our code review tool like GitHub. We're moving off of that to Github, but that's still around for folks today. You know, we have wrappers into Buck or Bazel or Gradle, depending on what folks are using so that we can get out the build data and pump that up. And then we have customized IDE plugins that are forwarding all of the, you know, IntelliJ or VS code data up as well. And then, you know, the standard like custom CLIs throughout the company. Typically those are emitting logs to LDA as well.
So what does that kind of look like in practice? Well, here's one of the debugging dashboards we have for IDE. We're measuring indexing time and code analysis and a bunch of the stuff that you probably heard about if you went to the JetBrains talk earlier to hear about why mono repos and IntelliJ don't really play nicely together. You know, we're working closely with them and we're trying to improve a lot of this stuff. But, you know, it takes measurement and data to get to the point where we understand what those problems actually are. Having dashboards like this, being able to setup alerting or understand what's going on has been really, really powerful for us. So let's move on to the next category that I want to talk about. We've talked about measuring. Great. You know, we know how what works we're going to prioritize. We know what we're actually going to fix. Let's address the primary issue that people always talk about in the NPS score, which is builds being slow. A brief history of the build systems for mobile at Uber. Back in 2016, we moved off of Gradle to Buck. We were using an open-source project that we maintain called OKBuck that is a Gradle plugin that generates dynamic Buck build files. We were much smaller at this time, but it's still a pretty large app. Over 2016 to 2019, we migrated fully off of dynamically generating the Buck files to a pure Buck implementation where we're really only using Gradle for dependency management at this point, and then we're using Buck for a much more paralyzed and hermetic build. Last year we started evaluating a move to Bazel to align with the rest of the industry, and we've been deep into that migration ever since. Right now we're in a shadow system where Bazel is running in shadow to Buck, and we're starting to take beta users with a plan to fully migrate over and deprecate Buck next year. Now, if you're not a mobile developer, you may not necessarily understand the pain points that come with build times that mobile folks have that's pretty unique. So here's an interesting analysis that I did a little while back that helped educate some of our leadership that wasn't necessarily mobile engineers to understand this pain point. So if you're building a mobile app, you're building a giant monolith typically, right? There's tons and tons of modules. Those are all assembling one binary. So if you're building backend and you're in a typical microservice architecture, then you're probably just building a handful of targets when you're trying to compile something. So we used LDA and we created the data and we saw that the P75 number of build targets built per invocation by the developer when they're coding is, you know, close to 200 on a rider app and it gets up to 100 on driver and eats. That means every time they're making any change and they're hitting build, it's having to recompile that many number of targets, versus our backend services where that was 2. So obviously it's much worse on mobile. But let's view this in a slightly different way. This is the dependency graph visualized for our rider app, and as you can see it's very clear what the architecture looks like from this picture right? Well, maybe let's zoom in a little bit, maybe make it a little clear, a little further. So as you can see, it's a giant spaghetti mess of dependencies. Right. And this is some of the problems you get into with really large mobile apps where you're assembling these monoliths. And this is even with like a very clearly defined architecture and dependency direction you end up with hell like this. So if we were to start to break apart the compilation problem, if you were to just understand basic dependencies, right? Like this is a visualization and part of that graph. We have a couple of modules that depend on eachother in a transitive chain.
Well, let's talk about how this slows us down and how we can make it faster. In this case, if I built BazModule or if I made a change to BazModule and I built Foo, it's going to need to rebuild everything up the chain without any optimizations. Now, for 12,000 modules, that means you're going to be rebuilding a lot of stuff. But the tangent for a minute, to explain it, to say why this is even worse for us, is in the Android world, people have moved off of Java to Kotlin, and we did a study a few years ago that we've released on our blog that demonstrated Kotlin is two times worse on most workflows than Java with error prone and about four times worse than Java without error prone. Now this gets worse in some common cases, like if you're doing annotation processing or if you're mixing Java or Kotlin, which we both do a ton of. So not only are rebuilding 12,000 modules, but we have a ton of Kotlin Code and Kotlin's much slower than Java, so now our mobile engineers are really suffering. So I'll give a couple high level concepts for build that we need to use to talk about the optimizations. We'll first cover build inputs and outputs, cache keys and then ABI jars. So in a hermetic build environment, you have a build target and it takes a number of inputs. Those, you know, the configuration, the dependencies, the toolchain information and then the output of that is a, you know, it's an artifact, it's an archive, a JAR, an AAR and APK, whatever. So and then that entire thing is cached up. And out of that, you get a hash, that's the cache key. This is how the build system references that specific module, and when you make a change, you know, it can query the build cache and say, hey, is this hash changed? Do I need to rebuild this or can I use the thing on disk? Similar basic concept for Bazel, Buck, Gradle, whatever. So ABI Jars are where this gets interesting. So if we have foo depending on bar here and we make a change to bar. Does foo get rebuilt? Well, it depends. So without ABI Jars, if yof had bar module that just had this basic method bar that had a content of foo, when you build that, you would get a class file that looks similar. So if I made a change from the content of that method from foo to foo bar. It's going to rehash the class file and that's going to be a different hash. So if I make a change to bar, foo going to say, hey, that's different, I need a rebuild foo as well. ABI jars are where you can try to address this some. So an ABI jar, is a jar or it's a class file that has all the inner contents of the methods stripped out or private methods or anything that's not exposed out to the consumers. So in the same case, I have the bar method and the class file that's generated is missing the implementation detail of this public method. So when I hash that, it's hashing the full contents and if I change the content of foo to foo bar, the class file doesn't change. So hashing that again, it's the same as the previous value. So this means I can make these changes in the bar module and I don't necessarily have to recompile foo. So this is one of the techniques we can think about for optimizing that build. So we talked about the high-level ABI. Now we'll dig into, class ABI jars versus source ABI jars. So a class ABI jar is the more naive technique that some of the build systems use to generate this. You take your source module, you pass that into JavaC or KotlinC. You get out your class files in your jar, you then apply that to an ASM processor and it strips out the inside of those methods and it gives you a second artifact that is the ABI jar. Now, this gives you the advantage of getting your output and you don't have to rebuild this stuff like we talked about. But obviously this is more expensive because you're having to run full JavaC and then you're still having to run a second step. So this is where source ABIs give you an optimization. A source ABI jar is the concept of generating it at the source compilation level, not as a post-processing stuff. So, you know, at a high level we throw the module over to JavaC or KotlinC, and without introducing a second step, it's able to output both the ABI and the regular class file. To dive into that a little bit more. What JavaC does is it has four stages in the compiler. It has a parse+enter section that's a little cheaper where it's parsing the code and then an analysis and a generation stage. And so what Buck does, which is interesting, is it is you're able to output the ABI jar because you have enough information at the parse+enter stage and that can unlock downstream dependencies and then you can finish your compilation and get the full jar. So what does that look like?
Well if you, use this thing called Rule pipelineing that enables you to have these build targets that have multiple steps. And because of that, you can emit the ABI early out of the parse+enter. It unlocks foo module. Now foo module can start compiling before bar module is done. And so your overall build, assuming you have a huge number of dependencies, is going to be faster. Now this gets a little more complicated than Kotlin, because Kotlin can also produce a source ABI, maybe from a compiler plugin that's maintained by JetBrains. Unfortunately, by the time KotlinC enough information to output the ABI jar, it's really negligible. If it's going to do any more work and output the full jar. So they output both at the same time. So you don't get the same optimization that you see in JavaC. And this is where this gets complicated for mixing Java and Kotlin sources, because if you're mixing these, you have this pipeline internally to the build system. Typically, first, you take your mixed sources, you take the Kotlin files, you hand those to KotlinC. You also hand the java files to KotlinC. It uses the Kotlin files for compilation and it uses the Java files to analyze and reference types. And out of that, you get your class files just for the Kotlin files. And then it hands all of those class files and the Java sources to a second step in JavaC, where it uses the Kotlin class files, as you know, on the class path, and then it compiles the JavaC and then it gets the output of that, and then it bundles all those up into a jar. So now the problem is, if you were to use mixed sources, you can't use source ABI and rule pipelining by default. So it's more expensive to build the same number of Java and Kotlin targets. Sorry, Java and Kotlin sources as one target as opposed to two dedicated targets that's one purely Kotlin and one purely Java with a dependency. And so we thought we could make this faster. And we introduced this hybrid mode of rule pipelining, which helped a lot here. So Kotlin behaves the same. We take the source files, we output the class files, but we also output the ABI class files at the same time. And then we hand all that to JavaC, and JavaC uses the previous behavior that we talked about with source ABIs where it's parsing+enter exits early because JavaC is able to output the Java ABI class files and then we compress all of the ABI class files into one jar and then we continue with the Java compilation for the remainder and then output a full jar. Now, this may not look like it saves you much, but in a large system with tons of dependencies, we've now met parity between having dedicated Java targets versus mixing Java and Kotlin targets because with this this chrome tracing style graph that we have here, we can see that while you still have to do all the KotlinC work, we can get the same exit early from the JavaC of this. And so this is what enabled us to turn on mixed source sets for all of the Uber rep, and we've been banning it until that point, actually. So we talked a bit about ABI jars and mixed source sets. Let's talk about the next optimization, which is per class compiler avoidance, not just the ABI jar, compiler avoidance. So to give an overview of this, let's say you have a module and it has 2 dependencies, but it's only using references from one of them. Well, if you make a change to the one that's unused based on the traditional dependency setup of Buck or Bazel, it's going to require you to rebuild the parent anyway. So Buck introduces this thing called use classes.json. So it uses a compiler plugin in KotlinC and JavaC, and it has a wrapper on the file manager. So anything that's loaded, it's tracking. And then it outputs an artifact that keeps a list of all the things that were used. And it applies that as a filtering mechanism within the dependencies in the build system. So if I have the same example where I make a change in something that's unused, it's not going to be in use classes, therefore I'm not going to need to rebuild it as a dependency, even though the build system thinks it's the dependency. And what does that look like in practice?
Well, it's actually pretty trivial. The Kotlin, the Java one is in the Buck repo the Kotlin one is now on the dev branch of the Buck repo as well. We have a PR that's out for Bazel as well. I can give you the link for that in a minute. But the implementation of this is it's just a Kotlin compiler plugin that extends the analysis handler extension. It has a call checker and a declaration checker. So it's just getting visited for every one of those type references. It's writing that to a map. And then at the end of the step, it writes that all out to the format that the build system expects, either JDeps or Bazel, which is used in like strict mode warning or the the upcoming compiler class avoidance PR or for Buck and used-classes.json. So we talked about the per class compiler avoidance. Now let's talk about annotation, processing improvements. Sorry, this one should have been before the previous slide. Anyway, this is the results of the compiler avoidance data. So this is where that win comes in that we were just talking about. So if you look at Bazel out of the box, this is an example from our internal repo on a Kotlin target. It's building a huge number of targets compared to Buck, and so Buck is doing these classes and all the other stuff that we talked about. Bazel doesn't have that built in. So we put this PR up. This has a per class compiler avoidance technique that's used in Bazel core. With that, we're able to pretty much get parity between Buck and Bazel for compiler avoidance, which can be significant because this number of targets that we're seeing, it's about the same number of seconds it takes to build these as well. So lastly, let's talk about annotation processing and the part of the build that's slow there. So we make internal we make heavy use of annotation processors in our android mono repo. Last I counted, we have 17 different ones internally. An overview of the space in the JVM world, we have KAPT for Kotlin, which is fully supports Java annotation processing, but it's quite slow as we showed in our study. We have traditional Java annotation processing, which is much faster. However, it doesn't support Kotlin at all. And then we have the new Kotlin symbol processing, which is an abstraction on the compiler plugins, which is fast as well, but it isn't backwards compatible with annotation processors and it takes work for me to author to support that. So there's not great support in the community quite yet. So what is KAPT doing? Why is it slow? Well, if you have your module, you have your source file and it's annotated. You want to do some code gen on that? It's going to run capped as a compiler in KotlinC. That's going to do three steps the stubs, the annotation processing and the compilation. Now you have multiple invocations to KotlinC potentially. Out of that, you get an implementation file where you have code gen something with KotlinPoet or whatever code gen you're wanting to do. Out of that, it builds both the class file for your source and the class file for your generated code, and then you set that up into a jar. But this is about one and a half times as slow as annotation processing on Java or Kotlin without annotation processing. So we thought, how could we? Use Java annotation processing where it's much faster to resolve some of our needs in the Kotlin world? And so we were able to figure that out.
There is a couple opensource implementations of this as well that I can link at the end of this. But what we end up doing is we think, hey, if we have the annotated classes already, can we get JavaC to just process those and skip doing annotation processing all together in Kotlin? And so we were able to do that, I'll show you how after I get the animation to play. So what we're doing is we built this Java compiler plugin called Kaptish. The open-source one for Gradle is called Napt. I think the one for Bazel has a different name, but they're all three open-source at this point. What this does is if you have a class retained annotation on your source file, it'll be preserved through the initial compilation, and so we don't run annotation processing on that at all in KotlinC. So we get a class file and it still retains that annotation. We hand that down to Java and it uses this compiler plugin that says, I'm going to take the list of those class files around the class path and I'm going to force Java C to run annotation processing on those, even though they're not source files, and it's able to do that because JavaC itself actually exposes this as a command line argument. It's not really known. It doesn't have a lot of usage in like standard tool chains, but you're able to tell it to process class files and run annotation processors on those. So this is a very simple implementation of what this is doing. We have a Java compiler plugin. It takes the list of class files. It forces them into a specific argument that JavaC's expecting and at this point it will run the annotation processor on these. Of course, this has some downsides. If you are generating your, if you have your sources in KotlinC, and then they hands those class files off to JavaC, then you can't reference any directly generated code. You could have two treated as two different modules, essentially. So many annotation processors like auto value, you have to reference it directly. So that's not compatible. We we don't use that very much. That is an anti pattern. So we're able to work around that. Some folks use reflection for that first entry point instead. You must also have class retained annotations. This isn't a big deal in Kotlin where there are class retained by default, but in Java their source retained by default. Those would get dropped before those class files are output. And lastly, it can't actually generate Kotlin code, right? Because you're doing the work in JavaC, which means that the compilation has to be Java. So even if you're processing Kotlin files, the generated code needs to be Java. So, you know, using Java Poet as opposed to Kotlin Poet. Okay. So we talked about all the ways we've made builds faster so that it's reasonable to work on a 12,000 module Android application, going forward, we have some things that we're looking at as well. We're really excited about KSP. We'll be working closely with that team to probably shadow in Q1 and start giving feedback. Sorry, K-2 compiler for feedback. We are also excited about KSP. We've already ported many of our internal annotation processors to KSP and we'll continue doing that. Our DI library motif that's open source, which was just recently moved to support KSP. And lastly, we're going to be looking at flattening our build graph more than is done right now. One of the big problems with a mobile monolith is the depth of that. So we'll be thinking about how to make that wider and get more optimizations out of the build system natively. So with that, let's move on to the last section of the talk, which is around IDEs and dev tools.
So first quick overview. If folks are doing android development at Uber their basic local development look something like this. They have a new M1 Macs laptop. They're using IntelliJ 2022. They have a bunch of standard Uber IntelliJ plugins that we provide. We fully provision like the IDE settings, like the VM options and code style. We have the analytics team and that's running on their machine and we manage various standard applications on that with Chef, and but there's still some downsides to this. They have to set up their own Android SDK. They have to upgrade the SDK. They do a security update in Mac that might conflict with some of the toolchain stuff that we have. And so there's there's issues and manual developer overhead that we're not super happy with. So to talk a little bit more around the local experience in the IDE itself. We have a bunch of plugins. We have the analytics plugin which I talked about, which hooks into LDA. We also have a bunch of stuff to scaffold, new modules, new ribs, our architecture component that supports compose, a scaffold new sandbox app, which is just like a demo app where they can do features specifically in that, not have to build the whole thing. We have an explorer for our DI graph. We have an explorer for our architecture graph. We can recommend and enforce third party plugins as well. And then a bunch of things like live templates and real-time UI updates. So a lot of functionality that we put into the IDE directly. An interesting one here we can dive a bit more into is the real time UI updates. So we built this tool called Quick UI for Android, and this is kind of our cheap version of live edit. So what we discovered is a lot of developers are building just UI iteration, right? They're adjusting in XML file. This is XML only so no compose quite yet. But they're adjusting that XML file. They're tweaking, they're adding new margin or new colors or whatever. And when they do that, it has to do the full build. And we talked about how expensive that build is. So in this case, you know, we see that if they're just building that, we thought, well, how can we just make that XML change without having to rebuild the entire thing? And so what we have is we built this tool and it deploys the application with a custom wrapper on the layout inflator. And so that knows how to read both the standard layouts that are deployed in the application as well as from a second APK that we can put on disk. Kind of like how espresso uses a second APK, and then in our build system, we detect if it was only XML files that were changed. And if it was, then we just do a compilation directly on the CLI of those XML file, and we push that over with like an ADB command. So those sit on the SD card and then the layout inflator receives an intent and it sees that something is updated and it will rerender that screen. And so this is much faster. It takes a couple of seconds to iterate on UI as opposed to a minute or so to do a build. So I have a demo of this.
Now, let's see if the video works well. Okay. It's a little small, but I wonder if the people in the back can full screen this. Okay, maybe not. What we're doing here is we're editing XML attributes, so we're changing some text, and we ran the build command, and with that it restarted immediately so that took about a second, and the text is different, and on this next one we're editing the margin so you can see we changed the margin in the XML, we run the build command, the application restarts pretty instantly and it's back on the screen and the margin has been adjusted. So keep in mind that a build of the full app averages 60 to 70 seconds. And with this iteration, we're seeing it reflected in about a second. So the other really interesting part of the IDEs for us is how we're thinking about remote development and cloud IDEs. So we've had local development for years. We see the industry moving towards cloud IDEs, big servers. We've seen, we've passed the limits where a big mono repo can work on our local laptop. So how do we use a big server to make development faster? You probably use things like remote build execution on Bazel, but now we're starting to see the migration of IDEs there as well. With these giant graphs, the IDE is very resource constrained. Even running that locally and punting your builds off to be remote executed it isn't enough for us. We're seeing large performance issues. So by putting the IDE on the server, we're able to reclaim a lot of that. So what are we doing here? Well, we have this beefy Linux dev server and we're provisioning these where all of our developers are. So, you know, we have folks in the U.S., on the East Coast, on the West Coast, in India, in Europe. So we have all of our cloud regions with these devs that are distributed, and it's a containerized dev environment. So we manage the entire thing. We know the version of the JDK, the version of the Android Toolchain, we're able to install all our custom tools. We're able to preload the cache for everything, right? For the IDE, we can preload the indices so we can just plop those directly down on disk for the build cache. We can put that there for the artifact cache. And so the developer, they're just spinning up these machines that are instantly ready to go and have the full environment. And we're running the IDE as a daemon with everything preindexed, so they can, you know, just immediately start up a new dev environment and have everything ready. You know, we allow these to be user customized. And so they're running JetBrains Gateway, which is their new remote development environment that connects to the IntelliJ backend. What's really cool, and what we keep hearing from developers is these multiple of these at once. So if you're on your computer, we know that context switching is really expensive. If you're on a big mono repo, it may have to, you know, builds and project something new that may take 10 or 20 minutes to context switch, which, well, people will just set up 2 or 3 dev servers and they'll have 1 for, you know, 1 branch, 1 for another app, and they have the IDEs instantly ready to go, so they'll just hop between these as the dev flow changes. So there's a couple of interesting parts from the mobile side specifically that I want to talk about. One is how we work with emulators or devices. So if you're thinking, hey, Android Cloud IDEs, but I have my local device, that's not going to work. Well, we have some workarounds. We're thinking about maturing this more. But right now we do enable local emulators and devices to be connected to the dev server. So we have a custom wrapper that manages the SSH port forwarding with a socket and we'll have the developer run that in the local environment and that'll forward it along. And then on the remote dev server, it sees the emulator or the physical device like it's natively attached. So you can see it in the IDE, you can debug, you can install to it, and it works pretty well. Besides a little bit of network latency for transferring large APKs. So the other really cool part about this that we talked about was that snapshotting section. Right. So the snapshots, we have a cron that's running nightly and it's booting up one of these dev pods. It's setting up the entire thing. It's loading the IDE, it's doing the builds. And then it has a list of all of the specific outputs where it put temporary files and it archives all those up. So things like the IDE indices, the Buck cache, the Bazel cache, the Gradle cache. And then it uploads those to a data store. And then when the developer sets up a new machine or when they turn it on in the morning, we're able to load all of those archives and they'll instantly have this ready to go. Now to give a little example of what this does in practice. If you were to index our full mono repo, it takes about 38 minutes. If you were to use JetBrains shared index plugin, which uses the indexes that you upload, but it still has the relativize those and change paths and toolchain info and parse them all. It's about 30 minutes versus the approach we take where we're taking those indexes right off disk and plopping them on to another machine, because we control the entire environment, including paths, and they're all the same for every user. It takes a few seconds. So it's a drastic difference.
So I'll give a quick demo. And as long as they don't kick me off the stage for being a few minutes over, you guys can see what the dev pod experience looks like. So in this example, we have an emulator up and running and we have a terminal. I just ran a create command, so I'm creating a new dev pod. Now I'm running the dev pod PS command, which is going to list all of the running ones I have. You can see I have 6 or 7, so I'm using them for different flows. I have different branches on each one. Are different apps indexed on each one. Now I have a couple tabs open here. One is one of these dev pods that I have gone ahead and set to the project selection for the eats app. And another one is the project selection for the rides app. So here in the eats app, dev pod, we can SSH in and it's going to give me the output for how to connect the IDE. So I have a JetBrains gateway link, as well as a VS code link, or an Android studio link so you can click those and get started right away. I click Gateway here and this sets up the thin client on your local environment and that connects to the running IDE that is running on the server, and so this is instantly up and it's fully indexed in the app. This is very responsive. It's using the code with me protocol, which is an asynchronous protocol for the text editing, and then it's using the projector rendering, which is their synchronous rendering that sends swing calls over the wire for all the secondary panels. So I'm in a Kotlin file on the rider app, oh sorry, in the eats app. Now, let's switch over to the rider terminal for this dev pod. I'll SSH in here it's going to give me a similar thing where I see the link, I click it, it's going to set up a second thin client. So now I have two thin clients running. Both of them are referencing two different servers. Both of the servers are fully indexed for our monitor repo and two different apps. And I can make different changes in these. I can. I'll be compiling here in a minute. None of that is using my local machine resources and I'm able to context switch very quickly between the two of these. So let's install the app. Well, first we have to connect to the emulator because it's an Android app. So we have a command that wraps the port forwarding that the developers can run. So they just select the dev pod they want. They run that. And at this point you can see if it was a little bit bigger or if you can squint that the emulator is connected here in IntelliJ, you can see the logs on the bottom that are flowing past. You can attach the debugger, all of that different sort of stuff. So let's run a build command. We're going to build the eats app. We're going to install it over the wire to the emulator. So this is cached right now. So it's building from cache, which is going to be very fast. But then the majority of the latency is going to be transferring the APK over the wire. So I think in this demo it's 10 seconds or so, it's really going to depend on your network bandwidth, how that looks. So buildings are almost done, installing, building is done, sorry, installing is done now, it launches the eats up. So this was built on the remote server. It was transferred over the wire. We launched that. I'm now back in the IDE. I can see from a debugger perspective it sees the device I can attach, I can do breakpoints, I can switch over to the rides app now, for the other dev pod, and I can also attach an emulator to a second machine because you can port forward to multiple remote servers at once. Both of those can see the emulator as native devices, and so that can be really powerful for when you have one emulator and you're doing multiple flows. So I'm installing the rider app that's going to do the same thing, that's going to transfer over to the same emulator. I can also see that in the other IDE. So that's about the end of the demo. I'm going to skip the last couple of seconds. It's just we click back into the right, but this really gives you like a concrete example of where you can have two different flows to different large mono repo you're working with and the local computer is using like one core. You know, it didn't even have to use any of the resources. You can imagine a world where you set this up on like a Chromebook or something very fun. Let's see. There we go. So that was the end of most of the content.
For Recap, we gave an overview of what mobile at Uber looks like, what the dev stack and workflow looks like, talked about how we measure things, everything from developer sentiment to LDA and the everything from builds to IDE performance. How we think about making builds faster so that we can build large mobile applications using ABI avoidance, compiler avoidance, improving the annotation processing and then how we think about the IDE experience and getting folks to iterate quickly and have a good UI. Quick UI for deploying quickly and our remote dev stack. So. Here's a list of resources. You can take a photo or I'll post slides later. This is a lot of the stuff that we talked about earlier, our architecture, our DI, performance profilers that we build. That is the open source tool that I mentioned as an implementation of the annotation processing for Kotlin. If you all are interested in that and that is everything I had. So only 4 minutes over Rooz won't be mad.