All right, so next, please welcome Emily Glanz from Google Federated Learning and Analytics team to talk about Federated Learning. I'm very excited to have Emily here because she has been giving a lot of talks over the years at our Seattle events. So it's great to see you here, Emily. At Devfest West Coast 2020.
Awesome, thank you. Thanks, everyone, for joining me today, I'll be talking about Federated Learning. I'd also like to quickly plug an event we're having this Friday. It's it will be a tutorial over Tensor for Federated or TFF. I'll be getting more into Tensorflow Federated later in this talk. But if you go ahead and go to that link, which I believe Margaret will also be posting in the livestream comments, you can register for the event and you'll get more details that way. Awesome. So let's jump into it. The goal of Federated Learning is to enable edge devices to do state of the art machine learning without centralizing any of the user data and with privacy by default. So let's provide some intuition behind why federated learning is so exciting. Hospitals contain tons of patient records and these are highly sensitive. Hospitals that really want to centralize this information due to privacy concerns. So we can imagine this being a scenario where Federated Learning could be used to enable hospitals to keep this information decentralized and private, but still allow for hospitals to learn from this information in aggregate things like Federated Analytics and Federated Learning.
So today, I'll be talking about why this type of data is exciting, how we can make use of this data in a secure and private way through federated computations, specifically federated learning and how to experiment with these concepts using Tenorflow Federated or TFF.
So why is decentralized data interesting? Well, for one thing, decentralized data is everywhere. There are billions of phones and Internet of Things devices that are constantly generating data. Smartphones themselves, if you kind of think about it, are the world's largest supercomputer. So it's interesting to think of them as decentralized computing nodes. The decentralized data stored on these devices could enable better products and smarter models. So can we harness the power of these edge devices without centralizing the data, without collecting this data?
And why would we want to centralize that or why would we want to bring learning to the edge? Well, we have many of these benefits, so we see an increase in user experience through things like lower latency to prediction, the ability to use the models when the phone itself is offline, bringing learning to the edges better for privacy concerns as well because the learning is localized the device itself.
This is also better for things like resource limitations, like battery life and data caps, since the phone isn't transmitting tons of trading data or inference requests to the centralized server. Because we're only performing on-device training when it comes when the device is plugged in an idle, we try to minimize the battery usage and performance impact on the device itself. So while bringing learning to the Edge has tons of benefits, why don't we just bring all of the learning to the device? It's tempting to bring just the whole machine learning process to the device itself and cut out the centralized server. But that approach itself has limitations.
If we don't have a server in the loop, how do we insert analytics questions about that decentralized data? How would we continue to improve the models based on data that other edge devices have?
And how do you see phones themselves with models that are already smart so it doesn't take lots and lots of training on that device? So that's what we'll really be looking at in the context of federated learning.
So can we do cross-device machine learning and analytics without centralized data collection? It turns out we can, and that's through Federated Learning. Next, I'm going to walk through the basic process of how Federated Learning works. So let's jump into a brief example of like this is a pretty simple example of how F.L. works.
So the initial model, as dictated by the model engineer, will be sent to the phone and they have a computer. So that's this guy. I guess it doesn't show up for the initial model.
Usually, zeros or a random initialization is sufficient for it or if you have some relevant proxy data in the cloud. You could also start from a pre-trained model in this case. Next, the device that received that initial model will compute and update to that model using its own local training data, and only this update then is sent back to the server to be aggregated. None of the raw training data that was used to produce the update leaves the device. Other devices will participate in this round of training as well, performing their own local rounds of learning, each computing an update using its own local training data. Some clients may drop out before uploading their updates to the server, but that's OK. Our federated learning protocol is resilient. That's. So the server will then aggregate users updates into a new model by averaging the updates together, or this is where the federated averaging comes from, the updates will then be discarded after use. The engineer will be monitoring the performance of the federated learning process through metrics that are themselves aggregated along with the model. And it's important to keep in mind that this process is iterative, new devices can check in with a server and participate in another round of training using the updated model parameters from the last round of training. To produce and then this produces another updated model that has learned from the new clients' aggregated data. Cool, so we've shown that through learning works in production here at Google to improve the model and intelligence without the need for having centralized training data or performing centralized learning. One example that I'd really to highlight is how we've used Federated Learning at Google to improve our model keyboards. People don't really think much about their keyboard, but they spend hours on each day and in general, like typing is 40 percent slower on a mobile keyboard than it is on a physical one. So this kind of shows how important the learning is to improve this model.
Intelligence behind the keyboard models here are essential for tap typing, just for typing, auto corrections, predictions, text of voice and more, which all greatly benefit from federated learning.
So what I've shown you here are the basic concepts of how better to learning words. In practice, the process is much more complicated, but learning is still an active area of research.
There are many extensions here like differential privacy, compression, quantization, secure aggregation, etc. You can think of things like Federated Distillation or Federated or things like that, which are really exciting. And on the frontier of research, I just briefly walked you through a very simple round of federative training. But there are many other federated computations available as well, like Federated Evaluation, Federated Analytics, etc., that provide insights and make use of your own decentralized data. So now that we're all stoked on what Federated Learning is, how can you yourselves experiment with to learning? We've provided community projects for all to develop the building blocks of federated computations with tensor flow, federated or TFF. Here's the link if you'd like to explore some tutorials about it. It currently allows you to experiment with federated learning and other federal computations, information only with more planned in the future. So stay tuned. So inspired by all of our experiences building our own production and learning system here at Google, we've created Tensas for Federated, but it's been generalized to be able to express any federated computation. TFF is an open source framework for federated learning and federated computations on decentralized data.
So jump in and have an influence for the system goes, because this is a relatively new project, it's rapidly evolving and we're encouraged. We encourage you to make contributions to it and impact the future of Federated Systems. So here's that GitHub link. If you're interested in the code itself or if you want to contribute to it, there's many ways to get involved. So we've made TFF an open system, and this system is designed for Kompas ability to make it easy to express the kinds of computations that will enable better research, etc. We really we hope it will allow the researchers to better understand what works and what doesn't work in the Senate system and will help federative enthusiasts develop faster together and create an ecosystem for new applications or deployment environments. So speaking of deployment, TFF compiles all code into an abstract representation, meaning that it's architecture agnostic. Currently, this is only available to be run in simulation, but someday we'd like for it to be able to be deployed to real devices. So you can think of a world where we can take that same representation and actually deployed in production.
So stay tuned for that. So here again is that tutorial in.
So TFF is about having fun, experimenting and building your own set of computations we've devised designed in such a way that you don't have to worry about the major pain points that we face when developing our own separate learning system.
And here, some of the pain points like interleaving, different types of logic, tension between the order of construction versus execution and global versus local perspectives and communication.
So TFF offers two main APIs, we have the of Learning API or F.L. API and the Federated Core API. So the API is a higher level API. It provides implementations of federative training through federated averaging and evaluation that can be applied to your own existing carious models for you to experiment with F.L. and simulation. This later sits on top of the other layer, which is much more lower level and gives you a more generic expressions that allow you to design and simulate custom types of computations and really control your own orchestration. So this is called the Federated Core API, which allows you to build your own federated computations. There's also local runtime simulations available. So, again, there are many, many ways to get involved, depending on what area of the stock interest you, this architecture is for a clean separation of concerns. So if you specialize in a different area, you can spend your time in the part that really interests you, whether that be the machine learning, computer theory, et cetera. Awesome. So let's jump in to the Federated Learning, or F.L. API, if you like more details on this. Here is that link. I just encourage you to go into that logush Federated. And from there you can find all the other links. But I'll be presenting on today.
So in order to facilitate experimentation, we've seeded the TFF repository with a few data sets, including a federated version of the classic and this data set that contains a version of that original dataset. But it's then reprocessed using leave so that the data itself is keyed by the original writer of the digits. Since this writer has since each writer has their own unique style.
This data set will exhibit that kind of non idea behavior expected for a federated dataset.
So here's what the code looks like for TFF, we provide data sets that are pretty good proxies for those three data sets again, and these can be accessed by returning by using this load data function. The low data function returns a data set, which is an instance of the TFW simulation, that client data interface that allows you to enumerate a set of users to construct two data sets that represent the data of a particular client. And this this allows you to query the structure of individual elements, because this is a simulation.
Here is where you will plug in your own KERIS model. So if you have a curious model, you can wrap it like this. So it's conceivable by TFF, almost all of the information that's required by TFW can be derived from Keres. Interfaces that link. So let's actually take a look at what a model looks like. So if you're familiar with Carus already, this probably looks pretty familiar to you here. We're just creating a simple model that will solve the this example for us we see or creating a sequential model.
It has one tennis player and will be using a softmax to get our results. And here we're also attaching our loss or optimizer and our metrics.
So now that we have a model which we've wrapped as a learning model under the hood for use with Federated, we can like to construct constructive federated averaging algorithm by invoking this helper function here. TFW dot learning build federated averaging process. So this is going to go ahead and construct that federated trading algorithm just using a provided carious model. So all you you'd have to do on this end is provide a model. And now this will work in a federated environment on decentralized data. So let's invoke the initialize method of that iterative process we just built under the third with that method. This will get our initial server state. Then we can call train next, which will run a round of federated training, so this includes sending the initial server state to each of the clients. Each of the clients will run their own local round of training and send their updates back to the server. Back on the server. We'll get that aggregated new global model that's been produced from the decentralized out of each of the clients. And finally, we can think of doing an evaluation to understand the state of our trained model, so the TFW learning that build federated evaluation will provide a federated computation for evaluation for you. And again, you can just plug in your own model function again here. That's Charis Model that's been wrapped as a learning model. And then you can run an evaluation using this here and you'll get back metrics that will show you how your training is progressing. Again, that was a little bit fast, that might have been kind of overwhelming, full tutorials are available at this link here at Tenzer Flow Dotcom Slash Federated.
Cool. So now let's jump into the Federated Corps, or F.C. API. This one is a lot lower level, so there is going to be a lot here. I hope to just kind of cover some of the general components of this. But again, this will be a lot to encourage you to go to that editorial page. So the API is a lower level API than the F.L. API. It's a language for constructing distributed systems. It introduces some abstract concepts. So let's go ahead and deep dive into those concepts that are introduced by TFF. So first, I'm going to explain what a theory of value is, let's set up a scenario where we have a group of clients. In this case, each client is an individual, a smart temperature sensor. Each sensor is going to have its own local item of data in this case, let's pretend it's the maximum temperature that that device saw in a single day. So you can see that this is in Celsius, by the way. So we can see that each of us has seen a different maximum temperature. Each local out of the client's data has a type of slope. Thirty-two, because this is like a sensor reading. So we refer to all of these sensor readings collectively as a federated value or a multi-asset, values like this are first-class citizens, meaning that they have a type. So here's what a federated type is. A type includes both where the value is located, which we call the placement.
So all of these values are located at the clients and the actual type of each of the local items of data. So this is float 30 to. So what happens now when we throw in our server into the mix so the server will also have value? And this value itself is float 30 to at the service. This time we've dropped the Kolten prices to indicate that it's one value and not many values.
So from all these sensor readings, how do we get our value back on the server? So let's introduce the concept now of a distributed aggregation protocol that will connect the pieces of our federated system. Let's say that it will computes the mean of all the client values to get that aggregated value back on the server. So now with aggregator sensor readings together to get the mean. So in TFF, we can think of a phone, we can think of a federated operation as a function, even though its inputs and outputs live in different systems. So the input to this function was all the client values relative to the client and then gets this.
Sorry, did I come for a second and thanks for being with me.
We provide a Library of Federated Operators that represent the common distributed aggregations that you would use, for example, the federal average, which averages the results from the clients back to the service. I'm going to now run through a brief code example using TFF, I'm not going to go too into depth, so it might look a little confusing, but at the end, I'll be putting a link up that will lead you to the tutorials where you can walk through the code yourself. So here we'll be declaring a federated type that represents the input. And next, we'll be passing as this as an argument to the special function decorator's that declares this a federated computation. We'll be invoking our federated operators here. So here we have that federated average intrinsic. So that was like a very simple example, so let's do a more interesting example this time we'd like to compute what fraction of sensors have a reading over a certain threshold. So we have a threshold that we input up the server.
It will be broadcast to each of the clients. So here's our first federated operator that we've introduced this broadcast. We'll get the value from the server to the clients.
We next have a map setup. You can think of this map setup as the map step and not produce with tfw dot federated map, which will get all of our ones and zeros representing whether each client value was over the broadcast and threshold's. Then we'll perform another operator, which is the federated aggregation, which will get the results back on this. So here again, we're using federated average.
So what does this look like in code will again be declaring our inputs here.
So here we're adding A threshold type B and C, it's a float thirty two and it lives at the server and in the body again will be invoking all of our Federated Operators.
So here we're calling. You can see that recalling that federated broadcast of the threshold that's getting it to all of the client devices. We're performing a federated map set to see if each of us individually on each of the clients, if they're maximum value, is over that threshold. And then we'll be doing it on average to get that value back on this server.
So here is the and here we highlighted the local computation that each device will be performing on its own temperature reading. So this is just the computation that each device will be executing locally. Cool. That was a whirlwind introduction to the Federated Core API. We welcome your contributions.
What you've seen here today is open source again, and it's available on GitHub. So there's those links. There's many ways to get involved. You can apply the AFL API to existing animal models and data and see what happens. You can develop new federated algorithms and you can help evolve the API itself as a foundation for TFF. For another look at the ideas we've introduced today, you can also check out this cool comic book that we have available at Federated with Google dot com. We're really fortunate to work with two incredibly talented comic book artists to illustrate these concepts of as graphic art. There's also Corgies.
Thanks. Thank you, Emily, for the great talk. And so I think I saw a question earlier. Do you have I think this is the question. Do you have a sample project showing Federated and for mobile app use and TFI? I think the key, for example, you show these are mobile, but is there actually a sample people can look at?
Yes. So here at Google we actually have a federated learning system that works in production with the computations being deployed to actual devices and back. TFF currently only supports a simulation run plan. It hasn't been extended to a production environment yet, that is all TBD. Again, we do welcome contributions if that interests you. Thanks for that question.
And here's another question, does the federal system account for active, malicious users?
Oh, cool, yeah, so this is one of the more interesting or this is one of the very interesting areas of open research, NFL, where you do try to understand how malicious users interact with the system as a whole. There's a lot of interesting papers out there. I can somehow get those links back and put them in that live stream.
So, yeah, we can share links some sometimes during Q&A. I said that there are some things we will share. We could either added to the site or we can just compile them, put it down to the YouTube recording. Yeah, that would be super cool. Mm hmm. Are there any other questions for Emily? OK, great. I don't see any other questions and thank you so much, Emily, for joining us today and we hope to see you again, our future event.
Yeah, yeah. Thanks, Margaret, for having me. All right.
Oh, I have read about game based attacks. Do you think that is put here's another question. Sorry.
Do you think that is a potent attack? So, again, I'm one of the engineers on the team, but some of the researchers could probably get a better idea into this. That is something interesting to think about. Again, based attack in a federated setting. But again, for Federated Learning, there's the idea would be to have thousands and thousands of clients that you would do this training over. So you'd have to, in fact, such a large portion of clients that would be participating in this federated computation. It is an interesting thing to think about. I can put you at some of the papers and the research that's gone into that.
Great. Sounds great.
Ok, so if there are no more questions or we can perhaps figure out a way to connect everyone with the speakers so that, you know, people have more questions they can ask.
And thanks again for Emily and Heidi. Yeah, good to see you.