Search site6 years agoThis post is part of our Privacy-Preserving Data Science, Explained series. Update as of November 18, 2021: The version of PySyft mentioned in this post has been deprecated. Any implementations using this older version of PySyft are unlikely to work. Stay tuned for the release of PySyft 0.6.0, a data centric library for use in production targeted for release in early December.In this article of the introductory series on Private ML, we will introduce Federated Learning (FL), explaining what FL is, when to use it, and how to implement it with OpenMined tools. The information in this article will be digestible for a broad audience, but section by section, we will go more into the weeds to understand and use Federated Learning.For more info about the series, check out the intro article or take a look at the other posts to learn more about the techniques that can enable privacy-preserving ML with OpenMined’s libraries.Initially proposed in 2015, federated learning is an algorithmic solution that enables the training of ML models by sending copies of a model to the place where data resides  and performing training at the edge, thereby eliminating the necessity to move large amounts of data to a central server for training purposes.The data remains at its source devices, a.k.a. the clients, which receive a copy of the global model from the central server. This copy of the global model is trained locally with  the data of each device. The model weights are updated via local training, and then the local copy is  sent back to the central server. Once the server receives the updated model, it proceeds to aggregate the updates, improving the global model without revealing any of the private data on which it was trained.One of the first applications of FL was to improve word recommendation in Google’s Android keyboard without uploading the data, i.e. a user’s text, to the cloud. More recently, Apple has detailed how it employs federated learning to improve Siri‘s voice recognition. Besides, intuitively, keeping the data at its source is valuable in any privacy-preserving applications, especially when applied in healthcare or on confidential data in business and government.To get started we will use the classical MNIST data set that will stand in for our clients’ data, PySyft will provide all the components needed to demo federated learning and test it locally on this data set. If you want to imagine a reasonably close application, we could conceive that the MNIST characters are part of the digital signatures of our clients, produced when signing documents on their smartphone and we would like to use them to train a character recognition model. In this scenario we would like to provide strong privacy assurances to our users by not uploading their signatures to a central server. In PySyft, the clients’ devices, that is, the entities performing model training, are called workers.At this point in or mock example we have to send the data to the workers using PySyft’s Federated Data Loader.The Federated Data Loader, like the standard torch data loader on which is based, enables lazy loading during training and so in the train function, because the data batches are now distributed across Alice and bob, you need to send the model to the right location for each batch using model.send(…). Then you perform all the operations remotely with the same syntax as if you were doing everything locally on PyTorch. When you’re done, you get back the model updated and the loss that can be logged using the .get() method.At this point, the model has been trained on the data of both Alice and Bob by the respective workers, and their data has not left their devices.Federated learning alone, however, is not enough to ensure privacy because using the model updates, an “honest-but-curious” server could reconstruct the samples from which the updates were computed. This is where Secure MultiParty Computation, homomorphic encryption, and differential privacy come to provide stronger guarantees of security to data owners. We will explore all these topics in this series. As we mentioned above the larger vision for the technology goes beyond any single application or service. With the help of federated learning data owners can more easily maintain control of their data that can be used to train models without leaving the owners’ systems. These guarantees, besides being positive for all users of data-intensive applications, have the potential to make available whole new data sets in sectors like healthcare where, to follow HIPPA or the health related provisions of GDPR, privacy is the top priority. To contribute to making this vision a reality OpenMined is working on PyGrid a peer-to-peer platform that uses the PySyft framework for Federated Learning and data science. Data owners and data scientists can connect on the platform, where the data owners can feel safe in the knowledge that their data will never leave their node, and data scientists can perform their analysis without infringing on anyone’s privacy rights. Today, this type of interaction could take from weeks to months in sectors working on sensitive data, but with PyGrid it could all be just a few lines of code away. To learn more about PyGrid, here is a deeper dive in the platform and the use cases it enables. OpenMined would like to thank Antonio Lopardo, Emma Bluemke, Théo Ryffel, Nahua Kang, Andrew Trask, Jonathan Lebensold, Ayoub Benaissa, and Madhura Joshi, Shaistha Fathima, Nate Solon, Robin Röhm, Sabrina Steinert, Michael Höh and Ben Szymkow for their contributions to various parts of this series.Sign up to recieve an email when new content like this is posted.Want to write for OpenMined or help update a post?Let us know!By sending, you agree to our privacy policyand join the OpenMined Newsletter.January 5, 2026December 1, 2025Follow usLearn MoreSolutionsProgramsOur vision for the future is ambitious.
Here is how you can help:© 2026 OpenMined FoundationOpenMined is a 501(c)(3) non-profit foundation and a global community on a mission to create the public network for non-public information.With your support, we can unlock the world’s insights while making privacy accessible to everyone.We can do it, with your help.Secure Donation