Federated Learning

Posted by Sumeet Singh on

When it comes to machine learning data pipelines, we are more conversant with the central server (on-premise or cloud) approach that hosts the trained model to make predictions. This standard approach to building machine learning models today is to gather all the training data in one place, and then to train the model on the data. This basically means that the information is taken from all points in the chain and sent to the central server for processing. Obviously, this means that there is a roundtrip being made between the server and the various local devices. Leading to an inability to learn in real-time. 


But there is another concern of the traditional approach in this digital era which is data privacy. Because data is the lifeblood of modern AI (Artificial Intelligence) data privacy issues play a significant, and often limiting, role in AI’s development.


Steps in a new term “Privacy-preserving artificial intelligence”. These are methods that enable AI models to learn from various datasets without compromising their privacy. One such method is known as Federated Learning.


Now comes FL which, in contrast, is an approach that downloads the current model and computes an updated model at the device itself which is also called edge computing, using local data. Afterwhich, these locally trained models send their data back to the central server where they are collected, i.e. averaging weights, and then a single consolidated and improved global model is sent back to the devices.


The idea of FL was first presented by scientists at Google in mid-2017. Over the previous year, interest in Federated learning has blossomed. You can see this in the fact that in excess of 1,000 research papers on FL were distributed in the initial half-year of 2020, contrasted with only 180 in each of 2018.


Federated learning flipping the conventional approach to AI. Federated learning leaves the data where it is, distributed across numerous devices and servers on the edge, This prevents the requirement of a unified model. Rather, many versions of the model are sent out, one to each device with training data, and trained locally on each subset of data. The output model data, and not the training data itself, are then sent back to the central server. When all these “mini-models” are aggregated, the result is one overall model that functions as if it had been trained on the entire dataset at once.


The original intend behind federated learning development was to train AI models on personal data distributed across billions of mobile devices. As those researchers summarized: “Modern mobile devices have access to a wealth of data suitable for machine learning models...However, this rich data is often privacy sensitive, large in quantity, or both, which may preclude logging to the data center...We advocate an alternative that leaves the training data distributed on the mobile devices, and learns a shared model by aggregating locally-computed updates.”


But it was quickly realized that this could apply to a multitude of other fields. One such application of this technology more recently is in the healthcare sector And It is easy to understand why. There exist an enormous number of valuable AI use cases in healthcare. Whereas healthcare data, especially those of patients’ personally identifiable information, is extremely sensitive; a thicket of regulations like HIPAA restricts its use and movement. FL is seen as a possible means to enable researchers to develop life-saving healthcare AI tools without ever moving sensitive health records from their source or exposing them to privacy breaches.


This has lead to the emergence of new startups such as Lynx.MD, Ferrum Health, and Secure AI Labs, to name just a few, are taking the lead in developing this technology for the healthcare sector.


Going further, federated learning may one day play a central role in the development of any AI application that involves sensitive data: from your personal sensitive information held in financial services to autonomous vehicles, from your local or federal government use cases to consumer products of all kinds. Paired with other privacy-preserving techniques like differential privacy and homomorphic encryption, federated learning may provide the key to unlocking AI’s vast potential while mitigating the thorny challenge of data privacy.


The wave of data privacy legislation being enacted worldwide today (starting with GDPR and CCPA, with many similar laws coming soon) will only accelerate the need for these privacy-preserving techniques. Expect federated learning to become an important part of the AI technology stack in the years ahead.


Federated Learning would allow for much smarter models, lower latency, and less power consumption while keeping user privacy. But Fl still has one more hidden card to play. Because it is modular in nature. It means that each instance can more or less run real-time within the local device.


The folks over at Google have been testing Federated Learning in Gboard on Android, the Google Keyboard. According to them, “When Gboard shows a suggested query, your phone locally stores information about the current context and whether you clicked the suggestion. Federated Learning processes that history on-device to suggest improvements to the next iteration of Gboard’s query suggestion model”.


All this doesn’t mean that federated learning is without its problems and might not be suitable for every scenario.


One of such challenges has to do with communication. This is a critical bottleneck in FL networks where data generated on each device remain local. To train a model using data generated by the devices in the network, it is necessary to develop communication-efficient methods that reduce the total number of communication rounds, and also iteratively send small model updates as part of the training process, as opposed to sending the entire data set.


But things can get a bit more complicated. As FL methods must be able to anticipate low levels of device participation, This means that only a small fraction of the devices would be active at any given time. Also, they should be able to tolerate variability in hardware that affects the storage, computational, and communication capabilities of each device in a federated network. All while being able to handle dropped devices in the network.



As stated earlier, Federated learning does not apply to all machine learning applications. If the model is too large to run on user devices, then the developer will need to find other workarounds to preserve user privacy. Such a scenario would occur on such things as an IoT setup. 


  • On the other hand, propagators of this method would have to take into consideration the relevance of the data in relation to their applications. The traditional machine learning development cycle involves intensive data cleaning practices in which data engineers remove misleading data points and fill the gaps where data is missing. Hence caution is taken so as not to train machine learning models on irrelevant data which can do more harm than good.

  • Another point of note is the evaluation of data and making sure it will be beneficial to the application. For this reason, federated learning must be limited to applications where the user data does not need preprocessing.

  • The limitations do not end there. It should be noted that federated machine learning is data labeling. Most machine learning models are supervised, which means they require training examples that are manually labeled by human annotators. For example, the ImageNet dataset is a crowdsourced repository that contains millions of images and their corresponding classes.

  • In federated learning, unless outcomes can be inferred from user interactions (e.g., predicting the next word the user is typing), the developers can’t expect users to go out of their way to label training data for the machine learning model. Federated learning is better suited for unsupervised learning applications such as language modeling.

  • While sending trained model parameters to the server is less privacy-sensitive than sending user data, it doesn’t mean that the model parameters are completely clean of private data.

While the flaws in FL as still being work on, its combination with other Privacy-Preserving AI technology might be a path towards truly securing sensitive data and user privacy.


Many experiments have shown that trained machine learning models might memorize user data and membership inference attacks can recreate training data in some models through trial and error.


One important remedy to the privacy concerns of federated learning is to discard the user-trained models after they are integrated into the central model. The cloud server doesn’t need to store individual models once it updates its base model.


Another measure that can help is to increase the pool of model trainers. For example, if a model needs to be trained on the data of 100 users, the engineers can increase their pool of trainers to 250 or 500 users. For each training iteration, the system will send the base model to 100 random users from the training pool. This way, the system doesn’t collect trained parameters from any single user constantly.


Finally, Federated Learning has the potential to help protect data generated on a device by sharing model updates such as gradient data instead of raw data. But communicating model updates throughout the training process can still reveal sensitive information as we touched on earlier, so while it is a direction for the development, true secure or privacy-oriented AI learning will be a combination of this and other methods.


Share this post



← Older Post Newer Post →


Leave a comment