Blog post

Healthcare data collaboration: Beyond federated learning

Healthcare
No items found.
Researcher in a lab evaluating content of pipettes
Written by
No items found.
Published on
February 20, 2025

Recommended reading

Healthcare data collaboration: beyond federated learning

Healthcare faces a dilemma: Combining patient data across institutions could revolutionize research and improve patient care, yet data privacy and security concerns make such collaboration increasingly difficult.

Federated learning (FL) emerged as the first attempt to resolve this tension. While groundbreaking, its practical limitations have led to the emergence of a more powerful approach: the combination of confidential computing and data clean rooms.

This article unpacks where Federated Learning still meets the needs of healthcare data collaboration while highlighting where it can be elevated by data clean rooms powered by confidential computing.

The promise and reality of federated learning

FL represents an innovative approach to healthcare data collaboration: Instead of sharing raw patient data, hospitals can participate in machine learning projects by training models locally and sharing only model updates.

This approach resonates strongly with healthcare institutions, as it keeps sensitive data within hospital walls while enabling participation in large-scale research projects. Clinical teams can contribute to multi-center studies without compromising their commitment to patient privacy. Researchers can train AI models on diverse patient populations whose data would otherwise be inaccessible.

However, federated learning comes with inherent trade-offs that limit its practical application. Each participating hospital needs to install and maintain specialized hardware, adding complexity to already stretched IT resources. Privacy guarantees also depend heavily on the quality of the machine learning models. So without careful configuration, sensitive information can potentially leak through model updates.

For research teams, the constraints are particularly challenging. FL works best when every participating site has identical data structures. This is a rarity in healthcare, where different institutions often record similar information in different ways. Data scientists can't explore the raw data to understand its quirks, can't transform datasets to align different formats, and are limited to specific types of machine learning approaches.

In many cases, these limitations are treated as a “necessary evil” of sensitive data collaboration, adding a lot of friction to an already difficult process. However, this is not a standard the field needs to continue to accept. As security and privacy technologies mature, more flexible and equally secure paradigms emerge.

Keeping the best of both worlds

The next evolution in healthcare data collaboration comes from combining two complementary innovations: confidential computing technology and data clean rooms. While each is powerful on its own, together they create something transformative.

This evolution moves data analysis from on-premise hardware to cloud infrastructure. However, there’s a crucial difference: confidential computing provides hardware-based security guarantees that protect data not just from external threats, but even from the cloud provider themselves. Your sensitive data remains encrypted and inaccessible to anyone — including the platform operators — throughout the entire analysis process. Think of it as having your own secure vault that maintains its protections even when moved outside your walls.

Data clean rooms provide the collaboration framework that makes this security practical to use. They add the crucial governance layer, letting clinical teams control exactly how their data is used, allowing researchers to discover available datasets, and providing tools for secure analysis. It's like having a secure reading room with clear rules of engagement that everyone understands and trusts.

What this means for clinical teams

For hospitals and research institutions, this combination transforms how they can participate in research collaborations:

Complete control: Beyond just keeping data secure, clinical teams can precisely control how their data is used. They can review and approve specific analyses, monitor usage in real-time, and maintain detailed audit trails — all without needing specialized hardware or complex IT setups.

Flexible collaboration: Rather than needing to limit themselves to specific types of machine learning studies, clinical teams can support any kind of analysis their researchers need — from statistical analysis to AI model training — while maintaining rigorous privacy protection.

Implications for research teams

For data scientists and biostatisticians, this new paradigm removes fundamental limitations that made federated learning impractical for many real-world scenarios:

Native workflow support: No more learning specialized frameworks or limiting analyses to what federated systems allow. Whether using Python, R, or SQL, researchers can work with their familiar tools and methods. The system adapts to existing workflows, not the other way around.

Freedom to transform data: Federated learning faces a fundamental constraint — it can only train models on datasets that look identical across every participating site. In the real world of healthcare data, where different institutions often structure similar information in different ways, this creates a major roadblock.

The combination of confidential computing and data clean rooms removes this constraint. Researchers can explore data structure through privacy-preserving previews, develop transformation pipelines to harmonize different formats, and iteratively refine their analysis — all while the original data remains fully encrypted and under institutional control.

Consider a multi-center study of patient outcomes. With federated learning, analysis would be impossible unless every hospital harmonized their data in exactly the same way. With confidential computing and data clean rooms, researchers can harmonize data from different sources, test their transformations, and ensure they're properly combining the information before running their final analysis. This flexibility makes real-world healthcare data collaboration practical in ways federated learning never could.

Real impact and future possibilities

This evolution beyond federated learning is already transforming healthcare research: A recent collaboration between European hospitals cut months off their research timeline by eliminating the usual technical hurdles. That's because rather than spending time deploying hardware and harmonizing data formats, clinical teams could focus on what mattered  defining clear research goals and ensuring appropriate data governance. Researchers could iterate quickly on their analysis approaches, test data transformations, and generate meaningful insights in weeks rather than months.

The impact goes beyond just speed. Clinical teams can now support types of research that were simply impossible with federated learning alone — from linking patient journeys across institutions to combining clinical trial data with real-world data. All while maintaining rigorous privacy controls.

What's next for healthcare data collaboration

As healthcare increasingly adopts AI technologies like large language models, the combination of confidential computing and data clean rooms isn't just solving today's challenges — it's enabling entirely new possibilities. Clinical teams can explore using these powerful tools on sensitive data without compromising their privacy commitments. Researchers can experiment with cutting-edge analysis techniques while maintaining security. The good news is, you don't have to combine these technologies on your own. The Decentriq platform uses confidential computing and other privacy-enhancing technologies for data clean rooms that ensure raw data remains private and inaccessible to anyone other than its owner.

As new analysis techniques emerge or data privacy regulations evolve, this foundation ensures institutions can adapt quickly. We're moving toward a future where data collaboration in healthcare is both more secure and more practical than ever before.

Planning a new collaboration? Discover the practical data protection benefits of Decentriq’s platform in our whitepaper or contact us to explore seamless integration into your setup.

References

Recommended reading

Related content

Subscribe to Decentriq

Stay connected with Decentriq. Receive email notifications about industry news and product updates.