Executive Summary

Photo of people socializing at a reception at the Harvard Science and Engineering Complex

Why is Privacy Paramount?

In a time when technology is embedded into the daily lives of billions of people, a vast amount of data is captured and can be used to enhance peoples’ lives, experiences, recommend products and services, or even help solve larger social and community-wide problems. While this has potential for social good, it is still important to protect personal information to prevent misuse, avoid biases, or target vulnerable audiences, as well as offer individuals the basic freedom to explore and cultivate their selves without tracking or recrimination. This is why data privacy is recognized as a national and international priority for both public and private organizations to enable the study, research, and use of data in a safe and protected manner to make informed and educated decisions.

Why Differential Privacy?

There are several methods that have been used over time to protect information about individuals when sharing sensitive datasets; however, these traditional approaches are known to be ineffective in protecting deriving from roots in cryptography, is the gold standard for privacy preservation; it allows for rich privacy and often can be reverse-engineered to glean information about users. Differential Privacy (DP), statistical analysis of sensitive data while still protecting personal data. While DP focuses on protecting the privacy of outputs that are published or released, other Privacy Enhancing Technologies (PETs) can be combined with DP to protect data in a wide array of compute, data sharing and distributed learning settings even during the process of an analysis being carried out by multiple data-holders.

How can OpenDP help?

The vast majority of data about people, groups, societies, geographies, and countries is presently inaccessible to audiences such as scientific researchers, policy makers or other analysts—locked up inside companies, governments, and other organizations. While it is understood that sharing this type of data would be greatly beneficial, it is difficult to do so in a trusted and compliant manner without running the risk of a severe privacy breach. OpenDP is a collaborative project where the community of differential privacy experts and users have come together to produce a trustworthy repository of resources for deploying these technologies to open up sensitive data for research, analysis, and decision making. It is on its way to becoming the standard body of open-source implementations of differential privacy algorithms that can be used for statistical analysis and machine learning on sensitive data. OpenDP is a pathway that will bring the newest developments in this rapidly advancing field to a wide array of practitioners.

Why is OpenDP a trustworthy resource?

The core of the OpenDP software is a library of algorithms for generating statistical releases with the strong protections of differential privacy. Its trustworthiness comes from the careful process by which the software is vetted, where experts in the community verify (through mathematical proofs) that the library provides its claimed privacy guarantees. No other open-source library of differential privacy software has such a vetting process in place. OpenDP has partnered and engaged with state, federal, and international government agencies and global technology companies to help them share data in a safe and protected manner in order to conduct valuable research. Incubated by a team of expert researchers, developers, and staff at Harvard University, OpenDP has a rapidly growing cross-sector community of more than 600 researchers, practitioners and privacy advocates from academia, industry and government, including an advisory board of 30 distinguished thought leaders. Learn more about OpenDP on our website. [ADD LINK]