In December 1890, Samuel Warren and Louis Brandeis, concerned about privacy implications of the new “instantaneous camera,” penned The Right to Privacy, where they argue for protecting “all persons, whatsoever their position or station, from having matters which they may properly prefer to keep private, made public against their will.” 125 years later, our private information has become currency: we (or others) “spend” it by giving it to others in exchange for services or other information. Like currency, our data can be stolen from the stewardship of those to whom we give it.
783 corporate breaches were reported in 2014 alone, up 27% year over year.
Those stewards also sell our data to brokers. Acxiom alone boasts of offering for sale more than 1,500 data items on each of 200 million Americans. In at least one important way, our data is unlike currency: its theft from or sale by stewards is our problem, not theirs, because as Brandeis foresaw, it can be used against us – forever. In 125 years, we’ve stepped from Brandeis’ brave vision all the way back to the mid 1700s, where Jean de la Bruyere wrote: “All confidence placed in another is dangerous if it is not perfect,” and Benjamin Franklin duly noted that, stating, “Three may keep a secret, if two of them are dead.” How might we claim The Right to Privacy for information under our control? By never letting it leave our hands unsecured. Think about protecting data privacy as a triangle. One edge secures data in transit; another secures data at rest; the third secures data in use. For example, your buy or sell bid in a commodity auction might be encrypted between your phone and the auction house (first edge), while stored there on a server (second edge), and then remain encrypted while used to determine the auction outcome (third and final edge). All the auction house (or anyone) learns is the resulting fair market price. No bidder’s economic position is revealed to other buyers or sellers or the auction house. Sound impossible? Danish sugar beets have been priced this way since 2008. We have pretty good solutions for data in transit: TLS is an example. Defects are still discovered in transit protocols (Heartbleed, FREAK) but the technology is mainly sound and can be made verifiably correct (miTLS) using formal methods from computer science. We also have pretty good solutions for data at rest – symmetric key block ciphers of a certain strength, such as AES, aren’t breakable in practice and can also be formally verified. What about the third (currently missing) leg of the triangle? We have component techniques for securing systems, such as secure hypervisors, sandboxing at multiple levels, and firewalling address spaces as Intel’s SGX aims to do. However, no composition of these approaches can fully secure application data while in use. Secure computation (computing on data while it remains encrypted) may be a solution. This tech is emerging from academic research and slowly becoming practical. The four techniques below round out most of this field – each has strengths and weaknesses.
1. Homomorphic encryption allows a single server to compute functions on data inputs encrypted and sent to it by a client. Operations on those encrypted inputs produce outputs that, when decrypted by the client, give the same answer as if the operation had been done “in the clear” on unencrypted inputs. The server learns nothing useful about inputs, intermediate values during computation, or the output of the computation.
2. Garbled circuits allow two participants to jointly compute a function while its inner workings and inputs remain encrypted. Alice encrypts her inputs and the function so Bob can’t understand her data or internal variables of the function during computation. Bob encrypts his inputs too, and lets Alice swizzle them so he can’t cheat to learn Alice’s secrets. Bob then computes the encrypted function result, and both can decrypt only that result. Neither Alice nor Bob learn anything about each other’s inputs except what they can infer from the output after decryption.
3. Secret-sharing computation makes it possible for multiple participants, usually three, to compute on essentially random shares of inputs from multiple clients. These shares are chosen so that when added together, they equal the original input value. Each server operates on a single share of each client’s input, so no server can reassemble any complete input, intermediate variable during computation, or output. Their result shares are returned to the client(s) where they are added together. Remarkably, the sum of those result shares is the correct result of the function.
4. Oblivious RAM (ORAM) allows secure calculation of a broader range of functions than the above techniques. ORAM conceals memory access patterns by a continuous process of shuffling and encrypting data on each access. The promise of ORAM is that an adversary exploit observing program memory learns nothing about its contents or the access pattern of the program.
What can be demonstrated today with these technologies besides beet auctions? Many-party, streaming VoIP where the server never decrypts voice data; tax fraud detection without revealing sensitive financial information; scanning of encrypted e-mail for sensitive content; secure databases in the cloud that learn nothing about queries or stored data; face recognition without revealing facial images; and others. Most are research prototypes, but they point to solutions for the third leg of the data privacy triangle. Making secure computation real may help us achieve a world where we can spend the currency of our private information without it being made public against our will. Identity theft can be curtailed because your personally identifiable information stays encrypted. Individuals can find out whether certain drugs are appropriate for them without revealing their medical details or genomes. Corporations can share sensitive cyber threat information without revealing their network structure or vulnerabilities. Maybe Warren, Brandeis and Franklin would be proud of such a future.
About the Author: David Archer is a Research Program Lead at Galois, Inc., where he directs research in computing on encrypted data, cryptography, and security-related provenance of data and computation. He holds a PhD in Computer Science from Portland State University (Portland, OR) as well as an MS in Electrical Engineering from the University of Illinois at Urbana-Champaign. Dr. Archer’s research interests also include cyber privacy and information assurance. Dr. Archer also has 25 years experience in processor and computer system design, and in leading large hardware and software product design teams at Intel Corporation and Mentor Graphics Corporation. Editor’s Note: The opinions expressed in this guest author article are solely those of the contributor, and do not necessarily reflect those of Tripwire, Inc. Title image courtesy of ShutterStock