Dual query

practical private query release for high dimensional data

Marco Gaboardi, Emilio Jesús Gallego Arias, Justin Hsu, Roth Aaron, Zhiwei Steven Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

We present a practical, differentially private algorithm for answering a large number of queries on high dimensional datasets. Like all algorithms for this task, ours necessarily has worst-case complexity exponential in the dimension of the data. However, our algorithm packages the computationally hard step into a concisely defined integer program, which can be solved non-privately using standard solvers. We prove accuracy and privacy theorems for our algorithm, and then demonstrate experimentally that our algorithm performs well in practice. For example, our algorithm can efficiently and accurately answer millions of queries on the Netflix dataset, which has over 17,000 attributes; this is an improvement on the state of the art by multiple orders of magnitude.

Original languageEnglish
Title of host publication31st International Conference on Machine Learning, ICML 2014
PublisherInternational Machine Learning Society
Pages2908-2916
Number of pages9
Volume4
ISBN (Print)9781634393973
Publication statusPublished - 2014
Event31st International Conference on Machine Learning, ICML 2014 - Beijing, China
Duration: 21 Jun 201426 Jun 2014
http://icml.cc/2014/

Publication series

NameJournal of Machine Learning Research
Volume32

Conference

Conference31st International Conference on Machine Learning, ICML 2014
Abbreviated titleICML 2014
CountryChina
CityBeijing
Period21/06/1426/06/14
Internet address

Cite this

Gaboardi, M., Arias, E. J. G., Hsu, J., Aaron, R., & Wu, Z. S. (2014). Dual query: practical private query release for high dimensional data. In 31st International Conference on Machine Learning, ICML 2014 (Vol. 4, pp. 2908-2916). (Journal of Machine Learning Research ; Vol. 32). International Machine Learning Society .
Gaboardi, Marco ; Arias, Emilio Jesús Gallego ; Hsu, Justin ; Aaron, Roth ; Wu, Zhiwei Steven. / Dual query : practical private query release for high dimensional data. 31st International Conference on Machine Learning, ICML 2014. Vol. 4 International Machine Learning Society , 2014. pp. 2908-2916 (Journal of Machine Learning Research ).
@inproceedings{bfd13c5e517445eabed0c9b68f6f6b5b,
title = "Dual query: practical private query release for high dimensional data",
abstract = "We present a practical, differentially private algorithm for answering a large number of queries on high dimensional datasets. Like all algorithms for this task, ours necessarily has worst-case complexity exponential in the dimension of the data. However, our algorithm packages the computationally hard step into a concisely defined integer program, which can be solved non-privately using standard solvers. We prove accuracy and privacy theorems for our algorithm, and then demonstrate experimentally that our algorithm performs well in practice. For example, our algorithm can efficiently and accurately answer millions of queries on the Netflix dataset, which has over 17,000 attributes; this is an improvement on the state of the art by multiple orders of magnitude.",
author = "Marco Gaboardi and Arias, {Emilio Jes{\'u}s Gallego} and Justin Hsu and Roth Aaron and Wu, {Zhiwei Steven}",
year = "2014",
language = "English",
isbn = "9781634393973",
volume = "4",
series = "Journal of Machine Learning Research",
publisher = "International Machine Learning Society",
pages = "2908--2916",
booktitle = "31st International Conference on Machine Learning, ICML 2014",

}

Gaboardi, M, Arias, EJG, Hsu, J, Aaron, R & Wu, ZS 2014, Dual query: practical private query release for high dimensional data. in 31st International Conference on Machine Learning, ICML 2014. vol. 4, Journal of Machine Learning Research , vol. 32, International Machine Learning Society , pp. 2908-2916, 31st International Conference on Machine Learning, ICML 2014, Beijing, China, 21/06/14.

Dual query : practical private query release for high dimensional data. / Gaboardi, Marco; Arias, Emilio Jesús Gallego; Hsu, Justin; Aaron, Roth; Wu, Zhiwei Steven.

31st International Conference on Machine Learning, ICML 2014. Vol. 4 International Machine Learning Society , 2014. p. 2908-2916 (Journal of Machine Learning Research ; Vol. 32).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Dual query

T2 - practical private query release for high dimensional data

AU - Gaboardi, Marco

AU - Arias, Emilio Jesús Gallego

AU - Hsu, Justin

AU - Aaron, Roth

AU - Wu, Zhiwei Steven

PY - 2014

Y1 - 2014

N2 - We present a practical, differentially private algorithm for answering a large number of queries on high dimensional datasets. Like all algorithms for this task, ours necessarily has worst-case complexity exponential in the dimension of the data. However, our algorithm packages the computationally hard step into a concisely defined integer program, which can be solved non-privately using standard solvers. We prove accuracy and privacy theorems for our algorithm, and then demonstrate experimentally that our algorithm performs well in practice. For example, our algorithm can efficiently and accurately answer millions of queries on the Netflix dataset, which has over 17,000 attributes; this is an improvement on the state of the art by multiple orders of magnitude.

AB - We present a practical, differentially private algorithm for answering a large number of queries on high dimensional datasets. Like all algorithms for this task, ours necessarily has worst-case complexity exponential in the dimension of the data. However, our algorithm packages the computationally hard step into a concisely defined integer program, which can be solved non-privately using standard solvers. We prove accuracy and privacy theorems for our algorithm, and then demonstrate experimentally that our algorithm performs well in practice. For example, our algorithm can efficiently and accurately answer millions of queries on the Netflix dataset, which has over 17,000 attributes; this is an improvement on the state of the art by multiple orders of magnitude.

UR - http://www.scopus.com/inward/record.url?scp=84919791361&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9781634393973

VL - 4

T3 - Journal of Machine Learning Research

SP - 2908

EP - 2916

BT - 31st International Conference on Machine Learning, ICML 2014

PB - International Machine Learning Society

ER -

Gaboardi M, Arias EJG, Hsu J, Aaron R, Wu ZS. Dual query: practical private query release for high dimensional data. In 31st International Conference on Machine Learning, ICML 2014. Vol. 4. International Machine Learning Society . 2014. p. 2908-2916. (Journal of Machine Learning Research ).