|Title||Secure and Scalable Statistical Computation of Questionnaire Data in R|
Collecting data via a questionnaire and analyzing them while preserving respondents’ privacy may increase the number of respondents and the truthfulness of their responses. It may also reduce the systematic differences between respondents and non-respondents. In this paper, we propose a privacy-preserving method for collecting and analyzing survey responses using secure multi-party computation (SMC). The method is secure under the semi-honest adversarial model.
The proposed method computes a wide variety of statistics. Total and stratified statistical counts are computed using the secure protocols developed in this paper. Then, additional statistics, such as a contingency table, a chi-square test, an odds ratio, and logistic regression, are computed within the R statistical environment using the statistical counts as building blocks.
The method was evaluated on a questionnaire dataset of 3,158 respondents sampled for a medical study and simulated questionnaire datasets of up to 50,000 respondents. The computation time for the statistical analyses linearly scales as the number of respondents increases. The results show that the method is efficient and scalable for practical use. It can also be used for other applications in which categorical data are collected.
|Secure Multi-Party Computation|
|Journal citation||4, pp. 4635-4645|
|Digital Object Identifier (DOI)||doi:10.1109/ACCESS.2016.2599851|
|Web address (URL)||http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7542506|
|Published||12 Aug 2016|