9.9
CiteScore
7.1
Impact Factor
Volume 51 Issue 2
Feb.  2024
Turn off MathJax
Article Contents

PICOTEES: a privacy-preserving online service of phenotype exploration for genetic-diagnostic variants from Chinese children cohorts

doi: 10.1016/j.jgg.2023.09.003
Funds:

D Program of Zhejiang (No. 2022C01126 to Q. Sun and S. Wang), and National Key R&

D Program of China (2022ZD0116003 to X. Dong), the Science and Technology Commission of Shanghai (22002400700 to S. Wu), Shanghai Municipal Science and Technology Major Project (20Z11900600 to W. Zhou), National Key Research and Development Program (2018YFC0116903 to W. Zhou), and Major Research Projects for Young and Middle-aged People of Fujian Province (2021ZQNZD017 to Y. Lu). This work is also supported by Key Lab Information Network Security, Ministry of Public Security (to H. Zheng and S. Wang), “Pioneer” and ”Leading Goose” R&

D Program of China (2021YFC2500802 and 2021YFC2500806 to H. Zheng and S. Wang). We thank all doctors and nurses in our hospital for their patient care and data collection. We are very grateful to the patient families for their trust in our lab.

This work was funded by the Shanghai Hospital Development Center (SHDC2020CR6028-002 to W. Zhou), National Key R&

D Program of China (2020YFC2006402 to Y. Lu), National Key R&

  • Received Date: 2023-04-06
  • Accepted Date: 2023-09-03
  • Rev Recd Date: 2023-08-31
  • Publish Date: 2023-09-13
  • The growth in biomedical data resources has raised potential privacy concerns and risks of genetic information leakage. For instance, exome sequencing aids clinical decisions by comparing data through web services, but it requires significant trust between users and providers. To alleviate privacy concerns, the most commonly used strategy is to anonymize sensitive data. Unfortunately, studies have shown that anonymization is insufficient to protect against reidentification attacks. Recently, privacy-preserving technologies have been applied to preserve application utility while protecting the privacy of biomedical data. We present the PICOTEES framework, a privacy-preserving online service of phenotype exploration for genetic-diagnostic variants (https://birthdefectlab.cn:3000/). PICOTEES enables privacy-preserving queries of the phenotype spectrum for a single variant by utilizing trusted execution environment technology, which can protect the privacy of the user's query information, backend models, and data, as well as the final results. We demonstrate the utility and performance of PICOTEES by exploring a bioinformatics dataset. The dataset is from a cohort containing 20,909 genetic testing patients with 3,152,508 variants from the Children's Hospital of Fudan University in China, dominated by the Chinese Han population (>99.9%). Our query results yield a large number of unreported diagnostic variants and previously reported pathogenicity.
  • loading
  • Blatt, M., Gusev, A., Polyakov, Y., Goldwasser, S., 2020. Secure large-scale genome-wide association studies using homomorphic encryption. Proc. Natl. Acad. Sci. U. S. A. 117, 11608-11613.
    Bloss, C.S., 2013. Does family always matter? Public genomes and their effect on relatives. Genome Med. 5, 107.
    Braun, R., Rowe, W., Schaefer, C., Zhang, J., Buetow, K., 2009. Needles in the haystack: identifying individuals present in pooled genomic data. PLoS Genet. 5, e1000668.
    Buske, O.J., Girdea, M., Dumitriu, S., Gallinger, B., Hartley, T., Trang, H., Misyura, A., Friedman, T., Beaulieu, C., Bone, W.P., et al., 2015. PhenomeCentral: a portal for phenotypic and genotypic matchmaking of patients with rare genetic diseases. Hum. Mutat. 36, 931-940.
    Chen, F., Dai, W., Wang, C., Jiang, X., Mohammed, N., Al Aziz, M.M., Sadat, M.N., Lauter, K., Wang, S., 2017a. PRESAGE: PRivacy-preserving gEnetic testing via SoftwAre Guard Extension. BMC Med. Genomics 10, 48.
    Chen, F., Wang, S., Jiang, X., Ding, S., Lu, Y., Kim, J., Sahinalp, S.C., Shimizu, C., Burns, J.C., Wright, V.J., et al., 2017b. PRINCESS: Privacy-protecting Rare disease International Network Collaboration via Encryption through Software guard extensionS. Bioinformatics 33, 871.
    Chen, R., Shi, L., Hakenberg, J., Naughton, B., Sklar, P., Zhang, J., Zhou, H., Tian, L., Prakash, O., Lemire, M., et al., 2016. Analysis of 589,306 genomes identifies individuals resilient to severe Mendelian childhood diseases. Nat. Biotechnol. 34, 531-538.
    Cho, H., Wu, D.J., Berger, B., 2018. Secure genome-wide association analysis using multiparty computation. Nat. Biotechnol. 36, 547-551.
    Clarke, L., Zheng-Bradley, X., Smith, R., Kulesha, E., Xiao, C., Toneva, I., Vaughan, B., Preuss, D., Leinonen, R., Shumway, M., et al., 2012. The 1000 Genomes Project: data management and community access. Nat. Methods 9, 459-462.
    Constable, S.D., Tang, Y., Wang, S., Jiang, X., Chapin, S., 2015. Privacy-preserving GWAS analysis on federated genomic datasets. BMC Med. Inform. Decis. Mak. 15 Suppl. 5, S2.
    Costan, V., Devadas, S., 2016. Intel SGX explained. Cryptology ePrint Archive, Report 2016/086, 2016. https://eprint.iacr.org/2016/086.
    Dong, X., Liu, B., Yang, L., Wang, H., Wu, B., Liu, R., Chen, H., Chen, X., Yu, S., Chen, B., et al., 2020. Clinical exome sequencing as the first-tier test for diagnosing developmental disorders covering both CNV and SNV: a Chinese cohort. J. Med. Genet. 57, 558-566.
    Dong, X., Wu, B., Wang, H., Yang, L., Chen, X., Ni, Q., Wang, Y., Liu, B., Lu, Y., Zhou, W., 2021. An automatic diagnostic system for pediatric genetic disorders by linking genotype and phenotype information. https://doi.org/10.1101/2021.08.26.21261185.
    Gokhman, D., Kelman, G., Amartely, A., Gershon, G., Tsur, S., Carmel, L., 2017. Gene ORGANizer: linking genes to the organs they affect. Nucleic Acids Res. 45, W138-W145.
    Green, R.C., Berg, J.S., Grody, W.W., Kalia, S.S., Korf, B.R., Martin, C.L., McGuire, A.L., Nussbaum, R.L., O'Daniel, J.M., Ormond, K.E., et al., American College of Medical Genetics and Genomics, 2013. ACMG recommendations for reporting of incidental findings in clinical exome and genome sequencing. Genet. Med. 15, 565-574.
    Gymrek, M., McGuire, A.L., Golan, D., Halperin, E., Erlich, Y., 2013. Identifying personal genomes by surname inference. Science 339, 321-324.
    Hamosh, A., Scott, A.F., Amberger, J.S., Bocchini, C.A., McKusick, V.A., 2005. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 33, D514-D517.
    Jagadeesh, K.A., Wu, D.J., Birgmeier, J.A., Boneh, D., Bejerano, G., 2017. Deriving genomic diagnoses without revealing patient genomes. Science 357, 692-695.
    Karczewski, K.J., Weisburd, B., Thomas, B., Solomonson, M., Ruderfer, D.M., Kavanagh, D., Hamamsy, T., Lek, M., Samocha, K.E., Cummings, B.B., et al., 2017. The ExAC browser: displaying reference data information from over 60000 exomes. Nucleic Acids Res. 45, D840–D845.
    Koch, L., 2020. Exploring human genomic diversity with gnomAD. Nat. Rev. Genet.
    Kockan, C., Zhu, K., Dokmai, N., Karpov, N., Kulekci, O., Woodruff, D., Sahinalp, C., 2020. Sketching algorithms for genomic data analysis and querying in a secure enclave. Nat. Methods 17, 295-301.
    Landrum, M.J., Lee, J.M., Benson, M., Brown, G.R., Chao, C., Chitipiralla, S., Gu, B., Hart, J., Hoffman, D., Jang, W., et al., 2018. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res 46, D1062–D1067.
    Lippert, C., Sabatini, R., Maher, M.C., Kang, E.Y., Lee, S., Arikan, O., Harley, A., Bernal, A., Garst, P., Lavrenko, V., et al., 2017. Identification of individuals by trait prediction using whole-genome sequencing data. Proc. Natl. Acad. Sci. U. S. A. 114, 10166-10171.
    Mardis, E.R., 2008. The impact of next-generation sequencing technology on genetics. Trends Genet. https://doi.org/10.1016/j.tig.2007.12.007.
    Ni, Q., Chen, X., Zhang, P., Yang, L., Lu, Y., Xiao, F., Wu, B., Wang, H., Zhou, W., Dong, X., 2022. Systematic estimation of cystic fibrosis prevalence in Chinese and genetic spectrum comparison to Caucasians. Orphanet J. Rare Dis. 17, 129.
    Phillips, A., Charbonneau, J., 2017. Giving Away More than Your Genome Sequence?: Privacy in the Direct-to-Consumer Genetic Testing Space.
    Raisaro, J.L., Tramer, F., Ji, Z., Bu, D., Zhao, Y., Carey, K., Lloyd, D., Sofia, H., Baker, D., Flicek, P., et al., 2017. Addressing Beacon re-identification attacks: quantification and mitigation of privacy risks. J. Am. Med. Inform. Assoc. 24, 799-805.
    Shi, H., Jiang, C., Dai, W., Jiang, X., Tang, Y., Ohno-Machado, L., Wang, S., 2016. Secure Multi-pArty Computation Grid LOgistic REgression (SMAC-GLORE). BMC Med. Inform. Decis. Mak. 16 Suppl. 3, 89.
    Sobreira, N., Schiettecatte, F., Valle, D., Hamosh, A., 2015. GeneMatcher: a matching tool for connecting investigators with an interest in the same gene. Hum. Mutat. 36, 928-930.
    Stenson, P.D., Mort, M., Ball, E.V., Evans, K., Hayden, M., Heywood, S., Hussain, M., Phillips, A.D., Cooper, D.N., 2017. The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies. Hum. Genet. 136, 665-677.
    Tang, M., Chen, X., Ni, Q., Lu, Y., Wu, B., Wang, H., Yin, Z., Zhou, W., Dong, X., 2022. Estimation of hereditary fructose intolerance prevalence in the Chinese population. Orphanet J. Rare Dis. 17, 326.
    Wang, S., Zhang, Y., Dai, W., Lauter, K., Kim, M., Tang, Y., Xiong, H., Jiang, X., 2016. HEALER: homomorphic computation of ExAct Logistic rEgRession for secure rare disease variants analysis in GWAS. Bioinformatics 32, 211-218.
    Wohler, E., Martin, R., Griffith, 2021. PhenoDB, GeneMatcher and VariantMatcher, tools for analysis and sharing of sequence data. In: Rodrigues, E.D.S., Antonescu, C. (Eds.), Orphanet J. Rare Dis. 16, 365.
    Yang, L., Kong, Y., Dong, X., Hu, L., Lin, Y., Chen, X., Ni, Q., Lu, Y., Wu, B., Wang, H., et al., 2019. Clinical and genetic spectrum of a large cohort of children with epilepsy in China. Genet. Med. 21, 564-571.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (99) PDF downloads (5) Cited by ()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return