Book cover

International Conference on Information Security Practice and Experience

ISPEC 2014: Information Security Practice and Experience pp 28–41 Cite as

Data Security and Privacy in the Cloud

  • Pierangela Samarati 17  
  • Conference paper

1919 Accesses

18 Citations

Part of the Lecture Notes in Computer Science book series (LNSC,volume 8434)

Achieving data security and privacy in the cloud means ensuring confidentiality and integrity of data and computations, and protection from non authorized accesses. Satisfaction of such requirements entails non trivial challenges, as relying on external servers, owners lose control on their data. In this paper, we discuss the problems of guaranteeing proper data security and privacy in the cloud, and illustrate possible solutions for them.

  • Cloud computing
  • confidentiality
  • honest-but-curious servers
  • data fragmentation
  • private access
  • shuffle index
  • query integrity

This is a preview of subscription content, log in via an institution .

Buying options

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Unable to display preview.  Download preview PDF.

Aggarwal, G., Bawa, M., Ganesan, P., Garcia-Molina, H., Kenthapadi, K., Motwani, R., Srivastava, U., Thomas, D., Xu, Y.: Two can keep a secret: A distributed architecture for secure database services. In: Proc. of the 2nd Biennial Conference on Innovative Data Systems Research, CIDR 2005, Asilomar, CA, USA (January 2005)

Google Scholar  

Atallah, M., Blanton, M., Fazio, N., Frikken, K.: Dynamic and efficient key management for access hierarchies. ACM Transactions on Information and System Security 12(3), 18:1–18:43 (2009)

Ateniese, G., Burns, R., Curtmola, R., Herring, J., Kissner, L., Peterson, Z., Song, D.: Provable data possession at untrusted stores. In: Proc. of the 14th ACM Conference on Computer and Communications Security (CCS 2007), Alexandria, VA, USA (October-November 2007)

Ciriani, V., De Capitani di Vimercati, S., Foresti, S., Jajodia, S., Paraboschi, S., Samarati, P.: Fragmentation and encryption to enforce privacy in data storage. In: Biskup, J., López, J. (eds.) ESORICS 2007. LNCS, vol. 4734, pp. 171–186. Springer, Heidelberg (2007)

Chapter   Google Scholar  

Ciriani, V., De Capitani di Vimercati, S., Foresti, S., Jajodia, S., Paraboschi, S., Samarati, P.: Keep a few: Outsourcing data while maintaining confidentiality. In: Backes, M., Ning, P. (eds.) ESORICS 2009. LNCS, vol. 5789, pp. 440–455. Springer, Heidelberg (2009)

Ciriani, V., De Capitani di Vimercati, S., Foresti, S., Jajodia, S., Paraboschi, S., Samarati, P.: Combining fragmentation and encryption to protect privacy in data storage. ACM Transactions on Information and System Security (TISSEC) 13(3), 22:1–22:33 (2010)

Damiani, E., De Capitani di Vimercati, S., Jajodia, S., Paraboschi, S., Samarati, P.: Balancing confidentiality and efficiency in untrusted relational DBMSs. In: Proc. of the 10th ACM Conference on Computer and Communications Security (CCS 2003), Washington, DC, USA (October 2003)

De Capitani di Vimercati, S., Foresti, S., Jajodia, S., Livraga, G.: Enforcing subscription-based authorization policies in cloud scenarios. In: Cuppens-Boulahia, N., Cuppens, F., Garcia-Alfaro, J. (eds.) DBSec 2012. LNCS, vol. 7371, pp. 314–329. Springer, Heidelberg (2012)

De Capitani di Vimercati, S., Foresti, S., Jajodia, S., Livraga, G., Paraboschi, S., Samarati, P.: Enforcing dynamic write privileges in data outsourcing. Computers & Security (COSE) 39, 47–63 (2013)

Article   Google Scholar  

De Capitani di Vimercati, S., Foresti, S., Jajodia, S., Livraga, G., Paraboschi, S., Samarati, P.: Extending loose associations to multiple fragments. In: Wang, L., Shafiq, B. (eds.) DBSec 2013. LNCS, vol. 7964, pp. 1–16. Springer, Heidelberg (2013)

De Capitani di Vimercati, S., Foresti, S., Jajodia, S., Livraga, G., Paraboschi, S., Samarati, P.: Fragmentation in presence of data dependencies. IEEE Transactions on Dependable and Secure Computing (TDSC) (2014)

De Capitani di Vimercati, S., Foresti, S., Jajodia, S., Paraboschi, S., Samarati, P.: Encryption policies for regulating access to outsourced data. ACM Transactions on Database Systems (TODS) 35(2), 12:1–12:46 (2010)

De Capitani di Vimercati, S., Foresti, S., Jajodia, S., Paraboschi, S., Samarati, P.: Fragments and loose associations: Respecting privacy in data publishing. Proc. of the VLDB Endowment 3(1), 1370–1381 (2010)

De Capitani di Vimercati, S., Foresti, S., Jajodia, S., Paraboschi, S., Samarati, P.: Integrity for join queries in the cloud. IEEE Transactions on Cloud Computing (TCC) 1(2), 187–200 (2013)

De Capitani di Vimercati, S., Foresti, S., Jajodia, S., Paraboschi, S., Samarati, P.: On information leakage by indexes over data fragments. In: Proc. of the 1st International Workshop on Privacy-Preserving Data Publication and Analysis (PrivDB 2013), Brisbane, Australia (April 2013)

De Capitani di Vimercati, S., Foresti, S., Livraga, G., Samarati, P.: Data privacy: Definitions and techniques. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 20(6), 793–817 (2012)

De Capitani di Vimercati, S., Foresti, S., Paraboschi, S., Pelosi, G., Samarati, P.: Efficient and private access to outsourced data. In: Proc. of the 31st International Conference on Distributed Computing Systems (ICDCS 2011), Minneapolis, Minnesota, USA (June 2011)

De Capitani di Vimercati, S., Foresti, S., Paraboschi, S., Pelosi, G., Samarati, P.: Distributed shuffling for preserving access confidentiality. In: Crampton, J., Jajodia, S., Mayes, K. (eds.) ESORICS 2013. LNCS, vol. 8134, pp. 628–645. Springer, Heidelberg (2013)

De Capitani di Vimercati, S., Foresti, S., Paraboschi, S., Pelosi, G., Samarati, P.: Supporting concurrency and multiple indexes in private access to outsourced data. Journal of Computer Security (JCS) 21(3), 425–461 (2013)

De Capitani di Vimercati, S., Foresti, S., Samarati, P.: Managing and accessing data in the cloud: Privacy risks and approaches. In: Proc. of the 7th International Conference on Risks and Security of Internet and Systems (CRiSIS 2012), Cork, Ireland (October 2012)

Goldreich, O., Ostrovsky, R.: Software protection and simulation on Oblivious RAMs. Journal of the ACM 43(3), 431–473 (1996)

Article   MATH   MathSciNet   Google Scholar  

Hacigümüş, H., Iyer, B., Li, C., Mehrotra, S.: Executing SQL over encrypted data in the database-service-provider model. In: Proc. of the ACM SIGMOD International Conference on Management of Data (SIGMOD 2002), Madison, Wisconsin, USA (June 2002)

Hacigümüş, H., Iyer, B., Mehrotra, S.: Ensuring integrity of encrypted databases in database as a service model. In: De Capitani di Vimercati, S., Ray, I., Ray, I. (eds.) Data and Applications Security XVII. IFIP, vol. 142, pp. 61–74. Springer, Heidelberg (2004)

Jhawar, R., Piuri, V.: Adaptive resource management for balancing availability and performance in cloud computing. In: Proc. of the 10th International Conference on Security and Cryptography (SECRYPT 2013), Reykjavik, Iceland (July 2013)

Jhawar, R., Piuri, V., Samarati, P.: Supporting security requirements for resource management in cloud computing. In: Proc. of the 15th IEEE International Conference on Computational Science and Engineering (CSE 2012), Paphos, Cyprus (December 2012)

Jhawar, R., Piuri, V., Santambrogio, M.: A comprehensive conceptual system-level approach to fault tolerance in cloud computing. In: Proc. of the 2012 IEEE International Systems Conference (SysCon 2012), Vancouver, BC, Canada (March 2012)

Jhawar, R., Piuri, V., Santambrogio, M.: Fault tolerance management in cloud computing: A system-level perspective. IEEE Systems Journal 7(2), 288–297 (2013)

Juels, A., Kaliski, B.: PORs: Proofs of retrievability for large files. In: Proc. of the 14th ACM Conference on Computer and Communications Security (CCS 2007), Alexandria, VA, USA (October-November 2007)

Li, F., Hadjieleftheriou, M., Kollios, G., Reyzin, L.: Authenticated index structures for aggregation queries. ACM Transactions on Information and System Security (TISSEC) 13(4), 32:1–32:35 (2010)

Li, J., Chen, X., Li, J., Jia, C., Ma, J., Lou, W.: Fine-grained access control system based on outsourced attribute-based encryption. In: Crampton, J., Jajodia, S., Mayes, K. (eds.) ESORICS 2013. LNCS, vol. 8134, pp. 592–609. Springer, Heidelberg (2013)

Mykletun, E., Narasimha, M., Tsudik, G.: Authentication and integrity in outsourced databases. ACM Transactions on Storage (TOS) 2(2), 107–138 (2006)

Ostrovsky, R., Skeith III, W.E.: A survey of single-database private information retrieval: Techniques and applications. In: Okamoto, T., Wang, X. (eds.) PKC 2007. LNCS, vol. 4450, pp. 393–411. Springer, Heidelberg (2007)

Pang, H., Jain, A., Ramamritham, K., Tan, K.: Verifying completeness of relational query results in data publishing. In: Proc. of the ACM SIGMOD International Conference on Management of Data (SIGMOD 2005), Baltimore, MA, USA (June 2005)

Samarati, P., De Capitani di Vimercati, S.: Data protection in outsourcing scenarios: Issues and directions. In: Proc. of the 5th ACM Symposium on Information, Computer and Communications Security (ASIACCS 2010), Beijing, China (April 2010)

Stefanov, E., van Dijk, M., Shi, E., Fletcher, C., Ren, L., Yu, X., Devadas, S.: Path ORAM: An extremely simple Oblivious RAM protocol. In: Proc. of the 20th ACM Conference on Computer and Communications Security (CCS 2013), Berlin, Germany (November 2013)

Wang, H., Yin, J., Perng, C., Yu, P.: Dual encryption for query integrity assurance. In: Proc. of the 2008 ACM International Conference on Information and Knowledge Management (CIKM 2008), Napa Valley, CA (October 2008)

Xie, M., Wang, H., Yin, J., Meng, X.: Integrity auditing of outsourced data. In: Proc. of the 33rd International Conference on Very Large Data Bases (VLDB 2007), Vienna, Austria (September 2007)

Yang, Z., Gao, S., Xu, J., Choi, B.: Authentication of range query results in MapReduce environments. In: Proc. of the 3rd International Workshop on Cloud Data Management (CloudDB 2011), Glasgow, U.K. (October 2011)

Download references

Author information

Authors and affiliations.

Dipartimento di Informatica , Università degli Studi di Milano , Via Bramante 65, 26013, Crema, Italy

Pierangela Samarati

You can also search for this author in PubMed   Google Scholar

Editor information

Editors and affiliations.

School of Mathematics and Computer Science, Fujian Normal University, No. 32 Shangsan Road, 350007, Fuzhou, China

Xinyi Huang

Infocom Security Department, Institute for Infocomm Research, 1 Fusionopolis Way, #21-01 Connexis, South Tower, 138632, Singapore, Singapore

Jianying Zhou

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper.

Samarati, P. (2014). Data Security and Privacy in the Cloud. In: Huang, X., Zhou, J. (eds) Information Security Practice and Experience. ISPEC 2014. Lecture Notes in Computer Science, vol 8434. Springer, Cham. https://doi.org/10.1007/978-3-319-06320-1_4

Download citation

DOI : https://doi.org/10.1007/978-3-319-06320-1_4

Publisher Name : Springer, Cham

Print ISBN : 978-3-319-06319-5

Online ISBN : 978-3-319-06320-1

eBook Packages : Computer Science Computer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research
  • Reference Manager
  • Simple TEXT file

People also looked at

Systematic review article, securing machine learning in the cloud: a systematic review of cloud machine learning security.

www.frontiersin.org

  • 1 Information Technology University (ITU), Lahore, Pakistan
  • 2 AI4Networks Research Center, University of Oklahoma, Norman, OK, United States
  • 3 Social Data Science (SDS) Lab, Queen Mary University of London, London, United Kingdom
  • 4 School of Computing and Communications, Lancaster University, Lancaster, United Kingdom
  • 5 Hamad Bin Khalifa University (HBKU), Doha, Qatar

With the advances in machine learning (ML) and deep learning (DL) techniques, and the potency of cloud computing in offering services efficiently and cost-effectively, Machine Learning as a Service (MLaaS) cloud platforms have become popular. In addition, there is increasing adoption of third-party cloud services for outsourcing training of DL models, which requires substantial costly computational resources (e.g., high-performance graphics processing units (GPUs)). Such widespread usage of cloud-hosted ML/DL services opens a wide range of attack surfaces for adversaries to exploit the ML/DL system to achieve malicious goals. In this article, we conduct a systematic evaluation of literature of cloud-hosted ML/DL models along both the important dimensions— attacks and defenses —related to their security. Our systematic review identified a total of 31 related articles out of which 19 focused on attack, six focused on defense, and six focused on both attack and defense. Our evaluation reveals that there is an increasing interest from the research community on the perspective of attacking and defending different attacks on Machine Learning as a Service platforms. In addition, we identify the limitations and pitfalls of the analyzed articles and highlight open research issues that require further investigation.

1 Introduction

In recent years, machine learning (ML) techniques have been successfully applied to a wide range of applications, significantly outperforming previous state-of-the-art methods in various domains: for example, image classification, face recognition, and object detection. These ML techniques—in particular deep learning (DL)–based ML techniques—are resource intensive and require a large amount of training data to accomplish a specific task with good performance. Training DL models on large-scale datasets is usually performed using high-performance graphics processing units (GPUs) and tensor processing units. However, keeping in mind the cost of GPUs/Tensor Processing Units and the fact that small businesses and individuals cannot afford such computational resources, the training of deep models is typically outsourced to clouds, which is referred to in the literature as “Machine Learning as a Service” (MLaaS).

MLaaS refers to different ML services that are offered as a component of a cloud computing services, for example, predictive analytics, face recognition, natural language services, and data modeling APIs. MLaaS allows users to upload their data and model for training at the cloud. In addition to training, cloud-hosted ML services can also be used for inference purposes, that is, models can be deployed on the cloud environments; the system architecture of a typical MLaaS is shown in Figure 1 .

www.frontiersin.org

FIGURE 1 . Taxonomy of different defenses proposed for defending attacks on the third-party cloud-hosted machine learning (ML) or deep learning (DL) models.

MLaaS 1 can help reduce the entry barrier to the use of ML and DL through access to managed services of wide hardware heterogeneity and incredible horizontal scale. MLaaS is currently provided by several major organizations such as Google, Microsoft, and Amazon. For example, Google offers Cloud ML Engine 2 that allows developers and data scientists to upload training data and model which is trained on the cloud in the Tensorflow 3 environment. Similarly, Microsoft offers Azure Batch AI 4 —a cloud-based service for training DL models using different frameworks supported by both Linux and Windows operating systems and Amazon offers a cloud service named Deep Learning AMI (DLAMI) 5 that provides several pre-built DL frameworks (e.g., MXNet, Caffe, Theano, and Tensorflow) that are available in Amazon’s EC2 cloud computing infrastructure. Such cloud services are popular among researchers as evidenced by the price lifting of Amazon’s p2.16x large instance to the maximum possible—two days before the deadline of NeurIPS 2017 (the largest research venue on ML)—indicating that a large number of users request to reserve instances.

In addition to MLaaS services that allow users to upload their model and data for training on the cloud, transfer learning is another strategy to reduce computational cost in which a pretrained model is fine-tuned for a new task (using a new dataset). Transfer learning is widely applied for image recognition tasks using a convolutional neural network (CNN). A CNN model learns and encodes features like edges and other patterns. The learned weights and convolutional filters are useful for image recognition tasks in other domains and state-of-the-art results can be obtained with a minimal amount of training even on a single GPU. Moreover, various popular pretrained models such as AlexNet ( Krizhevsky et al., 2012 ), VGG ( Simonyan and Zisserman, 2015 ), and Inception ( Szegedy et al., 2016 ) are available for download and fine-tuning online. Both of the aforementioned outsourcing strategies come with new security concerns. In addition, the literature suggests that different types of attacks can be realized on different components of the communication network as well ( Usama et al., 2020a ), for example, intrusion detection ( Han et al., 2020 ; Usama et al., 2020b ), network traffic classification ( Usama et al., 2019 ), and malware detection systems ( Chen et al., 2018 ). Moreover, adversarial ML attacks have also been devised for client-side ML classifiers, that is, Google’s phishing pages filter ( Liang et al., 2016 ).

Contributions of the article: In this article, we analyze the security of MLaaS and other cloud-hosted ML/DL models and provide a systematic review of associated security challenges and solutions. To the best of our knowledge, this article is the first effort on providing a systematic review of the security of cloud-hosted ML models and services. The following are the major contributions of this article:

(1) We conducted a systematic evaluation of 31 articles related to MLaaS attacks and defenses.

(2) We investigated five themes of approaches aiming to attack MLaaS and cloud-hosted ML services.

(3) We examined five themes of defense methods for securing MLaaS and cloud-hosted ML services.

(4) We identified the pitfalls and limitations of the examined articles. Finally, we have highlighted open research issues that require further investigation.

Organization of the article: The rest of the article is organized as follows. The methodology adopted for the systematic review is presented in Section 2. The results of the systematic review are presented in Section 3. Section 4 presents various security challenges associated with cloud-hosted ML models and potential solutions for securing cloud-hosted ML models are presented in Section 5. The pitfalls and limitations of the reviewed approaches are discussed in Section 6. We briefly reflect on our methodology to identify any threats to the validity in Section 8 and various open research issues that require further investigation are highlighted in Section 7. Finally, we conclude the article in Section 9.

2 Review Methodology

In this section, we present the research objectives and the adopted methodology for the systematic review. The purpose of this article is to identify and systematically review the state-of-the art research related to the security of the cloud-based ML/DL techniques. The methodology followed for this study is depicted in Figure 2 .

www.frontiersin.org

FIGURE 2 . An illustration of a typical cloud-based ML or machine learning as a service (MLaaS) architecture.

2.1 Research Objectives

The following are the key objectives of this article.

O1: To build upon the existing work around the security of cloud-based ML/DL methods and present a broad overview of the existing state-of-the-art literature related to MLaaS and cloud-hosted ML services.

O2: To identify and present a taxonomy of different attack and defense strategies for cloud-hosted ML/DL models.

O3: To identify the pitfalls and limitations of the existing approaches in terms of research challenges and opportunities.

2.2 Research Questions

To achieve our objectives, we consider answering two important questions that are described below and conducted a systematic analysis of 31 articles.

Q1: What are the well-known attacks on cloud-hosted/third-party ML/DL models?

Q2: What are the countermeasures and defenses against such attacks?

2.3 Review Protocol

We developed a review protocol to conduct the systematic review; the details are described below.

2.3.1 Search Strategy and Searching Phase

To build a knowledge base and extract the relevant articles, eight major publishers and online repositories were queried that include ACM Digital Library, IEEE Xplore, ScienceDirect, international conference on machine learning, international conference on learning representations, journal of machine learning research, neural information processing systems, USENIX, and arXiv. As we added non-peer–reviewed articles from electric preprint archive (arXiv), we (AQ and AI) performed the critical appraisal using AACODS checklist; it is designed to enable evaluation and appraisal of gray literature ( Tyndall, 2010 ), which is designed for the critical evaluation of gray literature.

In the initial phase, we queried main libraries using a set of different search terms that evolved using an iterative process to maximize the number of relevant articles. To achieve optimal sensitivity, we used a combination of words: attack, poisoning, Trojan attack, contamination, model inversion, evasion, backdoor, model stealing, black box, ML, neural networks, MLaaS, cloud computing, outsource, third party, secure, robust, and defense. The combinations of search keywords used are depicted in Figure 3 . We then created search strategies with controlled or index terms given in Figure 3 . Please note that no lower limit for the publication date was applied; the last search date was June 2020. The researchers (WI and AI) searched additional articles through citations and by snowballing on Google Scholar. Any disagreement was adjudicated by the third reviewer (AQ). Finally, articles focusing on the attack/defense for cloud-based ML models were retrieved.

www.frontiersin.org

FIGURE 3 . The methodology for systematic review.

2.3.2 Inclusion and Exclusion Criteria

The inclusion and exclusion criteria followed for this systematic review are defined below.

2.3.2.1 Inclusion Criteria

The following are the key points that we considered for screening retrieved articles as relevant for conducting a systematic review.

• We included all articles relevant to the research questions and published in the English language that discusses the attacks on cloud-based ML services, for example, offered by cloud computing service providers.

• We then assessed the eligibility of the relevant articles by identifying whether they discussed either attack or defense for cloud-based ML/DL models.

• Comparative studies that compare the attacks and robustness against different well-known attacks on cloud-hosted ML services (poisoning attacks, black box attacks, Trojan attacks, backdoor attacks, contamination attacks, inversion, stealing, and invasion attacks).

• Finally, we categorized the selected articles into three categories, that is, articles on attacks, articles on defenses, and articles on attacks and defenses.

2.3.2.2 Exclusion Criteria

The exclusion criteria are outlined below.

• Articles that are written in a language other than English.

• Articles not available in full text.

• Secondary studies (e.g., systematic literature reviews, surveys, editorials, and abstracts or short papers) are not included.

• Articles that do not discuss attacks and defenses for cloud-based/third-party ML services, that is, we only consider those articles which have proposed an attack or defense for a cloud-hosted ML or MLaaS service.

2.3.3 Screening Phase

For the screening of articles, we employ two phases based on the content of the retrieved articles: 1) title and abstract screening and 2) full text of the publication. Please note that to avoid bias and to ensure that the judgment about the relevancy of articles is entirely based on the content of the publications, we intentionally do not consider authors, publication type (e.g., conference and journal), and publisher (e.g., IEEE and ACM). Titles and abstracts might not be true reflectors of the articles’ contents; however, we concluded that our review protocol is sufficient to avoid provenance-based bias.

It is very common that the same work got published in multiple venues, for example, conference papers are usually extended to journals. In such cases, we only consider the original article. In the screening phase, every article was screened by at least two authors of this article that were tasked to annotate the articles as either relevant, not relevant, or need further investigation, which was finalized by the discussion between the authors until any such article is either marked relevant or not relevant. Only original technical articles are selected, while survey and review articles are ignored. Finally, all selected publications were thoroughly read by the authors for categorization and thematic analysis.

3 Review Results

3.1 overview of the search and selection process outcome.

The search using the aforementioned strategy identified a total of 4,384 articles. After removing duplicate articles, title, and abstract screening, the overall number of articles reduced to 384. A total of 230 articles did not meet the inclusion criteria and were therefore excluded. From the remaining 154 articles, 123 articles did not discuss attack/defense for third-party cloud-hosted ML models and were excluded as well. Of the remaining articles, a total of 31 articles are identified as relevant. Reasons for excluding articles were documented and reported in a PRISMA flow diagram, depicted in Figure 4 . These articles were categorized into three classes, that is, articles that are specifically focused on attacks, articles that are specifically focused on defenses, and articles that considered both attacks and defenses containing 19, 6, and 6 articles each, respectively.

www.frontiersin.org

FIGURE 4 . Search queries used to identify publications to include in the systematic review.

3.2 Overview of the Selected Studies

The systematic review eventually identified a set of 31 articles related to cloud-based ML/DL models and MLaaS, which we categorized into three classes as mentioned above and shown in Figure 4 . As shown in Figure 5 , a significant portion of the selected articles were published in conferences (41.94%); comparatively, a very smaller proportion of these articles were published in journals or transactions (19.35%). The percentage of gray literature (i.e., non-peer–reviewed articles) is 25.81%. Yet, a very small proportion of publications are published in symposia (6.45%), and this percentage is the same for workshop papers. The distribution of selected publications by their types over the years is shown in Figure 6 . The figure depicts that the interest in the security of cloud-hosted ML/DL models increased in the year 2017 and was at a peak in the year 2018 and was slightly lower in the year 2019 as compared to 2018. Also, the majority of the articles during these years were published in conferences. The distribution of selected publications by their publishers over the years is depicted in Figure 7 , the figure shows that the majority of the publications have been published at IEEE, ACM, and arXiv. There is a similar trend in the number of articles in the year 2017, 2018, and 2019 as discussed previously.

www.frontiersin.org

FIGURE 5 . Flowchart of systematic review and categorization.

www.frontiersin.org

FIGURE 6 . Distribution of selected publications according to their types.

www.frontiersin.org

FIGURE 7 . Distribution of selected publications by types over years.

3.3 Some Partially Related Non-Selected Studies: A Discussion

We have described our inclusion and exclusion criteria that help us to identify relevant articles. We note, however, that some seemingly relevant articles failed to meet the inclusion criteria. Here, we briefly describe few such articles for giving a rationale why they were not included.

• Liang et al. (2016) investigated the security challenges for the client-side classifiers via a case study on the Google’s phishing pages filter, a very widely used classifier for automatically detecting unknown phishing pages. They devised an attack that is not relevant to the cloud-based service.

• Demetrio et al. (2020) presented WAF-A-MoLE, a tool that models the presence of an adversary. This tool leverages a set of mutation operators that alter the syntax of a payload without affecting the original semantics. Using the results, the authors demonstrated that ML-based WAFs are exposed to a concrete risk of being bypassed. However, this attack is not associated with any cloud-based services.

• Authors in Apruzzese et al. (2019) discussed adversarial attacks where the machine learning model is compromised to induce an output favorable to the attacker. These attacks are realized in a different setting as compared to the scope of this systematic review, as we only included the articles which discuss the attack or defense when the cloud is outsourcing its services as MLaaS.

• Han et al. (2020) conducted the first systematic study of the practical traffic space evasion attack on learning-based network intrusion detection systems; again it is out of the inclusion criteria of our work.

• Chen et al. (2018) designed and evaluated three types of attackers targeting the training phases to poison our detection. To address this threat, the authors proposed the detection system, KuafuDet, and showed it significantly reduces false negatives and boosts the detection accuracy.

• Song et al. (2020) presented a federated defense approach for mitigating the effect of adversarial perturbations in a federated learning environment. This article can be potentially relevant for our study as they address the problem of defending cloud-hosted ML models; however, instead of using a third-party service, the authors conducted the experiments on a single computer system in a simulated environment; therefore, this study is not included in the analysis of this article.

• In a similar study, Zhang et al. (2019) presented a defense mechanism for defending adversarial attacks on cloud-aided automatic speech recognition (ASR); however, it is not explicitly stated that the cloud is outsourcing ML services and also which ML/DL model or MLaaS was used in experiments.

4 Attacks on Cloud-Hosted Machine Learning Models (Q1)

In this section, we present the findings from the systematically selected articles that aim at attacking cloud-hosted/third-party ML/DL models.

4.1 Attacks on Cloud-Hosted Machine Learning Models: Thematic Analysis

In ML practice, it is very common to outsource the training of ML/DL models to third-party services that provide high computational resources on the cloud. Such services enable ML practitioners to upload their models along with training data which is then trained on the cloud. Although such services have clear benefits for reducing the training and inference time; however, these services can easily be compromised and to this end, different types of attacks against these services have been proposed in the literature. In this section, we present the thematic analysis of 19 articles that are focused on attacking cloud-hosted ML/DL models. These articles are classified into five major themes: 1) attack type, 2) threat model, 3) attack method, 4) target model(s), and 5) dataset.

Attack type: A wide variety of attacks have been proposed in the literature. These are listed below with their descriptions provided in the next section.

• Adversarial attacks ( Brendel et al., 2017 );

• Backdoor attacks 6 ( Chen et al., 2017 ; Gu et al., 2019 );

• Cyber kill chain–based attack ( Nguyen, 2017 );

• Data manipulation attacks ( Liao et al., 2018 );

• Evasion attacks ( Hitaj et al., 2019 );

• Exploration attacks ( Sethi and Kantardzic, 2018 );

• Model extraction attacks ( Correia-Silva et al., 2018 ; Kesarwani et al., 2018 ; Joshi and Tammana, 2019 ; Reith et al., 2019 );

• Model inversion attacks ( Yang et al., 2019 );

• Model-reuse attacks ( Ji et al., 2018 );

• Trojan attacks ( Liu et al., 2018 ).

black box attacks (no knowledge) ( Brendel et al., 2017 ; Chen et al., 2017 ; Hosseini et al., 2017 ; Correia-Silva et al., 2018 ; Sethi and Kantardzic, 2018 ; Hitaj et al., 2019 );

white box attacks (full knowledge) ( Liao et al., 2018 ; Liu et al., 2018 ; Gu et al., 2019 ; Reith et al., 2019 );

gray box attacks (partial knowledge) ( Ji et al., 2018 ; Kesarwani et al., 2018 ).

Attack method: In each article, a different type of method is proposed for attacking cloud-hosted ML/DL models; a brief description of these methods is presented in Table 1 and is discussed in detail in the next section.

www.frontiersin.org

TABLE 1 . Summary of the state-of-the art attack types for cloud-based/third-party ML/DL models.

Target model(s): Considered studies have used different MLaaS services (e.g., Google Cloud ML Services ( Hosseini et al., 2017 ; Salem et al., 2018 ; Sethi and Kantardzic, 2018 ), ML models of BigML Platform ( Kesarwani et al., 2018 ), IBM’s visual recognition ( Nguyen, 2017 ), and Amazon Prediction APIs ( Reith et al., 2019 ; Yang et al., 2019 )).

Dataset: These attacks have been realized using different datasets ranging from small size datasets (e.g., MNIST ( Gu et al., 2019 ) and Fashion-MNIST ( Liu et al., 2018 )) to large size datasets (e.g., YouTube Aligned Face Dataset ( Chen et al., 2017 ), Project Wolf Eye ( Nguyen, 2017 ), and Iris dataset ( Joshi and Tammana, 2019 )). Other datasets include California Housing, Boston House Prices, UJIIndoorLoc, and IPIN 2016 Tutorial ( Reith et al., 2019 ), FaceScrub, CelebA, and CIFAR-10 ( Yang et al., 2019 ). A summary of thematic analyses of these attacks is presented in Table 1 and briefly described in the next section.

4.2 Taxonomy of Attacks on Cloud-Hosted Machine Learning Models

In this section, we present a taxonomy and description of different attacks described above in thematic analysis. A taxonomy of attacks on cloud-hosted ML/DL models is depicted in Figure 8 and is described next.

www.frontiersin.org

FIGURE 8 . Distribution of selected publications by publishers over years.

4.2.1 Adversarial Attacks

In recent years, DL models have been found vulnerable to carefully crafted imperceptible adversarial examples ( Goodfellow et al., 2014 ). For instance, a decision-based adversarial attack namely the boundary attack against two black box ML models trained for brand and celebrity recognition hosted at Clarifai.com are proposed in ( Brendel et al., 2017 ). The first model identifies brand names from natural images for 500 distinct brands and the second model recognizes over 10,000 celebrities. To date, a variety of adversarial examples generation methods have been proposed in the literature so far, the interesting readers are referred to recent surveys articles for detailed taxonomy of different types of adversarial attacks (i.e., Akhtar and Mian, 2018 ; Yuan et al., 2019 ; Qayyum et al., 2020b ; Demetrio et al., 2020 ).

4.2.2 Exploratory Attacks

These attacks are inference time attacks in which adversary attempts to evade the underlying ML/DL model, for example, by forcing the classifier (i.e., ML/DL model) to misclassify a positive sample as a negative one. Exploratory attacks do not harm the training data and only affects the model at test time. A data-driven exploratory attack using the Seed – Explore – Exploit strategy for evading Google’s cloud prediction API considering black box settings is presented in ( Sethi and Kantardzic, 2018 ). The performance evaluation of the proposed framework was performed using 10 real-world datasets.

4.2.3 Model Extraction Attacks

In model extraction attacks, adversaries can query the deployed ML model and can use query–response pair for compromising future predictions and also, they can potentially realize privacy breaches of the training data and can steal the model by learning extraction queries. In Kesarwani et al. (2018) , the authors presented a novel method for quantifying the extraction status of models for users with an increasing number of queries, which aims to measure model learning rate using information gain observed by query and response streams of users. The key objective of the authors was to design a cloud-based system for monitoring model extraction status and warnings. The performance evaluation of the proposed method was performed using a decision tree model deployed on the BigML MLaaS platform for different adversarial attack scenarios. Similarly, a model extraction/stealing strategy is presented by Correia-Silva et al. (2018) . The authors queried the cloud-hosted DL model with random unlabeled samples and used their predictions for creating a fake dataset. Then they used the fake dataset for building a fake model by training an oracle (copycat) model in an attempt to achieve similar performance as of the target model.

4.2.4 Backdooring Attacks

In backdooring attacks, an adversary maliciously creates the trained model which performs as good as expected on the users’ training and validation data, but it performs badly on attacker input samples. The backdooring attacks on deep neural networks (DNNs) are explored and evaluated in ( Gu et al., 2019 ). The authors first explored the properties of backdooring for a toy example and created a backdoor model for handwritten digit classifier and then demonstrated that backdoors are powerful for DNN by creating a backdoor model for a United States street sign classifier. Where, two scenarios were considered, that is, outsourced training of the model and transfer learning where an attacker can acquire a backdoor pretrained model online. In another similar study ( Chen et al., 2017 ), a targeted backdoor attack for two state-of-the art face recognition models, that is, DeepID ( Sun et al., 2014 ) and VGG-Face ( Parkhi et al., 2015 ) is presented. The authors proposed two categories of backdooring poisoning attacks, that is, input–instance–key attacks and pattern–key attacks using two different data poising strategies, that is, input–instance–key strategies and pattern–key strategies, respectively.

4.2.5 Trojan Attacks

In Trojan attacks, the attacker inserts malicious content into the system that looks legitimate but can take over the control of the system. However, the purpose of Trojan insertion can be varied, for example, stealing, disruption, misbehaving, or getting intended behavior. In Liu et al. (2018) , the authors proposed a stealth infection on neural networks, namely, SIN2 to realize a practical supply chain triggered neural Trojan attacks. Also, they proposed a variety of Trojan insertion strategies for agile and practical Trojan attacks. The proof of the concept is demonstrated by developing a prototype of the proposed neural Trojan attack (i.e., SIN2) in Linux sandbox and used Torch ( Collobert et al., 2011 ) ML/DL framework for building visual recognition models using the Fashion-MNIST dataset.

4.2.6 Model-Reuse Attacks

In model-reuse attacks, an adversary creates a malicious model (i.e., adversarial model) that influences the host model to misbehave on targeted inputs (i.e., triggers) in extremely predictable fashion, that is, getting a sample classified into specific (intended class). For instance, experimental evaluation of model-reuse attacks for four pretrained primitive DL models (i.e., speech recognition, autonomous steering, face verification, and skin cancer screening) is evaluated by Ji et al. (2018) .

4.2.7 Data Manipulation Attacks

Those attacks in which training data are manipulated to get intended behavior by the ML/DL model are known as data manipulation attacks. Data manipulation attacks for stealthily manipulating traditional supervised ML techniques and logistic regression (LR) and CNN models are studied by Liao et al. (2018) . In the attack strategy, the authors added a new constraint on fully connected layers of the models and used gradient descent for retraining them, and other layers were frozen (i.e., were made non-trainable).

4.2.8 Cyber Kill Chain–Based Attacks

Kill chain is a term used to define steps for attacking a target usually used in the military. In cyber kill chain–based attacks, the cloud-hosted ML/DL models are attacked, for example, a high-level threat model targeting ML cyber kill chain is presented by Nguyen (2017) . Also, the authors provided proof of concept by providing a case study using IBM visual recognition MLaaS (i.e., cognitive classifier for classification cats and female lions) and provided recommendations for ensuring secure and robust ML.

4.2.9 Membership Inference Attacks

In a typical membership inference attack, for given input data and black box access to the ML model, an attacker attempts to figure out if the given input sample was the part of the training set or not. To realize a membership inference attack against a target model, a classification model is trained for distinguishing between the predictions of the target model against the inputs on which it was trained and that those on which it was not trained ( Shokri et al., 2017 ).

4.2.10 Evasion Attacks

Evasion attacks are inference time attacks in which an adversary attempts to modify the test data for getting the intended outcome from the ML/DL model. Two evasion attacks against watermarking techniques for DL models hosted as MLaaS have been presented by Hitaj et al. (2019) . The authors used five publicly available models and trained them for distinguishing between watermarked and clean (non-watermarked) images, that is, binary image classification tasks.

4.2.11 Model Inversion Attacks

In model inversion attacks, an attacker tries to learn about training data using the model’s outcomes. Two model inversion techniques have been proposed by Yang et al. (2019) , that is, training an inversion model using auxiliary set composed by utilizing adversary’s background knowledge and truncation-based method for aligning the inversion model. The authors evaluated their proposed methods on a commercial prediction MLaaS named Amazon Rekognition.

5 Toward Securing Cloud-Hosted Machine Learning Models (Q2)

In this section, we present the insights from the systematically selected articles that provide tailored defense against specific attacks and report the articles that along with creating attacks propose countermeasure for the attacks for cloud-hosted/third-party ML/DL models.

5.1 Defenses for Attacks on Cloud-Hosted Machine Learning Models: Thematic Analysis

Leveraging cloud-based ML services for computational offloading and minimizing the communication overhead is accepted as a promising trend. While cloud-based prediction services have significant benefits, however, by sharing the model and the training data raises many privacy and security challenges. Several attacks that can compromise the model and data integrity, as described in the previous section. To avoid such issues, users can download the model and make inferences locally. However, this approach has certain drawbacks, including, confidentiality issues, service providers cannot update the models, adversaries can use the model to develop evading strategies, and privacy of the user data is compromised. To outline the countermeasures against these attacks, we present the thematic analysis of six articles that are focused on defense against the tailored attacks for cloud-hosted ML/DL models or data. In addition, we also provide the thematic analysis of those six articles that propose defense against specific attacks. These articles are classified into five major themes: 1) attack type, 2) defense, 3) target model(s), 4) dataset, and 5) measured outcomes. The thematic analysis of these systematically reviewed articles that are focused on developing defense strategies against attacks is given below.

Considered attacks for developing defenses: The defenses proposed in the reviewed articles are developed against the following specific attacks.

• Extraction attacks ( Tramèr et al., 2016 ; Liu et al., 2017 );

• Inversion attacks ( Liu et al., 2017 ; Sharma and Chen, 2018 );

• Adversarial attacks ( Hosseini et al., 2017 ; Wang et al., 2018b ; Rouhani et al., 2018 );

• Evasion attacks ( Lei et al., 2020 );

• GAN attacks ( Sharma and Chen, 2018 );

• Privacy threat attacks ( Hesamifard et al., 2017 );

• ide channel and cache-timing attacks ( Jiang et al., 2018 );

• Membership inference attacks ( Shokri et al., 2017 ; Salem et al., 2018 ).

Most of the aforementioned attacks are elaborated in previous sections. However, in the selected articles that are identified as either defense or attack and defense articles, some attacks are specifically created, for instance, GAN attacks, side channel, cache-timing attack, privacy threats, etc. Therefore, the attacks are worth mentioning in this section to explain the specific countermeasures proposed against them in the defense articles.

Defenses against different attacks: To provide resilience against these attacks, the authors of selected articles proposed different defense algorithms, which are listed below against each type of attack.

• Extraction attacks: MiniONN ( Liu et al., 2017 ), rounding confidence, differential, and ensemble methods ( Tramèr et al., 2016 );

• Adversarial attacks: ReDCrypt ( Rouhani et al., 2018 ) and Arden ( Wang et al., 2018b );

• Inversion attacks: MiniONN ( Liu et al., 2017 ) and image disguising techniques ( Sharma and Chen, 2018 );

• Privacy attacks: encryption-based defense ( Hesamifard et al., 2017 ; Jiang et al., 2018 );

• Side channel and cache-timing attacks: encryption-based defense ( Hesamifard et al., 2017 ; Jiang et al., 2018 );

• Membership inference attack: dropout and model stacking ( Salem et al., 2018 ).

Target model(s): Different cloud-hosted ML/DL models have been used for the evaluation of the proposed defenses, as shown in Table 2 .

www.frontiersin.org

TABLE 2 . Summary of attack types and corresponding defenses for cloud-based/third-party ML/DL models.

Dataset(s) used: The robustness of these defenses have been evaluated using various datasets ranging from small size datasets (e.g., MNIST ( Liu et al., 2017 ; Wang et al., 2018b ; Rouhani et al., 2018 ; Sharma and Chen, 2018 )) and CIFAR-10 ( Liu et al., 2017 ; Wang et al., 2018b ; Sharma and Chen, 2018 )), to large size datasets (e.g., Iris dataset ( Tramèr et al., 2016 ), fertility and climate dataset ( Hesamifard et al., 2017 ), and breast cancer ( Jiang et al., 2018 )). Other datasets include Crab dataset ( Hesamifard et al., 2017 ), Face dataset, Traffic signs dataset, Traffic signs dataset ( Tramèr et al., 2016 ), SVHN ( Wang et al., 2018b ), Edinburgh MI, Edinburgh MI, WI-Breast Cancerband MONKs Prob ( Jiang et al., 2018 ), crab dataset, fertility dataset, and climate dataset ( Hesamifard et al., 2017 ). Each of the defense techniques discussed above is mapped in Table 2 to the specific attack for which it was developed.

Measured outcomes: The measured outcomes based on which the defenses are evaluated are response latency and message sizes ( Liu et al., 2017 ; Wang et al., 2018b ), throughput comparison ( Rouhani et al., 2018 ), average on the cache miss rates per second ( Sharma and Chen, 2018 ), AUC, space complexity to demonstrate approximated storage costs ( Jiang et al., 2018 ), classification accuracy of the model as well as running time ( Hesamifard et al., 2017 ; Sharma and Chen, 2018 ), similarity index ( Lei et al., 2020 ), and training time ( Hesamifard et al., 2017 ; Jiang et al., 2018 ).

5.2 Taxonomy of Defenses on Cloud-Hosted Machine Learning Model Attacks

In this section, we present a taxonomy and summary of different defensive strategies against attacks on cloud-hosted ML/DL models as described above in thematic analysis. A taxonomy of these defenses strategies is presented in Figure 9 and is described next.

www.frontiersin.org

FIGURE 9 . Taxonomy of different attacks realized on the third-party cloud-hosted machine learning (ML) or deep learning (DL) models.

5.2.1 MiniONN

DNNs are vulnerable to model inversion and extraction attacks. Liu et al. (2017) proposed that without making any changes to the training phase of the model it is possible to change the model into an oblivious neural network. They make the nonlinear function such as tanh and sigmoid function more flexible, and by training the models on several datasets, the authors demonstrated significant results with minimal loss in the accuracy. In addition, they also implemented the offline precomputation phase to perform encryption incremental operations along with the SIMD batch processing technique.

5.2.2 ReDCrypt

A reconfigurable hardware-accelerated framework is proposed by Rouhani et al. (2018) , for protecting the privacy of deep neural models in cloud networks. The authors perform an innovative and power-efficient implementation of Yao’s Garbled Circuit (GC) protocol on FPGAs for preserving privacy. The proposed framework is evaluated for different DL applications, and it has achieved up to 57-fold throughput gain per core.

5.2.3 Arden

To offload the large portion of DNNs from the mobile devices to the clouds and to make the framework secure, a privacy-preserving mechanism Arden is proposed by Wang et al. (2018b) . While uploading the data to the mobile-cloud perturbation, noisy samples are included to make the data secure. To verify the robustness, the authors perform rigorous analysis based on three image datasets and demonstrated that this defense is capable to preserve the user privacy along with inference performance.

5.2.4 Image Disguising Techniques

While leveraging services from the cloud GPU server, the adversary can realize an attack by introducing malicious created training data, perform model inversion, and use the model for getting desirable incentives and outcomes. To protect from such attacks and to preserve the data as well as the model, Sharma and Chen (2018) proposed an image disguising mechanism. They developed a toolkit that can be leveraged to calibrate certain parameter settings. They claim that the disguised images with block-wise permutation and transformations are resilient to GAN-based attack and model inversion attacks.

5.2.5 Homomorphic Encryption

For making the cloud services of outsourced MLaaS secure, Hesamifard et al. (2017) proposed a privacy-preserving framework using homomorphic encryption. They trained the neural network using the encrypted data and then performed the encrypted predictions. The authors demonstrated that by carefully choosing the polynomials of the activation functions to adopt neural networks, it is possible to achieve the desired accuracy along with privacy-preserving training and classification.

In a similar study, to preserve the privacy of outsourced biomedical data and computation on public cloud servers, Jiang et al. (2018) built a homomorphically encrypted model that reinforces the hardware security through Software Guard Extensions. They combined homomorphic encryption and Software Guard Extensions to devise a hybrid model for the security of the most commonly used model for biomedical applications, that is, LR. The robustness of the Secure LR framework is evaluated on various datasets, and the authors also compared its performance with state-of-the-art secure LR solutions and demonstrated its superior efficiency.

5.2.6 Pelican

Lei et al. (2020) proposed three mutation-based evasion attacks and a sample-based collision attack in white-, gray-, and black box scenarios. They evaluated the attacks and demonstrated a 100% success rate of attack on Google’s phishing page filter classifier, while a success rate of up to 81% for the transferability on Bitdefender TrafficLight. To deal with such attacks and to increase the robustness of classifiers, they proposed a defense method known as Pelican.

5.2.7 Rounding Confidences and Differential Privacy

Tramèr et al. (2016) presented the model extraction attacks against the online services of BigML and Amazon ML. The attacks are capable of model evasion, monetization, and can compromise the privacy of training data. The authors also proposed and evaluated countermeasures such as rounding confidences against equation-solving and decision tree pathfinding attacks; however, this defense has no impact on the regression tree model attack. For the preservation of training data, differential privacy is proposed; this defense reduces the ability of an attacker to learn insights about the training dataset. The impact of both defenses is evaluated on the attacks for different models, while the authors also proposed ensemble models to mitigate the impact of attacks; however, their resilience is not evaluated.

5.2.8 Increasing Entropy and Reducing Precision

The training of attack using shadow training techniques against black box models in the cloud-based Google Prediction API and Amazon ML models are studied by Shokri et al. (2017) . The attack does not require prior knowledge of training data distribution. The authors emphasize that in order to protect the privacy of medical-related datasets or other public-related data, countermeasures should be designed. For instance, restriction of prediction vector to top k classes, which will prevent the leakage of important information or rounding down or up the classification probabilities in the prediction. They show that regularization can be effective to cope with overfitting and increasing the randomness of the prediction vector.

5.2.9 Dropout and Model Stacking

In the study by Salem et al. (2018) , the authors created three diverse attacks and tested the applicability of these attacks on eight datasets from which six are similar as used by Shokri et al. (2017) , whereas in this work, news dataset and face dataset is included. In the threat model, the authors considered black box access to the target model which is a supervised ML classifier with binary classes that was trained for binary classification. To mitigate the privacy threats, the authors proposed a dropout-based method which reduces the impact of an attack by randomly deleting a proportion of edges in each training iteration in a fully connected neural network. The second defense strategy is model stacking, which hierarchically organizes multiple ML models to avoid overfitting. After extensive evaluation, these defense techniques showed the potential to mitigate the performance of the membership inference attack.

5.2.10 Randomness to Video Analysis Algorithms

Hosseini et al. designed two attacks specifically to analyze the robustness of video classification and shot detection ( Hosseini et al., 2017 ). The attack can subtly manipulate the content of the video in such a way that it is undetected by humans, while the output from the automatic video analysis method is altered. Depending on the fact that the video and shot labels are generated by API by processing only the first video frame of every second, the attack can successfully deceive API. To deal with the shot removal and generation attacks, the authors proposed the inclusion of randomness for enhancing the robustness of algorithms. However, in this article, the authors thoroughly evaluated the applicability of these attacks in different video setting, but the purposed defense is not rigorously evaluated.

5.2.11 Neuron Distance Threshold and Obfuscation

Transfer learning is an effective technique for quickly building DL student models in which knowledge from a Teacher model is transferred to a Student model. However, Wang et al. (2018a) discussed that due to the centralization of model training, the vulnerability against misclassification attacks for image recognition on black box Student models increases. The authors proposed several defenses to mitigate the impact of such an attack, such as changing the internal representation of the Student model from the Teacher model. Other defense methods include increasing dropout randomization which alters the student model training process, modification in input data before classification, adding redundancy, and using orthogonal model against transfer learning attack. The authors analyzed the robustness of these attacks and demonstrated that the neuron distance threshold is the most effective in obfuscating the identity of the Teacher model.

6 Pitfalls and Limitations

6.1 lack of attack diversity.

The attacks presented in the selected articles have limited scope and lack diversity, that is, they are limited to a specific setting, and the variability of attacks is limited as well. However, the diversity of attacks is an important consideration for developing robust attacks from the perspective of adversaries, and it ensures the detection and prevention of the attacks to be difficult. The diversity of attacks ultimately helps in the development of robust defense strategies. Moreover, the empirical evaluation of attack variabilities can identify the potential vulnerabilities of cybersecurity systems. Therefore, to make a more robust defense solution, it is important to test the model robustness under a diverse set of attacks.

6.2 Lack of Consideration for Adaptable Adversaries

Most of the defenses in the systematically reviewed articles are proposed for a specific attack and did not consider the adaptable adversaries. On the other hand, in practice, the adversarial attacks are an arms race between attackers and defenders. That is, the attackers continuously evolve and enhance their knowledge and attacking strategies to evade the underlying defensive system. Therefore, the consideration of adaptable adversaries is crucial for developing a robust and long-lasting defense mechanism. If we do not consider this, the adversary will adapt to our defensive system over time and will bypass it to get the intended behavior or outcomes.

6.3 Limited Progress in Developing Defenses

From the systematically selected articles that are collected from different databases, only 12 articles have presented defense methods for the proposed attack as compared to the articles that are focused on attacks, that is, 19. In these 12 articles, six have only discussed/presented a defense strategy and six have developed a defense against a particular attack. This indicates that there is limited activity from the research community in developing defense strategies for already proposed attacks in the literature. In addition, the proposed defenses only mitigate or detect those attacks for which they have been developed, and therefore, they are not generalizable. On the contrary, the increasing interest in developing different attacks and the popularity of cloud-hosted/third-party services demand a proportionate amount of interest in developing defense systems as well.

7 Open Research Issues

7.1 adversarially robust machine learning models.

In recent years, adversarial ML attacks have emerged as a major panacea for ML/DL models and the systematically selected articles have highlighted the threat of these attacks for cloud-hosted Ml/DL models as well. Moreover, the diversity of these attacks is drastically increasing as compared with the defensive strategies that can pose serious challenges and consequences for the security of cloud-hosted ML/DL models. Each defense method presented in the literature so far has been shown resilient to a particular attack which is realized in specific, settings and it fails to withstand for yet stronger and unseen attacks. Therefore, the development of adversarially robust ML/DL models remains an open research problem, while the literature suggests that worst-case robustness analysis should be performed while considering adversarial ML settings ( Qayyum et al., 2020a ; Qayyum et al., 2020b ; Ilahi et al., 2020 ). In addition, it has been argued in the literature that most of ML developers and security incident responders are unequipped with the required tools for securing industry-grade ML systems against adversarial ML attacks Kumar et al. (2020) . This indicates the increasing need for the development of defense strategies for securing ML/DL models against adversarial ML attacks.

7.2 Privacy-Preserving Machine Learning Models

In cloud-hosted ML services, preserving user privacy is fundamentally important and is a matter of high concern. Also, it is desirable that ML models built using users’ data should not learn information that can compromise the privacy of the individuals. However, the literature on developing privacy-preserving ML/DL models or MLaaS is limited. On the other hand, one of the privacy-preserving techniques that have been used for privacy protection for building a defense system for cloud-hosted ML/DL models, that is, the homomorphic encryption-based protocol ( Jiang et al., 2018 ), has been shown vulnerable to model extraction attack ( Reith et al., 2019 ). Therefore, the development of privacy-preserving ML models for cloud computing platforms is another open research problem.

7.3 Proxy Metrics for Evaluating Security and Robustness

From systematically reviewed literature on the security of cloud-hosted ML/DL models, we orchestrate that the interest from the research community in the development of novel security-centric proxy metrics for the evaluation of security threats and model robustness of cloud-hosted models is very limited. However, with the increasing proliferation of cloud-hosted ML services (i.e., MLaaS) and with the development/advancements of different attacks (e.g., adversarial ML attacks), the development of effective and scalable metrics for evaluating the robustness ML/DL models toward different attacks and defense strategies is required.

8 Threats to Validity

We now briefly reflect on our methodology in order to identify any threats to the validity of our findings. First, internal validity is maintained as the research questions we pose in Section 2.2 capture the objectives of the study. Construct validity relies on a sound understanding of the literature and how it represents the state of the field. A detailed study of the reviewed articles along with deep discussions between the members of the research team helped ensure the quality of this understanding. Note that the research team is of diverse skills and expertise in ML, DL, cloud computing, ML/DL security, and analytics. Also, the inclusion and exclusion criteria (Section 2.3) help define the remit of our survey. Data extraction is prone to human error as is always the case. This was mitigated by having different members of the research team review each reviewed article. However, we did not attempt to evaluate the quality of the reviewed studies or validate their content due to time constraints. In order to minimize selection bias, we cast a wide net in order to capture articles from different communities publishing in the area of MLaaS via a comprehensive set of bibliographical databases without discriminating based on the venue/source.

9 Conclusion

In this article, we presented a systematic review of literature that is focused on the security of cloud-hosted ML/DL models, also named as MLaaS. The relevant articles were collected from eight major publishers that include ACM Digital Library, IEEE Xplore, ScienceDirect, international conference on machine learning, international conference on learning representations, journal of machine learning research, USENIX, neural information processing systems, and arXiv. For the selection of articles, we developed a review protocol that includes inclusion and exclusion formulas and analyzed the selected articles that fulfill these criteria across two dimensions (i.e., attacks and defenses) on MLaaS and provide a thematic analysis of these articles across five attack and five defense themes, respectively. We also identified the limitations and pitfalls from the reviewed literature, and finally, we have highlighted various open research issues that require further investigation.

Data Availability Statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.

Author Contributions

AQ led the work in writing the manuscript and performed the annotation of the data and analysis as well. AI performed data acquisition, annotation, and analysis from four venues, and contributed to the paper write-up. MU contributed to writing a few sections, did annotations of papers, and helped in analysis. WI performed data scrapping, annotation, and analysis from four venues, and helped in developing graphics. All the first four authors validated the data, analysis, and contributed to the interpretation of the results. AQ and AI helped in developing and refining the methodology for this systematic review. JQ conceived the idea and supervises the overall work. JQ, YEK, and AF provided critical feedback and helped shape the research, analysis, and manuscript. All authors contributed to the final version of the manuscript.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

1 We use MLaaS to cover both ML and DL as a Service cloud provisions.

2 https://cloud.google.com/ml-engine/ .

3 A popular Python library for DL.

4 https://azure.microsoft.com/en-us/services/machine-learning-service/ .

5 https://docs.aws.amazon.com/dlami/latest/devguide/AML2_0.html .

6 Backdoor attacks on cloud-hosted models can be further categorized into three categories ( Chen et al., 2020 ): 1) complete model–based attacks, 2) partial model–based attacks, and 3) model-free attacks).

Akhtar, N., and Mian, A. (2018). Threat of adversarial attacks on deep learning in computer vision: a survey. IEEE Access 6, 14410–14430. doi:10.1109/access.2018.2807385

CrossRef Full Text | Google Scholar

Apruzzese, G., Colajanni, M., Ferretti, L., and Marchetti, M. (2019). “Addressing adversarial attacks against security systems based on machine learning,” in 2019 11th International conference on cyber conflict (CyCon) , Tallinn, Estonia , May 28–31, 2019 ( IEEE ), 900, 1–18

Google Scholar

Brendel, W., Rauber, J., and Bethge, M. (2017). “Decision-based adversarial attacks: reliable attacks against black-box machine learning models,” in International Conference on Learning Representations (ICLR)

Chen, S., Xue, M., Fan, L., Hao, S., Xu, L., Zhu, H., et al. (2018). Automated poisoning attacks and defenses in malware detection systems: an adversarial machine learning approach. Comput. Secur. 73, 326–344. doi:10.1016/j.cose.2017.11.007

Chen, X., Liu, C., Li, B., Lu, K., and Song, D. (2017). Targeted backdoor attacks on deep learning systems using data poisoning. arXiv

Chen, Y., Gong, X., Wang, Q., Di, X., and Huang, H. (2020). Backdoor attacks and defenses for deep neural networks in outsourced cloud environments. IEEE Network 34 (5), 141–147. doi:10.1109/MNET.011.1900577

Collobert, R., Kavukcuoglu, K., and Farabet, C. (2011). “Torch7: a Matlab-like environment for machine learning,” in BigLearn, NIPS workshop .

Correia-Silva, J. R., Berriel, R. F., Badue, C., de Souza, A. F., and Oliveira-Santos, T. (2018). “Copycat CNN: stealing knowledge by persuading confession with random non-labeled data,” in 2018 International joint conference on neural networks (IJCNN) , Rio de Janeiro, Brazil , July 8–13, 2018 ( IEEE ), 1–8

Demetrio, L., Valenza, A., Costa, G., and Lagorio, G. (2020). “Waf-a-mole: evading web application firewalls through adversarial machine learning,” in Proceedings of the 35th annual ACM symposium on applied computing , Brno, Czech Republic , March 2020 , 1745–1752

Gong, Y., Li, B., Poellabauer, C., and Shi, Y. (2019). “Real-time adversarial attacks,” in Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI) , Macao, China , August 2019

Goodfellow, I. J., Shlens, J., and Szegedy, C. (2014). Explaining and harnessing adversarial examples. arXiv

Gu, T., Liu, K., Dolan-Gavitt, B., and Garg, S. (2019). BadNets: evaluating backdooring attacks on deep neural networks. IEEE Access 7, 47230–47244. doi:10.1109/access.2019.2909068

Han, D., Wang, Z., Zhong, Y., Chen, W., Yang, J., Lu, S., et al. (2020). Practical traffic-space adversarial attacks on learning-based nidss. arXiv

Hesamifard, E., Takabi, H., Ghasemi, M., and Jones, C. (2017). “Privacy-preserving machine learning in cloud,” in Proceedings of the 2017 on cloud computing security workshop , 39–43

Hilprecht, B., Härterich, M., and Bernau, D. (2019). “Monte Carlo and reconstruction membership inference attacks against generative models,” in Proceedings on Privacy Enhancing Technologies , Stockholm, Sweden , July 2019 , 2019, 232–249

Hitaj, D., Hitaj, B., and Mancini, L. V. (2019). “Evasion attacks against watermarking techniques found in MLaaS systems,” in 2019 sixth international conference on software defined systems (SDS) , Rome, Italy , June 10–13, 2019 ( IEEE )

Hosseini, H., Xiao, B., Clark, A., and Poovendran, R. (2017). “Attacking automatic video analysis algorithms: a case study of google cloud video intelligence API,” in Proceedings of the 2017 conference on multimedia Privacy and security (ACM) , 21–32

Ilahi, I., Usama, M., Qadir, J., Janjua, M. U., Al-Fuqaha, A., Hoang, D. T., et al. (2020). Challenges and countermeasures for adversarial attacks on deep reinforcement learning. arXiv

Ji, Y., Zhang, X., Ji, S., Luo, X., and Wang, T. (2018). “Model-reuse attacks on deep learning systems, “in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (New York, NY: ACM) , December 2018 , 349–363

Jiang, Y., Hamer, J., Wang, C., Jiang, X., Kim, M., Song, Y., et al. (2018). Securelr: secure logistic regression model via a hybrid cryptographic protocol. IEEE ACM Trans. Comput. Biol. Bioinf 16, 113–123. doi:10.1109/TCBB.2018.2833463

Joshi, N., and Tammana, R. (2019). “GDALR: an efficient model duplication attack on black box machine learning models,” in 2019 IEEE international Conference on system, computation, Automation and networking (ICSCAN) , Pondicherry, India , March 29–30, 2019 ( IEEE ), 1–6

Kesarwani, M., Mukhoty, B., Arya, V., and Mehta, S. (2018). Model extraction warning in MLaaS paradigm. In Proceedings of the 34th Annual Computer Security Applications Conference (ACM) , 371–380

Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems , 1097–1105 Available at: http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

Kumar, R. S. S., Nyström, M., Lambert, J., Marshall, A., Goertzel, M., Comissoneru, A., et al. (2020). Adversarial machine learning–industry perspectives. arXiv . Available at: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3532474

Lei, Y., Chen, S., Fan, L., Song, F., and Liu, Y. (2020). Advanced evasion attacks and mitigations on practical ml-based phishing website classifiers. arXiv

Liang, B., Su, M., You, W., Shi, W., and Yang, G. (2016). “Cracking classifiers for evasion: a case study on the google’s phishing pages filter,” in Proceedings of the 25th international conference on world wide web Montréal, Québec, Canada , 345–356

Liao, C., Zhong, H., Zhu, S., and Squicciarini, A. (2018). “Server-based manipulation attacks against machine learning models,” in Proceedings of the eighth ACM conference on data and application security and privacy (ACM) , New York, NY , March 2018 , 24–34

Liu, J., Juuti, M., Lu, Y., and Asokan, N.. (2017). “Oblivious neural network predictions via minionn transformations,” in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security , October 2017 , 619–631

Liu, T., Wen, W., and Jin, Y. (2018). “SIN 2: stealth infection on neural network—a low-cost agile neural Trojan attack methodology,” in 2018 IEEE international symposium on hardware oriented security and trust (HOST) , Washington, DC , April 30–4 May, 2018 ( IEEE ), 227–230

Nguyen, T. N. (2017). Attacking machine learning models as part of a cyber kill chain. arXiv

Parkhi, O. M., Vedaldi, A., Zisserman, A., et al. (2015). Deep face recognition. Bmvc 1, 6. doi:10.5244/C.29.41

Qayyum, A., Qadir, J., Bilal, M., and Al-Fuqaha, A. (2020a). Secure and robust machine learning for healthcare: a survey. IEEE Rev. Biomed. Eng. , 1. doi:10.1109/RBME.2020.3013489

Qayyum, A., Usama, M., Qadir, J., and Al-Fuqaha, A. (2020b). Securing connected & autonomous vehicles: challenges posed by adversarial machine learning and the way forward. IEEE Commun. Surv. Tutorials 22, 998–1026. doi:10.1109/comst.2020.2975048

Reith, R. N., Schneider, T., and Tkachenko, O. (2019). “Efficiently stealing your machine learning models,” in Proceedings of the 18th ACM workshop on privacy in the electronic society , November 2019 , 198–210

Rouhani, B. D., Hussain, S. U., Lauter, K., and Koushanfar, F. (2018). Redcrypt: real-time privacy-preserving deep learning inference in clouds using fpgas. ACM Trans. Reconfigurable Technol. Syst. 11, 1–21. doi:10.1145/3242899

Saadatpanah, P., Shafahi, A., and Goldstein, T. (2019). Adversarial attacks on copyright detection systems. arXiv .

Salem, A., Zhang, Y., Humbert, M., Berrang, P., Fritz, M., and Backes, M. (2018). ML-leaks: model and data independent membership inference attacks and defenses on machine learning models. arXiv .

Sehwag, V., Bhagoji, A. N., Song, L., Sitawarin, C., Cullina, D., Chiang, M., et al. (2019). Better the devil you know: an analysis of evasion attacks using out-of-distribution adversarial examples. arXiv .

Sethi, T. S., and Kantardzic, M. (2018). Data driven exploratory attacks on black box classifiers in adversarial domains. Neurocomputing 289, 129–143. doi:10.1016/j.neucom.2018.02.007

Sharma, S., and Chen, K.. (2018). “Image disguising for privacy-preserving deep learning,” in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security , ( ACM, Toronto, Canada ), 2291–2293

Shokri, R., Stronati, M., Song, C., and Shmatikov, V. (2017). “Membership inference attacks against machine learning models,” in 2017 IEEE Symposium on Security and privacy (SP) , San Jose, CA , May 22–26, 2017 ( IEEE ), 3–18

Simonyan, K., and Zisserman, A. (2015). “Very deep convolutional networks for large-scale image recognition,”in International Conference on Learning Representations (ICLR)

Song, Y., Liu, T., Wei, T., Wang, X., Tao, Z., and Chen, M. (2020). Fda3: federated defense against adversarial attacks for cloud-based iiot applications. IEEE Trans. Industr. Inform. , 1. doi:10.1109/TII.2020.3005969

Sun, Y., Wang, X., and Tang, X. (2014). “Deep learning face representation from predicting 10,000 classes,” in Proceedings of the IEEE conference on computer vision and pattern recognition , Columbus, OH , June 23–28, 2014 , ( IEEE ).

Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. “(2016). Rethinking the inception architecture for computer vision,” in Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) , Las Vegas, NV , June 27–30, 2016 ( IEEE ), 2818–2826

Tramèr, F., Zhang, F., Juels, A., Reiter, M. K., and Ristenpart, T. (2016). “Stealing machine learning models via prediction APIs,” in 25th USENIX security symposium (USENIX Security 16) , 601–618

Tyndall, J. (2010). AACODS checklist . Adelaide, Australia: Adelaide Flinders University

Usama, M., Mitra, R. N., Ilahi, I., Qadir, J., and Marina, M. K. (2020a). Examining machine learning for 5g and beyond through an adversarial lens. arXiv . Available at: https://arxiv.org/abs/2009.02473 .

Usama, M., Qadir, J., Al-Fuqaha, A., and Hamdi, M. (2020b). The adversarial machine learning conundrum: can the insecurity of ML become the achilles' heel of cognitive networks? IEEE Network 34, 196–203. doi:10.1109/mnet.001.1900197

Usama, M., Qayyum, A., Qadir, J., and Al-Fuqaha, A. (2019). “Black-box adversarial machine learning attack on network traffic classification, “in 2019 15th international wireless communications and mobile computing conference (IWCMC) , Tangier, Morocco , June 24–28, 2019

Wang, B., Yao, Y., Viswanath, B., Zheng, H., and Zhao, B. Y. (2018a). “With great training comes great vulnerability: practical attacks against transfer learning,” in 27th USENIX security symposium (USENIX Security 18) , Baltimore, MD , August 2018 , 1281–1297

Wang, J., Zhang, J., Bao, W., Zhu, X., Cao, B., and Yu, P. S. (2018b). “Not just privacy: improving performance of private deep learning in mobile cloud,” in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining London, United Kingdom , January 2018 , 2407–2416

Yang, Z., Zhang, J., Chang, E.-C., and Liang, Z. (2019). “Neural network inversion in adversarial setting via background knowledge alignment,” in Proceedings of the 2019 ACM SIGSAC conference on computer and communications security , London, UK , November 2019 , 225–240

Yuan, X., He, P., Zhu, Q., and Li, X. (2019). Adversarial examples: attacks and defenses for deep learning. IEEE Trans. Neural. Netw. Learn. Syst. 30 (9), 2805–2824. doi:10.1109/TNNLS.2018.2886017

Zhang, J., Zhang, B., and Zhang, B. (2019). “Defending adversarial attacks on cloud-aided automatic speech recognition systems, “in Proceedings of the seventh international workshop on security in cloud computing , New York , 23–31. Available at: https://dl.acm.org/doi/proceedings/10.1145/3327962

Keywords: Machine Learning as a Service, cloud-hosted machine learning models, machine learning security, cloud machine learning security, systematic review, attacks, defenses

Citation: Qayyum A, Ijaz A, Usama M, Iqbal W, Qadir J, Elkhatib Y and Al-Fuqaha A (2020) Securing Machine Learning in the Cloud: A Systematic Review of Cloud Machine Learning Security. Front. Big Data 3:587139. doi: 10.3389/fdata.2020.587139

Received: 24 July 2020; Accepted: 08 October 2020; Published: 12 November 2020.

Reviewed by:

Copyright © 2020 Qayyum, Ijaz, Usama, Iqbal, Qadir, Elkhatib and Al-Fuqaha. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Adnan Qayyum, [email protected]

This article is part of the Research Topic

Safe and Trustworthy Machine Learning

Advances, Systems and Applications

  • Open access
  • Published: 15 February 2024

Investigation on storage level data integrity strategies in cloud computing: classification, security obstructions, challenges and vulnerability

  • Paromita Goswami 1 , 2 ,
  • Neetu Faujdar 2 ,
  • Somen Debnath 3 ,
  • Ajoy Kumar Khan 1 &
  • Ghanshyam Singh 4  

Journal of Cloud Computing volume  13 , Article number:  45 ( 2024 ) Cite this article

89 Accesses

Metrics details

Cloud computing provides outsourcing of computing services at a lower cost, making it a popular choice for many businesses. In recent years, cloud data storage has gained significant success, thanks to its advantages in maintenance, performance, support, cost, and reliability compared to traditional storage methods. However, despite the benefits of disaster recovery, scalability, and resource backup, some organizations still prefer traditional data storage over cloud storage due to concerns about data correctness and security. Data integrity is a critical issue in cloud computing, as data owners need to rely on third-party cloud storage providers to handle their data. To address this, researchers have been developing new algorithms for data integrity strategies in cloud storage to enhance security and ensure the accuracy of outsourced data. This article aims to highlight the security issues and possible attacks on cloud storage, as well as discussing the phases, characteristics, and classification of data integrity strategies. A comparative analysis of these strategies in the context of cloud storage is also presented. Furthermore, the overhead parameters of auditing system models in cloud computing are examined, considering the desired design goals. By understanding and addressing these factors, organizations can make informed decisions about their cloud storage solutions, taking into account both security and performance considerations.

Introduction

Cloud computing’s appeal lies in its dynamic and flexible Service Level Agreement (SLA) based negotiable services, allowing users to access virtually limitless computing resources [ 1 ]. According to the National Institute of Standards and Technology (NIST), cloud computing offers a swiftly provisioned pay-per-use model, enabling on-demand, accessible, and configurable network access to shared pool resources, requiring minimal interactions from service providers and reduced management efforts [ 2 ]. Cloud computing models include private, public, hybrid, and community clouds, with services categorized into Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). IaaS providers like Google Compute Engine, Windows Azure Virtual Machines, and Amazon Elastic Cloud Compute offer network resources and computing storage, enhancing performance and reducing maintenance costs to meet specific customer demands [ 3 , 4 ]. This evolution in cloud computing has transformed various sectors. Businesses and healthcare organizations benefit from services like cost reduction through resource outsourcing [ 3 , 4 ], performance monitoring [ 5 , 6 ], resource management [ 7 ], and computing prediction [ 8 ]. Additionally, cloud computing facilitates tasks such as resource allocation [ 9 ], workload distribution [ 10 , 11 , 12 ], capacity planning [ 13 ], and job-based resource distribution [ 14 , 15 ]. This transformative impact underscores the significance of cloud computing in modern digital landscapes, empowering organizations with unprecedented efficiency and scalability in resource utilization [ 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 ].

Despite the availability of various data services, data owners are apprehensive about entrusting their valuable data to cloud service providers (CSPs) for third-party cloud storage due to concerns about the integrity of the CSPs [ 13 , 16 , 17 ], and the shared nature of cloud storage environments. Cloud computing primarily encompasses data storage and computation, with Infrastructure as a Service (IaaS) closely linked to cloud storage. When accessing IaaS, cloud users often lack visibility into the precise location of their outsourced data within the cloud storage and the machines responsible for processing tasks. Consequently, data privacy within cloud storage is a significant security challenge, exacerbated by the presence of malicious users, resulting in data integrity and confidentiality issues. This poses a critical security challenge for cloud storage, and trust in remote cloud data storage is crucial for the success of cloud computing. Data integrity, encompassing completeness, correctness, and consistency, is vital in the context of Database Management Systems (DBMS) and the ACID (Atomicity, Consistency, Isolation, Durability) properties of transactions. The issue arises when CSPs cannot securely guarantee clients the accuracy and completeness of data in response to their queries [ 18 ].

Researchers are actively advancing the field of data integrity in cloud computing by refining data integrity verification techniques and bolstering data privacy-preserving methods. These verification techniques primarily encompass Proof of Work (PoW), Proof of Data Possession (PDP), and Proof of Retrievability (PoR). Notably, the introduction of Message Authentication Code (MAC) using a unique random key within the data integrity framework marked a deterministic approach to data integrity verification, mitigating the inefficiencies associated with remote data integrity schemes that employed RSA-based encryption. This approach addressed issues related to significant computation time and long hash value transfer times for large files [ 19 ]. To enhance the security of data integrity schemes, Provable Data Possession (PDP) concepts were introduced to establish the legitimacy of data possession by a cloud server. Various subsequent research efforts have continually refined these algorithms, introducing innovations like the Transparent PDP scheme [ 20 ], DHT-PDP [ 21 ], Certificateless PDP Protocol for Multiple Copies [ 22 , 23 , 24 ], and Dynamic Multiple-Replica PDP [ 25 ]. Concurrently, the Proof of Retrievability (PoR) concept was introduced in 2007 to address error localization and data recovery issues [ 26 ]. Additionally, Proof of Original Ownership (PoW) emerged in 2011 through the Merkle hash tree protocol to prevent malicious adversaries, leading to a plethora of subsequent research endeavors with diverse improved algorithms aimed at the same goals [ 27 , 28 , 29 ].

Fully homomorphic encryption (FHE) was proposed to maintain the privacy preservation of outsourced data and in that case, original data were converted into ciphertext through an encryption technique that supports multiplication and additional operation over the ciphertext [ 30 ]. Meanwhile, drawbacks in [ 22 ] such as practically infeasible due to complex operations, were then solved by [ 31 ] Somewhat Homomorphic Encryption (SHE) scheme. Many more research works have been established in these few years such as biometrics face recognition approach [ 32 ], privacy-preserving auditing scheme for Cloud Storage using HLA [ 33 ], An Etiquette Approach for Preserving Data [ 34 ], etc.

Recently, Google cloud has introduced Zebra technologies based on a security command center (SCC) and security operation center (SOC) to point out some harmful threats such as crypto mining activity, data exfiltration, potential malware infections, brute force SSH attacks, etc. to maintain data integrity of business organization’s information [ 35 ].

In recent years, numerous cloud data integrity schemes have emerged, along with several survey papers, albeit with limited parameters to comprehensively address specific aspects of data integrity. Some of these surveys include data auditing from single copies to multiple replicas [ 36 ], Proof of Retrievability [ 37 ], various data integrity techniques and verification types for cloud storage, and different data integrity protocols [ 38 ]. However, these surveys often fall short in providing a comprehensive understanding of data integrity strategies and their classification. A concise taxonomy of data integrity schemes was presented in a survey paper [ 39 ], which discussed a comparative analysis of existing data integrity schemes, their evolution from 2007 to 2015, and covered fewer physical storage issues, fewer security challenges, and design considerations. This survey paper aims to address this gap by offering an in-depth discussion on the security challenges within physical cloud storage, potential threats, attacks, and their mitigations. It will also categorize data integrity schemes, outline their phases and characteristics, provide a comparative analysis, and project future trends. This comprehensive approach underscores the significance of data integrity schemes in securing cloud storage.

Although there are several articles arise on similar issues, our research work differs from all mentioned research works in the following ways: Unlike [ 36 , 37 , 39 ], our research work focused on different types of storage-based attacks and also comprised up-to-date methods to resist storage-based attacks which always violate data integrity schemes on physical cloud storage. Like [ 37 ], it includes storage-based security issues, threats, and it’s existing mitigation solutions. Unlike [ 36 , 37 , 39 ] our research work focused on the different types of proposals of data integrity verification which is broadly classified into file-level verification, entire blocks verification, metadata verification, and randomly block-level verification.

Unlike [ 37 ], our survey work is not constricted to only proof of retrievability (POR). It covers all verification types like the power of ownership (PoW), proof of retrievability (POR), and provable data possession (PDP). It also includes different types of auditing verifications techniques to elaborate job roles on the TPA’s side and DO’s side. It also includes a discussion of the benefit of public auditing to reduce the overhead of computational and communication overhead of DO. Unlike [ 36 , 37 , 38 , 40 , 41 , 42 , 43 ], our survey work reviews a wide range of quality features of data integrity schemes that have individually prime importance in cloud storage security. Unlike [ 36 , 37 , 41 ], we focused on different types of security challenges according to existing symptoms, effects, and probable solutions of data integrity schemes. Like [ 42 , 43 , 44 ], we include a discussion about malicious insider attacks, forgery attacks, and dishonest TPA and CSP. Unlike [ 41 , 43 , 44 ], in comparative analysis, we introduce here different performance analysis parameters of existing works based on the work’s motivations and limitations in addition to a discussion of public and private data auditing criteria. Like [ 32 ], we include all existing data integration methods briefly in the Comparative analysis of data integrity strategies section.

Research gap

According to the above discussion, this research focuses on the following points to summarize the research gaps:

In contrast to [ 36 , 37 , 39 ], our research included current strategies to fend against storage-based attacks, which consistently compromise data integrity techniques on physical cloud storage.

Our research, in contrast to [ 36 , 37 , 39 ], concentrated on the various approaches to data integrity verification, which is categorised into four categories: file-level verification, full block verification, metadata verification, and randomized block-level verification.

Our survey study is not limited to proof of retrievability (POR), in contrast to [ 37 ]. It includes all forms of verification, including proven data possession (PDP), proof of retrievability (POR), and power of ownership (PoW). Different Key Management Techniques used in cloud storage to improve security at cloud storage were also added here .

In contrast to [ 36 , 37 , 38 , 40 , 41 , 42 , 43 ], our survey work examines a variety of data integrity scheme quality features, each of which is crucial to the security of cloud storage.

In contrast to [ 36 , 37 , 41 ], we concentrated on various security issues based on the impacts, symptoms, and likely fixes of data integrity techniques.

In contrast to [ 41 , 43 , 44 ], we present here various performance analysis parameters of previous efforts based on the goals and constraints of the work together with a discussion of auditing criteria for both public and private data.

Contribution

On the basis of our knowledge, this is the first attempt to overlook all the related issues of cloud data storage with possible directions under a single article. The Key contributions of this research paper are summarized below:

Identification of possible attacks on storage level services which may arise on physical cloud storage mitigating explored solutions

Summarizing of possible characteristics of data integrity strategies to examine data integrity auditing soundness, phases, classification, etc. to understand and analyse security loopholes

Literature review on comparative analysis based on all characteristics, motivation, limitation, accuracy, method, and probable attacks

Discussion on design goal issues along with security level issues on data integrity strategy to analyse dynamic performance efficiency, different key management techniques to achieve security features, to analyse server attacks, etc.

Identification of security issues in data integrity strategy and its mitigation solution

Discussion about the future direction of new data integrity schemes of cloud computing.

This review article is described in 8 sections. Issues of physical cloud storage section, discusses issues of physical cloud storage, and attacks in storage level service. Key management techniques with regards to storage level in cloud section describes some existing key management techniques to enhance security of cloud storage. Potential attacks in storage level service section describes possible potential attacks in cloud storage. Phases of data integrity technique section phases of the data integrity scheme and summarizes all possible characteristics of the data integrity strategy. Classification of data integrity strategy section describes a classification of data integrity strategy. Characteristics of data integrity technique section describes characteristics of data integrity technique. Challenges of data integrity technique in cloud environment section describes Challenges of data integrity technique in cloud Environment. Desire design challenges of data integrity strategy section describes Desire design challenges of data integrity strategy. Comparative analysis of data integrity strategies section represents a comparative analysis of existing research works of data integrity strategy. At the end,design goal issues and future trends of cloud storage based on existing integrity schemes using a timeline infographic from 2016 to 2022 in Future trends in data integrity approaches section.

Issues of physical cloud storage

Generally, the physical cloud storage in terms of IaaS services gives cloud users the opportunity of using computing resources at a minimum cost without taking any responsibility for infrastructure maintenance. But in the actual scenario, CSP and other authorized users have no trusted actors in cloud computing. Hence, cloud storage is an attack-prone area due to the malicious intentions of CSP and insider-outsider attackers. We have listed here cloud storage issues along with possible attacks. Table  1 shows below all possible mitigating solutions.

In capability of CSP: Managing big cloud storage may create a data loss problem for CSP due to lack of insufficient computational capacity, sometimes cannot meet user’s requirement, missing a user-friendly data serialization standard with easily readable and editable syntax, due to changes of a life cycle in a cloud environment [ 66 ].

Loses control of cloud data over a distributed cloud environment may give vulnerable chances to unauthorized users to manipulate valuable data of valid one [ 67 ].

Lack of Scalability of physical cloud storage: Scalability means all hardware resources are merged to provide more resources to the distributed cloud system. It might be beneficial for illegitimate access and modify cloud storage and physical data centers [ 68 ].

Unfair resource allocation strategy: Generally, monitoring data is stored in a shared pool in a public cloud environment which might not be preferable to cloud users who are not interested to leave any footprint on their work distribution/data transmission by a public cloud-hosted software component which will be the reason for a future mediocre of original data fetching [ 69 ].

Lack of performance monitoring of cloud storage: Generally, monitoring data is stored in a shared pool in a public cloud which might not be preferable to cloud users who are not interested to leave any footprint on their work distribution/data transmission by a public cloud-hosted software component [ 70 ].

Data threat: Cloud users store sensitive data in cloud environments about their personal information or business information. Due to the lack of data threat prevention techniques of cloud service providers, data may be lost or damaged [ 64 , 71 ].

Malicious cloud storage provider: Lack of transparency and access control policies are basic parameters of a cloud service provider being a malicious storage provider. Due to the missing of these two parameters, it’s quite easy to disclose confidential data of cloud users towards others for business profit [ 72 ].

Data Pooling: Resource pooling is an important aspect of cloud computing. Due to this aspect, data recovery policies and data confidentiality schemes are broken [ 73 ].

Data lock-in: Every cloud storage provider does not have a standard format to store data. Therefore, cloud users face a binding problem to switch data from one provider to another due to dynamic changes in resource requirements [ 39 ].

Security against internal and external malicious attack: Data might be lost or data can be modified by insider or outsider attacks [ 49 , 74 , 75 , 76 ].

Key management techniques with regards to storage level in cloud

In order to prevent data leakage and increase the difficulty of attack, this paper presents a method combining data distribution and data encryption to improve data storage security. We have listed here some key techniques used in cloud storage to enhance security and transparency between cloud storage, cloud users.

Hierarchical Key Technique: Some research articles [ 77 ] provide secret sharing and key hierarchy derivation technique in combination with user password to enhance key security, protecting the key and preventing the attacker from using the key to recover the data.

Private Key Update Technique:This identity-based encryption technique [ 78 ] helps to update the private keys of the non-revoked group users instead of the authenticators of the revoked user when the authenticators are not updated, and it does away with the complex certificate administration found in standard PKI systems.

Key Separation Technique: This cryptographic method aids in maintaining the privacy of shared sensitive data while offering consumers effective and efficient storage services [ 79 ].

Attribute-based Encryption Key Technique: Instead of disclosing decryption keys, this method achieves the conventional notion of semantic security for data secrecy, whereas existing methods only do so by establishing a lesser security notion [ 80 , 81 ]. It is used to share data with users in a confidential manner.

Multiple Key Technique:This k-NN query-based method improves security by assisting the Data owner(DO) and each query user in maintaining separate keys and not sharing them [ 82 ]. In the meantime, the DO uses his own key to encrypt and decrypt data that has been outsourced.

Potential attacks in storage level service

Storage level service in cloud computing offers services of resource computation, virtual network, shared storage over the internet in lease. It provides more flexible and scalable benefits than on-premise physical hardware. Due to these two aspects of the cloud, storage-level services can be the victim of malicious attacks attempting to steal computing resources for the publication of original data or data exfiltration in data braces. If attackers can successfully enter into the infrastructure services of an organization, they can then grip those parts to obtain access to other important parts of the enterprise architecture causing security issues of data integrity. We have listed here possible attacks on storage-level services.

DoS/DDoS: Ultimate purpose of this attack is to do unavailable original services towards users and overload the system by flooding spam results in a single cloud server. Due to the high workload, the performance of cloud servers slumps, and users lose the accessibility to their cloud services.

Phishing: Attackers steal important information in the form of a user’s credentials like name, password, etc. after redirecting the user to a fraud webpage as an original page.

Brute Force attack/ Online dictionary attack: It’s one type of cryptographic hack. Using an exhaustive key search engine, malicious attackers can violate the privacy policy of the data integrity scheme in cloud storage.

MITC: Man in the cloud attack helps attackers to gain the capability to execute any code on a victim machine through installing their synchronization token on a victim’s machine instead of the original synchronization token of a victim machine and using this token, attackers get control over target machine while target machine synchronizes this token to the attacker’s machine.

Port scanning: Attackers perform port scanning methods to identify open ports or exposed server locations, analyze the security level of storage and break into the target system.

Identity theft: Using password recovery method, attackers can get account information of legitimate users which causes loss of credential information of the user’s account.

Risk spoofing: Resource workload balancing is a good managerial part of cloud storage but due to this aspect of cloud computing, attackers can steal credential data of cloud users, able to spread malware code in host machines and create internal security issues.

Data loss/leakage: During data transmission time by external adversaries, incapability of cloud service providers, by unauthorized users of the same cloud environment, by internal malicious attackers, data can be lost or manipulated.

Shared technology issue: Compromising hypervisors, cloud service providers can run concurrently multiple OS as guests on a host computer. For the feebleness of hypervisor, attackers create vulnerabilities like data loss, insider malicious attacks, outsider attacks, loss of control on machines, and service disruption by taking control over all virtual machines.

Phases of data integrity technique

Data integrity always keeps the promise of data consistency and accuracy of data at cloud storage. Its probabilistic nature and resistance capability of storing data from unauthorized access help cloud users to gain trust for outsourcing their data to remote clouds. It consists of mainly three actors in this scheme: Data owner (DO), Cloud Storage/Service Provider (CSP), and Third-Party Auditor(optional) [ 39 ] as depicted in Fig.  1 . The data owner produces data before uploading it to any local cloud storage to acquire financial profit. CSP is a third-party organization offering Infrastructure as a service (IaaS) to cloud users. TPA exempts the burden of management of data of DO by checking the correctness and intactness of outsourced data. TPA also reduces communication overhead costs and the computational cost of the data owner [ 83 , 84 ]. Sometimes, DO ownself takes responsibility for data integrity verification without TPA interference. There are three phases in data integrity strategy described below in Table  2 :

Data processing phase: In data processing phase, data files are processed in many way like file is divided into blocks [ 60 ], applying encryption technique on blocks [ 90 ], generation of message digest [ 87 ], applying random masking number generation [ 88 ], key generation and applying signature on encrypted block [ 93 ] etc. and finally encrypted data or obfuscated data is outsourced to cloud storage.

Acknowledgement Phase: This phase is totally optional but valuable because sometimes there may arise a situation where CSP might conceal the message of data loss or discard data accidentally to maintain their image [ 88 ]. But most of the research works skip this step to minimize computational overhead costs during acknowledgment verification time.

Integrity verification phase: In this phase, DO/ TPA sends a challenge message to CSP and subsequently, CSP sends a response message as metadata or proof of information to TPA/DO for data integrity verification. The audit result is sent to DO if verification is done by TPA.

figure 1

Entire Cycle of Data Integrity Technique

Classification of data integrity strategy

Classification of data integrity depends on a variety of conceptual parameters and sub-parameters. Table  3 shows all parameters, and sub-parameters with references to give a clear idea about data integrity strategy. The deployment setup of data integrity strategy is dependent on the environment of the proposed system. Clients can store their data in public cloud set up [ 98 ], multi-cloud setup [ 99 , 100 ] or hybrid cloud set up [ 101 ]. If data are placed in a public cloud setup, clients lose access control visibility on data due to the outsider data management policy of CSP. As a result, data integrity problems arise because both CSP and public cloud storage are not honest in practical scenarios. Multi-cloud means more than one cloud service, more than one vendor in the same heterogeneous cloud architecture. A hybrid cloud is also a combination of private and public clouds. Hence, in the shared storage structure of multi and hybrid cloud environments, security issues of data integrity is a genuine concern. The guarantee of data integrity scheme can be proposed in two types: deterministic and probabilistic approaches. The performance of probabilistic verification is better than deterministic verification because of its higher accuracy in error correction of blocks without accessing the whole file and low computational overhead [ 102 ]. But, the deterministic approach gives adequate accuracy of data integrity whereas the probabilistic approach gives less than data integrity accuracy of deterministic approach [ 39 ].

Type of proposal

File level verification: This is a deterministic verification approach. Here, data integrity verification is generally done by either TPA or the client. The client submitted an encoded file to the storage server and for data integrity verification a verifier verified the encoded file through the challenge key and secret key which is chosen by the client [ 103 ].

Block Level Verification : This type of verification is a deterministic verification approach. Firstly, a file is divided into blocks, encrypted, generated message digest, and sent encrypted blocks to CSP. Later, CSP sends a response message to TPA for verification and TPA verifies all blocks by comparing the newly generated message digest with the old message digest generated by the client [ 87 ].

Randomly block level verification: This is a probabilistic verification approach. In this verification, a file is divided into blocks, next generate anyone signatures or combination of any two signatures of hash [ 86 ], BLS [ 88 ], HLA [ 124 ], random masking [ 88 ], or ZSS [ 97 ] for all blocks and submits both of them to cloud storage. Later, TPA generates a challenge message for randomly selected blocks which will be verified for data integration checking and sent to CSP. Next, CSP sends a proof message to TPA for verification. The proof message is verified by TPA for randomly selected blocks by generating new signatures and comparing old and new signatures of particular blocks [ 61 , 86 ].

Metadata verification: In this deterministic approach, firstly cloud users generate a secret key, and using this secret key, cloud users prepare metadata of the entire file through HMAC-MD5 authentication. Later, the encrypted file is sent to CSP, and metadata is sent to TPA. Later this metadata is used for integrity verification via TPA [ 85 ].

Category of data

Static data: In static nature, no need to modify data that are stored in cloud storage. In [ 105 ], a basic RDPC scheme is proposed for the verification of static data integrity. In remote cloud data storage, all static files are of state-of-the-art nature which gets the main attention but in practical scenarios, TPA gets permission to possess the original data file creates security problems. In [ 106 ], the RSASS scheme is introduced for static data verification by applying a secure hash signature (SHA1) on file blocks.

Dynamic Data: Data owners don’t have any restriction policy for applying updation, insertion and deletion operations on outsourced data for unlimited time which is currently stored in remote cloud storage. In [ 111 ], a PDP scheme is introduced by assuming a ranked skipping list to hold up completely dynamic operation on data to overcome the problem of limited no. of insertion and query operation on data which is described in [ 118 ]. In [ 117 ], dynamic data graph is used to restrict conflict of the dynamic nature of big-sized graph data application.

Verification type

Proof of ownership verification: The proof of ownership (PoW) scheme is introduced in the data integrity scheme to prove the actual data ownership of original data owner to server and to restrict unauthorized access to outsourced data of data owner from valid malicious users in the same cloud environment. PoW scheme is enclosed with data duplication scheme to reduce security issues about an illegal endeavor of a malicious user to access unauthorized data [ 27 ]. Three types of PoW scheme is defined: s-POW, s-Pow1, s-Pow2 in [ 29 ] which have satisfactory computation and I/O efficiency at user side but I/O burden on the remote cloud are significantly increased and this problem was overcome in [ 28 ] through establishing a balance between server and user side efficiency.

Provable of data possession: Provable of data possession (PDP) scheme promises statically the exactness of data integrity verification of cloud data without downloading on untrusted cloud servers and restricts data leakage attacks at cloud storage. In [ 104 ], research work described aspects of the PDP technique from a variety of system design perspectives like computation efficiency, robust verification, lightweight and constant communication cost, etc. in related work. In [ 112 ], certificateless PDP is proposed for public cloud storage to address key escrow problems and key management of general public key cryptography and solve the security problems(verifiers were able to extract original data of users during integrity verification time) of [ 113 , 120 ].

Proof of retrievability verification: Proof of retrievability(PoR) ensures data intactness in remote cloud storage. Both PoR and PDP perform similar functions with the difference that PoR scheme has the ability to recover faulty outsourced data whereas PDP only supports data integrity and availability of data to clients [ 108 ]. In [ 109 ], IPOR scheme is introduced which ensures 100% retrieval probability of corrupted blocks of original data file. DIPOR scheme also supports data retrieval technique of partial health records along with data update operation [ 115 ].

Auditing verification: Verification of cloud data which is outsourced by the data owner is known as the audit verification process. Data integrity scheme supports two types of verification: Private auditing verification(verification is done between CSP and data owner i.e. cloud user) and Public auditing verification (cloud user hiers a TPA to reduce computational and communication overhead at ownside and verification is done between CSP and TPA) [ 122 ]. Privacy-preserving public auditing [ 83 , 122 ], certificateless public auditing [ 125 ],optimized public auditing scheme [ 123 ] ,bitcoin-based public auditing [ 88 ], S-audit public auditing scheme [ 108 ], shared data auditing [ 83 ], Dynamic data public auditing [ 126 ] Non-privacy preserving public auditing scheme [ 127 ], digital signature(BLS, hash table, RSA etc. ) based public auditing scheme [ 88 , 119 , 128 ] etc. are some types of public auditing schemes. A private auditing scheme was first proposed by [ 110 ] called SW method and further reviewed by some research works[[ 87 , 116 ].

Characteristics of data integrity technique

In this review article, focuses on several quality features of data integrity, which have individually prime importance in cloud storage security. These are:

Public Auditability: The auditability scheme examines the accuracy of stored outsourced data from data owner at cloud storage by TPA according to the request of data owners [ 94 , 95 ].

Audit correctness: The proof message of CSP can pass the validation test of TPA only if CSP and TPA are being honest and CSP, data owner properly follow the pre-defined process of data storing [ 89 , 78 ].

Auditing soundness: The one and only way to pass TPA’s verification test is that CSP has to store the data owner’s entire outsourced data at cloud storage [ 90 ].

Error localization at block level: It helps to find out error blocks of a file in cloud storage during verification time [ 89 ].

Data Correctness: It helps to rectify error data block with available replica block’s information in cloud storage [ 89 ].

Stateless Auditor: During verification, a stateless auditor is not necessary to maintain, store or update previous results of verification for future usages [ 88 , 95 ].

Storage Correctness: CSP prepares a report which shows that all data is entirely stored in cloud storage even if the data are partially tempered or lost. Therefore, the system needs to guarantee data owners that their outsourced data are the same as what was previously stored [ 129 ].

Robustness: In probabilistic data integrity strategy, errors in smaller size data should be identified and rectified [ 39 ].

Unforgeability: Authenticated users can only generate a valid signature/metadata on shared data [ 129 ].

Data Dynamic support: It allows data owners to insert, edit and delete data in the cloud storage by maintaining the constant level of integrity verification support like previous [ 89 ].

Dependability: Data should be available during managing all the file blocks time [ 89 ].

Replica Audibility: It helps to examine the replicas of the data file stored in the cloud storage by TPA on demand with data owners [ 89 ].

Light Weight: It means that due to the occurrence of a large number of blocks and the presence of multiple users in the system, signature process time should be short to reduce the computational overhead of clients[ 88 , 97 ].

Auditing Correctness: It ensures that the response message from the CSP side can pass only the verification trial of TPA when CSP properly stores outsourced data perfectly into cloud storage [ 97 ].

Privacy Protection: During verification, the auditing scheme should not expose a user’s identity information in front of an adversary [ 90 , 97 ].

Efficient User Revocation: The repeal users are not able to upload any data to cloud storage and can not be authorized users any more [ 78 ].

Batch Auditing: In the public auditing scheme, batch auditing method is proposed for doing multiple auditing tasks from different cloud users which TPA can instantly perform [ 95 ].

Data Confidentiality: TPA can not acquire actual data during data integrity verification time [ 90 ].

Boundless Verification: Data owners never give TPA any obligate condition about a fixed no. of verification interaction of data integrity [ 88 ].

Efficiency: The size of test metadata and the test time on multi-owner’s outsourced data in cloud computing are both individualistic with the number of data owners [ 95 ].

Private Key Correctness: Private key can pass verification test of cloud user only if the Private key Generator (PKG) sends a right private key to the cloud user [ 90 ].

Blockless Verification: TPA no need to download entire blocks from cloud storage for verification [ 95 ].

Challenges of data integrity technique in cloud environment

Security challenges of data integrity technique in cloud computing always come with some fundamental questions:

how outsourced data will be safe in a remote server and how data will be protected from any loss, damage, or alteration in cloud storage?

how security will assure cloud data if a malicious user is present inside the cloud?

On which location of shared storage, outsourced data will be stored?

Will legitimate access to the cloud data be by an authorized user only with complete audit verification availability?

All the above questions are associated with the term privacy preservation of data integrity scheme and that’s why data integrity in cloud computing is a rapidly growing challenge still now. Refer Table  4 , for existing solutions to security challenges and corresponding solutions of data integrity techniques.

Risk to integrity of data : This security is divided into three parts:

during globally acquiring time, cloud services are hampered by many malicious attacks if integrity of database, network etc. are properly maintained.

Data availability and integrity problems occur if unauthorized changes happened with data by CSP.

Segregation problem of data among cloud users in cloud storage is another problem of data integrity. Therefore, SLA-based patch management policy, standard validation technique against unauthorized use and adequate security parameters need to be included in data integrity technique [ 131 ].

Dishonest TPA : A dishonest TPA has two prime intentions:

TPA can spoil the image of CSP by generating wrong integrity verification messages.

TPA can exploit confidential information with the help of malicious attackers through repeated verification interaction messages with cloud storage.

Hence, an audit message verification method must be included in a data integrity verification scheme to continuously analyze the intentional behavior of TPA

Dishonest CSP : An adversary CSP has three motives: i) CSP tries to retrieve either the original content of the entire data file or all block information of the data file and this leakage data information are used by CSP for business profit. ii) CSP can modify the actual content of a file and use it for personal reasons. But in both cases, the data owner can not detect the actual culprit. iii) CSP always tries to maintain its business reputation even if outsourced data of owner are partially tempered or lost Particularly, for that reason, an acknowledged verification method, an error data detection method and an error data recovery method should be included in data integrity scheme to maintain intactness of data and confidentiality of data [ 89 , 132 ].

Forgery Attack at Cloud Storage : Outsider attacker may forge a proof message which is generated by CSP for the blocks indicated by challenge message to respond TPA. Malicious auditors may forge an audit-proof that passes the data integrity verification [ 88 , 90 ].

Data modification by an insider malicious user into cloud storage : An insider malicious user can subvert or modify a data block at his/her will and can fool the auditor and data owner to trust that the data blocks are well maintained at the cloud storage even if that malicious user alters the interaction messages in the network channel. Hence data confidentiality scheme or obfuscation data technique should be included in data integrity technique [ 92 ].

Desire design challenges of data integrity strategy

Below are the main design issues for data integrity schemes:

Communication overhead : It means total outsourcing data, which is transferred from client to storage server, transfer of challenge message to CSP, transfer of the proof message to TPA, transfer of audit message to client all are communication overhead. Table  5 ,compares the communication overhead incurred during public auditing by DO, LCSP, and RTPA. Since DO always sends either their original file, an encrypted file, or an encrypted file with a signature to a cloud server, most articles here consider communication overhead for creating challenge messages and challenge-response messages, which is not included in DO’s communication overhead.

Computational overhead : Data preprocessing, signature generation and audit message verification from data owner side or trusted agent side, challenge message generation, data integrity verification and audit message generation from the TPA side, prof message generation from CSP side all are computational overhead. In [ 97 ], the computational overhead of client, CSP and TPA are less than [ 124 ] because ZSS signature requires less overhead of power exponential and hash calculation than BLS signature. Table  6 compares the computational overhead incurred during public auditing by DO, LCSP, and RTPA. Here, Pair denotes bilinear pairing operatons, Hash denotes hash function, Mul denotes multiplication operation, ADD denotes addition operation, Exp denotes exponential operation,Inv denotes inverser operation,Encrypt denotes encryption operation, decrypt denotes decryption operation,and Sub denotes subtraction operation etc.

Storage overhead : Entire file or block files, metadata, signature, and replica blocks are required to be stored at cloud storage and at client side depending on the policy of system models. Cloud user storage overhead should be little during auditing verification to save extra storage overhead [ 36 ].

Cost overhead : It denotes the summarized cost of communication overhead, computational overhead, and storage overhead.

Data Dynamic Analysis : Stored data in cloud storage is not always static. Sometimes, alternation of data, deletion of data or addition of new data with old one are basic functions that come into the practical picture due to the dynamic demanding nature of clients. Therefore, data integrity verification should be done after all dynamic operations on stored data. In [ 93 ], the insertion, deletion and updation time of increasing data blocks are less than [ 123 ] due to less depth of the authenticated structure of the dynamic data integrity auditing scheme.

Comparative analysis of data integrity strategies

integrity checking scheme

This section presents a comparative study and comparison of data integrity strategies. Table  7 shows a comparative analysis of the data integrity strategy of cloud storage for expected design methods with limitations. Zang et al. [ 88 ] introduced a random masking technique in public audibility scheme during the computation of proof information generation time. Due to the linear relationship between the data block and proof information, malicious adversaries are capable of inert the effectiveness of the SWP scheme. In the SWP scheme, CSP generates proof information and sends it to TPA for verification. There may be an uncertain situation arise when CSP is intruded on by an external and malicious adversary that can alter every data block’s information. To hoax TPA and pass the verification test, a malicious adversary can eavesdrop challenge message and break off the proof message. Therefore, in the SWP scheme, we assume that TPA is the trustworthy element. But practically, it is not possible. To defend against external malicious adversaries without a protective channel, the author proposed here a nonlinear disturbance code as a random masking technique to alter the linear relationship into a nonlinear relationship between data blocks and proof messages. The author applied a BLS hash signature on each block to help the verifier for random block verification. These public audibility verification techniques assure boundless, effective, stateless auditor and soundness criteria with two limitations are that due to the missing data storing acknowledge verification, the reputation of the Cloud services may be destroyed and this scheme is applicable for only static data.

M Thangavel et al. [ 89 ] proposed a novel auditing framework, which protects cloud storage from malicious attacks. This technique is based on a ternary and replica-based ternary hash tree which ensures dynamically block updating, data correctness with error localization operation, data insertion, and data deletion operations. W. Shen et al. [ 90 ] introduced identity-based data auditing scheme to hide sensitive information at the block level for securing cloud storage during data sharing time. Using this scheme, sanitizer sanitizes data blocks containing sensitive information. Chameleon hash and an unforgeable chameleon hash signature do not provide blockless auditing and require high computational overhead. Hence, this PKG-based signature method assures blockless verification and reduces computational overhead. These public audibility verification techniques assure auditing soundness, private key correctness, and sensitive information hiding one limitation is that due to missing audit messages, TPA can deceive users about data verification. S.Mohanty et al. [ 85 ] introduce a confidentiality-preserving auditing scheme by which cloud users can easily verify the risk of the used service from the audit report which is maintained by TPA. This scheme has two benefits. First, it helps to check the integrity of cloud users’ data. Second, it verifies the TPA’s authentication and repudiation. In this scheme, the author proposed a system model which supports the basic criteria of cloud security auditing, confidentiality, and availability. HMAC-MD5 technique is used on metadata to maintain data privacy on the TPA side. Chen et al. [ 61 ] proposed MAC oriented data integrity technique based on the metadata verification method which reinforces auditing correctness. These technique helps to protect stored data in cloud storage from MitM and replay attacks. But this scheme needs to improve because, after some repeated pass of challenge-proof messages, CSP will have the ability to get actual block elements of the user’s confidential data.

S. Hiremath et al. [ 87 ] established a public blockless data integrity scheme that secures fixed time to check data of variable size files. For data encryption, the author uses the AES algorithm and SHA-2 algorithm for the data auditing scheme. The author uses the concept of random masking and Homomorphic Linear Authenticator (HLA) techniques to ensure stored data confidentiality during auditing time. But this scheme is only applicable for static data stored in cloud storage. Hence, it needs to expand for dynamic data operations. T. Subha. et al. [ 86 ] introduced the idea of public auditability to check the correctness of stored data in cloud storage and assume that TPA is a trusty entity. Data privacy mechanisms like Knox and Oruta have been proposed here to grow the security level at cloud storage and resist active adversary attacks. The author uses the Merkle hash tree to encrypt data block elements. B.Shao et al. [ 93 ] established a hierarchical Multiple Branches Tree(HMBT) which secures users’ data auditing correctness, fulfills the crypto criteria of data privacy, and gives protection against forgery and replay attacks. The scheme is used a special hash function to give BLS signature on block elements and helps in public auditing.DCDV is a concept based on a hierarchical time tree and Merkle hash tree. Simultaneous execution of access control and data auditing mechanism rarely happens in attribute-based cryptography. Hence, Dual Control and Data Variable(DCDV) data integrity scheme is proposed in [ 132 ]. This scheme ensures the solution of the private data leakage problem by the user’s secret key and assures the correctness of the auditing scheme. A PDP technique is proposed for data integrity verification scheme that supports dynamic data update operations, reduces communication overhead for fine-grained dynamic update of Bigdata increases the protection level of stored data at cloud storage, and resists collusion resistance attacks and batch auditing [ 114 ]. Another novel public auditing scheme based on an identity-based cryptographical idea ensures low computational overhead from revocated users during the possession of all file blocks. It fulfills the crypto criteria of soundness, correctness, security, and efficiency of revoke users [ 78 ].

Some research works introduced BLS cryptographical signature which has the shortest length among all available signatures [ 88 ]. This signature is based on a special hash function that is probabilistic, not deterministic. Also, it has more overhead of power exponential and hash calculation. To overcome signature efficiency and computational overhead, a new signature ZSS is proposed [ 97 ]. This integrity scheme supports crypto criteria like privacy protection, public auditing correctness, and resisting message and forgery. An attribute-based data auditing scheme is proposed in [ 137 ] which proved data correctness and soundness based on discrete logarithm and Diffie-Hellman key exchange algorithm. This scheme maintains the privacy of confidential data of cloud users and resists collusion in blocks during auditing verification time. attacks. ID-based remote data auditing scheme(ID-PUIC) is introduced here which secures efficiency, security, and flexibility with the help of the Diffie-Hellman problem [ 98 ]. It also supports ID-based proxy data upload operation when users are restricted to access public cloud servers. It shows a lower computation cost of server and TPA than [ 107 ]. Both researches works [ 105 , 126 ] have worked on public checking of data intactness of outsourced data and reducing communication and computational cost of the verifier. These also support dynamic data auditing, blockless verification, and privacy preservation.

Future trends in data integrity approaches

As further research work, we are discussing here the future direction of the data integrity scheme to enlarge the scope of cloud data security for research process continuity. New emerging trends in data integrity schemes are listed below.In [ 39 ], authors have already discussed and shown evolutionary trends of data integrity schemes through a timeline representation from 2007 to 2015 which presented possible scopes of data integrity strategy. Hence, we show a visual representation of all probable trends of the integrity scheme from 2016 to 2022 in the timeline infographic template, Fig.  2 .

Blockchain data-based integrity : Blockchain technology is decentralized, peer-peer technology. It supports scalable and distributed environments in which all the data are treated as transparent blocks that contain cryptographic hash information of the previous block, and timestamps to resist any alteration of a single data block without modifying all the subsequent linked blocks. This feature of this technology improves the performance of cloud storage and maintains the trust of data owners by increasing data privacy through the Merkle tree concept. In [ 138 ], a distributed virtual agent model is proposed through mobile agent technology to maintain the reliability of cloud data and to ensure trust verification of cloud data via multi-tenant. In [ 139 ], a blockchain-based generic framework is proposed to increase the security of the provenance data in cloud storage which is important for accessing log information of cloud data securely. In [ 140 , 141 , 142 ], all research works have the same intention of using blockchain technology to enhance data privacy and maintain data integrity in cloud storage.In Table  8 , this article show use Blockchain technology to overcome some issues of cloud storage.

Data integrity in fog computing : Generally, privacy protection schemes are able to resist completely insider attacks in cloud storage. In [ 147 ], a fog computing-based TLS framework is proposed to maintain the privacy of data in Fog servers. The extension part of cloud computing is fog computing which was firstly introduced in 2011 [ 148 ]. The three advantages of fog computing are towering real-time, low latency, and broader range geographical distribution which is embedded with cloud computing to ensure the privacy of data in fog servers which is a powerful supplement to maintain data privacy preservation in cloud storage.

Distributed Machine Learning Oriented Data Integrity : In artificial intelligence, maintaining the integrity of training data in the distributed machine learning environment is a rapidly growing challenge due to network forge attacks. In [ 136 ], distributed machine learning-oriented data integrity verification scheme (DML-DIV) is introduced to assure training data intactness and to secure training data model. PDP sampling auditing algorithm is adopted here to resist tampering attacks and forge attacks. Discrete logarithm problem (DLP) is introduced in the DML-DIV scheme to ensure privacy preservation of training data during TPA’s challenge verification time. To reduce key escrow problem and certificate cost, identity-based cryptography and key generation technology is proposed here.

Data Integrity in Edge Computing : Edge computing is an extensional part of distributed computing. Cache data integrity is a new concept in edge computing developed based on cloud computing which serves optimized data retrieval latency on edge servers and gives centralized problems of cloud storage server.Edge data integrity(EDI) concept is first proposed to effectively handle auditing of vendor apps’ cache data on edge servers which is a challenging issue in dynamic, distributed, and volatile edge environments described In [ 149 ]. Research work proposed here EDI-V model using variable Merkle hash tree (VMHT) structure to maintain cache data auditing on a large scale server through generating integrity of replica data of it. In [ 150 ], the EDI-S model is introduced to verify the integrity of edge data and to localize the corrupted data on edge servers by generating digital signatures of each edge’s replica.

figure 2

Timeline Infographic of Data Integrity

With the continuously enlarging popularity of attractive and optimized cost-based cloud services, it is inconvenient to make sure data owners the intactness of outsourced data in cloud storage environments has become a disaster security challenge. We have tried to highlight several issues and the corresponding solution approaches for cloud data integrity which will provide a visualization as well as clear directions to researchers. The current state of the art in this mentioned research field will provide extra milestones in several areas like cloud-based sensitive health care, secured financial service, managing social media flat-forms, etc. In this paper, we have discussed phases of data integrity, characteristics of data integrity scheme, classification of data integrity strategy based on the type of proposal, nature of data and type of verification schemes, and desired design challenges of data integrity strategy based on performance overhead. We have also identified issues in physical cloud storage and attacks on storage-level services along with mitigating solutions. Lastly, we have established here a timeline infographic visual representation of a variety of data integrity schemes and future aspects of data integrity strategy to explore all the security directions of cloud storage.

Availability of data and materials

Not applicable.

Buyya R, Broberg J, Goscinski AM (2010) Cloud computing: Principles and paradigms, vol 87. Wiley

Mell P, Grance T, et al (2011) The nist definition of cloud computing

Wu C, Buyya R, Ramamohanarao K (2019) Cloud pricing models: Taxonomy, survey, and interdisciplinary challenges. ACM Comput Surv (CSUR) 52(6):1–36

Article   CAS   Google Scholar  

Dimitri N (2020) Pricing cloud iaas computing services. J Cloud Comput 9(1):1–11

Article   MathSciNet   Google Scholar  

Roy SS, Garai C, Dasgupta R (2015) Performance analysis of parallel cbar in mapreduce environment. In: 2015 International Conference on Computing, Communication and Security (ICCCS). IEEE, pp 1–7

Singhal S, Sharma A (2020) Load balancing algorithm in cloud computing using mutation based pso algorithm. In: Advances in Computing and Data Sciences: 4th International Conference. Springer, pp 224–233

Luong NC, Wang P, Niyato D, Wen Y, Han Z (2017) Resource management in cloud networking using economic analysis and pricing models: A survey. IEEE Commun Surv Tutorials 19(2):954–1001

Article   Google Scholar  

Goswami P, Roy SS, Dasgupta R (2017) Design of an architectural framework for providing quality cloud services. In: International Conference on Grid, Cloud, & Cluster Computing. pp 17–23

Anuradha V, Sumathi D (2014) A survey on resource allocation strategies in cloud computing. In: International Conference on Information Communication and Embedded Systems (ICICES2014). IEEE, pp 1–7

Magalhaes D, Calheiros RN, Buyya R, Gomes DG (2015) Workload modeling for resource usage analysis and simulation in cloud computing. Comput Electr Eng 47:69–81

Singhal S, Sharma A (2021) Mutative aco based load balancing in cloud computing. Eng Lett 29(4)

Chandramohan D, Vengattaraman T, Dhavachelvan P, Baskaran R, Venkatachalapathy V (2014) Fewss-framework to evaluate the service suitability and privacy in a distributed web service environment. Int J Model Simul Sci Comput 5(01):1350016

Klosterboer L (2011) ITIL capacity management. Pearson Education

Majumdar A, Roy SS, Dasgupta R (2017) Job migration policy in a structured cloud framework. In: 2017 International Conference on Computational Science and Computational Intelligence (CSCI). IEEE, pp 1529–1534

Singhal S, Sharma A (2021) A job scheduling algorithm based on rock hyrax optimization in cloud computing, vol 103. Springer, pp 2115–2142

Dong Y, Sun L, Liu D, Feng M, Miao T (2018) A survey on data integrity checking in cloud. In: 2018 1st International Cognitive Cities Conference (IC3). IEEE, pp 109–113

Bian G, Fu Y, Shao B, Zhang F (2022) Data integrity audit based on data blinding for cloud and fog environment. IEEE Access 10:39743–39751. https://doi.org/10.1109/ACCESS.2022.3166536

Iqbal A, Saham H (2014) Data integrity issues in cloud servers. Int J Comput Sci Issues (IJCSI) 11(3):118

Google Scholar  

Caronni G, Waldvogel M (2003) Establishing trust in distributed storage providers. In: Proceedings Third International Conference on Peer-to-Peer Computing (P2P2003). IEEE, pp 128–133

Ogiso S, Mohri M, Shiraishi Y (2020) Transparent provable data possession scheme for cloud storage. In: 2020 International Symposium on Networks, Computers and Communications (ISNCC). IEEE, pp 1–5

Masood R, Pandey N, Rana Q (2020) Dht-pdp: A distributed hash table based provable data possession mechanism in cloud storage. In: 2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO). IEEE, pp 275–279

Bian G, Chang J (2020) Certificateless provable data possession protocol for the multiple copies and clouds case. IEEE Access 8:102958–102970

Zhang X, Wang X, Gu D, Xue J, Tang W (2022) Conditional anonymous certificateless public auditing scheme supporting data dynamics for cloud storage systems. IEEE Trans Netw Serv Manag 19(4):5333–5347. https://doi.org/10.1109/TNSM.2022.3189650

Li J, Yan H, Zhang Y (2021) Certificateless public integrity checking of group shared data on cloud storage. IEEE Trans Serv Comput 14(1):71–81. https://doi.org/10.1109/TSC.2018.2789893

Yuan Y, Zhang J, Xu W (2020) Dynamic multiple-replica provable data possession in cloud storage system. IEEE Access 8:120778–120784

Juels A, Kaliski Jr BS (2007) Pors: Proofs of retrievability for large files. In: Proceedings of the 14th ACM conference on Computer and communications security. ACM, pp 584–597

González-Manzano L, Orfila A (2015) An efficient confidentiality-preserving proof of ownership for deduplication. J Netw Comput Appl 50:49–59

Yu CM, Chen CY, Chao HC (2015) Proof of ownership in deduplicated cloud storage with mobile device efficiency. IEEE Netw 29(2):51–55

Di Pietro R, Sorniotti A (2012) Boosting efficiency and security in proof of ownership for deduplication. In: Proceedings of the 7th ACM Symposium on Information, Computer and Communications Security. ACM, pp 81–82

Gentry C (2009) Fully homomorphic encryption using ideal lattices. In: Proceedings of the forty-first annual ACM symposium on Theory of computing. ACM, pp 169–178

Enoch SY, Hong JB, Kim DS (2018) Time independent security analysis for dynamic networks using graphical security models. In: 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). IEEE, pp 588–595

Kumar S, Singh SK, Singh AK, Tiwari S, Singh RS (2018) Privacy preserving security using biometrics in cloud computing. Multimed Tools Appl 77(9):11017–11039

Sirohi P, Agarwal A (2015) Cloud computing data storage security framework relating to data integrity, privacy and trust. In: 2015 1st international conference on next generation computing technologies (NGCT). IEEE, pp 115–118

Prasad D, Singh BR, Akuthota M, Sangeetha M (2014) An etiquette approach for public audit and preserve data at cloud. Int J Comput Trends Technol (IJCTT) 16

Skibitzki B (2021) How zebra technologies manages security & risk using security command center. https://cloud.google.com/blog/products/identity-security/how-zebra-technologies

Li A, Chen Y, Yan Z, Zhou X, Shimizu S (2020) A survey on integrity auditing for data storage in the cloud: from single copy to multiple replicas. IEEE Trans Big Data 8(5):1428–1442.

Tan CB, Hijazi MHA, Lim Y, Gani A (2018) A survey on proof of retrievability for cloud data integrity and availability: Cloud storage state-of-the-art, issues, solutions and future trends. J Netw Comput Appl 110:75–86

Pujar SR, Chaudhari SS, Aparna R (2020) Survey on data integrity and verification for cloud storage. In: 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT). IEEE, pp 1–7

Zafar F, Khan A, Malik SUR, Ahmed M, Anjum A, Khan MI, Javed N, Alam M, Jamil F (2017) A survey of cloud computing data integrity schemes: Design challenges, taxonomy and future trends. Comput Secur 65:29–49

Debnath S, Bhuyan B (2019) Large universe attribute based encryption enabled secured data access control for cloud storage with computation outsourcing. Multiagent Grid Syst 15(2):99–119

Hsien WF, Yang CC, Hwang MS (2016) A survey of public auditing for secure data storage in cloud computing. Int J Netw Secur 18(1):133–142

Zhou L, Fu A, Yu S, Su M, Kuang B (2018) Data integrity verification of the outsourced big data in the cloud environment: A survey. J Netw Comput Appl 122:1–15

Liu CW, Hsien WF, Yang CC, Hwang MS (2016) A survey of public auditing for shared data storage with user revocation in cloud computing. Int J Netw Secur 18(4):650–666

Garg N, Bawa S (2016) Comparative analysis of cloud data integrity auditing protocols. J Netw Comput Appl 66:17–32

Sutradhar MR, Sultana N, Dey H, Arif H (2018) A new version of kerberos authentication protocol using ecc and threshold cryptography for cloud security. In: 2018 Joint 7th International Conference on Informatics, Electronics & Vision (ICIEV) and 2018 2nd International Conference on Imaging, Vision & Pattern Recognition (icIVPR). IEEE, pp 239–244

Patel SC, Singh RS, Jaiswal S (2015) Secure and privacy enhanced authentication framework for cloud computing. In: 2015 2nd International Conference on Electronics and Communication Systems (ICECS). IEEE, pp 1631–1634

Hong H, Sun Z, Xia Y (2017) Achieving secure and fine-grained data authentication in cloud computing using attribute based proxy signature. In: 2017 4th International Conference on Information Science and Control Engineering (ICISCE). IEEE, pp 130–134

Wang W, Ren L, Chen L, Ding Y (2019) Intrusion detection and security calculation in industrial cloud storage based on an improved dynamic immune algorithm. Inf Sci 501:543–557

Yan Q, Yu FR, Gong Q, Li J (2015) Software-defined networking (sdn) and distributed denial of service (ddos) attacks in cloud computing environments: A survey, some research issues, and challenges. IEEE Commun Surv Tutor 18(1):602–622

Dong S, Abbas K, Jain R (2019) A survey on distributed denial of service (ddos) attacks in sdn and cloud computing environments. IEEE Access 7:80813–80828

Thirumallai C, Mekala MS, Perumal V, Rizwan P, Gandomi AH (2020) Machine learning inspired phishing detection (pd) for efficient classification and secure storage distribution (ssd) for cloud-iot application. In: 2020 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, pp 202–210

Mary BF, Amalarethinam DG (2017) Data security enhancement in public cloud storage using data obfuscation and steganography. In: 2017 World Congress on Computing and Communication Technologies (WCCCT). IEEE, pp 181–184

Nakouri I, Hamdi M, Kim TH (2017) A new biometric-based security framework for cloud storage. In: 2017 13th International Wireless Communications and Mobile Computing Conference (IWCMC). IEEE, pp 390–395

Meddeb-Makhlouf A, Zarai F, et al (2018) Distributed firewall and controller for mobile cloud computing. In: 2018 IEEE/ACS 15th International Conference on Computer Systems and Applications (AICCSA). IEEE/ACS, pp 1–9

Fu Y, Au MH, Du R, Hu H, Li D (2020) Cloud password shield: A secure cloud-based firewall against ddos on authentication servers. In: 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS). IEEE, pp 1209–1210

Zeidler C, Asghar MR (2018) Authstore: Password-based authentication and encrypted data storage in untrusted environments. In: 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). IEEE, pp 996–1001

Erdem E, Sandıkkaya MT (2018) Otpaas-one time password as a service. IEEE Trans Inf Forensic Secur 14(3):743–756

Chandramohan D, Vengattaraman T, Rajaguru D, Baskaran R, Dhavachelvan P (2013) Emppc-an evolutionary model based privacy preserving technique for cloud digital data storage. In: 2013 3rd IEEE International Advance Computing Conference (IACC). IEEE, pp 89–95

Bakas A, Dang HV, Michalas A, Zalitko A (2020) The cloud we share: Access control on symmetrically encrypted data in untrusted clouds. IEEE Access 8:210462–210477

Rukavitsyn AN, Borisenko KA, Holod II, Shorov AV (2017) The method of ensuring confidentiality and integrity data in cloud computing. In: 2017 XX IEEE International Conference on Soft Computing and Measurements (SCM). IEEE, pp 272–274

Chen Y, Li L, Chen Z (2017) An approach to verifying data integrity for cloud storage. In: 2017 13th International Conference on Computational Intelligence and Security (CIS). IEEE, pp 582–585

Alneyadi S, Sithirasenan E, Muthukkumarasamy V () A survey on data leakage prevention systems. J Netw Comput Appl 62:137–152

Baloch FS, Muhammad TA, Waqas L, Mehmet B, Muhammad AN, Gönül Cömertpay, Nergiz Çoban et al (2023) "Recent advancements in the breeding of sorghum crop: current status and future strategies for marker-assisted breeding." Frontiers in Genetics 14:1150616.

Rakotondravony N, Taubmann B, Mandarawi W, Weishäupl E, Xu P, Kolosnjaji B, Protsenko M, De Meer H, Reiser HP (2017) Classifying malware attacks in iaas cloud environments. J Cloud Comput 6(1):1–12

Perez-Botero D, Szefer J, Lee RB (2013) Characterizing hypervisor vulnerabilities in cloud computing servers. In: Proceedings of the 2013 international workshop on Security in cloud computing. ACM, pp 3–10

Tunc C, Hariri S, Merzouki M, Mahmoudi C, De Vaulx FJ, Chbili J, Bohn R, Battou A (2017) Cloud security automation framework. In: 2017 IEEE 2nd International Workshops on Foundations and Applications of Self Systems. IEEE, pp 307–312

Maithili K, Vinothkumar V, Latha P (2018) Analyzing the security mechanisms to prevent unauthorized access in cloud and network security. J Comput Theor Nanosci 15(6–7):2059–2063

Somasundaram TS, Prabha V, Arumugam M (2012) Scalability issues in cloud computing. In: 2012 Fourth International Conference on Advanced Computing (ICoAC). IEEE, pp 1–5

Yousafzai A, Gani A, Noor RM, Sookhak M, Talebian H, Shiraz M, Khan MK (2017) Cloud resource allocation schemes: review, taxonomy, and opportunities. Knowl Inf Syst 50(2):347–381

Natu M, Ghosh RK, Shyamsundar RK, Ranjan R (2016) Holistic performance monitoring of hybrid clouds: Complexities and future directions. IEEE Cloud Comput 3(1):72–81

Mahajan A, Sharma S (2015) The malicious insiders threat in the cloud. Int J Eng Res Gen Sci 3(2):245–256

Liao X, Alrwais S, Yuan K, Xing L, Wang X, Hao S, Beyah R (2018) Cloud repository as a malicious service: challenge, identification and implication. Cybersecurity 1(1):1–18

Singh A, Chatterjee K (2017) Cloud security issues and challenges: A survey. J Netw Comput Appl 79:88–115

Daniel E, Durga S, Seetha S (2019) Panoramic view of cloud storage security attacks: an insight and security approaches. In: 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC). IEEE, pp 1029–1034

Devi BK, Subbulakshmi T (2017) Ddos attack detection and mitigation techniques in cloud computing environment. In: 2017 International Conference on Intelligent Sustainable Systems (ICISS). IEEE, pp 512–517

Yusop ZM, Abawajy J (2014) Analysis of insiders attack mitigation strategies. Procedia-Soc Behav Sci 129:581–591

Song H, Li J, Li H (2021) A cloud secure storage mechanism based on data dispersion and encryption. IEEE Access 9:63745–63751. https://doi.org/10.1109/ACCESS.2021.3075340

Zhang Y, Yu J, Hao R, Wang C, Ren K (2020) Enabling efficient user revocation in identity-based cloud storage auditing for shared big data. IEEE Trans Dependable Secure Comput 17(3):608–619. https://doi.org/10.1109/TDSC.2018.2829880

Zuo C, Shao J, Liu JK, Wei G, Ling Y (2018) Fine-grained two-factor protection mechanism for data sharing in cloud storage. IEEE Trans Inf Forensic Secur 13(1):186–196. https://doi.org/10.1109/TIFS.2017.2746000

Cui H, Deng RH, Li Y, Wu G (2019) Attribute-based storage supporting secure deduplication of encrypted data in cloud. IEEE Trans Big Data 5(3):330–342. https://doi.org/10.1109/TBDATA.2017.2656120

Sun S, Ma H, Song Z, Zhang R (2022) Webcloud: Web-based cloud storage for secure data sharing across platforms. IEEE Trans Dependable Secure Comput 19(3):1871–1884. https://doi.org/10.1109/TDSC.2020.3040784

Cheng K, Wang L, Shen Y, Wang H, Wang Y, Jiang X, Zhong H (2021) Secure kk-nn query on encrypted cloud data with multiple keys. IEEE Trans Big Data 7(4):689–702. https://doi.org/10.1109/TBDATA.2017.2707552

Wang B, Li B, Li H (2014) Oruta: Privacy-preserving public auditing for shared data in the cloud. IEEE Trans Cloud Comput 2(1):43–56

Indhumathil T, Aarthy N, Devi VD, Samyuktha V (2017) Third-party auditing for cloud service providers in multicloud environment. In: 2017 Third International Conference on Science Technology Engineering & Management (ICONSTEM). IEEE, pp 347–352

Mohanty S, Pattnaik PK, Kumar R (2018) Confidentiality preserving auditing for cloud computing environment. In: 2018 International Conference on Research in Intelligent and Computing in Engineering (RICE). IEEE, pp 1–4

Subha T, Jayashri S (2017) Efficient privacy preserving integrity checking model for cloud data storage security. In: 2016 Eighth International Conference on Advanced Computing (ICoAC). IEEE, pp 55–60

Hiremath S, Kunte S (2017) A novel data auditing approach to achieve data privacy and data integrity in cloud computing. In: 2017 International Conference on Electrical, Electronics, Communication, Computer, and Optimization Techniques (ICEECCOT). IEEE, pp 306–310

Zhang Y, Xu C, Li H, Liang X (2016) Cryptographic public verification of data integrity for cloud storage systems. IEEE Cloud Comput 3(5):44–52

Thangavel M, Varalakshmi P (2019) Enabling ternary hash tree based integrity verification for secure cloud data storage. IEEE Trans Knowl Data Eng 32(12):2351–2362

Shen W, Qin J, Yu J, Hao R, Hu J (2018) Enabling identity-based integrity auditing and data sharing with sensitive information hiding for secure cloud storage. IEEE Trans Inf Forensic Secur 14(2):331–346

Singh P, Saroj SK (2020) A secure data dynamics and public auditing scheme for cloud storage. In: 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS). IEEE, pp 695–700

Ni J, Yu Y, Mu Y, Xia Q (2013) On the security of an efficient dynamic auditing protocol in cloud storage. IEEE Trans Parallel Distrib Syst 25(10):2760–2761

Shao B, Bian G, Wang Y, Su S, Guo C (2018) Dynamic data integrity auditing method supporting privacy protection in vehicular cloud environment. IEEE Access 6:43785–43797

Shen J, Liu D, He D, Huang X, Xiang Y (2017) Algebraic signatures-based data integrity auditing for efficient data dynamics in cloud computing. IEEE Trans Sustain Comput 5(2):161–173

Wang B, Li H, Liu X, Li F, Li X (2014) Efficient public verification on the integrity of multi-owner data in the cloud. J Commun Netw 16(6):592–599

Yu Y, Li Y, Yang B, Susilo W, Yang G, Bai J (2017) Attribute-based cloud data integrity auditing for secure outsourced storage. IEEE Trans Emerg Top Comput 8(2):377–390

Zhu H, Yuan Y, Chen Y, Zha Y, Xi W, Jia B, Xin Y (2019) A secure and efficient data integrity verification scheme for cloud-iot based on short signature. IEEE Access 7:90036–90044

Wang H, He D, Tang S (2016) Identity-based proxy-oriented data uploading and remote data integrity checking in public cloud. IEEE Trans Inf Forensic Secur 11(6):1165–1176

Thakur AS, Gupta P (2014) Framework to improve data integrity in multi cloud environment

Zhang C, Xu Y, Hu Y, Wu J, Ren J, Zhang Y (2021) A blockchain-based multi-cloud storage data auditing scheme to locate faults. IEEE Trans Cloud Comput 10(4):2252–2263.

Subha T, Jayashri S (2014) Data integrity verification in hybrid cloud using ttpa. In: Networks and communications (NetCom2013). Springer, pp 149–159

Mao J, Zhang Y, Li P, Li T, Wu Q, Liu J (2017) A position-aware merkle tree for dynamic cloud data integrity verification. Soft Comput 21(8):2151–2164

Han S, Liu S, Chen K, Gu D (2014) Proofs of retrievability based on mrd codes. In: International Conference on Information Security Practice and Experience. Springer, pp 330–345

Kaaniche N, El Moustaine E, Laurent M (2014) A novel zero-knowledge scheme for proof of data possession in cloud storage applications. In: 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. IEEE, pp 522–531

Khedr WI, Khater HM, Mohamed ER (2019) Cryptographic accumulator-based scheme for critical data integrity verification in cloud storage. IEEE Access 7:65635–65651

Khatri TS, Jethava G (2013) Improving dynamic data integrity verification in cloud computing. In: 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT). IEEE, pp 1–6

Wang H (2012) Proxy provable data possession in public clouds. IEEE Trans Serv Comput 6(4):551–559

Apolinário F, Pardal M, Correia M (2018) S-audit: Efficient data integrity verification for cloud storage. In: 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing and Communications/12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). IEEE, pp 465–474

Li Y, Fu A, Yu Y, Zhang G (2017) Ipor: An efficient ida-based proof of retrievability scheme for cloud storage systems. In: 2017 IEEE International Conference on Communications (ICC). IEEE, pp 1–6

Shacham H, Waters B (2008) Compact proofs of retrievability. In: International conference on the theory and application of cryptology and information security. Springer, pp 90–107

Erway CC, Küpçü A, Papamanthou C, Tamassia R (2015) Dynamic provable data possession. ACM Trans Inf Syst Secur (TISSEC) 17(4):1–29

He D, Kumar N, Wang H, Wang L, Choo KKR (2017) Privacy-preserving certificateless provable data possession scheme for big data storage on cloud. Appl Math Comput 314:31–43

MathSciNet   Google Scholar  

Wang B, Li B, Li H, Li F (2013) Certificateless public auditing for data integrity in the cloud. In: 2013 IEEE conference on communications and network security (CNS). IEEE, pp 136–144

Liu C, Chen J, Yang LT, Zhang X, Yang C, Ranjan R, Kotagiri R (2013) Authorized public auditing of dynamic big data storage on cloud with efficient verifiable fine-grained updates. IEEE Trans Parallel Distrib Syst 25(9):2234–2244

Fu A, Li Y, Yu S, Yu Y, Zhang G (2018) Dipor: An ida-based dynamic proof of retrievability scheme for cloud storage systems. J Netw Comput Appl 104:97–106

Xu J, Chang EC (2012) Towards efficient proofs of retrievability. In: Proceedings of the 7th ACM symposium on information, computer and communications security. pp 79–80

Lu Y, Hu F (2019) Secure dynamic big graph data: Scalable, low-cost remote data integrity checking. IEEE Access 7:12888–12900

Ateniese G, Di Pietro R, Mancini LV, Tsudik G (2008) Scalable and efficient provable data possession. In: Proceedings of the 4th international conference on Security and privacy in communication netowrks. ACM, pp 1–10

Tian H, Chen Y, Chang CC, Jiang H, Huang Y, Chen Y, Liu J (2015) Dynamic-hash-table based public auditing for secure cloud storage. IEEE Trans Serv Comput 10(5):701–714

He D, Zeadally S, Wu L (2015) Certificateless public auditing scheme for cloud-assisted wireless body area networks. IEEE Syst J 12(1):64–73

Yoosuf MS, Anitha R (2022). LDuAP: lightweight dual auditing protocol to verify data integrity in cloud storage servers. J Ambient Intell Humanized Comput 13(8):3787–3805.

Tian H, Nan F, Chang CC, Huang Y, Lu J, Du Y (2019) Privacy-preserving public auditing for secure data storage in fog-to-cloud computing. J Netw Comput Appl 127:59–69

Singh AP, Pasupuleti SK (2016) Optimized public auditing and data dynamics for data storage security in cloud computing. Procedia Comput Sci 93:751–759

Wang C, Chow SS, Wang Q, Ren K, Lou W (2011) Privacy-preserving public auditing for secure cloud storage. IEEE Trans Comput 62(2):362–375

Zhang Y, Xu C, Lin X, Shen XS (2019) Blockchain-based public integrity verification for cloud storage against procrastinating auditors. IEEE Trans Cloud Comput 9(3):923–937.

Shen J, Shen J, Chen X, Huang X, Susilo W (2017) An efficient public auditing protocol with novel dynamic structure for cloud data. IEEE Trans Inf Forensic Secur 12(10):2402–2415

Oualha N, Leneutre J, Roudier Y (2012) Verifying remote data integrity in peer-to-peer data storage: A comprehensive survey of protocols. Peer-to-Peer Netw Appl 5(3):231–243

Xu Z, Wu L, Khan MK, Choo KKR, He D (2017) A secure and efficient public auditing scheme using rsa algorithm for cloud storage. J Supercomput 73(12):5285–5309

Sookhak M, Gani A, Talebian H, Akhunzada A, Khan SU, Buyya R, Zomaya AY (2015) Remote data auditing in cloud computing environments: a survey, taxonomy, and open issues. ACM Comput Surv (CSUR) 47(4):1–34

Mohammed A, Vasumathi D (2019) Locality parameters for privacy preserving protocol and detection of malicious third-party auditors in cloud computing. In: International Conference on Intelligent Computing and Communication. Springer, pp 67–76

Carroll M, Van Der Merwe A, Kotze P (2011) Secure cloud computing: Benefits, risks and controls. In: 2011 Information Security for South Africa. IEEE, pp 1–9

Zhang Q, Wang S, Zhang D, Wang J, Zhang Y (2019) Time and attribute based dual access control and data integrity verifiable scheme in cloud computing applications. IEEE Access 7:137594–137607

Li Y, Yu Y, Yang B, Min G, Wu H (2018) Privacy preserving cloud data auditing with efficient key update. Futur Gener Comput Syst 78:789–798

Shen W, Qin J, Yu J, Hao R, Hu J, Ma J (2021) Data integrity auditing without private key storage for secure cloud storage. IEEE Trans Cloud Comput 9(4):1408–1421. https://doi.org/10.1109/TCC.2019.2921553

Garg N, Bawa S, Kumar N (2020) An efficient data integrity auditing protocol for cloud computing. Futur Gener Comput Syst 109:306–316

Zhao XP, Jiang R (2020) Distributed machine learning oriented data integrity verification scheme in cloud computing environment. IEEE Access 8:26372–26384. https://doi.org/10.1109/ACCESS.2020.2971519

Yu Y, Au MH, Ateniese G, Huang X, Susilo W, Dai Y, Min G (2016) Identity-based remote data integrity checking with perfect data privacy preserving for cloud storage. IEEE Trans Inf Forensic Secur 12(4):767–778

Wei P, Wang D, Zhao Y, Tyagi SKS, Kumar N (2020) Blockchain data-based cloud data integrity protection mechanism. Futur Gener Comput Syst 102:902–911

Sifah EB, Xia Q, Agyekum KOBO, Xia H, Smahi A, Gao J (2021) A blockchain approach to ensuring provenance to outsourced cloud data in a sharing ecosystem. IEEE Syst J 16(1):1673–1684.

Huang P, Fan K, Yang H, Zhang K, Li H, Yang Y (2020) A collaborative auditing blockchain for trustworthy data integrity in cloud storage system. IEEE Access 8:94780–94794

Pise R, Patil S (2021) Enhancing security of data in cloud storage using decentralized blockchain. In: 2021 Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV). IEEE, pp 161–167

Sharma P, Jindal R, Borah MD (2019) Blockchain-based integrity protection system for cloud storage. In: 2019 4th Technology Innovation Management and Engineering Science International Conference (TIMES-iCON). IEEE, pp 1–5

Miao Y, Huang Q, Xiao M, Li H (2020) Decentralized and privacy-preserving public auditing for cloud storage based on blockchain. IEEE Access 8:139813–139826. https://doi.org/10.1109/ACCESS.2020.3013153

Cui H, Wan Z, Wei X, Nepal S, Yi X (2020) Pay as you decrypt: Decryption outsourcing for functional encryption using blockchain. IEEE Trans Inf Forensic Secur 15:3227–3238. https://doi.org/10.1109/TIFS.2020.2973864

Duan H, Du Y, Zheng L, Wang C, Au MH, Wang Q (2023) Towards practical auditing of dynamic data in decentralized storage. IEEE Trans Dependable Secure Comput 20(1):708–723. https://doi.org/10.1109/TDSC.2022.3142611

Sasikumar A, Ravi L, Kotecha K, Abraham A, Devarajan M, Vairavasundaram S (2023) A secure big data storage framework based on blockchain consensus mechanism with flexible finality. IEEE Access 11:56712–56725. https://doi.org/10.1109/ACCESS.2023.3282322

Wang T, Zhou J, Chen X, Wang G, Liu A, Liu Y (2018) A three-layer privacy preserving cloud storage scheme based on computational intelligence in fog computing. IEEE Trans Emerg Top Comput Intell 2(1):3–12

Bonomi F, Milito R, Zhu J, Addepalli S (2012) Fog computing and its role in the internet of things. In: Proceedings of the first edition of the MCC workshop on Mobile cloud computing. pp 13–16

Li B, He Q, Chen F, Jin H, Xiang Y, Yang Y (2020) Auditing cache data integrity in the edge computing environment. IEEE Trans Parallel Distrib Syst 32(5):1210–1223.

Li B, He Q, Chen F, Jin H, Xiang Y, Yang Y (2021) Inspecting edge data integrity with aggregated signature in distributed edge computing environment. IEEE Trans Cloud Comput 10(4):2691–2703.

Download references

Acknowledgements

Author information, authors and affiliations.

Department of Computer Engineering, Mizoram University, Aizawl, MZ, 796004, India

Paromita Goswami & Ajoy Kumar Khan

Department of Computer Engineering and Application, GLA University, Mathura, UP, 281406, India

Paromita Goswami & Neetu Faujdar

Department of Computer science and Enginering, Tripura University, Agartala, Tripura, 796022, India

Somen Debnath

Centre for Smart Information and Communication Systems Department of Electrical and Electronic Engineering Science, University of Johannesburg, Auckland Park Campus, PO Box. 524, Johannesburg, 2006, South Africa

Ghanshyam Singh

You can also search for this author in PubMed   Google Scholar

Contributions

Paromita Goswami, Neetu Faujdar and Somen Debnath invented the proposed methodology and wrote the main manuscript text, Ajay Kumar Khan prepared tables and Ghyanshyam Singh prepared figures, Ghyanshyam Singh and Ajay Kumar Singh is also written literature and all authrs reviewed the whole manuscript.

Corresponding author

Correspondence to Neetu Faujdar .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Goswami, P., Faujdar, N., Debnath, S. et al. Investigation on storage level data integrity strategies in cloud computing: classification, security obstructions, challenges and vulnerability. J Cloud Comp 13 , 45 (2024). https://doi.org/10.1186/s13677-024-00605-z

Download citation

Received : 06 May 2023

Accepted : 30 January 2024

Published : 15 February 2024

DOI : https://doi.org/10.1186/s13677-024-00605-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Cloud computing
  • Data integrity
  • Security attacks
  • Cloud storage
  • Data auditing
  • Security challenges

research paper on data security in cloud computing

Help | Advanced Search

Computer Science > Cryptography and Security

Title: enhancing data security for cloud computing applications through distributed blockchain-based sdn architecture in iot networks.

Abstract: Blockchain (BC) and Software Defined Networking (SDN) are some of the most prominent emerging technologies in recent research. These technologies provide security, integrity, as well as confidentiality in their respective applications. Cloud computing has also been a popular comprehensive technology for several years. Confidential information is often shared with the cloud infrastructure to give customers access to remote resources, such as computation and storage operations. However, cloud computing also presents substantial security threats, issues, and challenges. Therefore, to overcome these difficulties, we propose integrating Blockchain and SDN in the cloud computing platform. In this research, we introduce the architecture to better secure clouds. Moreover, we leverage a distributed Blockchain approach to convey security, confidentiality, privacy, integrity, adaptability, and scalability in the proposed architecture. BC provides a distributed or decentralized and efficient environment for users. Also, we present an SDN approach to improving the reliability, stability, and load balancing capabilities of the cloud infrastructure. Finally, we provide an experimental evaluation of the performance of our SDN and BC-based implementation using different parameters, also monitoring some attacks in the system and proving its efficacy.

Submission history

Access paper:.

  • Download PDF
  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

The Research Study on Identification of Threats and Security Techniques in Cloud Environment

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

IMAGES

  1. (PDF) DATA SECURITY IN CLOUD COMPUTING

    research paper on data security in cloud computing

  2. (PDF) Privacy Protection and Data Security in Cloud Computing: A Survey

    research paper on data security in cloud computing

  3. (PDF) An Overview on Data Security in Cloud Computing

    research paper on data security in cloud computing

  4. (PDF) Data Security Implementations in Cloud Computing: A Critical Review

    research paper on data security in cloud computing

  5. (PDF) A Data Security Self-Attribute System in Cloud Computing

    research paper on data security in cloud computing

  6. (PDF) A Survey Paper on Data security in Cloud Computing

    research paper on data security in cloud computing

VIDEO

  1. Integrated Privacy Framework

  2. Cloud computing Question paper & Answers 20CS53I#Question Paper #Diploma Question paper July 2023

  3. Panel Discussion: Future of Cloud Security

  4. 16 Essential IT Security Policies for Business #shorts #cybersecurity #datasecurity #cloudsecurity

  5. Module 4

  6. Three core components of data security,cloud infrastructure monitoring

COMMENTS

  1. A Systematic Literature Review on Cloud Computing Security: Threats and

    This systematic literature review (SLR) is aimed to review the existing research studies on cloud computing security, threats, and challenges. This SLR examined the research studies published between 2010 and 2020 within the popular digital libraries.

  2. (PDF) DATA SECURITY ON CLOUD COMPUTING

    This paper addresses data security in cloud computing. The recent development in Computing has drastically changed everyone's perception of the Infrastructure Architecture and...

  3. Security and privacy protection in cloud computing: Discussions and

    We discuss the research progress of several technologies, such as access control; ciphertext policy attribute-based encryption (CP-ABE); key policy attribute-based encryption (KP-ABE); the fine-grain, multi-authority, proxy re-encryption (PRE); hierarchical encryption; searchable encryption (SE). •

  4. LITERATURE REVIEW ON DATA SECURITY IN CLOUD COMPUTING

    LITERATURE REVIEW ON DATA SECURITY IN CLOUD COMPUTING Authors: Abraham Ekow Dadzie Sharda University Abstract Cloud Computing is now a worldwide concept which is being utilized by majority of...

  5. Data Security and Privacy in Cloud Computing

    In this paper, we make a comparative research analysis of the existing research work regarding the data security and privacy protection techniques used in the cloud computing. 1. Introduction Cloud computing has been envisioned as the next generation paradigm in computation.

  6. (PDF) Data Security in Cloud Computing

    This paper discusses the security of data in cloud computing. It is a study of data in the cloud and aspects related to it concerning security. The paper will go in to details of...

  7. Exploring Data Security Issues and Solutions in Cloud Computing

    This paper explores the different data security issues in cloud computing in a multi-tenant environment and proposes methods to overcome the security issues. This paper also describes Cloud computing models such as the deployment models and the service delivery models.

  8. [2108.09508] Data Security and Privacy in Cloud Computing: Concepts and

    Data security and privacy are inevitable requirement of cloud environment. Massive usage and sharing of data among users opens door to security loopholes. This paper envisages a discussion of cloud environment, its utilities, challenges, and emerging research trends confined to secure processing and sharing of data. Submission history

  9. A survey on security challenges in cloud computing: issues, threats

    Cloud computing has gained huge attention over the past decades because of continuously increasing demands. There are several advantages to organizations moving toward cloud-based data storage solutions. These include simplified IT infrastructure and management, remote access from effectively anywhere in the world with a stable Internet connection and the cost efficiencies that cloud computing ...

  10. Privacy Protection and Data Security in Cloud Computing: A Survey

    In recent years, there are many research schemes of cloud computing privacy protection based on access control, attribute-based encryption (ABE), trust and reputation, but they are scattered and lack unified logic. In this paper, we systematically review and analyze relevant research achievements.

  11. A new lightweight data security system for data security in the cloud

    1. Introduction. Cloud computing is a standard for massive computation, where several scattered and parallel designs are integrated. Utility computing, virtualization, server systems, and parallel computing provide services like networks, space, and connectivity gear, which are expected to be paid for and beyond [1].Cloud storage security solutions emphasize data encryption from design to ...

  12. Data Security and Privacy in the Cloud

    Abstract. Achieving data security and privacy in the cloud means ensuring confidentiality and integrity of data and computations, and protection from non authorized accesses. Satisfaction of such requirements entails non trivial challenges, as relying on external servers, owners lose control on their data. In this paper, we discuss the problems ...

  13. An Overview on Data Security in Cloud Computing

    This paper is an overview of data security issues in the cloud computing. Its objective is to highlight the principal issues related to data security that raised by cloud environment. To do this, these issues was classi ed into three categories: 1-data security issues raised by single fi

  14. Data security in cloud computing

    Abstract: This paper discusses the security of data in cloud computing. It is a study of data in the cloud and aspects related to it concerning security. The paper will go in to details of data protection methods and approaches used throughout the world to ensure maximum data protection by reducing risks and threats.

  15. (PDF) Data Security in Cloud Computing

    ... Vaidya et al. [8] analyzed the data security problem in cloud storage, which is mostly a distributed storage system. To verify erasure-coded data using RSA encryption, existing methods...

  16. Securing Machine Learning in the Cloud: A Systematic Review of Cloud

    With the advances in machine learning (ML) and deep learning (DL) techniques, and the potency of cloud computing in offering services efficiently and cost-effectively, Machine Learning as a Service (MLaaS) cloud platforms have become popular. In addition, there is increasing adoption of third-party cloud services for outsourcing training of DL models, which requires substantial costly ...

  17. PDF Data Security in Cloud Computing

    DATA SECURITY IN CLOUD COMPUTING R.SANGEETHA, M.M.E.S WOMEN'S ARTS AND SCIENCE COLLEGE, MELVISHRAM. information security. Security ABSTRACT Cloud computing is a model which enables widespread access to a shared pool of resources including the characteristics of scalability, virtualization and many others.

  18. Investigation on storage level data integrity strategies in cloud

    Cloud computing provides outsourcing of computing services at a lower cost, making it a popular choice for many businesses. In recent years, cloud data storage has gained significant success, thanks to its advantages in maintenance, performance, support, cost, and reliability compared to traditional storage methods. However, despite the benefits of disaster recovery, scalability, and resource ...

  19. Research on Data Security in Big Data Cloud Computing Environment

    This paper delivers an overview of conceptions, characteristics and advanced technologies for big data cloud computing. Security issues of data quality and privacy control are elaborated pertaining to data access, data isolation, data integrity, data destruction, data transmission and data sharing.

  20. Enhancing Data Security for Cloud Computing Applications through

    Blockchain (BC) and Software Defined Networking (SDN) are some of the most prominent emerging technologies in recent research. These technologies provide security, integrity, as well as confidentiality in their respective applications. Cloud computing has also been a popular comprehensive technology for several years. Confidential information is often shared with the cloud infrastructure to ...

  21. (PDF) Cloud Security

    This paper reviews the cloud security issues and concerns , while addressing various key topics like vulnerabilities, threats and mitigations, and cloud models. Discover the world's...

  22. Adoption of cloud computing as innovation in the organization

    The computing infrastructure in a private cloud is specifically provided for an organization and is not allowed to be shared with other organizations. 15 When enterprises are not able to host their data remotely, both cloud computing providers and users have the optimal infrastructure and security management. They elected to utilize private ...

  23. Review Paper on Data Security in Cloud Computing Environment

    Review Paper on Data Security in Cloud Computing Environment Abstract: In today's scenario cloud computing in growing concept in the computer and information technology field for storing the data. Because on the physical storage device whatever the huge amount of data which is sensitive or confidential that cannot be stored.

  24. (PDF) Analysis of Security Algorithms in Cloud

    This paper discusses the comparison of various cryptographic encryption algorithms with their various key features & then later discusses their performance cost based on the encryption time,...

  25. Gartner Emerging Technologies and Trends Impact Radar for 2024

    Use this year's Gartner Emerging Tech Impact Radar to: ☑️Enhance your competitive edge in the smart world ☑️Prioritize prevalent and impactful GenAI use cases that already deliver real value to users ☑️Balance stimulating growth and mitigating risk ☑️Identify relevant emerging technologies that support your strategic product roadmap Explore all 30 technologies and trends: www ...

  26. The Research Study on Identification of Threats and Security Techniques

    Cloud computing allows you to share advantages over the Internet with the use of Internet composites, group of jobs, extra room and various programming initiatives. Cloud master associations can lease different benefits to customer needs and companies can pay for cloud client businesses. Nonetheless, the various security issues related to all cloud benefits, programming, virtualization ...