• Survey Paper
  • Open access
  • Published: 01 July 2020

Cybersecurity data science: an overview from machine learning perspective

  • Iqbal H. Sarker   ORCID: orcid.org/0000-0003-1740-5517 1 , 2 ,
  • A. S. M. Kayes 3 ,
  • Shahriar Badsha 4 ,
  • Hamed Alqahtani 5 ,
  • Paul Watters 3 &
  • Alex Ng 3  

Journal of Big Data volume  7 , Article number:  41 ( 2020 ) Cite this article

134k Accesses

218 Citations

51 Altmetric

Metrics details

In a computing context, cybersecurity is undergoing massive shifts in technology and its operations in recent days, and data science is driving the change. Extracting security incident patterns or insights from cybersecurity data and building corresponding data-driven model , is the key to make a security system automated and intelligent. To understand and analyze the actual phenomena with data, various scientific methods, machine learning techniques, processes, and systems are used, which is commonly known as data science. In this paper, we focus and briefly discuss on cybersecurity data science , where the data is being gathered from relevant cybersecurity sources, and the analytics complement the latest data-driven patterns for providing more effective security solutions. The concept of cybersecurity data science allows making the computing process more actionable and intelligent as compared to traditional ones in the domain of cybersecurity. We then discuss and summarize a number of associated research issues and future directions . Furthermore, we provide a machine learning based multi-layered framework for the purpose of cybersecurity modeling. Overall, our goal is not only to discuss cybersecurity data science and relevant methods but also to focus the applicability towards data-driven intelligent decision making for protecting the systems from cyber-attacks.

Introduction

Due to the increasing dependency on digitalization and Internet-of-Things (IoT) [ 1 ], various security incidents such as unauthorized access [ 2 ], malware attack [ 3 ], zero-day attack [ 4 ], data breach [ 5 ], denial of service (DoS) [ 2 ], social engineering or phishing [ 6 ] etc. have grown at an exponential rate in recent years. For instance, in 2010, there were less than 50 million unique malware executables known to the security community. By 2012, they were double around 100 million, and in 2019, there are more than 900 million malicious executables known to the security community, and this number is likely to grow, according to the statistics of AV-TEST institute in Germany [ 7 ]. Cybercrime and attacks can cause devastating financial losses and affect organizations and individuals as well. It’s estimated that, a data breach costs 8.19 million USD for the United States and 3.9 million USD on an average [ 8 ], and the annual cost to the global economy from cybercrime is 400 billion USD [ 9 ]. According to Juniper Research [ 10 ], the number of records breached each year to nearly triple over the next 5 years. Thus, it’s essential that organizations need to adopt and implement a strong cybersecurity approach to mitigate the loss. According to [ 11 ], the national security of a country depends on the business, government, and individual citizens having access to applications and tools which are highly secure, and the capability on detecting and eliminating such cyber-threats in a timely way. Therefore, to effectively identify various cyber incidents either previously seen or unseen, and intelligently protect the relevant systems from such cyber-attacks, is a key issue to be solved urgently.

figure 1

Popularity trends of data science, machine learning and cybersecurity over time, where x-axis represents the timestamp information and y axis represents the corresponding popularity values

Cybersecurity is a set of technologies and processes designed to protect computers, networks, programs and data from attack, damage, or unauthorized access [ 12 ]. In recent days, cybersecurity is undergoing massive shifts in technology and its operations in the context of computing, and data science (DS) is driving the change, where machine learning (ML), a core part of “Artificial Intelligence” (AI) can play a vital role to discover the insights from data. Machine learning can significantly change the cybersecurity landscape and data science is leading a new scientific paradigm [ 13 , 14 ]. The popularity of these related technologies is increasing day-by-day, which is shown in Fig.  1 , based on the data of the last five years collected from Google Trends [ 15 ]. The figure represents timestamp information in terms of a particular date in the x-axis and corresponding popularity in the range of 0 (minimum) to 100 (maximum) in the y-axis. As shown in Fig.  1 , the popularity indication values of these areas are less than 30 in 2014, while they exceed 70 in 2019, i.e., more than double in terms of increased popularity. In this paper, we focus on cybersecurity data science (CDS), which is broadly related to these areas in terms of security data processing techniques and intelligent decision making in real-world applications. Overall, CDS is security data-focused, applies machine learning methods to quantify cyber risks, and ultimately seeks to optimize cybersecurity operations. Thus, the purpose of this paper is for those academia and industry people who want to study and develop a data-driven smart cybersecurity model based on machine learning techniques. Therefore, great emphasis is placed on a thorough description of various types of machine learning methods, and their relations and usage in the context of cybersecurity. This paper does not describe all of the different techniques used in cybersecurity in detail; instead, it gives an overview of cybersecurity data science modeling based on artificial intelligence, particularly from machine learning perspective.

The ultimate goal of cybersecurity data science is data-driven intelligent decision making from security data for smart cybersecurity solutions. CDS represents a partial paradigm shift from traditional well-known security solutions such as firewalls, user authentication and access control, cryptography systems etc. that might not be effective according to today’s need in cyber industry [ 16 , 17 , 18 , 19 ]. The problems are these are typically handled statically by a few experienced security analysts, where data management is done in an ad-hoc manner [ 20 , 21 ]. However, as an increasing number of cybersecurity incidents in different formats mentioned above continuously appear over time, such conventional solutions have encountered limitations in mitigating such cyber risks. As a result, numerous advanced attacks are created and spread very quickly throughout the Internet. Although several researchers use various data analysis and learning techniques to build cybersecurity models that are summarized in “ Machine learning tasks in cybersecurity ” section, a comprehensive security model based on the effective discovery of security insights and latest security patterns could be more useful. To address this issue, we need to develop more flexible and efficient security mechanisms that can respond to threats and to update security policies to mitigate them intelligently in a timely manner. To achieve this goal, it is inherently required to analyze a massive amount of relevant cybersecurity data generated from various sources such as network and system sources, and to discover insights or proper security policies with minimal human intervention in an automated manner.

Analyzing cybersecurity data and building the right tools and processes to successfully protect against cybersecurity incidents goes beyond a simple set of functional requirements and knowledge about risks, threats or vulnerabilities. For effectively extracting the insights or the patterns of security incidents, several machine learning techniques, such as feature engineering, data clustering, classification, and association analysis, or neural network-based deep learning techniques can be used, which are briefly discussed in “ Machine learning tasks in cybersecurity ” section. These learning techniques are capable to find the anomalies or malicious behavior and data-driven patterns of associated security incidents to make an intelligent decision. Thus, based on the concept of data-driven decision making, we aim to focus on cybersecurity data science , where the data is being gathered from relevant cybersecurity sources such as network activity, database activity, application activity, or user activity, and the analytics complement the latest data-driven patterns for providing corresponding security solutions.

The contributions of this paper are summarized as follows.

We first make a brief discussion on the concept of cybersecurity data science and relevant methods to understand its applicability towards data-driven intelligent decision making in the domain of cybersecurity. For this purpose, we also make a review and brief discussion on different machine learning tasks in cybersecurity, and summarize various cybersecurity datasets highlighting their usage in different data-driven cyber applications.

We then discuss and summarize a number of associated research issues and future directions in the area of cybersecurity data science, that could help both the academia and industry people to further research and development in relevant application areas.

Finally, we provide a generic multi-layered framework of the cybersecurity data science model based on machine learning techniques. In this framework, we briefly discuss how the cybersecurity data science model can be used to discover useful insights from security data and making data-driven intelligent decisions to build smart cybersecurity systems.

The remainder of the paper is organized as follows. “ Background ” section summarizes background of our study and gives an overview of the related technologies of cybersecurity data science. “ Cybersecurity data science ” section defines and discusses briefly about cybersecurity data science including various categories of cyber incidents data. In “  Machine learning tasks in cybersecurity ” section, we briefly discuss various categories of machine learning techniques including their relations with cybersecurity tasks and summarize a number of machine learning based cybersecurity models in the field. “ Research issues and future directions ” section briefly discusses and highlights various research issues and future directions in the area of cybersecurity data science. In “  A multi-layered framework for smart cybersecurity services ” section, we suggest a machine learning-based framework to build cybersecurity data science model and discuss various layers with their roles. In “  Discussion ” section, we highlight several key points regarding our studies. Finally,  “ Conclusion ” section concludes this paper.

In this section, we give an overview of the related technologies of cybersecurity data science including various types of cybersecurity incidents and defense strategies.

  • Cybersecurity

Over the last half-century, the information and communication technology (ICT) industry has evolved greatly, which is ubiquitous and closely integrated with our modern society. Thus, protecting ICT systems and applications from cyber-attacks has been greatly concerned by the security policymakers in recent days [ 22 ]. The act of protecting ICT systems from various cyber-threats or attacks has come to be known as cybersecurity [ 9 ]. Several aspects are associated with cybersecurity: measures to protect information and communication technology; the raw data and information it contains and their processing and transmitting; associated virtual and physical elements of the systems; the degree of protection resulting from the application of those measures; and eventually the associated field of professional endeavor [ 23 ]. Craigen et al. defined “cybersecurity as a set of tools, practices, and guidelines that can be used to protect computer networks, software programs, and data from attack, damage, or unauthorized access” [ 24 ]. According to Aftergood et al. [ 12 ], “cybersecurity is a set of technologies and processes designed to protect computers, networks, programs and data from attacks and unauthorized access, alteration, or destruction”. Overall, cybersecurity concerns with the understanding of diverse cyber-attacks and devising corresponding defense strategies that preserve several properties defined as below [ 25 , 26 ].

Confidentiality is a property used to prevent the access and disclosure of information to unauthorized individuals, entities or systems.

Integrity is a property used to prevent any modification or destruction of information in an unauthorized manner.

Availability is a property used to ensure timely and reliable access of information assets and systems to an authorized entity.

The term cybersecurity applies in a variety of contexts, from business to mobile computing, and can be divided into several common categories. These are - network security that mainly focuses on securing a computer network from cyber attackers or intruders; application security that takes into account keeping the software and the devices free of risks or cyber-threats; information security that mainly considers security and the privacy of relevant data; operational security that includes the processes of handling and protecting data assets. Typical cybersecurity systems are composed of network security systems and computer security systems containing a firewall, antivirus software, or an intrusion detection system [ 27 ].

Cyberattacks and security risks

The risks typically associated with any attack, which considers three security factors, such as threats, i.e., who is attacking, vulnerabilities, i.e., the weaknesses they are attacking, and impacts, i.e., what the attack does [ 9 ]. A security incident is an act that threatens the confidentiality, integrity, or availability of information assets and systems. Several types of cybersecurity incidents that may result in security risks on an organization’s systems and networks or an individual [ 2 ]. These are:

Unauthorized access that describes the act of accessing information to network, systems or data without authorization that results in a violation of a security policy [ 2 ];

Malware known as malicious software, is any program or software that intentionally designed to cause damage to a computer, client, server, or computer network, e.g., botnets. Examples of different types of malware including computer viruses, worms, Trojan horses, adware, ransomware, spyware, malicious bots, etc. [ 3 , 26 ]; Ransom malware, or ransomware , is an emerging form of malware that prevents users from accessing their systems or personal files, or the devices, then demands an anonymous online payment in order to restore access.

Denial-of-Service is an attack meant to shut down a machine or network, making it inaccessible to its intended users by flooding the target with traffic that triggers a crash. The Denial-of-Service (DoS) attack typically uses one computer with an Internet connection, while distributed denial-of-service (DDoS) attack uses multiple computers and Internet connections to flood the targeted resource [ 2 ];

Phishing a type of social engineering , used for a broad range of malicious activities accomplished through human interactions, in which the fraudulent attempt takes part to obtain sensitive information such as banking and credit card details, login credentials, or personally identifiable information by disguising oneself as a trusted individual or entity via an electronic communication such as email, text, or instant message, etc. [ 26 ];

Zero-day attack is considered as the term that is used to describe the threat of an unknown security vulnerability for which either the patch has not been released or the application developers were unaware [ 4 , 28 ].

Beside these attacks mentioned above, privilege escalation [ 29 ], password attack [ 30 ], insider threat [ 31 ], man-in-the-middle [ 32 ], advanced persistent threat [ 33 ], SQL injection attack [ 34 ], cryptojacking attack [ 35 ], web application attack [ 30 ] etc. are well-known as security incidents in the field of cybersecurity. A data breach is another type of security incident, known as a data leak, which is involved in the unauthorized access of data by an individual, application, or service [ 5 ]. Thus, all data breaches are considered as security incidents, however, all the security incidents are not data breaches. Most data breaches occur in the banking industry involving the credit card numbers, personal information, followed by the healthcare sector and the public sector [ 36 ].

Cybersecurity defense strategies

Defense strategies are needed to protect data or information, information systems, and networks from cyber-attacks or intrusions. More granularly, they are responsible for preventing data breaches or security incidents and monitoring and reacting to intrusions, which can be defined as any kind of unauthorized activity that causes damage to an information system [ 37 ]. An intrusion detection system (IDS) is typically represented as “a device or software application that monitors a computer network or systems for malicious activity or policy violations” [ 38 ]. The traditional well-known security solutions such as anti-virus, firewalls, user authentication, access control, data encryption and cryptography systems, however might not be effective according to today’s need in the cyber industry

[ 16 , 17 , 18 , 19 ]. On the other hand, IDS resolves the issues by analyzing security data from several key points in a computer network or system [ 39 , 40 ]. Moreover, intrusion detection systems can be used to detect both internal and external attacks.

Intrusion detection systems are different categories according to the usage scope. For instance, a host-based intrusion detection system (HIDS), and network intrusion detection system (NIDS) are the most common types based on the scope of single computers to large networks. In a HIDS, the system monitors important files on an individual system, while it analyzes and monitors network connections for suspicious traffic in a NIDS. Similarly, based on methodologies, the signature-based IDS, and anomaly-based IDS are the most well-known variants [ 37 ].

Signature-based IDS : A signature can be a predefined string, pattern, or rule that corresponds to a known attack. A particular pattern is identified as the detection of corresponding attacks in a signature-based IDS. An example of a signature can be known patterns or a byte sequence in a network traffic, or sequences used by malware. To detect the attacks, anti-virus software uses such types of sequences or patterns as a signature while performing the matching operation. Signature-based IDS is also known as knowledge-based or misuse detection [ 41 ]. This technique can be efficient to process a high volume of network traffic, however, is strictly limited to the known attacks only. Thus, detecting new attacks or unseen attacks is one of the biggest challenges faced by this signature-based system.

Anomaly-based IDS : The concept of anomaly-based detection overcomes the issues of signature-based IDS discussed above. In an anomaly-based intrusion detection system, the behavior of the network is first examined to find dynamic patterns, to automatically create a data-driven model, to profile the normal behavior, and thus it detects deviations in the case of any anomalies [ 41 ]. Thus, anomaly-based IDS can be treated as a dynamic approach, which follows behavior-oriented detection. The main advantage of anomaly-based IDS is the ability to identify unknown or zero-day attacks [ 42 ]. However, the issue is that the identified anomaly or abnormal behavior is not always an indicator of intrusions. It sometimes may happen because of several factors such as policy changes or offering a new service.

In addition, a hybrid detection approach [ 43 , 44 ] that takes into account both the misuse and anomaly-based techniques discussed above can be used to detect intrusions. In a hybrid system, the misuse detection system is used for detecting known types of intrusions and anomaly detection system is used for novel attacks [ 45 ]. Beside these approaches, stateful protocol analysis can also be used to detect intrusions that identifies deviations of protocol state similarly to the anomaly-based method, however it uses predetermined universal profiles based on accepted definitions of benign activity [ 41 ]. In Table 1 , we have summarized these common approaches highlighting their pros and cons. Once the detecting has been completed, the intrusion prevention system (IPS) that is intended to prevent malicious events, can be used to mitigate the risks in different ways such as manual, providing notification, or automatic process [ 46 ]. Among these approaches, an automatic response system could be more effective as it does not involve a human interface between the detection and response systems.

  • Data science

We are living in the age of data, advanced analytics, and data science, which are related to data-driven intelligent decision making. Although, the process of searching patterns or discovering hidden and interesting knowledge from data is known as data mining [ 47 ], in this paper, we use the broader term “data science” rather than data mining. The reason is that, data science, in its most fundamental form, is all about understanding of data. It involves studying, processing, and extracting valuable insights from a set of information. In addition to data mining, data analytics is also related to data science. The development of data mining, knowledge discovery, and machine learning that refers creating algorithms and program which learn on their own, together with the original data analysis and descriptive analytics from the statistical perspective, forms the general concept of “data analytics” [ 47 ]. Nowadays, many researchers use the term “data science” to describe the interdisciplinary field of data collection, preprocessing, inferring, or making decisions by analyzing the data. To understand and analyze the actual phenomena with data, various scientific methods, machine learning techniques, processes, and systems are used, which is commonly known as data science. According to Cao et al. [ 47 ] “data science is a new interdisciplinary field that synthesizes and builds on statistics, informatics, computing, communication, management, and sociology to study data and its environments, to transform data to insights and decisions by following a data-to-knowledge-to-wisdom thinking and methodology”. As a high-level statement in the context of cybersecurity, we can conclude that it is the study of security data to provide data-driven solutions for the given security problems, as known as “the science of cybersecurity data”. Figure 2 shows the typical data-to-insight-to-decision transfer at different periods and general analytic stages in data science, in terms of a variety of analytics goals (G) and approaches (A) to achieve the data-to-decision goal [ 47 ].

figure 2

Data-to-insight-to-decision analytic stages in data science [ 47 ]

Based on the analytic power of data science including machine learning techniques, it can be a viable component of security strategies. By using data science techniques, security analysts can manipulate and analyze security data more effectively and efficiently, uncovering valuable insights from data. Thus, data science methodologies including machine learning techniques can be well utilized in the context of cybersecurity, in terms of problem understanding, gathering security data from diverse sources, preparing data to feed into the model, data-driven model building and updating, for providing smart security services, which motivates to define cybersecurity data science and to work in this research area.

Cybersecurity data science

In this section, we briefly discuss cybersecurity data science including various categories of cyber incidents data with the usage in different application areas, and the key terms and areas related to our study.

Understanding cybersecurity data

Data science is largely driven by the availability of data [ 48 ]. Datasets typically represent a collection of information records that consist of several attributes or features and related facts, in which cybersecurity data science is based on. Thus, it’s important to understand the nature of cybersecurity data containing various types of cyberattacks and relevant features. The reason is that raw security data collected from relevant cyber sources can be used to analyze the various patterns of security incidents or malicious behavior, to build a data-driven security model to achieve our goal. Several datasets exist in the area of cybersecurity including intrusion analysis, malware analysis, anomaly, fraud, or spam analysis that are used for various purposes. In Table 2 , we summarize several such datasets including their various features and attacks that are accessible on the Internet, and highlight their usage based on machine learning techniques in different cyber applications. Effectively analyzing and processing of these security features, building target machine learning-based security model according to the requirements, and eventually, data-driven decision making, could play a role to provide intelligent cybersecurity services that are discussed briefly in “ A multi-layered framework for smart cybersecurity services ” section.

Defining cybersecurity data science

Data science is transforming the world’s industries. It is critically important for the future of intelligent cybersecurity systems and services because of “security is all about data”. When we seek to detect cyber threats, we are analyzing the security data in the form of files, logs, network packets, or other relevant sources. Traditionally, security professionals didn’t use data science techniques to make detections based on these data sources. Instead, they used file hashes, custom-written rules like signatures, or manually defined heuristics [ 21 ]. Although these techniques have their own merits in several cases, it needs too much manual work to keep up with the changing cyber threat landscape. On the contrary, data science can make a massive shift in technology and its operations, where machine learning algorithms can be used to learn or extract insight of security incident patterns from the training data for their detection and prevention. For instance, to detect malware or suspicious trends, or to extract policy rules, these techniques can be used.

In recent days, the entire security industry is moving towards data science, because of its capability to transform raw data into decision making. To do this, several data-driven tasks can be associated, such as—(i) data engineering focusing practical applications of data gathering and analysis; (ii) reducing data volume that deals with filtering significant and relevant data to further analysis; (iii) discovery and detection that focuses on extracting insight or incident patterns or knowledge from data; (iv) automated models that focus on building data-driven intelligent security model; (v) targeted security  alerts focusing on the generation of remarkable security alerts based on discovered knowledge that minimizes the false alerts, and (vi) resource optimization that deals with the available resources to achieve the target goals in a security system. While making data-driven decisions, behavioral analysis could also play a significant role in the domain of cybersecurity [ 81 ].

Thus, the concept of cybersecurity data science incorporates the methods and techniques of data science and machine learning as well as the behavioral analytics of various security incidents. The combination of these technologies has given birth to the term “cybersecurity data science”, which refers to collect a large amount of security event data from different sources and analyze it using machine learning technologies for detecting security risks or attacks either through the discovery of useful insights or the latest data-driven patterns. It is, however, worth remembering that cybersecurity data science is not just about a collection of machine learning algorithms, rather,  a process that can help security professionals or analysts to scale and automate their security activities in a smart way and in a timely manner. Therefore, the formal definition can be as follows: “Cybersecurity data science is a research or working area existing at the intersection of cybersecurity, data science, and machine learning or artificial intelligence, which is mainly security data-focused, applies machine learning methods, attempts to quantify cyber-risks or incidents, and promotes inferential techniques to analyze behavioral patterns in security data. It also focuses on generating security response alerts, and eventually seeks for optimizing cybersecurity solutions, to build automated and intelligent cybersecurity systems.”

Table  3 highlights some key terms associated with cybersecurity data science. Overall, the outputs of cybersecurity data science are typically security data products, which can be a data-driven security model, policy rule discovery, risk or attack prediction, potential security service and recommendation, or the corresponding security system depending on the given security problem in the domain of cybersecurity. In the next section, we briefly discuss various machine learning tasks with examples within the scope of our study.

Machine learning tasks in cybersecurity

Machine learning (ML) is typically considered as a branch of “Artificial Intelligence”, which is closely related to computational statistics, data mining and analytics, data science, particularly focusing on making the computers to learn from data [ 82 , 83 ]. Thus, machine learning models typically comprise of a set of rules, methods, or complex “transfer functions” that can be applied to find interesting data patterns, or to recognize or predict behavior [ 84 ], which could play an important role in the area of cybersecurity. In the following, we discuss different methods that can be used to solve machine learning tasks and how they are related to cybersecurity tasks.

Supervised learning

Supervised learning is performed when specific targets are defined to reach from a certain set of inputs, i.e., task-driven approach. In the area of machine learning, the most popular supervised learning techniques are known as classification and regression methods [ 129 ]. These techniques are popular to classify or predict the future for a particular security problem. For instance, to predict denial-of-service attack (yes, no) or to identify different classes of network attacks such as scanning and spoofing, classification techniques can be used in the cybersecurity domain. ZeroR [ 83 ], OneR [ 130 ], Navies Bayes [ 131 ], Decision Tree [ 132 , 133 ], K-nearest neighbors [ 134 ], support vector machines [ 135 ], adaptive boosting [ 136 ], and logistic regression [ 137 ] are the well-known classification techniques. In addition, recently Sarker et al. have proposed BehavDT [ 133 ], and IntruDtree [ 106 ] classification techniques that are able to effectively build a data-driven predictive model. On the other hand, to predict the continuous or numeric value, e.g., total phishing attacks in a certain period or predicting the network packet parameters, regression techniques are useful. Regression analyses can also be used to detect the root causes of cybercrime and other types of fraud [ 138 ]. Linear regression [ 82 ], support vector regression [ 135 ] are the popular regression techniques. The main difference between classification and regression is that the output variable in the regression is numerical or continuous, while the predicted output for classification is categorical or discrete. Ensemble learning is an extension of supervised learning while mixing different simple models, e.g., Random Forest learning [ 139 ] that generates multiple decision trees to solve a particular security task.

Unsupervised learning

In unsupervised learning problems, the main task is to find patterns, structures, or knowledge in unlabeled data, i.e., data-driven approach [ 140 ]. In the area of cybersecurity, cyber-attacks like malware stays hidden in some ways, include changing their behavior dynamically and autonomously to avoid detection. Clustering techniques, a type of unsupervised learning, can help to uncover the hidden patterns and structures from the datasets, to identify indicators of such sophisticated attacks. Similarly, in identifying anomalies, policy violations, detecting, and eliminating noisy instances in data, clustering techniques can be useful. K-means [ 141 ], K-medoids [ 142 ] are the popular partitioning clustering algorithms, and single linkage [ 143 ] or complete linkage [ 144 ] are the well-known hierarchical clustering algorithms used in various application domains. Moreover, a bottom-up clustering approach proposed by Sarker et al. [ 145 ] can also be used by taking into account the data characteristics.

Besides, feature engineering tasks like optimal feature selection or extraction related to a particular security problem could be useful for further analysis [ 106 ]. Recently, Sarker et al. [ 106 ] have proposed an approach for selecting security features according to their importance score values. Moreover, Principal component analysis, linear discriminant analysis, pearson correlation analysis, or non-negative matrix factorization are the popular dimensionality reduction techniques to solve such issues [ 82 ]. Association rule learning is another example, where machine learning based policy rules can prevent cyber-attacks. In an expert system, the rules are usually manually defined by a knowledge engineer working in collaboration with a domain expert [ 37 , 140 , 146 ]. Association rule learning on the contrary, is the discovery of rules or relationships among a set of available security features or attributes in a given dataset [ 147 ]. To quantify the strength of relationships, correlation analysis can be used [ 138 ]. Many association rule mining algorithms have been proposed in the area of machine learning and data mining literature, such as logic-based [ 148 ], frequent pattern based [ 149 , 150 , 151 ], tree-based [ 152 ], etc. Recently, Sarker et al. [ 153 ] have proposed an association rule learning approach considering non-redundant generation, that can be used to discover a set of useful security policy rules. Moreover, AIS [ 147 ], Apriori [ 149 ], Apriori-TID and Apriori-Hybrid [ 149 ], FP-Tree [ 152 ], and RARM [ 154 ], and Eclat [ 155 ] are the well-known association rule learning algorithms that are capable to solve such problems by generating a set of policy rules in the domain of cybersecurity.

Neural networks and deep learning

Deep learning is a part of machine learning in the area of artificial intelligence, which is a computational model that is inspired by the biological neural networks in the human brain [ 82 ]. Artificial Neural Network (ANN) is frequently used in deep learning and the most popular neural network algorithm is backpropagation [ 82 ]. It performs learning on a multi-layer feed-forward neural network consists of an input layer, one or more hidden layers, and an output layer. The main difference between deep learning and classical machine learning is its performance on the amount of security data increases. Typically deep learning algorithms perform well when the data volumes are large, whereas machine learning algorithms perform comparatively better on small datasets [ 44 ]. In our earlier work, Sarker et al. [ 129 ], we have illustrated the effectiveness of these approaches considering contextual datasets. However, deep learning approaches mimic the human brain mechanism to interpret large amount of data or the complex data such as images, sounds and texts [ 44 , 129 ]. In terms of feature extraction to build models, deep learning reduces the effort of designing a feature extractor for each problem than the classical machine learning techniques. Beside these characteristics, deep learning typically takes a long time to train an algorithm than a machine learning algorithm, however, the test time is exactly the opposite [ 44 ]. Thus, deep learning relies more on high-performance machines with GPUs than classical machine-learning algorithms [ 44 , 156 ]. The most popular deep neural network learning models include multi-layer perceptron (MLP) [ 157 ], convolutional neural network (CNN) [ 158 ], recurrent neural network (RNN) or long-short term memory (LSTM) network [ 121 , 158 ]. In recent days, researchers use these deep learning techniques for different purposes such as detecting network intrusions, malware traffic detection and classification, etc. in the domain of cybersecurity [ 44 , 159 ].

Other learning techniques

Semi-supervised learning can be described as a hybridization of supervised and unsupervised techniques discussed above, as it works on both the labeled and unlabeled data. In the area of cybersecurity, it could be useful, when it requires to label data automatically without human intervention, to improve the performance of cybersecurity models. Reinforcement techniques are another type of machine learning that characterizes an agent by creating its own learning experiences through interacting directly with the environment, i.e., environment-driven approach, where the environment is typically formulated as a Markov decision process and take decision based on a reward function [ 160 ]. Monte Carlo learning, Q-learning, Deep Q Networks, are the most common reinforcement learning algorithms [ 161 ]. For instance, in a recent work [ 126 ], the authors present an approach for detecting botnet traffic or malicious cyber activities using reinforcement learning combining with neural network classifier. In another work [ 128 ], the authors discuss about the application of deep reinforcement learning to intrusion detection for supervised problems, where they received the best results for the Deep Q-Network algorithm. In the context of cybersecurity, genetic algorithms that use fitness, selection, crossover, and mutation for finding optimization, could also be used to solve a similar class of learning problems [ 119 ].

Various types of machine learning techniques discussed above can be useful in the domain of cybersecurity, to build an effective security model. In Table  4 , we have summarized several machine learning techniques that are used to build various types of security models for various purposes. Although these models typically represent a learning-based security model, in this paper, we aim to focus on a comprehensive cybersecurity data science model and relevant issues, in order to build a data-driven intelligent security system. In the next section, we highlight several research issues and potential solutions in the area of cybersecurity data science.

Research issues and future directions

Our study opens several research issues and challenges in the area of cybersecurity data science to extract insight from relevant data towards data-driven intelligent decision making for cybersecurity solutions. In the following, we summarize these challenges ranging from data collection to decision making.

Cybersecurity datasets : Source datasets are the primary component to work in the area of cybersecurity data science. Most of the existing datasets are old and might insufficient in terms of understanding the recent behavioral patterns of various cyber-attacks. Although the data can be transformed into a meaningful understanding level after performing several processing tasks, there is still a lack of understanding of the characteristics of recent attacks and their patterns of happening. Thus, further processing or machine learning algorithms may provide a low accuracy rate for making the target decisions. Therefore, establishing a large number of recent datasets for a particular problem domain like cyber risk prediction or intrusion detection is needed, which could be one of the major challenges in cybersecurity data science.

Handling quality problems in cybersecurity datasets : The cyber datasets might be noisy, incomplete, insignificant, imbalanced, or may contain inconsistency instances related to a particular security incident. Such problems in a data set may affect the quality of the learning process and degrade the performance of the machine learning-based models [ 162 ]. To make a data-driven intelligent decision for cybersecurity solutions, such problems in data is needed to deal effectively before building the cyber models. Therefore, understanding such problems in cyber data and effectively handling such problems using existing algorithms or newly proposed algorithm for a particular problem domain like malware analysis or intrusion detection and prevention is needed, which could be another research issue in cybersecurity data science.

Security policy rule generation : Security policy rules reference security zones and enable a user to allow, restrict, and track traffic on the network based on the corresponding user or user group, and service, or the application. The policy rules including the general and more specific rules are compared against the incoming traffic in sequence during the execution, and the rule that matches the traffic is applied. The policy rules used in most of the cybersecurity systems are static and generated by human expertise or ontology-based [ 163 , 164 ]. Although, association rule learning techniques produce rules from data, however, there is a problem of redundancy generation [ 153 ] that makes the policy rule-set complex. Therefore, understanding such problems in policy rule generation and effectively handling such problems using existing algorithms or newly proposed algorithm for a particular problem domain like access control [ 165 ] is needed, which could be another research issue in cybersecurity data science.

Hybrid learning method : Most commercial products in the cybersecurity domain contain signature-based intrusion detection techniques [ 41 ]. However, missing features or insufficient profiling can cause these techniques to miss unknown attacks. In that case, anomaly-based detection techniques or hybrid technique combining signature-based and anomaly-based can be used to overcome such issues. A hybrid technique combining multiple learning techniques or a combination of deep learning and machine-learning methods can be used to extract the target insight for a particular problem domain like intrusion detection, malware analysis, access control, etc. and make the intelligent decision for corresponding cybersecurity solutions.

Protecting the valuable security information : Another issue of a cyber data attack is the loss of extremely valuable data and information, which could be damaging for an organization. With the use of encryption or highly complex signatures, one can stop others from probing into a dataset. In such cases, cybersecurity data science can be used to build a data-driven impenetrable protocol to protect such security information. To achieve this goal, cyber analysts can develop algorithms by analyzing the history of cyberattacks to detect the most frequently targeted chunks of data. Thus, understanding such data protecting problems and designing corresponding algorithms to effectively handling these problems, could be another research issue in the area of cybersecurity data science.

Context-awareness in cybersecurity : Existing cybersecurity work mainly originates from the relevant cyber data containing several low-level features. When data mining and machine learning techniques are applied to such datasets, a related pattern can be identified that describes it properly. However, a broader contextual information [ 140 , 145 , 166 ] like temporal, spatial, relationship among events or connections, dependency can be used to decide whether there exists a suspicious activity or not. For instance, some approaches may consider individual connections as DoS attacks, while security experts might not treat them as malicious by themselves. Thus, a significant limitation of existing cybersecurity work is the lack of using the contextual information for predicting risks or attacks. Therefore, context-aware adaptive cybersecurity solutions could be another research issue in cybersecurity data science.

Feature engineering in cybersecurity : The efficiency and effectiveness of a machine learning-based security model has always been a major challenge due to the high volume of network data with a large number of traffic features. The large dimensionality of data has been addressed using several techniques such as principal component analysis (PCA) [ 167 ], singular value decomposition (SVD) [ 168 ] etc. In addition to low-level features in the datasets, the contextual relationships between suspicious activities might be relevant. Such contextual data can be stored in an ontology or taxonomy for further processing. Thus how to effectively select the optimal features or extract the significant features considering both the low-level features as well as the contextual features, for effective cybersecurity solutions could be another research issue in cybersecurity data science.

Remarkable security alert generation and prioritizing : In many cases, the cybersecurity system may not be well defined and may cause a substantial number of false alarms that are unexpected in an intelligent system. For instance, an IDS deployed in a real-world network generates around nine million alerts per day [ 169 ]. A network-based intrusion detection system typically looks at the incoming traffic for matching the associated patterns to detect risks, threats or vulnerabilities and generate security alerts. However, to respond to each such alert might not be effective as it consumes relatively huge amounts of time and resources, and consequently may result in a self-inflicted DoS. To overcome this problem, a high-level management is required that correlate the security alerts considering the current context and their logical relationship including their prioritization before reporting them to users, which could be another research issue in cybersecurity data science.

Recency analysis in cybersecurity solutions : Machine learning-based security models typically use a large amount of static data to generate data-driven decisions. Anomaly detection systems rely on constructing such a model considering normal behavior and anomaly, according to their patterns. However, normal behavior in a large and dynamic security system is not well defined and it may change over time, which can be considered as an incremental growing of dataset. The patterns in incremental datasets might be changed in several cases. This often results in a substantial number of false alarms known as false positives. Thus, a recent malicious behavioral pattern is more likely to be interesting and significant than older ones for predicting unknown attacks. Therefore, effectively using the concept of recency analysis [ 170 ] in cybersecurity solutions could be another issue in cybersecurity data science.

The most important work for an intelligent cybersecurity system is to develop an effective framework that supports data-driven decision making. In such a framework, we need to consider advanced data analysis based on machine learning techniques, so that the framework is capable to minimize these issues and to provide automated and intelligent security services. Thus, a well-designed security framework for cybersecurity data and the experimental evaluation is a very important direction and a big challenge as well. In the next section, we suggest and discuss a data-driven cybersecurity framework based on machine learning techniques considering multiple processing layers.

A multi-layered framework for smart cybersecurity services

As discussed earlier, cybersecurity data science is data-focused, applies machine learning methods, attempts to quantify cyber risks, promotes inferential techniques to analyze behavioral patterns, focuses on generating security response alerts, and eventually seeks for optimizing cybersecurity operations. Hence, we briefly discuss a multiple data processing layered framework that potentially can be used to discover security insights from the raw data to build smart cybersecurity systems, e.g., dynamic policy rule-based access control or intrusion detection and prevention system. To make a data-driven intelligent decision in the resultant cybersecurity system, understanding the security problems and the nature of corresponding security data and their vast analysis is needed. For this purpose, our suggested framework not only considers the machine learning techniques to build the security model but also takes into account the incremental learning and dynamism to keep the model up-to-date and corresponding response generation, which could be more effective and intelligent for providing the expected services. Figure 3 shows an overview of the framework, involving several processing layers, from raw security event data to services. In the following, we briefly discuss the working procedure of the framework.

figure 3

A generic multi-layered framework based on machine learning techniques for smart cybersecurity services

Security data collecting

Collecting valuable cybersecurity data is a crucial step, which forms a connecting link between security problems in cyberinfrastructure and corresponding data-driven solution steps in this framework, shown in Fig.  3 . The reason is that cyber data can serve as the source for setting up ground truth of the security model that affect the model performance. The quality and quantity of cyber data decide the feasibility and effectiveness of solving the security problem according to our goal. Thus, the concern is how to collect valuable and unique needs data for building the data-driven security models.

The general step to collect and manage security data from diverse data sources is based on a particular security problem and project within the enterprise. Data sources can be classified into several broad categories such as network, host, and hybrid [ 171 ]. Within the network infrastructure, the security system can leverage different types of security data such as IDS logs, firewall logs, network traffic data, packet data, and honeypot data, etc. for providing the target security services. For instance, a given IP is considered malicious or not, could be detected by performing data analysis utilizing the data of IP addresses and their cyber activities. In the domain of cybersecurity, the network source mentioned above is considered as the primary security event source to analyze. In the host category, it collects data from an organization’s host machines, where the data sources can be operating system logs, database access logs, web server logs, email logs, application logs, etc. Collecting data from both the network and host machines are considered a hybrid category. Overall, in a data collection layer the network activity, database activity, application activity, and user activity can be the possible security event sources in the context of cybersecurity data science.

Security data preparing

After collecting the raw security data from various sources according to the problem domain discussed above, this layer is responsible to prepare the raw data for building the model by applying various necessary processes. However, not all of the collected data contributes to the model building process in the domain of cybersecurity [ 172 ]. Therefore, the useless data should be removed from the rest of the data captured by the network sniffer. Moreover, data might be noisy, have missing or corrupted values, or have attributes of widely varying types and scales. High quality of data is necessary for achieving higher accuracy in a data-driven model, which is a process of learning a function that maps an input to an output based on example input-output pairs. Thus, it might require a procedure for data cleaning, handling missing or corrupted values. Moreover, security data features or attributes can be in different types, such as continuous, discrete, or symbolic [ 106 ]. Beyond a solid understanding of these types of data and attributes and their permissible operations, its need to preprocess the data and attributes to convert into the target type. Besides, the raw data can be in different types such as structured, semi-structured, or unstructured, etc. Thus, normalization, transformation, or collation can be useful to organize the data in a structured manner. In some cases, natural language processing techniques might be useful depending on data type and characteristics, e.g., textual contents. As both the quality and quantity of data decide the feasibility of solving the security problem, effectively pre-processing and management of data and their representation can play a significant role to build an effective security model for intelligent services.

Machine learning-based security modeling

This is the core step where insights and knowledge are extracted from data through the application of cybersecurity data science. In this section, we particularly focus on machine learning-based modeling as machine learning techniques can significantly change the cybersecurity landscape. The security features or attributes and their patterns in data are of high interest to be discovered and analyzed to extract security insights. To achieve the goal, a deeper understanding of data and machine learning-based analytical models utilizing a large number of cybersecurity data can be effective. Thus, various machine learning tasks can be involved in this model building layer according to the solution perspective. These are - security feature engineering that mainly responsible to transform raw security data into informative features that effectively represent the underlying security problem to the data-driven models. Thus, several data-processing tasks such as feature transformation and normalization, feature selection by taking into account a subset of available security features according to their correlations or importance in modeling, or feature generation and extraction by creating new brand principal components, may be involved in this module according to the security data characteristics. For instance, the chi-squared test, analysis of variance test, correlation coefficient analysis, feature importance, as well as discriminant and principal component analysis, or singular value decomposition, etc. can be used for analyzing the significance of the security features to perform the security feature engineering tasks [ 82 ].

Another significant module is security data clustering that uncovers hidden patterns and structures through huge volumes of security data, to identify where the new threats exist. It typically involves the grouping of security data with similar characteristics, which can be used to solve several cybersecurity problems such as detecting anomalies, policy violations, etc. Malicious behavior or anomaly detection module is typically responsible to identify a deviation to a known behavior, where clustering-based analysis and techniques can also be used to detect malicious behavior or anomaly detection. In the cybersecurity area, attack classification or prediction is treated as one of the most significant modules, which is responsible to build a prediction model to classify attacks or threats and to predict future for a particular security problem. To predict denial-of-service attack or a spam filter separating tasks from other messages, could be the relevant examples. Association learning or policy rule generation module can play a role to build an expert security system that comprises several IF-THEN rules that define attacks. Thus, in a problem of policy rule generation for rule-based access control system, association learning can be used as it discovers the associations or relationships among a set of available security features in a given security dataset. The popular machine learning algorithms in these categories are briefly discussed in “  Machine learning tasks in cybersecurity ” section. The module model selection or customization is responsible to choose whether it uses the existing machine learning model or needed to customize. Analyzing data and building models based on traditional machine learning or deep learning methods, could achieve acceptable results in certain cases in the domain of cybersecurity. However, in terms of effectiveness and efficiency or other performance measurements considering time complexity, generalization capacity, and most importantly the impact of the algorithm on the detection rate of a system, machine learning models are needed to customize for a specific security problem. Moreover, customizing the related techniques and data could improve the performance of the resultant security model and make it better applicable in a cybersecurity domain. The modules discussed above can work separately and combinedly depending on the target security problems.

Incremental learning and dynamism

In our framework, this layer is concerned with finalizing the resultant security model by incorporating additional intelligence according to the needs. This could be possible by further processing in several modules. For instance, the post-processing and improvement module in this layer could play a role to simplify the extracted knowledge according to the particular requirements by incorporating domain-specific knowledge. As the attack classification or prediction models based on machine learning techniques strongly rely on the training data, it can hardly be generalized to other datasets, which could be significant for some applications. To address such kind of limitations, this module is responsible to utilize the domain knowledge in the form of taxonomy or ontology to improve attack correlation in cybersecurity applications.

Another significant module recency mining and updating security model is responsible to keep the security model up-to-date for better performance by extracting the latest data-driven security patterns. The extracted knowledge discussed in the earlier layer is based on a static initial dataset considering the overall patterns in the datasets. However, such knowledge might not be guaranteed higher performance in several cases, because of incremental security data with recent patterns. In many cases, such incremental data may contain different patterns which could conflict with existing knowledge. Thus, the concept of RecencyMiner [ 170 ] on incremental security data and extracting new patterns can be more effective than the existing old patterns. The reason is that recent security patterns and rules are more likely to be significant than older ones for predicting cyber risks or attacks. Rather than processing the whole security data again, recency-based dynamic updating according to the new patterns would be more efficient in terms of processing and outcome. This could make the resultant cybersecurity model intelligent and dynamic. Finally, response planning and decision making module is responsible to make decisions based on the extracted insights and take necessary actions to prevent the system from the cyber-attacks to provide automated and intelligent services. The services might be different depending on particular requirements for a given security problem.

Overall, this framework is a generic description which potentially can be used to discover useful insights from security data, to build smart cybersecurity systems, to address complex security challenges, such as intrusion detection, access control management, detecting anomalies and fraud, or denial of service attacks, etc. in the area of cybersecurity data science.

Although several research efforts have been directed towards cybersecurity solutions, discussed in “ Background ” , “ Cybersecurity data science ”, and “ Machine learning tasks in cybersecurity ” sections in different directions, this paper presents a comprehensive view of cybersecurity data science. For this, we have conducted a literature review to understand cybersecurity data, various defense strategies including intrusion detection techniques, different types of machine learning techniques in cybersecurity tasks. Based on our discussion on existing work, several research issues related to security datasets, data quality problems, policy rule generation, learning methods, data protection, feature engineering, security alert generation, recency analysis etc. are identified that require further research attention in the domain of cybersecurity data science.

The scope of cybersecurity data science is broad. Several data-driven tasks such as intrusion detection and prevention, access control management, security policy generation, anomaly detection, spam filtering, fraud detection and prevention, various types of malware attack detection and defense strategies, etc. can be considered as the scope of cybersecurity data science. Such tasks based categorization could be helpful for security professionals including the researchers and practitioners who are interested in the domain-specific aspects of security systems [ 171 ]. The output of cybersecurity data science can be used in many application areas such as Internet of things (IoT) security [ 173 ], network security [ 174 ], cloud security [ 175 ], mobile and web applications [ 26 ], and other relevant cyber areas. Moreover, intelligent cybersecurity solutions are important for the banking industry, the healthcare sector, or the public sector, where data breaches typically occur [ 36 , 176 ]. Besides, the data-driven security solutions could also be effective in AI-based blockchain technology, where AI works with huge volumes of security event data to extract the useful insights using machine learning techniques, and block-chain as a trusted platform to store such data [ 177 ].

Although in this paper, we discuss cybersecurity data science focusing on examining raw security data to data-driven decision making for intelligent security solutions, it could also be related to big data analytics in terms of data processing and decision making. Big data deals with data sets that are too large or complex having characteristics of high data volume, velocity, and variety. Big data analytics mainly has two parts consisting of data management involving data storage, and analytics [ 178 ]. The analytics typically describe the process of analyzing such datasets to discover patterns, unknown correlations, rules, and other useful insights [ 179 ]. Thus, several advanced data analysis techniques such as AI, data mining, machine learning could play an important role in processing big data by converting big problems to small problems [ 180 ]. To do this, the potential strategies like parallelization, divide-and-conquer, incremental learning, sampling, granular computing, feature or instance selection, can be used to make better decisions, reducing costs, or enabling more efficient processing. In such cases, the concept of cybersecurity data science, particularly machine learning-based modeling could be helpful for process automation and decision making for intelligent security solutions. Moreover, researchers could consider modified algorithms or models for handing big data on parallel computing platforms like Hadoop, Storm, etc. [ 181 ].

Based on the concept of cybersecurity data science discussed in the paper, building a data-driven security model for a particular security problem and relevant empirical evaluation to measure the effectiveness and efficiency of the model, and to asses the usability in the real-world application domain could be a future work.

Motivated by the growing significance of cybersecurity and data science, and machine learning technologies, in this paper, we have discussed how cybersecurity data science applies to data-driven intelligent decision making in smart cybersecurity systems and services. We also have discussed how it can impact security data, both in terms of extracting insight of security incidents and the dataset itself. We aimed to work on cybersecurity data science by discussing the state of the art concerning security incidents data and corresponding security services. We also discussed how machine learning techniques can impact in the domain of cybersecurity, and examine the security challenges that remain. In terms of existing research, much focus has been provided on traditional security solutions, with less available work in machine learning technique based security systems. For each common technique, we have discussed relevant security research. The purpose of this article is to share an overview of the conceptualization, understanding, modeling, and thinking about cybersecurity data science.

We have further identified and discussed various key issues in security analysis to showcase the signpost of future research directions in the domain of cybersecurity data science. Based on the knowledge, we have also provided a generic multi-layered framework of cybersecurity data science model based on machine learning techniques, where the data is being gathered from diverse sources, and the analytics complement the latest data-driven patterns for providing intelligent security services. The framework consists of several main phases - security data collecting, data preparation, machine learning-based security modeling, and incremental learning and dynamism for smart cybersecurity systems and services. We specifically focused on extracting insights from security data, from setting a research design with particular attention to concepts for data-driven intelligent security solutions.

Overall, this paper aimed not only to discuss cybersecurity data science and relevant methods but also to discuss the applicability towards data-driven intelligent decision making in cybersecurity systems and services from machine learning perspectives. Our analysis and discussion can have several implications both for security researchers and practitioners. For researchers, we have highlighted several issues and directions for future research. Other areas for potential research include empirical evaluation of the suggested data-driven model, and comparative analysis with other security systems. For practitioners, the multi-layered machine learning-based model can be used as a reference in designing intelligent cybersecurity systems for organizations. We believe that our study on cybersecurity data science opens a promising path and can be used as a reference guide for both academia and industry for future research and applications in the area of cybersecurity.

Availability of data and materials

Not applicable.

Abbreviations

  • Machine learning

Artificial Intelligence

Information and communication technology

Internet of Things

Distributed Denial of Service

Intrusion detection system

Intrusion prevention system

Host-based intrusion detection systems

Network Intrusion Detection Systems

Signature-based intrusion detection system

Anomaly-based intrusion detection system

Li S, Da Xu L, Zhao S. The internet of things: a survey. Inform Syst Front. 2015;17(2):243–59.

Google Scholar  

Sun N, Zhang J, Rimba P, Gao S, Zhang LY, Xiang Y. Data-driven cybersecurity incident prediction: a survey. IEEE Commun Surv Tutor. 2018;21(2):1744–72.

McIntosh T, Jang-Jaccard J, Watters P, Susnjak T. The inadequacy of entropy-based ransomware detection. In: International conference on neural information processing. New York: Springer; 2019. p. 181–189

Alazab M, Venkatraman S, Watters P, Alazab M, et al. Zero-day malware detection based on supervised learning algorithms of api call signatures (2010)

Shaw A. Data breach: from notification to prevention using pci dss. Colum Soc Probs. 2009;43:517.

Gupta BB, Tewari A, Jain AK, Agrawal DP. Fighting against phishing attacks: state of the art and future challenges. Neural Comput Appl. 2017;28(12):3629–54.

Av-test institute, germany, https://www.av-test.org/en/statistics/malware/ . Accessed 20 Oct 2019.

Ibm security report, https://www.ibm.com/security/data-breach . Accessed on 20 Oct 2019.

Fischer EA. Cybersecurity issues and challenges: In brief. Congressional Research Service (2014)

Juniper research. https://www.juniperresearch.com/ . Accessed on 20 Oct 2019.

Papastergiou S, Mouratidis H, Kalogeraki E-M. Cyber security incident handling, warning and response system for the european critical information infrastructures (cybersane). In: International Conference on Engineering Applications of Neural Networks, p. 476–487 (2019). New York: Springer

Aftergood S. Cybersecurity: the cold war online. Nature. 2017;547(7661):30.

Hey AJ, Tansley S, Tolle KM, et al. The fourth paradigm: data-intensive scientific discovery. 2009;1:

Cukier K. Data, data everywhere: A special report on managing information, 2010.

Google trends. In: https://trends.google.com/trends/ , 2019.

Anwar S, Mohamad Zain J, Zolkipli MF, Inayat Z, Khan S, Anthony B, Chang V. From intrusion detection to an intrusion response system: fundamentals, requirements, and future directions. Algorithms. 2017;10(2):39.

MATH   Google Scholar  

Mohammadi S, Mirvaziri H, Ghazizadeh-Ahsaee M, Karimipour H. Cyber intrusion detection by combined feature selection algorithm. J Inform Sec Appl. 2019;44:80–8.

Tapiador JE, Orfila A, Ribagorda A, Ramos B. Key-recovery attacks on kids, a keyed anomaly detection system. IEEE Trans Depend Sec Comput. 2013;12(3):312–25.

Tavallaee M, Stakhanova N, Ghorbani AA. Toward credible evaluation of anomaly-based intrusion-detection methods. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 40(5), 516–524 (2010)

Foroughi F, Luksch P. Data science methodology for cybersecurity projects. arXiv preprint arXiv:1803.04219 , 2018.

Saxe J, Sanders H. Malware data science: Attack detection and attribution, 2018.

Rainie L, Anderson J, Connolly J. Cyber attacks likely to increase. Digital Life in. 2014, vol. 2025.

Fischer EA. Creating a national framework for cybersecurity: an analysis of issues and options. LIBRARY OF CONGRESS WASHINGTON DC CONGRESSIONAL RESEARCH SERVICE, 2005.

Craigen D, Diakun-Thibault N, Purse R. Defining cybersecurity. Technology Innovation. Manag Rev. 2014;4(10):13–21.

Council NR. et al. Toward a safer and more secure cyberspace, 2007.

Jang-Jaccard J, Nepal S. A survey of emerging threats in cybersecurity. J Comput Syst Sci. 2014;80(5):973–93.

MathSciNet   MATH   Google Scholar  

Mukkamala S, Sung A, Abraham A. Cyber security challenges: Designing efficient intrusion detection systems and antivirus tools. Vemuri, V. Rao, Enhancing Computer Security with Smart Technology.(Auerbach, 2006), 125–163, 2005.

Bilge L, Dumitraş T. Before we knew it: an empirical study of zero-day attacks in the real world. In: Proceedings of the 2012 ACM conference on computer and communications security. ACM; 2012. p. 833–44.

Davi L, Dmitrienko A, Sadeghi A-R, Winandy M. Privilege escalation attacks on android. In: International conference on information security. New York: Springer; 2010. p. 346–60.

Jovičić B, Simić D. Common web application attack types and security using asp .net. ComSIS, 2006.

Warkentin M, Willison R. Behavioral and policy issues in information systems security: the insider threat. Eur J Inform Syst. 2009;18(2):101–5.

Kügler D. “man in the middle” attacks on bluetooth. In: International Conference on Financial Cryptography. New York: Springer; 2003, p. 149–61.

Virvilis N, Gritzalis D. The big four-what we did wrong in advanced persistent threat detection. In: 2013 International Conference on Availability, Reliability and Security. IEEE; 2013. p. 248–54.

Boyd SW, Keromytis AD. Sqlrand: Preventing sql injection attacks. In: International conference on applied cryptography and network security. New York: Springer; 2004. p. 292–302.

Sigler K. Crypto-jacking: how cyber-criminals are exploiting the crypto-currency boom. Comput Fraud Sec. 2018;2018(9):12–4.

2019 data breach investigations report, https://enterprise.verizon.com/resources/reports/dbir/ . Accessed 20 Oct 2019.

Khraisat A, Gondal I, Vamplew P, Kamruzzaman J. Survey of intrusion detection systems: techniques, datasets and challenges. Cybersecurity. 2019;2(1):20.

Johnson L. Computer incident response and forensics team management: conducting a successful incident response, 2013.

Brahmi I, Brahmi H, Yahia SB. A multi-agents intrusion detection system using ontology and clustering techniques. In: IFIP international conference on computer science and its applications. New York: Springer; 2015. p. 381–93.

Qu X, Yang L, Guo K, Ma L, Sun M, Ke M, Li M. A survey on the development of self-organizing maps for unsupervised intrusion detection. In: Mobile networks and applications. 2019;1–22.

Liao H-J, Lin C-HR, Lin Y-C, Tung K-Y. Intrusion detection system: a comprehensive review. J Netw Comput Appl. 2013;36(1):16–24.

Alazab A, Hobbs M, Abawajy J, Alazab M. Using feature selection for intrusion detection system. In: 2012 International symposium on communications and information technologies (ISCIT). IEEE; 2012. p. 296–301.

Viegas E, Santin AO, Franca A, Jasinski R, Pedroni VA, Oliveira LS. Towards an energy-efficient anomaly-based intrusion detection engine for embedded systems. IEEE Trans Comput. 2016;66(1):163–77.

Xin Y, Kong L, Liu Z, Chen Y, Li Y, Zhu H, Gao M, Hou H, Wang C. Machine learning and deep learning methods for cybersecurity. IEEE Access. 2018;6:35365–81.

Dutt I, Borah S, Maitra IK, Bhowmik K, Maity A, Das S. Real-time hybrid intrusion detection system using machine learning techniques. 2018, p. 885–94.

Ragsdale DJ, Carver C, Humphries JW, Pooch UW. Adaptation techniques for intrusion detection and intrusion response systems. In: Smc 2000 conference proceedings. 2000 IEEE international conference on systems, man and cybernetics.’cybernetics evolving to systems, humans, organizations, and their complex interactions’(cat. No. 0). IEEE; 2000. vol. 4, p. 2344–2349.

Cao L. Data science: challenges and directions. Commun ACM. 2017;60(8):59–68.

Rizk A, Elragal A. Data science: developing theoretical contributions in information systems via text analytics. J Big Data. 2020;7(1):1–26.

Lippmann RP, Fried DJ, Graf I, Haines JW, Kendall KR, McClung D, Weber D, Webster SE, Wyschogrod D, Cunningham RK, et al. Evaluating intrusion detection systems: The 1998 darpa off-line intrusion detection evaluation. In: Proceedings DARPA information survivability conference and exposition. DISCEX’00. IEEE; 2000. vol. 2, p. 12–26.

Kdd cup 99. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html . Accessed 20 Oct 2019.

Tavallaee M, Bagheri E, Lu W, Ghorbani AA. A detailed analysis of the kdd cup 99 data set. In: 2009 IEEE symposium on computational intelligence for security and defense applications. IEEE; 2009. p. 1–6.

Caida ddos attack 2007 dataset. http://www.caida.org/data/ passive/ddos-20070804-dataset.xml/ . Accessed 20 Oct 2019.

Caida anonymized internet traces 2008 dataset. https://www.caida.org/data/passive/passive-2008-dataset . Accessed 20 Oct 2019.

Isot botnet dataset. https://www.uvic.ca/engineering/ece/isot/ datasets/index.php/ . Accessed 20 Oct 2019.

The honeynet project. http://www.honeynet.org/chapters/france/ . Accessed 20 Oct 2019.

Canadian institute of cybersecurity, university of new brunswick, iscx dataset, http://www.unb.ca/cic/datasets/index.html/ . Accessed 20 Oct 2019.

Shiravi A, Shiravi H, Tavallaee M, Ghorbani AA. Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput Secur. 2012;31(3):357–74.

The ctu-13 dataset. https://stratosphereips.org/category/datasets-ctu13 . Accessed 20 Oct 2019.

Moustafa N, Slay J. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In: 2015 Military Communications and Information Systems Conference (MilCIS). IEEE; 2015. p. 1–6.

Cse-cic-ids2018 [online]. available: https://www.unb.ca/cic/ datasets/ids-2018.html/ . Accessed 20 Oct 2019.

Cic-ddos2019 [online]. available: https://www.unb.ca/cic/datasets/ddos-2019.html/ . Accessed 28 Mar 2019.

Jing X, Yan Z, Jiang X, Pedrycz W. Network traffic fusion and analysis against ddos flooding attacks with a novel reversible sketch. Inform Fusion. 2019;51:100–13.

Xie M, Hu J, Yu X, Chang E. Evaluating host-based anomaly detection systems: application of the frequency-based algorithms to adfa-ld. In: International conference on network and system security. New York: Springer; 2015. p. 542–49.

Lindauer B, Glasser J, Rosen M, Wallnau KC, ExactData L. Generating test data for insider threat detectors. JoWUA. 2014;5(2):80–94.

Glasser J, Lindauer B. Bridging the gap: A pragmatic approach to generating insider threat data. In: 2013 IEEE Security and Privacy Workshops. IEEE; 2013. p. 98–104.

Enronspam. https://labs-repos.iit.demokritos.gr/skel/i-config/downloads/enron-spam/ . Accessed 20 Oct 2019.

Spamassassin. http://www.spamassassin.org/publiccorpus/ . Accessed 20 Oct 2019.

Lingspam. https://labs-repos.iit.demokritos.gr/skel/i-config/downloads/lingspampublic.tar.gz/ . Accessed 20 Oct 2019.

Alexa top sites. https://aws.amazon.com/alexa-top-sites/ . Accessed 20 Oct 2019.

Bambenek consulting—master feeds. available online: http://osint.bambenekconsulting.com/feeds/ . Accessed 20 Oct 2019.

Dgarchive. https://dgarchive.caad.fkie.fraunhofer.de/site/ . Accessed 20 Oct 2019.

Zago M, Pérez MG, Pérez GM. Umudga: A dataset for profiling algorithmically generated domain names in botnet detection. Data in Brief. 2020;105400.

Zhou Y, Jiang X. Dissecting android malware: characterization and evolution. In: 2012 IEEE Symposium on security and privacy. IEEE; 2012. p. 95–109.

Virusshare. http://virusshare.com/ . Accessed 20 Oct 2019.

Virustotal. https://virustotal.com/ . Accessed 20 Oct 2019.

Comodo. https://www.comodo.com/home/internet-security/updates/vdp/database . Accessed 20 Oct 2019.

Contagio. http://contagiodump.blogspot.com/ . Accessed 20 Oct 2019.

Kumar R, Xiaosong Z, Khan RU, Kumar J, Ahad I. Effective and explainable detection of android malware based on machine learning algorithms. In: Proceedings of the 2018 international conference on computing and artificial intelligence. ACM; 2018. p. 35–40.

Microsoft malware classification (big 2015). arXiv:org/abs/1802.10135/ . Accessed 20 Oct 2019.

Koroniotis N, Moustafa N, Sitnikova E, Turnbull B. Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: bot-iot dataset. Future Gen Comput Syst. 2019;100:779–96.

McIntosh TR, Jang-Jaccard J, Watters PA. Large scale behavioral analysis of ransomware attacks. In: International conference on neural information processing. New York: Springer; 2018. p. 217–29.

Han J, Pei J, Kamber M. Data mining: concepts and techniques, 2011.

Witten IH, Frank E. Data mining: Practical machine learning tools and techniques, 2005.

Dua S, Du X. Data mining and machine learning in cybersecurity, 2016.

Kotpalliwar MV, Wajgi R. Classification of attacks using support vector machine (svm) on kddcup’99 ids database. In: 2015 Fifth international conference on communication systems and network technologies. IEEE; 2015. p. 987–90.

Pervez MS, Farid DM. Feature selection and intrusion classification in nsl-kdd cup 99 dataset employing svms. In: The 8th international conference on software, knowledge, information management and applications (SKIMA 2014). IEEE; 2014. p. 1–6.

Yan M, Liu Z. A new method of transductive svm-based network intrusion detection. In: International conference on computer and computing technologies in agriculture. New York: Springer; 2010. p. 87–95.

Li Y, Xia J, Zhang S, Yan J, Ai X, Dai K. An efficient intrusion detection system based on support vector machines and gradually feature removal method. Expert Syst Appl. 2012;39(1):424–30.

Raman MG, Somu N, Jagarapu S, Manghnani T, Selvam T, Krithivasan K, Sriram VS. An efficient intrusion detection technique based on support vector machine and improved binary gravitational search algorithm. Artificial Intelligence Review. 2019, p. 1–32.

Kokila R, Selvi ST, Govindarajan K. Ddos detection and analysis in sdn-based environment using support vector machine classifier. In: 2014 Sixth international conference on advanced computing (ICoAC). IEEE; 2014. p. 205–10.

Xie M, Hu J, Slay J. Evaluating host-based anomaly detection systems: Application of the one-class svm algorithm to adfa-ld. In: 2014 11th international conference on fuzzy systems and knowledge discovery (FSKD). IEEE; 2014. p. 978–82.

Saxena H, Richariya V. Intrusion detection in kdd99 dataset using svm-pso and feature reduction with information gain. Int J Comput Appl. 2014;98:6.

Chandrasekhar A, Raghuveer K. Confederation of fcm clustering, ann and svm techniques to implement hybrid nids using corrected kdd cup 99 dataset. In: 2014 international conference on communication and signal processing. IEEE; 2014. p. 672–76.

Shapoorifard H, Shamsinejad P. Intrusion detection using a novel hybrid method incorporating an improved knn. Int J Comput Appl. 2017;173(1):5–9.

Vishwakarma S, Sharma V, Tiwari A. An intrusion detection system using knn-aco algorithm. Int J Comput Appl. 2017;171(10):18–23.

Meng W, Li W, Kwok L-F. Design of intelligent knn-based alarm filter using knowledge-based alert verification in intrusion detection. Secur Commun Netw. 2015;8(18):3883–95.

Dada E. A hybridized svm-knn-pdapso approach to intrusion detection system. In: Proc. Fac. Seminar Ser., 2017, p. 14–21.

Sharifi AM, Amirgholipour SK, Pourebrahimi A. Intrusion detection based on joint of k-means and knn. J Converg Inform Technol. 2015;10(5):42.

Lin W-C, Ke S-W, Tsai C-F. Cann: an intrusion detection system based on combining cluster centers and nearest neighbors. Knowl Based Syst. 2015;78:13–21.

Koc L, Mazzuchi TA, Sarkani S. A network intrusion detection system based on a hidden naïve bayes multiclass classifier. Exp Syst Appl. 2012;39(18):13492–500.

Moon D, Im H, Kim I, Park JH. Dtb-ids: an intrusion detection system based on decision tree using behavior analysis for preventing apt attacks. J Supercomput. 2017;73(7):2881–95.

Ingre, B., Yadav, A., Soni, A.K.: Decision tree based intrusion detection system for nsl-kdd dataset. In: International conference on information and communication technology for intelligent systems. New York: Springer; 2017. p. 207–18.

Malik AJ, Khan FA. A hybrid technique using binary particle swarm optimization and decision tree pruning for network intrusion detection. Cluster Comput. 2018;21(1):667–80.

Relan NG, Patil DR. Implementation of network intrusion detection system using variant of decision tree algorithm. In: 2015 international conference on nascent technologies in the engineering field (ICNTE). IEEE; 2015. p. 1–5.

Rai K, Devi MS, Guleria A. Decision tree based algorithm for intrusion detection. Int J Adv Netw Appl. 2016;7(4):2828.

Sarker IH, Abushark YB, Alsolami F, Khan AI. Intrudtree: a machine learning based cyber security intrusion detection model. Symmetry. 2020;12(5):754.

Puthran S, Shah K. Intrusion detection using improved decision tree algorithm with binary and quad split. In: International symposium on security in computing and communication. New York: Springer; 2016. p. 427–438.

Balogun AO, Jimoh RG. Anomaly intrusion detection using an hybrid of decision tree and k-nearest neighbor, 2015.

Azad C, Jha VK. Genetic algorithm to solve the problem of small disjunct in the decision tree based intrusion detection system. Int J Comput Netw Inform Secur. 2015;7(8):56.

Jo S, Sung H, Ahn B. A comparative study on the performance of intrusion detection using decision tree and artificial neural network models. J Korea Soc Dig Indus Inform Manag. 2015;11(4):33–45.

Zhan J, Zulkernine M, Haque A. Random-forests-based network intrusion detection systems. IEEE Trans Syst Man Cybern C. 2008;38(5):649–59.

Tajbakhsh A, Rahmati M, Mirzaei A. Intrusion detection using fuzzy association rules. Appl Soft Comput. 2009;9(2):462–9.

Mitchell R, Chen R. Behavior rule specification-based intrusion detection for safety critical medical cyber physical systems. IEEE Trans Depend Secure Comput. 2014;12(1):16–30.

Alazab M, Venkataraman S, Watters P. Towards understanding malware behaviour by the extraction of api calls. In: 2010 second cybercrime and trustworthy computing Workshop. IEEE; 2010. p. 52–59.

Yuan Y, Kaklamanos G, Hogrefe D. A novel semi-supervised adaboost technique for network anomaly detection. In: Proceedings of the 19th ACM international conference on modeling, analysis and simulation of wireless and mobile systems. ACM; 2016. p. 111–14.

Ariu D, Tronci R, Giacinto G. Hmmpayl: an intrusion detection system based on hidden markov models. Comput Secur. 2011;30(4):221–41.

Årnes A, Valeur F, Vigna G, Kemmerer RA. Using hidden markov models to evaluate the risks of intrusions. In: International workshop on recent advances in intrusion detection. New York: Springer; 2006. p. 145–64.

Hansen JV, Lowry PB, Meservy RD, McDonald DM. Genetic programming for prevention of cyberterrorism through dynamic and evolving intrusion detection. Decis Supp Syst. 2007;43(4):1362–74.

Aslahi-Shahri B, Rahmani R, Chizari M, Maralani A, Eslami M, Golkar MJ, Ebrahimi A. A hybrid method consisting of ga and svm for intrusion detection system. Neural Comput Appl. 2016;27(6):1669–76.

Alrawashdeh K, Purdy C. Toward an online anomaly intrusion detection system based on deep learning. In: 2016 15th IEEE international conference on machine learning and applications (ICMLA). IEEE; 2016. p. 195–200.

Yin C, Zhu Y, Fei J, He X. A deep learning approach for intrusion detection using recurrent neural networks. IEEE Access. 2017;5:21954–61.

Kim J, Kim J, Thu HLT, Kim H. Long short term memory recurrent neural network classifier for intrusion detection. In: 2016 international conference on platform technology and service (PlatCon). IEEE; 2016. p. 1–5.

Almiani M, AbuGhazleh A, Al-Rahayfeh A, Atiewi S, Razaque A. Deep recurrent neural network for iot intrusion detection system. Simulation Modelling Practice and Theory. 2019;102031.

Kolosnjaji B, Zarras A, Webster G, Eckert C. Deep learning for classification of malware system call sequences. In: Australasian joint conference on artificial intelligence. New York: Springer; 2016. p. 137–49.

Wang W, Zhu M, Zeng X, Ye X, Sheng Y. Malware traffic classification using convolutional neural network for representation learning. In: 2017 international conference on information networking (ICOIN). IEEE; 2017. p. 712–17.

Alauthman M, Aslam N, Al-kasassbeh M, Khan S, Al-Qerem A, Choo K-KR. An efficient reinforcement learning-based botnet detection approach. J Netw Comput Appl. 2020;150:102479.

Blanco R, Cilla JJ, Briongos S, Malagón P, Moya JM. Applying cost-sensitive classifiers with reinforcement learning to ids. In: International conference on intelligent data engineering and automated learning. New York: Springer; 2018. p. 531–38.

Lopez-Martin M, Carro B, Sanchez-Esguevillas A. Application of deep reinforcement learning to intrusion detection for supervised problems. Exp Syst Appl. 2020;141:112963.

Sarker IH, Kayes A, Watters P. Effectiveness analysis of machine learning classification models for predicting personalized context-aware smartphone usage. J Big Data. 2019;6(1):1–28.

Holte RC. Very simple classification rules perform well on most commonly used datasets. Mach Learn. 1993;11(1):63–90.

John GH, Langley P. Estimating continuous distributions in bayesian classifiers. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc.; 1995. p. 338–45.

Quinlan JR. C4.5: Programs for machine learning. Machine Learning, 1993.

Sarker IH, Colman A, Han J, Khan AI, Abushark YB, Salah K. Behavdt: a behavioral decision tree learning to build user-centric context-aware predictive model. Mobile Networks and Applications. 2019, p. 1–11.

Aha DW, Kibler D, Albert MK. Instance-based learning algorithms. Mach Learn. 1991;6(1):37–66.

Keerthi SS, Shevade SK, Bhattacharyya C, Murthy KRK. Improvements to platt’s smo algorithm for svm classifier design. Neural Comput. 2001;13(3):637–49.

Freund Y, Schapire RE, et al: Experiments with a new boosting algorithm. In: Icml, vol. 96, p. 148–156 (1996). Citeseer

Le Cessie S, Van Houwelingen JC. Ridge estimators in logistic regression. J Royal Stat Soc C. 1992;41(1):191–201.

Watters PA, McCombie S, Layton R, Pieprzyk J. Characterising and predicting cyber attacks using the cyber attacker model profile (camp). J Money Launder Control. 2012.

Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.

Sarker IH. Context-aware rule learning from smartphone data: survey, challenges and future directions. J Big Data. 2019;6(1):95.

MacQueen J. Some methods for classification and analysis of multivariate observations. In: Fifth Berkeley symposium on mathematical statistics and probability, vol. 1, 1967.

Rokach L. A survey of clustering algorithms. In: Data Mining and Knowledge Discovery Handbook. New York: Springer; 2010. p. 269–98.

Sneath PH. The application of computers to taxonomy. J Gen Microbiol. 1957;17:1.

Sorensen T. method of establishing groups of equal amplitude in plant sociology based on similarity of species. Biol Skr. 1948;5.

Sarker IH, Colman A, Kabir MA, Han J. Individualized time-series segmentation for mining mobile phone user behavior. Comput J. 2018;61(3):349–68.

Kim G, Lee S, Kim S. A novel hybrid intrusion detection method integrating anomaly detection with misuse detection. Exp Syst Appl. 2014;41(4):1690–700.

MathSciNet   Google Scholar  

Agrawal R, Imieliński T, Swami A. Mining association rules between sets of items in large databases. In: ACM SIGMOD Record. ACM; 1993. vol. 22, p. 207–16.

Flach PA, Lachiche N. Confirmation-guided discovery of first-order rules with tertius. Mach Learn. 2001;42(1–2):61–95.

Agrawal R, Srikant R, et al: Fast algorithms for mining association rules. In: Proc. 20th Int. Conf. Very Large Data Bases, VLDB, 1994, vol. 1215, p. 487–99.

Houtsma M, Swami A. Set-oriented mining for association rules in relational databases. In: Proceedings of the eleventh international conference on data engineering. IEEE; 1995. p. 25–33.

Ma BLWHY. Integrating classification and association rule mining. In: Proceedings of the fourth international conference on knowledge discovery and data mining, 1998.

Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation. In: ACM Sigmod Record. ACM; 2000. vol. 29, p. 1–12.

Sarker IH, Salim FD. Mining user behavioral rules from smartphone data through association analysis. In: Proceedings of the 22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), Melbourne, Australia. New York: Springer; 2018. p. 450–61.

Das A, Ng W-K, Woon Y-K. Rapid association rule mining. In: Proceedings of the tenth international conference on information and knowledge management. ACM; 2001. p. 474–81.

Zaki MJ. Scalable algorithms for association mining. IEEE Trans Knowl Data Eng. 2000;12(3):372–90.

Coelho IM, Coelho VN, Luz EJS, Ochi LS, Guimarães FG, Rios E. A gpu deep learning metaheuristic based model for time series forecasting. Appl Energy. 2017;201:412–8.

Van Efferen L, Ali-Eldin AM. A multi-layer perceptron approach for flow-based anomaly detection. In: 2017 International symposium on networks, computers and communications (ISNCC). IEEE; 2017. p. 1–6.

Liu H, Lang B, Liu M, Yan H. Cnn and rnn based payload classification methods for attack detection. Knowl Based Syst. 2019;163:332–41.

Berman DS, Buczak AL, Chavis JS, Corbett CL. A survey of deep learning methods for cyber security. Information. 2019;10(4):122.

Bellman R. A markovian decision process. J Math Mech. 1957;1:679–84.

Kaelbling LP, Littman ML, Moore AW. Reinforcement learning: a survey. J Artif Intell Res. 1996;4:237–85.

Sarker IH. A machine learning based robust prediction model for real-life mobile phone data. Internet of Things. 2019;5:180–93.

Kayes ASM, Han J, Colman A. OntCAAC: an ontology-based approach to context-aware access control for software services. Comput J. 2015;58(11):3000–34.

Kayes ASM, Rahayu W, Dillon T. An ontology-based approach to dynamic contextual role for pervasive access control. In: AINA 2018. IEEE Computer Society, 2018.

Colombo P, Ferrari E. Access control technologies for big data management systems: literature review and future trends. Cybersecurity. 2019;2(1):1–13.

Aleroud A, Karabatis G. Contextual information fusion for intrusion detection: a survey and taxonomy. Knowl Inform Syst. 2017;52(3):563–619.

Sarker IH, Abushark YB, Khan AI. Contextpca: Predicting context-aware smartphone apps usage based on machine learning techniques. Symmetry. 2020;12(4):499.

Madsen RE, Hansen LK, Winther O. Singular value decomposition and principal component analysis. Neural Netw. 2004;1:1–5.

Qiao L-B, Zhang B-F, Lai Z-Q, Su J-S. Mining of attack models in ids alerts from network backbone by a two-stage clustering method. In: 2012 IEEE 26th international parallel and distributed processing symposium workshops & Phd Forum. IEEE; 2012. p. 1263–9.

Sarker IH, Colman A, Han J. Recencyminer: mining recency-based personalized behavior from contextual smartphone data. J Big Data. 2019;6(1):49.

Ullah F, Babar MA. Architectural tactics for big data cybersecurity analytics systems: a review. J Syst Softw. 2019;151:81–118.

Zhao S, Leftwich K, Owens M, Magrone F, Schonemann J, Anderson B, Medhi D. I-can-mama: Integrated campus network monitoring and management. In: 2014 IEEE network operations and management symposium (NOMS). IEEE; 2014. p. 1–7.

Abomhara M, et al. Cyber security and the internet of things: vulnerabilities, threats, intruders and attacks. J Cyber Secur Mob. 2015;4(1):65–88.

Helali RGM. Data mining based network intrusion detection system: A survey. In: Novel algorithms and techniques in telecommunications and networking. New York: Springer; 2010. p. 501–505.

Ryoo J, Rizvi S, Aiken W, Kissell J. Cloud security auditing: challenges and emerging approaches. IEEE Secur Priv. 2013;12(6):68–74.

Densham B. Three cyber-security strategies to mitigate the impact of a data breach. Netw Secur. 2015;2015(1):5–8.

Salah K, Rehman MHU, Nizamuddin N, Al-Fuqaha A. Blockchain for ai: review and open research challenges. IEEE Access. 2019;7:10127–49.

Gandomi A, Haider M. Beyond the hype: big data concepts, methods, and analytics. Int J Inform Manag. 2015;35(2):137–44.

Golchha N. Big data-the information revolution. Int J Adv Res. 2015;1(12):791–4.

Hariri RH, Fredericks EM, Bowers KM. Uncertainty in big data analytics: survey, opportunities, and challenges. J Big Data. 2019;6(1):44.

Tsai C-W, Lai C-F, Chao H-C, Vasilakos AV. Big data analytics: a survey. J Big data. 2015;2(1):21.

Download references

Acknowledgements

The authors would like to thank all the reviewers for their rigorous review and comments in several revision rounds. The reviews are detailed and helpful to improve and finalize the manuscript. The authors are highly grateful to them.

Author information

Authors and affiliations.

Swinburne University of Technology, Melbourne, VIC, 3122, Australia

Iqbal H. Sarker

Chittagong University of Engineering and Technology, Chittagong, 4349, Bangladesh

La Trobe University, Melbourne, VIC, 3086, Australia

A. S. M. Kayes, Paul Watters & Alex Ng

University of Nevada, Reno, USA

Shahriar Badsha

Macquarie University, Sydney, NSW, 2109, Australia

Hamed Alqahtani

You can also search for this author in PubMed   Google Scholar

Contributions

This article provides not only a discussion on cybersecurity data science and relevant methods but also to discuss the applicability towards data-driven intelligent decision making in cybersecurity systems and services. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Iqbal H. Sarker .

Ethics declarations

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Sarker, I.H., Kayes, A.S.M., Badsha, S. et al. Cybersecurity data science: an overview from machine learning perspective. J Big Data 7 , 41 (2020). https://doi.org/10.1186/s40537-020-00318-5

Download citation

Received : 26 October 2019

Accepted : 21 June 2020

Published : 01 July 2020

DOI : https://doi.org/10.1186/s40537-020-00318-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Decision making
  • Cyber-attack
  • Security modeling
  • Intrusion detection
  • Cyber threat intelligence

cyber security research paper outline

  • Search Menu
  • Editor's Choice
  • Author Guidelines
  • Submission Site
  • Open Access
  • About Journal of Cybersecurity
  • Editorial Board
  • Advertising and Corporate Services
  • Journals Career Network
  • Self-Archiving Policy
  • Journals on Oxford Academic
  • Books on Oxford Academic

Issue Cover

Editors-in-Chief

Tyler Moore

About the journal

Journal of Cybersecurity publishes accessible articles describing original research in the inherently interdisciplinary world of computer, systems, and information security …

Latest articles

Cybersecurity Month

Call for Papers

Journal of Cybersecurity is soliciting papers for a special collection on the philosophy of information security. This collection will explore research at the intersection of philosophy, information security, and philosophy of science.

Find out more

CYBERS High Impact 480x270.png

High-Impact Research Collection

Explore a collection of freely available high-impact research from 2020 and 2021 published in the Journal of Cybersecurity .

Browse the collection here

submit

Submit your paper

Join the conversation moving the science of security forward. Visit our Instructions to Authors for more information about how to submit your manuscript.

Read and publish

Read and Publish deals

Authors interested in publishing in Journal of Cybersecurity may be able to publish their paper Open Access using funds available through their institution’s agreement with OUP.

Find out if your institution is participating

Related Titles

cybersecurityandcyberwar

Affiliations

  • Online ISSN 2057-2093
  • Print ISSN 2057-2085
  • Copyright © 2024 Oxford University Press
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Accessibility Links

  • Skip to content
  • Skip to search IOPscience
  • Skip to Journals list
  • Accessibility help
  • Accessibility Help

Click here to close this panel.

Purpose-led Publishing is a coalition of three not-for-profit publishers in the field of physical sciences: AIP Publishing, the American Physical Society and IOP Publishing.

Together, as publishers that will always put purpose above profit, we have defined a set of industry standards that underpin high-quality, ethical scholarly communications.

We are proudly declaring that science is our only shareholder.

Cyber Security Challenges and its Emerging Trends on Latest Technologies

K. M Rajasekharaiah 1 , Chhaya S Dule 2 and E Sudarshan 3

Published under licence by IOP Publishing Ltd IOP Conference Series: Materials Science and Engineering , Volume 981 , International Conference on Recent Advancements in Engineering and Management (ICRAEM-2020) 9-10 October 2020, Warangal, India Citation K. M Rajasekharaiah et al 2020 IOP Conf. Ser.: Mater. Sci. Eng. 981 022062 DOI 10.1088/1757-899X/981/2/022062

Article metrics

11593 Total downloads

Share this article

Author e-mails.

[email protected]

Author affiliations

1 Principal and Professor-CSE, Kshatriya College of Engineering, Armoor, Nizamabad Dist. Telangana, India

2 Assistant Professor-CSE, Dayananda Sagar University, Bangalore, Karnataka, India

3 Sumathi Reddy Institute of Technology for Women, Warangal, India

Buy this article in print

Today, due to the modern life style people have joined technology life and using more technology for shopping as well as financial transactions in their cyber space. At the same time, safeguarding of knowledge has become increasingly difficult. In addition, the heavy use and growth of social media, online crime or cybercrime has increased. In the world of information technology, data security plays a significant role. The information security has become one of today's main challenges. Whenever we think of cyber security, we first of all think of 'cybercrimes,' which expand tremendously every day. Different government and businesses take various steps to avoid this form of cybercrime. In addition to numerous cyber protection initiatives, many people are also very worried about it. This paper focuses primarily on cyber security concerns related to the new technology. It also concentrates on the new technologies for cyber security, ethics and developments that impact cyber security.

Export citation and abstract BibTeX RIS

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence . Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

Information Security:A Review of Information Security Issues and Techniques

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

AI hype as a cyber security risk: the moral responsibility of implementing generative AI in business

  • Original Research
  • Open access
  • Published: 23 February 2024

Cite this article

You have full access to this open access article

  • Declan Humphreys   ORCID: orcid.org/0009-0008-2693-7340 1 ,
  • Abigail Koay   ORCID: orcid.org/0000-0002-4130-9931 1 ,
  • Dennis Desmond   ORCID: orcid.org/0000-0003-1278-6306 1 &
  • Erica Mealy   ORCID: orcid.org/0000-0002-8119-151X 1  

7 Altmetric

Explore all metrics

This paper examines the ethical obligations companies have when implementing generative Artificial Intelligence (AI). We point to the potential cyber security risks companies are exposed to when rushing to adopt generative AI solutions or buying into “AI hype”. While the benefits of implementing generative AI solutions for business have been widely touted, the inherent risks associated have been less well publicised. There are growing concerns that the race to integrate generative AI is not being accompanied by adequate safety measures. The rush to buy into the hype of generative AI and not fall behind the competition is potentially exposing companies to broad and possibly catastrophic cyber-attacks or breaches. In this paper, we outline significant cyber security threats generative AI models pose, including potential ‘backdoors’ in AI models that could compromise user data or the risk of ‘poisoned’ AI models producing false results. In light of these the cyber security concerns, we discuss the moral obligations of implementing generative AI into business by considering the ethical principles of beneficence, non-maleficence, autonomy, justice, and explicability. We identify two examples of ethical concern, overreliance and over-trust in generative AI, both of which can negatively influence business decisions, leaving companies vulnerable to cyber security threats. This paper concludes by recommending a set of checklists for ethical implementation of generative AI in business environment to minimise cyber security risk based on the discussed moral responsibilities and ethical concern.

Avoid common mistakes on your manuscript.

1 Introduction

The recent hype around AI has seen many companies rush to incorporate generative AI to their business strategy. A recent IBM study found that nearly 80% of UK businesses have already deployed generative AI in their business or are planning to within the next year [ 1 ]. The message to industry seems clear “Organizations are seizing the generative AI moment to capture opportunities … Those that don’t will be stuck in the control tower wondering why they’ve fallen behind.” [ 2 ].

Generative AI models take large amounts of data and are then trained to produce data that resembles the most commonly found elements. A Large Language Model (LLM) is a type of generative AI model that assigns statistical probabilities to a sequence of words. These probabilities help to generate human like responses in natural language processing tasks [ 3 ]. Companies are using these LLMs such as ChatGPT, LLaMA, Claude, and Gemini to aid many areas of business. The areas which are most likely to see the potential of generative AI to improve businesses are areas such as sales, marketing, software engineering, customer service and product research and development [ 4 ]. The benefits of its implementation are still being tested, but there is early evidence that AI-based assistants can improve the performance of novice or low-skilled workers [ 5 ].

However, there are growing concerns that the race to integrate generative AI is not being accompanied by adequate guardrails or safety evaluations [ 6 ]. A recent global survey on AI found that few companies were fully prepared for the widespread use of generative AI [ 7 ]. The rush to buy into the hype of generative AI, and not fall behind the competition, is potentially exposing organisations to broad and possibly catastrophic cyber-attacks or breaches. In the growing area of cyber security ethics, the hype around AI presents a novel risk, one which could lead companies to fail in their moral obligation to keep company and individual’s data safe and secure.

We have already seen Microsoft AI researchers accidently leak 38 TB of private training data [ 8 ]; Samsung employees inputting sensitive source code into ChatGPT, [ 9 ]; and a bug in ChatGPT exposing active user’s chat history [ 10 ]. Beyond the risk due to accidents or human error, there are more malicious threats posed by generative AI. Imagined scenarios could see targeted manipulation of the data driving a company’s model to spread misinformation or influence business decisions [ 11 ]. Risks are also increased with the reliance on third-party AI providers, with more than half (55%) of AI related failures stemming from third-party tools, companies can be left vulnerable to unmitigated risks [ 12 ].

It is evident that generative AI poses new and novel threats to business security. A recent IBM survey found that 96% of surveyed business executives expect that adopting generative AI will make a security breach likely in the next three years [ 11 ]. However, this report noted a “glaring disconnect between the understanding of generative AI cyber security needs and the implementation of cyber security measures” [ 11 ]. Reportedly, only 24% of generative AI projects will include a cyber security component within the next 6 months, with 69% of executives saying that innovation takes precedence over cyber security for generative AI [ 11 ]. A separate study found that 53% of organisations saw cyber security as a generative AI-related risk, with only 38% working to mitigate that risk [ 7 ].

The hype around generative AI in business, therefore, presents an area of ethical concern. Ethics is at the core of cyber security, as it is increasingly required to prevent harm to people, not just information, and to protect our ability to live well [ 13 , 14 , 15 ]. Companies have a duty of care toward their users, customers, and employees with regard to protecting the data they hold [ 16 ]. The world is now so reliant on secure networks and systems to protect identities, personal information, and livelihoods that breaches to can have major disruptions and disastrous effects on individual’s lives [ 17 ]. Beyond the effect on the public, it is in the financial interest of companies to focus on cyber security with the average cost of a data breach in 2023 being USD 4.45 million [ 18 ].

As our analysis of potential threats to generative AI models, such as LLMs, will show businesses need to be aware of the increased risk to privacy and security. While companies tout the vast benefits of generative AI for business productivity, there needs to be a greater focus on effective mitigation of threats posed to and by generative AI models [ 6 ]. Conversations of these risks have generally been kept within cyber security industry professionals, but there needs to be a wider understanding of the vulnerabilities which generative AI is susceptible to before organisations jump to using them. There is an ethical responsibility for business to consider the cyber security risk associated with generative AI, and for this information to be shared with the general public.

2 Cyber security as an area of ethical inquiry

As more and more data and information is stored online, and more services move to digital operations, the threat to the security and risk of harm also increases. The definition of cyber security has evolved over time and it often contested [ 19 ]. There remains the question as to whether cyber security is a role, a field, a discipline, or a practical application encompassing a combination of information security, operational security, network and communications security or other security focused disciplines.

A thorough and systematic review of historical definitions of cyber security by Schatz, et al. [ 19 ] arrived at a definition of cyber security that includes the key aspects of protecting information as the core asset. To wit: “ The approach and actions associated with security risk management processes followed by organizations and states to protect confidentiality, integrity and availability of data and assets used in cyber space. The concept includes guidelines, policies and collections of safeguards, technologies, tools and training to provide the best protection for the state of the cyber environment and its users.” [ 19 ].

Schatz’ inclusion of basic protection of the confidentiality, integrity and availability of information has become prescient with the advent of AI generated deepfakes, celebrity images, and AI journalism employing automated authors. This has also led to a greater focus on the ethical implications of cyber security processes and policy. Integrity, for example, is defined as guarding against improper information modification and includes ensuring information authenticity [ 20 ].

Cyber security is a growing field of ethical investigation, with developing literature into the ethical challenges, risks and issues associated [ 14 , 21 , 22 ]. Whether monitoring information flows of individuals, intrusive measures to identify child sexual exploitation material, or restricting access to online sites to deter terrorism and extremism, cyber security can be both intrusive and violate norms of privacy.

One issue faced by the cyber security ethicist is the broad nature of the field of cyber security. There has been a distinction made between the ethics of national or state based cyber security and business or commercial cyber security [ 14 ]. The former of these takes in topics such as the application of just war theory to cyberwar and espionage [ 23 , 24 , 25 ]. However, it is questionable whether cyberwar and espionage do fall under the purview of cyber security or whether cyber security provides a supporting capability to ensure their success.

Alternatively, in the private sector, there are numerous areas of inquiry that fit under the broad umbrella of cyber security ethics. Recent work has focused on the ethics of conducting cyber security research [ 26 ]; the ethical balance between needing internet traffic to be monitored for security, but also wanting it to be private [ 27 ]; the concept of “ethical hacking” to test security of networks or employees [ 28 ]; as well as the ethical obligations of businesses to protect their data [ 16 , 29 ].

We will concentrate on the new ethical challenges presented by generative AI and the resulting cyber security implications for an organisation. To narrow the ethical focus of this paper, we will concentrate on the moral responsibility businesses have to protect their assets as well as user and employee data. It will be shown that the ethical considerations for cyber security on business have clear crossovers for the implementation of generative AI.

Whereas generative AI for public consumption is a relatively new phenomenon, many of the ethical considerations can be derived from previous research and applications of ethics to cyber security activities. The ability to apply ethical considerations to emerging technologies will continue to challenge cyber security professionals as new applications appear and see mainstream adoption.

3 Literature review

In this section we look at the background literature related to AI in cyber security as well as the growing literature on the ethical issues around generative AI tools, such as ChatGPT. We will conclude by showing where the gaps in the literature lie and clearly note the contributions this paper makes to the field. We note that, while there is literature around the risks of generative AI tools, such as ChatGPT, this has not yet translated into the discourse of business ethics. This paper takes the unique angle of framing the implementation of generative AI as a question of business ethics and cyber security ethics.

3.1 AI in cyber security

The relationship between AI and cyber security is not new, with autonomous or semi-autonomous systems for cyber security defence being on the market for a number of years. In 2017, for example, DarkLight was released in what was then called “first of its kind” artificial intelligence tool to enhance cyber security defence [ 30 ]. There has since been literature highlighting the beneficial uses of AI in cyber security defence.

Early uses of AI in cyber security were based around developing discriminative-based machine learning (ML) or deep learning (DL) AI models. ML tools are capable of discriminating data through classifying information, and recognising specific patterns [ 31 ]. Though powerful, ML is also limited in terms of threat detection as it acts according to pre-defined features, meaning that any features not pre-defined will evade detection [ 31 ]. DL models, a subset of ML, on the other hand are able to learn high-level abstract characteristics, or deeper features of given data, making them excel at things like image and speech recognition, text analysis and natural language processing [ 32 ]. This benefits cyber security as it enables the detection of unknown attackers or novel forms of malware. AI assists in cyber security through constructing models for malware classification, intrusion detection and threat intelligence sensing [ 18 ]. Because AI has the ability to extract patterns from large datasets, and adapt to new information, it can accurately make predictions to improve cyber security [ 33 ].

3.2 Cyber security of AI

While the benefits of AI in cyber security have become evident in the preceding years, the malicious threats to AI models have also been recognised. ML and DL models used in AI systems such as recommendation systems or facial recognition are susceptible to ‘poisoning’ or manipulation, potentially undermining their integrity and useability [ 3 , 6 , 34 ]. In practical terms, injecting misleading or incorrect data into an AI model used for cyber security defence can skew its decision making causing it to overlook vulnerabilities or misidentify threats [ 33 ].

Since the increased popularity of generative AI, spurred by the release of ChatGPT in 2022, new discussions have surfaced on the usefulness and risks of such technology. Generative AI is a branch of ML and DL which is capable of creating new data that is similar to its training data set [ 35 ]. Large language models, such as ChatGPT, use text as their dataset, and have caused a boom in AI interest and hype.

The use of generative AI been explored in areas such as healthcare [ 36 , 37 ], education [ 38 ], academia [ 39 ], creative industries [ 40 ], journalism, and media [ 41 ]. At the time of writing, empirical study of the effect of generative AI within work and business is in its infancy, yet its far-ranging impacts are being explored. Studies have so far looked at the effect of generative AI in areas such as call centres [ 5 ], on knowledge worker productivity and quality [ 42 ], risk management and finance [ 43 ] and on operations and supply chains [ 44 ].

3.3 Ethical concerns and risks in AI

For all the new applications and advances in efficiency which generative AI is showing, it has also undoubtedly brought concern with recent work focusing on the ethics around generative AI and ChatGPT [ 45 ]. Some of this literature focuses on the threat which generative AI will have for jobs [ 46 , 47 ], bias in training data affecting its output [ 48 , 49 ], or a diminishing of critical thinking and problem-solving skills amongst users [ 50 ]. Other concerns circle around the threat of disinformation [ 51 ], manipulation of public sentiment [ 52 ], and a widening socio-economic inequalities [ 46 ].

With regards to cyber security, recent work has highlighted the risks to generative AI models such as ChatGPT, and their susceptibility to data poisoning and manipulation [ 3 , 6 , 34 ] similar to earlier ML or DL models. Companies making AI models, such as Open AI and Google, have published their own findings on the risks associated with these models and the techniques they used to train them [ 53 , 54 ]. Generative AI has also reduced the barrier of entry for cybercriminals, helping in malware creation and phishing attacks [ 55 ].

Literature on the cyber security risk of generative AI for business is beginning, with ChatGPT in particular being cited as a potential risk. This includes the risk of data breaches or unauthorised access to user conversations as well as the risk of staff putting sensitive information into the program [ 56 ]. However, there is still a gap in literature translating the technical threats of generative AI into a business setting.

While we have noted some of the ethical issues raised by generative AI, limited work has been done in systematically applying ethical frameworks or lenses to these issues. Schlagwein and Willocks [ 57 ] apply deontological and teleological lenses to judge the ethical use of AI in research and science. Illia et al. [ 58 ] apply a stakeholder theory approach to the ethics of using AI for text-generation in business. The latter, arguing that the use of AI agents diminishes direct communication between stakeholders, potentially causing misunderstandings and leading to a decreased level of trust between parties.

Our paper will look at issues in generative AI in business, through the lens of ethical principles similar to those found in bio-ethics, namely: beneficence, non-maleficence, autonomy, justice and explicability. This builds on work in applying ethical principles both to AI [ 59 ] and to cyber security [ 14 ].We note that not all are convinced of the efficacy of a principlist approach to AI ethics, Bruschi and Diomede [ 60 ] provide a useful summary of this argument. However, while our paper focusses on generative AI, it also does so by looking at it as a technological innovation in the workplace. Thus, we build upon literature which applies ethical principles to the introduction of new technology in society and into the workplace [ 61 , 62 ].

From this review we can see that there is growing literature outlining the risk which could befall Generative AI models. However, this concern has not yet been translated into discourse around the ethical implementation of generative AI for business. This is evidenced by the lack of awareness or concern around the cyber security risk of gen AI amongst business leaders [ 11 ]. This paper therefore makes the following contributions:

Supports the case for cyber security being an ethical obligation for business, using normative ethical principles.

Highlights literature on the cyber security risks associated with generative AI, including the risks of poisoning, manipulation, and data leakage.

Demonstrates how the risks associated with generative AI can threaten business operations and their responsibilities to stakeholders.

Makes the case that businesses have an ethical obligation to consider the cyber security risk of generative AI and provides suggestions based on ethical considerations and analysis.

4 Cyber security of AI as an ethical obligation for business

While many have recognised the need for ethics in cyber security, there has been little clear consensus about the most appropriate framework from which to investigate ethical issues the field. Some advocate for the use of traditional frameworks of deontology, utilitarianism and virtue ethics [ 21 ] while others have proposed using a principlist approach adopted from areas such as bio-ethics [ 14 ].

While broad moral theories of utilitarianism or deontology provide guidance, their effectiveness falls when applied to situations which require pragmatic solutions [ 14 ]. The contextual nuances of cyber security provide difficulty in applying such general theories. For example, some have noted the substantial difficulty in applying a general theory of consequentialism or deontology to a process such as tracking a hacker through the machines of innocent persons [ 63 ].

Greater success has been found in applying ethical principles like those adopted in the field of bioethics. To analyse the ethical obligation of implementing generative AI in business with respect to cyber security concerns, we propose a combination of the ethical framework for a Good AI society from Floridi et al. [ 59 ] and the principlist ethics for cyber security from Formosa et al. [ 14 ].

It is our contention that the application of the moral principles of beneficence, non-maleficence, autonomy, justice and explicability are the most suitable to analyse the ethical concerns regarding the cyber security risks of generative AI for companies. Because adoption of generative AI in business combines both issues of ethical AI and ethics of cyber security, there is utility in applying such a set of principles.

It is evident now that generative AI will have a major impact on the way companies do business, but there are still questions around the opportunities and risks associated with its adoption. An ethical adoption of generative AI should also take into consideration the cyber security risk associated with its implementation. In the next few subsections, we present how the ethical principles of beneficence, non-maleficence, autonomy, justice and explicability relate to businesses adoption of generative AI considering cyber security.

4.1 Beneficence

A core principle of bioethics, beneficence concerns promoting well-being or “doing good”. Implementing a technology such as AI should be for the common good and to generally promote the well-being of people [ 59 ]. Similarly, beneficence in cyber security means protecting privacy and personal data, which subsequently promotes well-being of the public [ 14 ]. Good cyber security also has the added benefit of enhancing the reputation of a company and building trust among their customers.

While AI presents certain risks as we will outline in the following sections, it also opens beneficial opportunities for business such as the potential to increase productivity and reduce workloads on staff [ 5 ]. In cyber security, for example, generative AI can increase threat detection, automate repetitive tasks, scan for threats and learn to detect threat patterns to detect malicious traffic on a network [ 56 ].

It should be noted there is an issue of value judgements when identifying benefits of adopting a new technology. What is best for a company in terms of their bottom line might be different to what is best for individual workers and what is best for the company’s customers.

4.2 Non-maleficence

Non-maleficence or the “do no harm” principle, warns against causing harm or making our lives worse-off overall [ 14 ]. Regarding the development of AI, there should be caution “against the many potentially negative consequences of overusing or misusing AI technologies” [ 59 ].

Similarly, steps must be taken in cyber security to prevent unduly increasing threats or harms to business or other stakeholders. Cyber security practices focus on three core principles: confidentiality, availability and integrity (known as the CIA triad) [ 22 ]. Where confidentiality is broken, information is made unavailable, or the integrity of data is compromised then harm can follow [ 14 ].

In both digital ethics and cyber security, any technology which is implemented in an organisation must be done so with the consideration of the type of harm which could occur and the likelihood of such harm occurring. Accordingly, introducing generative AI must also be done without increasing the risks of harm through breaches in cyber security. Harms can include economic and psychological harms to individuals who, for example, have to go through the stress of being victims of theft or identity fraud [ 17 ]. Harm can also come in the form of financial or reputational loss for organisations [ 17 ]. Organisational planning and work to prevent such harm occurring falls under the principle of non-maleficence [ 14 ].

4.3 Autonomy

In medical ethics, autonomy refers to ability for everyone to have a right to decide for themselves about their own treatment. Autonomy in relation to AI becomes more complex, as we willingly give over forms of control over decision-making power to machines [ 59 ]. Autonomy means balancing what we decide to do for ourselves, and what we give over or delegate to systems and machines [ 59 ]. It can refer to the ability for human agents to be able to choose when to implement, or what decisions to take based on AI recommendations.

There is a crossover here with ethics in cyber security, as autonomy requires the ability for individuals to have access to their data and systems [ 14 ]. Cyber security can prevent unauthorised access to our data but should also give some control over user privacy [ 14 ].

Generative AI provides a distinct ethical consideration regarding autonomy. Data scraping for training AI models takes away the autonomy of individuals to choose to have their data used, possibly infringing on privacy and intellectual property rights. One such example is an artist having their work used to train a model which can subsequently generate new simulated works matching their unique style [ 64 ]. The nature of generative AI models means that once data has been used in its training, there is no option to ‘take-it-back’ or withdraw consent later without deleting the model and starting from a new training set. As we will see when we look at risk factors of generative AI, this could leave individual data exposed to malicious actors with little in the way of protection.

Businesses incorporating generative AI must consider how the data used to their model was trained or sourced. If it is based on customer data for example, should those customers need to give specific informed consent for their data to be used in AI training?

4.4 Justice

There are many conceptions of justice, most of which revolve around promoting fairness and equality. It can also refer to the distribution of benefits and harms, considering their impacts on the least advantaged groups [ 14 ].

Justice with regard to AI means acting to eliminate unfair discrimination, create shared benefits, and prevent the undermining of social structures [ 59 ]. AI development, while bringing many opportunities for innovation, also has the risk of maintaining social inequalities rather than improving them. A feature of LLMs is their propensity to maintain stereotypes and bias [ 65 ]. Businesses implementing AI or generative AI must consider the wider social or justice implications of such technology.

Justice in cyber security should also consider the protection of property, data, and privacy rights [ 14 ]. As much as control over digital privacy is a matter of preserving autonomy , it is also a matter of justice and procedural fairness. Those who are affected by a technology should have a fair opportunity to challenge it. Some questions which will soon come to the fore regarding generative AI are around whether customers have a capability to opt out of their data being used to train AI models. If their data is exposed in a generative AI hack, who is responsible? What legal avenues could they pursue? This will be a matter for law and policy to decide, however, no business will want to be known as the first to have a data breach due to a generative AI attack.

4.5 Explicability

As a feature of procedural fairness, Floridi et al. [ 59 ] point out that there is a need to be able to understand and hold to account decision making in AI, considering both explicability and accountability. “Explicability” can broadly be considered as an answer to the question “how does it work?”, while “accountability” an answer to “who is responsible for the way it works?” [ 59 ]. As with autonomy, there are ethical issues around transparency and accountability.

Formosa et al. [ 14 ] point out that explicability in cyber security also includes procedures for holding people and organisations accountable for failures. The rapid incorporation of AI technologies into the workplace and society broadly, has also led to a rush of people trying to understand the capabilities and limitations of these technologies. Implementing an AI solution into business should also come with relevant training as it should be clear who is accountable and responsible for its use. If a company uses a third party to create a generative AI model, that somehow becomes a threat or leaks valuable information, whose responsibility is it? The company implementing it, the user who utilised it for that task, or the one designing and training the model?

5 Business implementations of AI and large language models – buying into the hype

While some might see cyber security as a technical field meant only for the protection of systems and networks, ultimately the aim of the cyber security professional is to protect the well-being of the public at large [ 13 ]. As an ever-increasing amount of data is gathered and stored about us, there is also an increasing obligation for companies to keep that information secure. The spate of large-scale hacks where private and sensitive information has been leaked has sparked calls for greater responsibility to be taken by companies who handle and store such data [ 66 ]. The implementation of generative AI expands the threat horizon. As companies rush to implement AI, they also have an obligation to understand and work to minimise the threats and subsequent harms this technology could bring.

To many, the main threat AI tools present lie in their ability to replace workers or eliminate traditional human-centred roles. To others, replacing humans with AI tools removes flexibility and responsiveness and takes out the humanity of traditional, customer-oriented services. However, to early adopters, AI is seen as a panacea of efficiency and effectiveness, removing the barriers to improving customer service and business while expanding business opportunities into previously unknown areas. To these business owners, AI tools work 24/7, do not ask for time off, can be modified at will, and do not suffer from the traditional personal and professional challenges of human employees. Where AI tools have not replaced human employees, AI tools are seen as enhancements to human-centric jobs and can improve their performance and responsiveness significantly.

However, with the adoption of early generative AI tools come higher error rates and challenges in fine tuning them to support traditional business models. A lack of understanding of how proprietary company data, once fed into an LLM, exposes the company to potential IP issues. Further, as many users have discovered, generative AI output is only as good as the data used to train the model. Generative AI results have often yielded biased, racist, and often incorrect information owing to ineffectual model tuning, limited cross validation process and operationalisation. Therefore, owing to a lack of critical thinking and analysis skills in the corporate sector may result in both poor performance and embarrassing results.

While long term expectations are that AI tools will undoubtedly result in business efficiencies, reduced labour costs, and the ability to increase the number of customers served, the short-term prognosis for their use has been mixed. Positively, the advent and adoption of AI tools has meant the creation of new job positions such as prompt engineers, Machine Learning trainers and validators, AI deployment specialists, and coders. We would also expect that new positions as AI ethicists and data control and evaluation specialists would also be a part of the new technology explosion.

5.1 The cyber-threat of AI adoption

The mass adoption of generative AI will amplify existing cyber and information security threats bringing new areas of concern. In the cyber security field, hackers and cyber criminals have also adopted AI to support hacking, online scams, and phishing emails [ 56 ]. AI serves as a force multiplier while enhancing the skills of previously mediocre cyber criminals. Despite numerous controls and safety measures, entire websites are devoted to circumventing these controls and jailbreaking existing tools. In some cases, Darkweb hackers now offer tailored AI tools to support online criminal enterprises [ 67 ]. Hackers have also traded in stolen ChatGPT login credentials, creating targets for information theft as ChatGPT profiles store a history of queries and responses [ 68 ].

Owing to its rapid deployment and universal adoption throughout the public and private sectors, there is a greater risk that generative AI could be ‘hacked’ or otherwise misappropriated. While most software applications are traditionally extensively evaluated for security and vulnerabilities, this has been lacking in generative AI. In traditional software development models we can trace a “bug” back to its cause, even if that cause is a complex interrelation with other programs, libraries services or even time itself, but generative AI adds another dimension since it’s based on such large data sets, The creative use of seemingly innocuous applications such as generative AI by criminals and adversarial nation states often results in technology surprise and creates new lines of exploitation. Whereas policy and regulatory controls are often lacking with these new technologies, their adoption without due consideration places organisations at risk. This exacerbates the potential risk with the rapid implementation of AI in workplaces, without sufficient thought or oversight.

6 Cyber security risk factors for generative AI and large language models

The following threats have been identified by cyber security researchers, and as of yet have not been known to be maliciously exploited. Even though some of these threats remain speculative in their possibility, they give reason to consider the safety of generative AI models.

6.1 Data poisoning

Firstly, there is a risk that bad actors could manipulate training data which is used to create generative AI models like LLMs. LLMs are trained on data sets scraped from across the internet, a malicious actor could store altered or ‘poisoned’ information waiting for that model to scrape the training data as it is updated [ 54 ]. This poisoned data would then surface in responses given by the model. This is especially true with the recent creation by OpenAI of personal GPTs [ 69 ]. Personal GPTs can be created by anyone to operate alongside of OpenAI’s ChatGPT and may be narrowly focused on one field or topic area. These GPT models are trained and validated the same way as other GPTs but with a narrowly defined set of input data. If the data is skewed or biased, the resulting output will reflect the ingested data. Not only could this lead to incorrect or skewed data, but it could also be used to support extremist viewpoints or to exploit vulnerable user groups.

Historically, data seeding has been used to influence Internet users through data propagation and search engine optimisation [ 70 ]. This strategy has now evolved to influence AI LLMs by prepopulating websites, social media and databases with information that data training will ingest and incorporate into AI results. A recent report by Google outlined an example where an attacker might want to influence public sentiment about a politician, so that whenever the model is queried about that politician it gives a positive response [ 54 ]. The researchers pointed out that is possible for an attacker to buy expired domains that used to have content about a politician, modifying it to be more positive [ 54 ]. The follow-on effect being that an LLM which scraped those sites would proceed to give those favourable results when asked. Further research indicated that an attacker only needs to control 0.01% of a dataset to poison it, which could be done for a cost of just US$60 [ 34 ]. If this is correct for all datasets, then there is a low barrier for someone able to poison any dataset and undermine the reliability of the subsequent model.

While influence operations have historically been the purview of governments, the integration of AI tools used by the masses makes disinformation campaigns and influence operations available to anyone. As we’ve seen recently, companies training AI have run afoul of copyright claims, but their tool flexibility and ease of access may also violate the CIA triad identified by Schatz et al. [ 19 ]. The use of autonomous tools designed to respond to human interrogatories with false, private or biased information is not generally addressed within our traditional view of cyber security. Unless we treat AI as a potential bad actor, those actions, controlled by complex rulesets and instigated by prompt engineers, may simply be viewed as anomalous and not worthy of consideration as a separate entity within our definition of cyber security.

Others have similarly argued that disinformation meets the conditions to be considered a cyber security risk due to the threat to business reputation, calling into question the integrity of data, and the psychological threat to individuals due to distrust [ 71 ]. Whether or not disinformation is directly an issue of cyber security, it has nonetheless been seen as a business risk to consider, due to the potential of influencing investment decisions or causing supply chain disruptions [ 72 , 73 ].

OpenAI specifically addresses the potential misuse of language models for disinformation campaigns by various actors including “ propagandists for hire ” [ 74 ]. Potential solutions to mitigate the impact of propaganda and disinformation campaigns include improved fact-sensitive models, tagging information for easier tracking, government control over data collection and AI hardware.

6.2 Training data extraction

Early test attacks on GTP-2 showed that it was possible for adversaries to extract specific examples of training data just by querying large language models [ 3 ]. The test showed the possibility of extracting exact words and phrases used in the training of the model, alarmingly this included public personally identifiable information such as names, phone numbers and email addresses [ 3 ]. This information only needed to appear once in the training data. In February 2023, a Harvard University student used a ‘prompt injection’ attack on Bing chat to gain access to a document otherwise hidden to users [ 75 ]. This could be a risk as many companies are training their own internal LLMs. A company which is training its own LLM with proprietary information could run the risk of having sensitive information exposed through such an attack.

6.3 Backdooring the model

More alarmingly, is the risk for indirect prompt injection [ 6 ]. In this case attackers can strategically inject prompts into training data, which can then allow attackers to indirectly exploit or completely take control of a system, without the need to access the model itself [ 6 ]. Similar to the example of data poisoning, a training data set could include malicious content that, instead of providing misinformation, could provide specific coded instructions for the model to follow.

Google researchers have pointed out that a model could be built with hidden outputs when a specific “trigger” is activated [ 54 ]. This code could, for example, trigger a download of malicious code onto the user’s device or control certain outputs of the model, changing the response or action the model takes. The researchers give the example of an attacker uploading a new kind of AI image classification tool to GitHub. While the program appears to run smoothly, the attacker could have stored malicious code to download malware on a device after a certain trigger is activated [ 54 ].

These are just some of the examples of ways in which malicious actors might be able to manipulate and otherwise affect the reliability of AI models.

6.4 Adversarial prompting

LLMs are generally built with safeguards around generating contents that are harmful and misaligned with common moral and ethical standards. However, several researchers have demonstrated that using specific or augmented prompts can bypass the safety measures and trick these models into providing harmful content. Typically referred to as “jailbreaking,” there are numerous online resources that provide instruction to users on developing prompts that will bypass the controls of the AI engine [ 76 ]. A jailbreak prompt instructs the AI engine to ignore any previous coded instructions, emulate another, less restrictive engine, or incorporate specific attributes to respond to the user’s instructions. An example that has been used previously by users is to invoke the Do Anything Now (DAN) mode in ChatGPT. While in DAN mode, ChatGPT is more responsive to user requests that potentially violate its rules.

7 Ethical implications of generative AI risk

We now turn to the ethical implications for the risks mentioned in Sect. 6. As some of the examples in Sect. 6 have shown there are multiple attacks which AI models could be vulnerable to. It is important that businesses who are planning to implement such tools within their organisation recognise and be alert to the potential harm that could come from such use. We will use the ethical principles for cyber security outlined in Sect. 4, to show what ethical concerns businesses must consider in light of the cyber security risk of generative AI models.

These threats outlined in Sect. 6 are enabled or exacerbated in two ways, by users either (a) over-relying on the output of an AI program or (b) over-trusting what information they give over to it. Firstly, by over-relying on the output of a generative AI model, employees risk making potentially harmful decisions or exposing systems to malware through phishing scam attacks. Secondly, by over-trusting the security of training data or the information put into an LLM model, there is the increased risk for data leak or theft.

7.1 Overreliance

There is evidence to suggest that people are susceptible to overreliance on AI decision making, even when it is detrimental to their work [ 77 , 78 ]. Instead of combining critical thinking and their own insights into a problem along with an AI model, people frequently over-rely on the AI even if they would have made a better choice on their own [ 77 ]. This is also known as ‘automation bias’. Pilots have been shown to place trust in incorrect automated processes, even if they would not have done so without automated recommendations [ 79 , 80 ]. Pilots must go through special training to overcome these types of automation bias. When generative AI solutions are implemented in business, there must be a consideration of what training will be sufficient to combat overreliance or automation bias.

One solution proposed to combat overreliance has been explainable AI (XAI) where a system gives reasons for its decision. The idea being that if a system can give people an explanation for how it came to a decision, they might be more easily be able to spot errors, reducing overreliance. However, it is debatable whether explainable AI does reduce overreliance and more research is being done on what circumstances explainable AI could be effective [ 78 ].

It is widely recognised that generative AI systems have the capacity to hallucinate, casting doubt on “the whole information environment” [ 53 ]. Beyond hallucinations, as the above analysis of cyber threats show, the capacity for malicious actors to purposely poison output from such models to give incorrect information gives extra cause for concern. There is a risk that hackers could change the data driving a company’s AI model, potentially influencing business decisions with targeted manipulation or misinformation [ 11 ].

In line with the principle of beneficence, ethical implementation of generative AI in business should be of benefit to employees, promote well-being and make the workplace better overall. Guardrails should be in place to ensure that its implementation is not providing more avenues for employees to make mistakes, which could potentially lead to cyber security risks.

The introduction of generative AI must also be done without risking increasing threats or harms to business or other stakeholders. Non-maleficence warns against the negative consequences of overusing or misusing AI technologies [ 59 ]. The adoption of generative AI within a company should be done while recognising the increased risk of a cyber security incident. For example, over-relying on generative AI in coding can serve as a more immediate cyber security threat, as past versions of GitHub’s Copilot were found to recommend insecure and vulnerable code to developers [ 81 ]. However, few companies are prioritising protection against the cyber security risk of generative AI [ 11 ].

The level of overreliance on the system as a source of truth, where users are not trained or used to questioning its output can increase the threat of cyber security breaches and subsequent harm. Overreliance could also be exploited by indirect prompt injection, with researchers demonstrating the possibility for a ‘hacked’ LLM to elicit information from a user [ 6 ]. By injecting instructions into an LLM, researchers were able to have the model ask users questions, enabling them to gain information such as the user’s real name [ 6 ]. If workers over-rely on a generative AI system, they might give over such information in a conversation without thinking of it as being a risk.

The issue of AI literacy, education and equality must be emphasised when integrating generative AI. Once trust and ubiquity of generative AI in business has been built, businesses should consider whether there will be a threat of delegating too much to machines, thereby threatening the autonomy of workers. Moreover, what impact could this have to erode the capacity of workers to make choices, especially in significant decision making? Over time, and once AI models become engrained in the operations of a workplace, employee capacity to judge the lines between what the AI can and cannot do might become blurred. For example, is a new staff member, going to understand when to rely on an AI decision and when not to? This will also be a challenge as there is evidence that LLMs change their behaviour over time [ 82 ].

We earlier defined justice as it relates to AI as promoting fairness, equality, and shared benefits. As a matter of justice, the displacement of jobs is a recognised threat of AI integration, threatening fairness and equality [ 46 ]. Generative AI can be used in internal business process such as human resource management (HRM) for training and development initiatives, resource allocation and employee engagement [ 83 ]. But HRM decisions also have an impact on individuals, such as who gets hired or fired, who gets better appraisals, or who is put on preferred projects [ 84 ]. These type of decisions all have psychological impacts on employees [ 85 ]. If generative AI is used in the process of evaluating staff performance it must be done so in light of distributive justice (everyone is treated the same way by the system) and procedural justice (the processes employed to reach a decision are transparent) [ 84 , 86 ]. This last point concerns the principle of explicability . When implementing a generative AI system, its use and capabilities should be explainable to all users. Management and employees should know why certain systems are used, how they make their decisions and on what information in order to reduce possible overreliance.

7.2 Over-trust

The second factor we identify is what we term as over-trust in generative AI systems. This refers to the degree to which users trust a model with sensitive information, or trust that it is safe and secure. Studies have found that a proportion of employees have pasted sensitive information into ChatGPT [ 87 ]. Companies such as Samsung moved to ban employees using ChatGPT as a result of company proprietary material being placed into the program [ 9 ]. There is also increasing trust placed in third-party AI providers, without always a consideration of the cyber security risks [ 12 ].

Some companies have moved to create their own in-house AI models trained on company data and information to assist staff with queries. BloombergGPT, for example, an LLM that was purpose built from the scratch for finance by Bloomberg [ 88 ]. The training and use of such a model brings its own security challenges, as we have seen such models are susceptible to data extraction attacks [ 3 ]. Bloomberg, for their part, chose not to release their model citing security concerns of a model trained on so much company data being potentially exposed through nefarious means increasing risk for harm [ 88 ]. Training such a model is cost intensive and not something which is an option for many businesses.

Large companies such as Morgan Stanley are using cloud-based systems only accessible to its employees. While some argue a leak of confidential or private information “should not be a problem” [ 89 ] this thinking ignores the risk of internal threats and of actors trying to use attacks such as training data extraction. It also ignores the risk of the model being otherwise leaked, as happened with Meta’s AI language model LLaMA [ 90 ]. There is also the risk of accidental data leaks, such as the recent 38 TB of data accidentally exposed by Microsoft AI researchers [ 8 ].

Companies such as Salesforce have touted promises of plugging the AI “trust gap”, promoting services to protect company information while using AI tools, a package which will reportedly cost businesses $360,000 per year to implement [ 91 ].

With companies implementing domain specific LLMs, in an unregulated market, ethical considerations should still be implemented to protect the security of data. By applying the ethical principles from Sect. 4, we can see how over-trust in a new and untested technology presents ethical issues for companies.

In terms of beneficence, there are many positive benefits for the training of generative AI and large language models on proprietary content or knowledge [ 89 ]. This can be useful in assisting customer-facing employees find information about company policy, solving customer problems, or keeping employee knowledge when they leave the organisation [ 89 ]. In implementing such a strategy, companies and staff must have best practises in mind, and continually revise its use. Morgan Stanley reportedly used 1,000 financial managers to fine tune its model for safety and use [ 92 ]. However, this kind of resource intensive safeguard is not something that is practical for all businesses.

By trusting generative AI systems to store and process data, organisations could also be exposing themselves to added security threats. Non-maleficence (“do no harm”) in this case does not just mean intentional harm, but also means preventing accidental harm or the harm from the “unpredicted behaviour of machines” [ 59 ]. Placing data or sensitive information into generative AI models, could increase the threat of infringing upon personal privacy by increasing a company’s exposure to cyber-risk. Generative AI models can be susceptible to attacks such as prompt injection attacks or data extraction attacks, both of which have the potential to leak sensitive data [ 6 ]. If we consider that IBM estimates that only 24% of generative AI projects will include a cyber security component within the next 6 months [ 11 ], then this rush to adopt AI is leading to users being exposed to unnecessary consequences.

A new question raised by generative AI is what autonomy do customers have over their information being stored or used in a model which potentially has flaws in security? If generative AI programs become widespread and ubiquitous in business, should customers have to give their consent for their information to be either (a) be used in the training set of a model; or (b) to be inputted into the finished model?

Regulations about the business use of these models is on the horizon, but there are many questions still to be considered. If users or customers have a right for their data to be erased from a database, such as under the rules of the GDPR, similar protections cannot be offered once a model has been trained a person’s data. There is also no option to later withdraw consent once a model has been trained. Mechanisms and best practice around the use of customer information, which could threaten autonomy, must be taken into consideration. The ethical AI guidelines Floridi et al. [ 59 ] point out that the autonomy of humans should be promoted, while also limiting the autonomy of machines, and making them intrinsically reversable. The problem with LLMs is that they are lacking in the capacity to be reversable.

Justice in both AI and cyber security encompasses the protection of rights, in particular the right to privacy over data. In using and training models with data taken from users, for example, there must be a consideration for the protection of this data. The susceptibility of models to attacks can include the threat of information or data theft [ 6 ].

Justice can also refer to recourse available when something goes wrong with AI systems or in cyber security. As more companies use LLMs the greater the risk becomes of data being leaked. Without clear guidelines or regulation in place, what recourse do users have if their data is used in a training model and then subsequently exposed? If a company is using a third-party AI provider, is it clear where the responsibility for any failures lies?

A follow on from the ethical considerations of justice, is the principle of explicability. With the rapid implementation of generative AI, are customers being informed whether their data is being used to train new company models? Large companies such as Facebook, Amazon and X (formerly Twitter) all have plans to train LLMs using user data [ 93 ]. Amazon plans to train its LLM using voice data from Alexa conversations [ 93 ]. Do customers need to opt-in to their data being used to train generative AI models? If their data is exposed in a generative AI hack, who is responsible? What legal avenues could they pursue? Explicability entails who is made accountable for failures in cyber security, in the result of a breach due to generative AI, do companies know who would be at fault or where the responsibility lies?

8 Ethical implementation of generative AI

The above analysis shows the many ethical questions which are raised by thinking about cyber security and generative AI for business. We argue that cyber security needs to be an ethical consideration for businesses implementing generative AI. As such, we offer five key recommendations which companies can adopt to ensure that the security risk of using AI models is limited.

A secure and ethical AI model design.

When designing an AI model, companies should ensure that their designs take into consideration the principles such as beneficence and non-maleficence. This means considering the potential harms and security risks which could be exposed through the model. Each design should also include non-discriminatory principles to avoid biases and unexpected outcomes from the AI models. Following the principle of explicability, companies should ensure their AI training is easily explainable and transparent in its design.

A trusted and fair data collection process.

Companies need to ensure data collected is accurate, fair, representative, and legally sourced. As the principle of autonomy demonstrates, there should be considerations of how much users can have a say about how their data is used in the training of a model. Companies should consider whether they will need to have an opt-in or opt-out systems to protect the privacy of users or customers.

A secure data storage.

Companies will need to adhere to the privacy best practices for all the data stored, whether it is training data or input data from users. This should also be done while considering the risk of leaks through hacks such as training data extraction. With regulation of generative AI on the horizon, companies must now prepare by putting in place their own policies over what data is used, while considering the risk that this data could be exposed. This takes into consideration the principle of justice, in the prevention of possible data leaks.

Ethical AI model retraining and maintenance.

To maintain model currency and accuracy, AI models require retraining from time to time. Companies need to perform sufficient checks and tests after retraining the AI model and updating the generative AI applications to ensure it maintains its ethical standards and accuracy. In terms of cyber security, this also means constant monitoring for signs of influence, malware or the AI focused attacks as outlined in this paper. New defence training and policies will be needed to monitor for these threats.

Upskilling, training staff and managing staff.

One of the biggest pain points for business is upskilling and training staff. When implementing a strategy with generative AI, companies should consider what benefit the AI is bringing, while also considering the human impact this will have on staff. If staff are being asked to work with, train or implement models, they might be concerned that they will soon be replaced by these models. Upskilling and training will also be essential to mitigate the potential threats from overreliance and over-trust in new generative AI models.

9 Conclusion

We have seen that implementation of generative AI comes with considerable cyber security risk for businesses. When rushing to implement generative AI and not fall behind others in industry, companies are also increasing the risk for cyber security breaches. While there is a great momentum toward incorporating generative AI, there also needs to be a consideration of the ethical responsibility toward the protection of data and prevention against threats.

A major risk with the rush to market of generative AI is its adoption by workers without guidance or understanding of how various generative AI tools are produced, managed or of the risks they pose. This lack of understanding can leave companies open to cyber security threats. We point out two ways in which this can happen: overreliance and over-trust in generative AI systems. While these two are related, each offers distinct risks and ethical challenges.

The ethical principles of beneficence, non-maleficence, autonomy, justice and explicability are useful lenses through which business can view their obligations when planning to implement data-safe and cyber-secure generative AI solutions.

The rapid adoption of generative AI seems to be moving faster than the industry’s understanding of the technology and its inherent ethical and cyber security risks. Companies will need to manage the risk from new vulnerabilities due to generative AI, requiring new forms of governance and regulatory frameworks. Employee training, procedures and managed implementation are an ethical responsibility to protect workers, sensitive company information and the public. Companies now have the opportunity to prevent expensive and unnecessary consequences of generative AI, by addressing the ethical and cyber security threats and investing in data protection measures.

IBM: Leadership in the age of AI. IBM: (2023)

IBM: The CEO’s Guide to Generative AI: Supply chain. IBM: (2023)

Carlini, N., Tramèr, F., Wallace, E., Jagielski, M., Herbert-Voss, A., Lee, K., Roberts, A., Brown, T.B., Song, D.X., Erlingsson, Ú., Oprea, A., Raffel, C.: Extracting Training Data from Large Language Models. In: USENIX Security Symposium. (2020)

McKinsey & Company: The Economic Potential of Generative AI: The next Productivity Frontier. McKinsey & Company (2023)

Brynjolfsson, E., Li, D., Raymond, L.: Generative AI at Work. National Bureau of Economic Research (2023)

Greshake, K., Abdelnabi, S., Mishra, S., Endres, C., Holz, T., Fritz, M.: More than you’ve asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models. arXiv preprint arXiv:2302.12173 (2023)

Chui, M., Yee, L., Singla, A., Sukharevsky, A.: The State of AI in 2023: Generative AI’s Breakout year. McKinsey & Company (2023)

Ben-Sasson, H., Greenberg, R.: 38 TB of data accidentally exposed by Microsoft AI researchers (2023). https://www.wiz.io/blog/38-terabytes-of-private-data-accidentally-exposed-by-microsoft-ai-researchers . Accessed 22 November 2023

Park, K.: Samsung bans use of generative AI tools like ChatGPT after April internal data leak (2023). https://techcrunch.com/2023/05/02/samsung-bans-use-of-generative-ai-tools-like-chatgpt-after-april-internal-data-leak/ . Accessed 22 November 2023

OpenAI: March 20 ChatGPT outage: Here’s what happened: (2023). https://openai.com/blog/march-20-chatgpt-outage

IBM: The CEO’s guide to generative AI: Cybersecurity. IBM: (2023)

Renieris, E.M., Kiron, D., Mills, S.: Building Robust RAI Programs as Third-Party AI tools proliferate. MIT Sloan Manage. Rev. (2023)

Vallor, S.: An Introduction to Cybersecurity Ethics. Markkula Center for Applied Ethics (2018). https://www.scu.edu/media/ethics-center/technology-ethics/IntroToCybersecurityEthics.pdf

Formosa, P., Wilson, M., Richards, D.: A principlist framework for cybersecurity ethics. Computers Secur. 109 , 102382 (2021). https://doi.org/10.1016/j.cose.2021.102382

Article   Google Scholar  

Blanken-Webb, J., Palmer, I., Campbell, R.H., Burbules, N.C., Bashir, M.: Cybersecurity Ethics. Foundations of Information Ethics, pp. 91–101. American Library Association (2019)

Morgan, G., Gordijn, B.: A care-based stakeholder approach to ethics of cybersecurity in business. In: Christen, M., Gordijn, B., Loi, M. (eds.) The ethics of cybersecurity https://doi.org/ (2020). https://doi.org/10.1007/978-3-030-29053-5_6 , pp. 119–138

Agrafiotis, I., Nurse, J.R.C., Goldsmith, M., Creese, S., Upton, D.: A taxonomy of cyber-harms: Defining the impacts of cyber-attacks and understanding how they propagate. J. Cybersecur. 4 (2018). https://doi.org/10.1093/cybsec/tyy006

IBM: Cost of a Data Breach Report 2023. IBM: (2023)

Schatz, D., Bashroush, R., Wall, J.: Towards a more representative definition of Cyber Security. J. Digit. Forensics Se. 12 , 53–74 (2017)

Google Scholar  

National Institute of Standards and Technology:, https://csrc.nist.gov/glossary/term/integrity

Manjikian, M.: Cybersecurity Ethics: An Introduction. Routledge, London (2023)

Christen, M., Gordijn, B., Loi, M.: The Ethics of Cybersecurity. The International Library of Ethics. Law Technol. (2020). https://doi.org/10.1007/978-3-030-29053-5

Finlay, C.J.: Just War, Cyber War, and the Concept of Violence. Philos. Technol. 31 , 357–377 (2018). https://doi.org/10.1007/s13347-017-0299-6

Taddeo, M.: Information Warfare: A Philosophical Perspective. The Ethics of Information Technologies10.4324/9781003075011-35, pp. 461–476. Routledge (2020)

Taddeo, M.: An analysis for a just cyber warfare. 4th Int. Conf. Cyber Confl. (CYCON 2012). pp 1–10 , Tallinn–Estonia (2012). (2012)

Macnish, K., van der Ham, J.: Ethics in cybersecurity research and practice. Technol. Soc. 63 (2020). https://doi.org/10.1016/j.techsoc.2020.101382

Van De Poel, I.: Core Values and Value Conflicts in Cybersecurity: Beyond Privacy Versus Security, pp. 45–71. Springer International Publishing (2020). https://doi.org/10.1007/978-3-030-29053-5_3

Jaquet-Chiffelle, D.-O., Loi, M.: Ethical and Unethical Hacking, pp. 179–204. Springer International Publishing (2020). https://doi.org/10.1007/978-3-030-29053-5_9

Brey, P.: Ethical Aspects of Information Security and Privacy, pp. 21–36. Springer, Berlin Heidelberg (2007). https://doi.org/10.1007/978-3-540-69861-6_3

Book   Google Scholar  

Riley, S.: DarkLight Offers First of its Kind Artificial Intelligence to Enhance Cybersecurity Defenses. (2017). https://www.businesswire.com/news/home/20170726005117/en/DarkLight-Offers Kind-Artificial-Intelligence-Enhance-Cybersecurity. Business Wire Accessed 05 February 2024

Li, J.H.: Cyber security meets artificial intelligence: A survey. Front. Inf. Tech. El. 19 , 1462–1474 (2018). https://doi.org/10.1631/Fitee.1800573

Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature. 521 , 436–444 (2015). https://doi.org/10.1038/nature14539

Article   ADS   CAS   PubMed   Google Scholar  

Kumar, S., Gupta, U., Singh, A.K., Singh, A.K.: Artificial Intelligence: Revolutionizing Cyber Security in the Digital era. J. Computers Mech. Manage. 2 , 31–42 (2023). https://doi.org/10.57159/gadl.jcmm.2.3.23064

Carlini, N., Jagielski, M., Choquette-Choo, C.A., Paleka, D., Pearce, W., Anderson, H., Terzis, A., Thomas, K., Tramèr, F.: Poisoning Web-Scale Training Datasets is Practical. (2023). ArXiv abs/2302.10149

Foster, D.: Generative deep Learning. O’Reilly Media, Inc. (2022)

Sallam, M.: In: Healthcare (ed.) ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns, p. 887. MDPI (2023)

Cascella, M., Montomoli, J., Bellini, V., Bignami, E.: Evaluating the feasibility of ChatGPT in healthcare: An analysis of multiple clinical and research scenarios. J. Med. Syst. 47 , 33 (2023)

Article   PubMed   PubMed Central   Google Scholar  

Lo, C.K.: What is the impact of ChatGPT on education? A rapid review of the literature. Educ. Sci. 13 , 410 (2023)

Article   ADS   Google Scholar  

Stokel-Walker, C.: ChatGPT listed as author on research papers: Many scientists disapprove. Nature. 613 , 620–621 (2023)

Hutson, J., Harper-Nichols, M.: Generative AI and Algorithmic Art: Disrupting the Framing of Meaning and Rethinking the Subject-Object Dilemma. Global Journal of Computer Science and Technology: D 23, (2023)

Pavlik, J.V.: Collaborating with ChatGPT: Considering the implications of generative artificial intelligence for journalism and media education. Journalism Mass. Communication Educ. 78 , 84–93 (2023)

Dell’Acqua, F., McFowland, E., Mollick, E.R., Lifshitz-Assaf, H., Kellogg, K., Rajendran, S., Krayer, L., Candelon, F., Lakhani, K.R.: Navigating the jagged technological frontier: Field experimental evidence of the effects of AI on knowledge worker productivity and quality. Harvard Business School Technology & Operations Mgt. Unit Working Paper (2023)

Chen, B., Wu, Z., Zhao, R.: From fiction to fact: The growing role of generative AI in business and finance. J. Chin. Economic Bus. Stud. 21 , 471–496 (2023). https://doi.org/10.1080/14765284.2023.2245279

Wamba, S.F., Queiroz, M.M., Jabbour, C.J.C., Shi, C.V.: Are both generative AI and ChatGPT game changers for 21st-Century operations and supply chain excellence? Int. J. Prod. Econ. 265 , 109015 (2023)

Stahl, B.C., Eke, D.: The ethics of ChatGPT–Exploring the ethical issues of an emerging technology. Int. J. Inf. Manag. 74 , 102700 (2024)

Krzysztof Wach, C.D.D., Joanna Ejdys, R., Kazlauskaitė, P., Korzynski, G., Mazurek: Joanna Paliszkiewicz, Ewa Ziemba: The dark side of Generative Artificial Intelligence: A critical analysis of controversies and risks of ChatGPT

Zarifhonarvar, A.: Economics of chatgpt: A labor market view on the occupational impact of artificial intelligence. J. Electron. Bus. Digit. Econ. (2023)

Gross, N.: What chatGPT tells us about gender: A cautionary tale about performativity and gender biases in AI. Social Sci. 12 , 435 (2023)

Ray, P.P.: ChatGPT: A Comprehensive Review on Background, Applications, key Challenges, bias, Ethics, Limitations and Future Scope. Internet of Things and Cyber-Physical Systems (2023)

Rahman, M.M., Watanobe, Y.: ChatGPT for Education and Research: Opportunities, threats, and strategies. Appl. Sci. 13 , 5783 (2023), https://doi.org/10.3390/app13095783

De Angelis, L., Baglivo, F., Arzilli, G., Privitera, G.P., Ferragina, P., Tozzi, A.E., Rizzo, C.: ChatGPT and the rise of large language models: The new AI-driven infodemic threat in public health. Front. Public. Health. 11 , 1166120 (2023)

Ferrara, E.: Social bot detection in the age of ChatGPT: Challenges and opportunities. First Monday (2023)

OpenAI: GPT-4 System Card. OpenAI: (2023)

Fabian, D., Crisp, J.: Why Red Teams Play a Central Role in Helping Organizations Secure AI Systems. Google (2023)

Sebastian, G.: Do ChatGPT and other AI Chatbots pose a cybersecurity risk? Int. J. Secur. Priv. Pervasive Comput. 15 , 1–11 (2023). https://doi.org/10.4018/ijsppc.320225

Gupta, M., Akiri, C., Aryal, K., Parker, E., Praharaj, L.: From ChatGPT to ThreatGPT: Impact of generative AI in cybersecurity and privacy. IEEE Access. (2023)

Schlagwein, D., Willcocks, L.: ChatGPT et al.’: The ethics of using (generative) artificial intelligence in research and science. J. Inform. Technol. 38 , 232–238 (2023). https://doi.org/10.1177/02683962231200411

Illia, L., Colleoni, E., Zyglidopoulos, S.: Ethical implications of text generation in the age of artificial intelligence. Bus. Ethics Environ. Responsib. 32 , 201–210 (2023). https://doi.org/10.1111/beer.12479

Floridi, L., Cowls, J., Beltrametti, M., Chatila, R., Chazerand, P., Dignum, V., Luetge, C., Madelin, R., Pagallo, U., Rossi, F., Schafer, B., Valcke, P., Vayena, E.: AI4People—An ethical Framework for a good AI society: Opportunities, risks, principles, and recommendations. Mind. Mach. 28 , 689–707 (2018). https://doi.org/10.1007/s11023-018-9482-5

Bruschi, D., Diomede, N.: A framework for assessing AI ethics with applications to cybersecurity. AI Ethics. 3 , 65–72 (2023). https://doi.org/10.1007/s43681-022-00162-8

Van De Poel, I.: An ethical Framework for evaluating Experimental Technology. Sci Eng. Ethics. 22 , 667–686 (2016). https://doi.org/10.1007/s11948-015-9724-3

Article   PubMed   Google Scholar  

Hosseini, Z., Nyholm, S., Le Blanc, P.M., Preenen, P.T.Y., Demerouti, E.: Assessing the artificially intelligent workplace: An ethical framework for evaluating experimental technologies in workplace settings. AI Ethics. (2023). https://doi.org/10.1007/s43681-023-00265-w

Himma, K.E.: The Ethics of tracing Hacker attacks through the machines of innocent persons. Int. Rev. Inform. Ethics. 2 (2004). https://doi.org/10.29173/irie256

Franceschelli, G., Musolesi, M.: Copyright in generative deep learning. Data Policy 4 , e17 (2022)

Kirk, H.R., Jun, Y., Volpin, F., Iqbal, H., Benussi, E., Dreyer, F., Shtedritski, A., Asano, Y.: Bias out-of-the-box: An empirical analysis of intersectional occupational biases in popular generative language models. Adv. Neural. Inf. Process. Syst. 34 , 2611–2624 (2021)

Spinello, R.A.: Corporate Data breaches: A Moral and Legal Analysis. J. Inform. Ethics. 30 , 12–32 (2021). https://doi.org/https://doi.org/ https://doi.org/10.2307/JIE.30.1.12

Erzberger, A.: WormGPT and FraudGPT – The Rise of Malicious LLMs (2023). https://www.trustwave.com/en-us/resources/blogs/spiderlabs-blog/wormgpt-and-fraudgpt-the-rise-of-malicious-llms/ . Accessed 27 November 2023

Group-IB: Group-IB Discovers 100K + Compromised ChatGPT Accounts on Dark Web Marketplaces: (2023). https://www.group-ib.com/media-center/press-releases/stealers-chatgpt-credentials/ . Accessed 27 November 2023

OpenAI: Introducing GPTs: (2023). https://openai.com/blog/introducing-gpts

Gelper, S., van der Lans, R., van Bruggen, G.: Competition for attention in online social networks: Implications for seeding strategies. Manage. Sci. 67 , 1026–1047 (2021)

Caramancion, K.M.: An exploration of disinformation as a cybersecurity threat. In: 2020 3rd International Conference on Information and Computer Technologies (ICICT), pp. 440–444. IEEE, (2020)

Petratos, P.N., Faccia, A.: Fake news, misinformation, disinformation and supply chain risks and disruptions: Risk management and resilience using blockchain. Ann. Oper. Res. 327 , 735–762 (2023). https://doi.org/10.1007/s10479-023-05242-4

Petratos, P.N.: Misinformation, disinformation, and fake news: Cyber risks to business. Bus. Horiz. 64 , 763–774 (2021)

Goldstein, J.A., Sastry, G., Musser, M., DiResta, R., Gentzel, M., Sedova, K.: Generative language models and automated influence operations: Emerging threats and potential mitigations. arXiv Preprint arXiv:230104246 (2023)

Edwards, B.: AI-powered Bing Chat spills its secrets via prompt injection attack (2023). https://arstechnica.com/information-technology/2023/02/ai-powered-bing-chat-spills-its-secrets-via-prompt-injection-attack/ . Accessed 27 November 2023

Boxleitner, A.: Pushing Boundaries or Crossing Lines? The Complex Ethics of ChatGPT Jailbreaking. The Complex Ethics of ChatGPT JailbreakingOctober 17, (2023) (2023)

Buçinca, Z., B Malaya, M., Z Gajos, K.: To trust or to think: Cognitive forcing functions can reduce overreliance on AI in AI-assisted decision-making. Proc. ACM Hum. -Comput Interact. 5 , Article188 (2021). https://doi.org/10.1145/3449287

Vasconcelos, H., Jörke, M., Grunde-Mclaughlin, M., Gerstenberg, T., Bernstein, M.S., Krishna, R.: Explanations Can Reduce Overreliance on AI Systems During Decision-Making. Proceedings of the ACM on Human-Computer Interaction 7, 1–38 (2023). https://doi.org/10.1145/3579605

Skitka, L.J., Mosier, K.L., Burdick, M.: Does automation bias decision-making? Int. J. Hum-Comput St. 51 , 991–1006 (1999). https://doi.org/10.1006/ijhc.1999.0252

Cummings, M.: Automation bias in intelligent time critical decision support systems. In: AIAA 1st intelligent systems technical conference, pp. 6313. (2004)

Pearce, H., Ahmad, B., Tan, B., Dolan-Gavitt, B., Karri, R.: Asleep at the keyboard? assessing the security of github copilot’s code contributions. In: 2022 IEEE Symposium on Security and Privacy (SP), pp. 754–768. IEEE, (2022)

Chen, L., Zaharia, M., Zou, J.: How is ChatGPT’s behavior changing over time? arXiv preprint arXiv:2307.09009 (2023)

Ooi, K.-B., Tan, G.W.-H., Al-Emran, M., Al-Sharafi, M.A., Capatina, A., Chakraborty, A., Dwivedi, Y.K., Huang, T.-L., Kar, A.K., Lee, V.-H., Loh, X.-M., Micu, A., Mikalef, P., Mogaji, E., Pandey, N., Raman, R., Rana, N.P., Sarker, P., Sharma, A., Teng, C.-I., Wamba, S.F., Wong, L.-W.: The potential of Generative Artificial Intelligence Across disciplines: Perspectives and future directions. J. Comput. Inform. Syst. 1–32 (2023). https://doi.org/10.1080/08874417.2023.2261010

Tambe, P., Cappelli, P., Yakubovich, V.: Artificial intelligence in human resources management: Challenges and a path forward. Calif. Manag. Rev. 61 , 15–42 (2019)

Varma, A., Dawkins, C., Chaudhuri, K.: Artificial intelligence and people management: A critical assessment through the ethical lens. Hum. Resource Manage. Rev. 33 , 100923 (2023)

Robert, L.P., Pierce, C., Marquis, L., Kim, S., Alahmad, R.: Designing fair AI for managing employees in organizations: A review, critique, and design agenda. Human–Computer Interact. 35 , 545–575 (2020). https://doi.org/10.1080/07370024.2020.1735391

Cameron, C.: 11% of data employees paste into ChatGPT is confidential (2023). https://www.cyberhaven.com/blog/4-2-of-workers-have-pasted-company-data-into-chatgpt . Accessed 23 November 2023

Wu, S., Irsoy, O., Lu, S., Dabravolski, V., Dredze, M., Gehrmann, S., Kambadur, P., Rosenberg, D., Mann, G.: Bloomberggpt: A large language model for finance. arXiv Preprint arXiv:230317564 (2023)

Davenport, T., Alavi, M.: How to Train Generative AI Using Your Company’s Data (2023). https://hbr.org/2023/07/how-to-train-generative-ai-using-your-companys-data . Accessed 27 November 2023

Vincent, J.: Meta’s powerful AI language model has leaked online — what happens now? (2023). https://www.theverge.com/2023/3/8/23629362/meta-ai-language-model-llama-leak-online-misuse . Accessed 27 November 2023

Lin, B., Loten, A.: Salesforce Aims to Plug ‘AI Trust Gap’ With New Tech Tools (2023). https://www.wsj.com/articles/salesforce-aims-to-plug-ai-trust-gap-with-new-tech-tools-19e11750 . Accessed 27 November 2023

Bautzer, T., Nguyen, L.: Morgan Stanley to launch AI chatbot to woo wealthy (2023). https://www.reuters.com/technology/morgan-stanley-launch-ai-chatbot-woo-wealthy-2023-09-07/ . Accessed 27 November 2023

Leffer, L., Your Personal Information Is Probably Being Used to Train Generative AI Models: (2023). https://www.scientificamerican.com/article/your-personal-information-is-probably-being-used-to-train-generative-ai-models/ . Accessed 27 November 2023

Download references

Open Access funding enabled and organized by CAUL and its Member Institutions

Author information

Authors and affiliations.

School of Science, Technology and Engineering, University of the Sunshine Coast, Sunshine Coast, Queesland, Australia

Declan Humphreys, Abigail Koay, Dennis Desmond & Erica Mealy

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Declan Humphreys .

Ethics declarations

Competing interests.

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Humphreys, D., Koay, A., Desmond, D. et al. AI hype as a cyber security risk: the moral responsibility of implementing generative AI in business. AI Ethics (2024). https://doi.org/10.1007/s43681-024-00443-4

Download citation

Received : 30 November 2023

Accepted : 15 February 2024

Published : 23 February 2024

DOI : https://doi.org/10.1007/s43681-024-00443-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Cyber security
  • Business ethics
  • Large language models
  • Generative AI ethics
  • Find a journal
  • Publish with us
  • Track your research

Questions? Call us: 

Email: 

  • How it works
  • Testimonials

Essay Writing

  • Essay service
  • Essay writers
  • College essay service
  • Write my essay
  • Pay for essay
  • Essay topics

Term Paper Writing

  • Term paper service
  • Buy term papers
  • Term paper help
  • Term paper writers
  • College term papers
  • Write my term paper
  • Pay for term paper
  • Term paper topic

Research Paper Writing

  • Research paper service
  • Buy research paper
  • Research paper help
  • Research paper writers
  • College research papers
  • Write my research paper
  • Pay for research paper
  • Research paper topics

Dissertation Writing

  • Dissertation service
  • Buy dissertation
  • Dissertation help
  • Dissertation writers
  • College thesis
  • Write my dissertation
  • Pay for dissertation
  • Dissertation topics

Other Services

  • Custom writing services
  • Speech writing service
  • Movie review writing
  • Editing service
  • Assignment writing
  • Article writing service
  • Book report writing
  • Book review writing

Popular request:

Writing an outstanding cyber security research paper.

January 10, 2020

In the world today, technology has evolved so much, and the bulk of data is stored in cyberspace. But this has also brought about the serious issue of cyber security. In one of the recent cases, in 2017, a malware known as Wannacry ransomware attacked companies across the globe and took over their data. The malware encrypted data and the cybercriminals demanded to be paid ransoms in Bitcoin. That was just one example. In reality, the cybersecurity threat is everywhere!

cyber security research topics

The serious nature of cyber security threats in the world has made teachers in computing and related disciplines to ask students to write related research papers regularly. These papers are technical and require an inherent understanding of the subject and impeccable writing skills. In this post, we will give you some great tips for writing a security research paper. We will also list the top 50 cyber security topics for research.

What is a Cyber Security Research Paper?

This is a type of academic writing where the student is required to write on a topic related to cyber security. The assignments are given to students, especially those in computing-related studies, to help them research and come up with solutions on cyber security.

Note that since cyber threats target everyone, even teachers in other disciplines can give students cyber security related topics as assignments. Although most cybersecurity research topics tend to focus on prevention, there are others that might require you to go deeper into the architecture of malware.

50 Best Cyber Security Research Topics for You

What are the best topics in cybersecurity for my research paper today? Here are the top 50 options that you should consider. Pick and use them as they are or tweak to suit your interest:

Computer and software administration cyber security topics for research

  • Evaluating how antimalware operate to prevent cyber attacks.
  • Evaluating the history of ransomware.
  • Is the technology evolving too fast and making us unable to counter malware?
  • Encrypting viruses: How do they work?
  • Analyzing security measures used in Windows operating system.
  • MacOS VS Windows: Which is more secure?
  • Are hardware components of a computer free from attacks?
  • How does firewall work to prevent malware attacks?
  • Phishing: What methods can we use to stop it?
  • Analyzing key cyber security threats for people using social media.
  • The biggest cyber security threats in the world today.
  • What are the main challenges of cyber security in the globe today?
  • What are the common causes of cyber crimes?
  • Importance of software updates in cyber security.
  • Demystifying blackmailing and revenge porn online.
  • Online identity theft.
  • Importance of staff training on cyber security.

Cyber security paper topics on system management

  • Evaluating the US legal framework for cyber security.
  • Ethical hacking: What are its implications?
  • Social engineering: What is its importance?
  • Demystifying white and black hat hackers.
  • Unified user profiles: What are the merits and demerits?
  • What are the top five cyber security protection methods for multinational companies?
  • A closer look at the crucial components of good data governance.
  • Steps for responding to hacking in a company system.
  • Two-factor authentication: How effective is it?
  • What are the motivations behind cybercrimes?
  • Evaluating the use of machine learning for cyber security intrusion detection.
  • Key challenges of big data in enhancing cyber security.
  • Why is it crucial to have a cyber security administrator every second of your system operation?
  • Data backup: How does it help in cyber security?
  • What is the best method of managing multiple threat possibilities?
  • Analyzing the most difficult part in cyber security administration.

Security research topics on cryptography

  • The importance of cryptography in cyber security enhancement.
  • How does malware attack personal data with the assistance of cryptographers?
  • Should you use cryptographers in the event of an attack?
  • Evaluating the process of decoding encrypted data after a malware attack.
  • Anomalous communion detection systems.
  • Integrating wireless sensors into IoT to enhance the security of your system.
  • Cyber security and blockchain.
  • Analyzing cybersecurity of critical infrastructure networks.

Security research topics on recent events and technologies

  • Analyzing the efficiency of RFID security systems.
  • Dark web: How does it propagate organized crime?
  • Reverse engineering.
  • Analyzing the best authorization infrastructures.
  • Analyzing the application of steganalysis.
  • How significant is computer forensics in the digital era?
  • Regular password changes: Can it help to predict cyber attacks?
  • What are the best strategies for barring cyber attacks?
  • Analyzing the best forensic tools for cyber threats detection.

How to Write a Great Cyber Security Research Paper

When you are faced with the task of writing a cyber security research paper, how do you go about it? Here are the top five steps to follow.

  • Start by reading widely about the subject.
  • Pick the right cybersecurity research topics. Make sure to pick a topic that has not been explored by other researchers.
  • Write a great introduction that captures the attention of the reader.
  • Develop a good cyber security thesis.
  • Ensure to support your arguments well to make the research paper interesting.
  • Ensure to use the latest resources when writing your paper.

Special Tips to Help Make Your Cyber Security Research Paper Stand Out

Now that you know how to write a cyber security research paper, the next question is: “How do you make it sparkle?” Here are some tested and proven tips that you should use:

  • Make sure to only go for the cyber security thesis topics of interest. This will help you to avoid getting bored midway.
  • Read other top cyber security research papers to understand how pros do it.
  • Start working on the research paper right away as opposed to waiting for the last moment.
  • Ensure to carefully follow instructions from your department.
  • Start by writing a draft and finally revise it to make the final copy.
  • Make sure to carefully proofread your paper to clear typos and cliches before submitting the final paper.

Cyber Security Problems Can Be Tackled Easily!

Once you have selected the preferred research paper topic, it is time to get down to writing your paper. But writing cyber security papers, to many students, is always a challenge. If the deadline is tight or you have other engagements, completing the research paper could turn into a serious nightmare. But hold on! A solution is only a click away: seeking writing help from professionals . They have the experience, all the needed resources, and are willing to help you get the top grades.

cyber security research paper outline

Take a break from writing.

Top academic experts are here for you.

  • How To Write An Autobiography Guideline And Useful Advice
  • 182 Best Classification Essay Topics To Learn And Write About
  • How To Manage Stress In College: Top Practical Tips  
  • How To Write A Narrative Essay: Definition, Tips, And A Step-by-Step Guide
  • How To Write Article Review Like Professional
  • Great Problem Solution Essay Topics
  • Creating Best Stanford Roommate Essay
  • Costco Essay – Best Writing Guide
  • How To Quote A Dialogue
  • Wonderful Expository Essay Topics
  • Research Paper Topics For 2020
  • Interesting Persuasive Essay Topics

For enquiries call:

+1-469-442-0620

banner-in1

60+ Latest Cyber Security Research Topics for 2024

Home Blog Security 60+ Latest Cyber Security Research Topics for 2024

Play icon

The concept of cybersecurity refers to cracking the security mechanisms that break in dynamic environments. Implementing Cyber Security Project topics and cyber security thesis topics /ideas helps overcome attacks and take mitigation approaches to security risks and threats in real-time. Undoubtedly, it focuses on events injected into the system, data, and the whole network to attack/disturb it.

The network can be attacked in various ways, including Distributed DoS, Knowledge Disruptions, Computer Viruses / Worms, and many more. Cyber-attacks are still rising, and more are waiting to harm their targeted systems and networks. Detecting Intrusions in cybersecurity has become challenging due to their Intelligence Performance. Therefore, it may negatively affect data integrity, privacy, availability, and security. 

This article aims to demonstrate the most current Cyber Security Topics for Projects and areas of research currently lacking. We will talk about cyber security research questions, cyber security research questions, cyber security topics for the project, best cyber security research topics, research titles about cyber security and web security research topics.

Cyber Security Research Topics

List of Trending Cyber Security Research Topics for 2024

Digital technology has revolutionized how all businesses, large or small, work, and even governments manage their day-to-day activities, requiring organizations, corporations, and government agencies to utilize computerized systems. To protect data against online attacks or unauthorized access, cybersecurity is a priority. There are many Cyber Security Courses online where you can learn about these topics. With the rapid development of technology comes an equally rapid shift in Cyber Security Research Topics and cybersecurity trends, as data breaches, ransomware, and hacks become almost routine news items. In 2024, these will be the top cybersecurity trends.

A) Exciting Mobile Cyber Security Research Paper Topics

  • The significance of continuous user authentication on mobile gadgets. 
  • The efficacy of different mobile security approaches. 
  • Detecting mobile phone hacking. 
  • Assessing the threat of using portable devices to access banking services. 
  • Cybersecurity and mobile applications. 
  • The vulnerabilities in wireless mobile data exchange. 
  • The rise of mobile malware. 
  • The evolution of Android malware.
  • How to know you’ve been hacked on mobile. 
  • The impact of mobile gadgets on cybersecurity. 

B) Top Computer and Software Security Topics to Research

  • Learn algorithms for data encryption 
  • Concept of risk management security 
  • How to develop the best Internet security software 
  • What are Encrypting Viruses- How does it work? 
  • How does a Ransomware attack work? 
  • Scanning of malware on your PC 
  • Infiltrating a Mac OS X operating system 
  • What are the effects of RSA on network security ? 
  • How do encrypting viruses work?
  • DDoS attacks on IoT devices 

C) Trending Information Security Research Topics

  • Why should people avoid sharing their details on Facebook? 
  • What is the importance of unified user profiles? 
  • Discuss Cookies and Privacy  
  • White hat and black hat hackers 
  • What are the most secure methods for ensuring data integrity? 
  • Talk about the implications of Wi-Fi hacking apps on mobile phones 
  • Analyze the data breaches in 2024
  • Discuss digital piracy in 2024
  • critical cyber-attack concepts 
  • Social engineering and its importance 

D) Current Network Security Research Topics

  • Data storage centralization
  • Identify Malicious activity on a computer system. 
  • Firewall 
  • Importance of keeping updated Software  
  • wireless sensor network 
  • What are the effects of ad-hoc networks  
  • How can a company network be safe? 
  • What are Network segmentation and its applications? 
  • Discuss Data Loss Prevention systems  
  • Discuss various methods for establishing secure algorithms in a network. 
  • Talk about two-factor authentication

E) Best Data Security Research Topics

  • Importance of backup and recovery 
  • Benefits of logging for applications 
  • Understand physical data security 
  • Importance of Cloud Security 
  • In computing, the relationship between privacy and data security 
  • Talk about data leaks in mobile apps 
  • Discuss the effects of a black hole on a network system. 

F) Important Application Security Research Topics

  • Detect Malicious Activity on Google Play Apps 
  • Dangers of XSS attacks on apps 
  • Discuss SQL injection attacks. 
  • Insecure Deserialization Effect 
  • Check Security protocols 

G) Cybersecurity Law & Ethics Research Topics

  • Strict cybersecurity laws in China 
  • Importance of the Cybersecurity Information Sharing Act. 
  • USA, UK, and other countries' cybersecurity laws  
  • Discuss The Pipeline Security Act in the United States 

H) Recent Cyberbullying Topics

  • Protecting your Online Identity and Reputation 
  • Online Safety 
  • Sexual Harassment and Sexual Bullying 
  • Dealing with Bullying 
  • Stress Center for Teens 

I) Operational Security Topics

  • Identify sensitive data 
  • Identify possible threats 
  • Analyze security threats and vulnerabilities 
  • Appraise the threat level and vulnerability risk 
  • Devise a plan to mitigate the threats 

J) Cybercrime Topics for a Research Paper

  • Crime Prevention. 
  • Criminal Specialization. 
  • Drug Courts. 
  • Criminal Courts. 
  • Criminal Justice Ethics. 
  • Capital Punishment.
  • Community Corrections. 
  • Criminal Law. 

Research Area in Cyber Security

The field of cyber security is extensive and constantly evolving. Its research covers a wide range of subjects, including: 

  • Quantum & Space  
  • Data Privacy  
  • Criminology & Law 
  • AI & IoT Security

How to Choose the Best Research Topics in Cyber Security

A good cybersecurity assignment heading is a skill that not everyone has, and unfortunately, not everyone has one. You might have your teacher provide you with the topics, or you might be asked to come up with your own. If you want more research topics, you can take references from Certified Ethical Hacker Certification, where you will get more hints on new topics. If you don't know where to start, here are some tips. Follow them to create compelling cybersecurity assignment topics. 

1. Brainstorm

In order to select the most appropriate heading for your cybersecurity assignment, you first need to brainstorm ideas. What specific matter do you wish to explore? In this case, come up with relevant topics about the subject and select those relevant to your issue when you use our list of topics. You can also go to cyber security-oriented websites to get some ideas. Using any blog post on the internet can prove helpful if you intend to write a research paper on security threats in 2024. Creating a brainstorming list with all the keywords and cybersecurity concepts you wish to discuss is another great way to start. Once that's done, pick the topics you feel most comfortable handling. Keep in mind to stay away from common topics as much as possible. 

2. Understanding the Background

In order to write a cybersecurity assignment, you need to identify two or three research paper topics. Obtain the necessary resources and review them to gain background information on your heading. This will also allow you to learn new terminologies that can be used in your title to enhance it. 

3. Write a Single Topic

Make sure the subject of your cybersecurity research paper doesn't fall into either extreme. Make sure the title is neither too narrow nor too broad. Topics on either extreme will be challenging to research and write about. 

4. Be Flexible

There is no rule to say that the title you choose is permanent. It is perfectly okay to change your research paper topic along the way. For example, if you find another topic on this list to better suit your research paper, consider swapping it out. 

The Layout of Cybersecurity Research Guidance

It is undeniable that usability is one of cybersecurity's most important social issues today. Increasingly, security features have become standard components of our digital environment, which pervade our lives and require both novices and experts to use them. Supported by confidentiality, integrity, and availability concerns, security features have become essential components of our digital environment.  

In order to make security features easily accessible to a wider population, these functions need to be highly usable. This is especially true in this context because poor usability typically translates into the inadequate application of cybersecurity tools and functionality, resulting in their limited effectiveness. 

Writing Tips from Expert

Additionally, a well-planned action plan and a set of useful tools are essential for delving into Cyber Security Research Topics. Not only do these topics present a vast realm of knowledge and potential innovation, but they also have paramount importance in today's digital age. Addressing the challenges and nuances of these research areas will contribute significantly to the global cybersecurity landscape, ensuring safer digital environments for all. It's crucial to approach these topics with diligence and an open mind to uncover groundbreaking insights.

  • Before you begin writing your research paper, make sure you understand the assignment. 
  • Your Research Paper Should Have an Engaging Topic 
  • Find reputable sources by doing a little research 
  • Precisely state your thesis on cybersecurity 
  • A rough outline should be developed 
  • Finish your paper by writing a draft 
  • Make sure that your bibliography is formatted correctly and cites your sources. 
Discover the Power of ITIL 4 Foundation - Unleash the Potential of Your Business with this Cost-Effective Solution. Boost Efficiency, Streamline Processes, and Stay Ahead of the Competition. Learn More!

Studies in the literature have identified and recommended guidelines and recommendations for addressing security usability problems to provide highly usable security. The purpose of such papers is to consolidate existing design guidelines and define an initial core list that can be used for future reference in the field of Cyber Security Research Topics.

The researcher takes advantage of the opportunity to provide an up-to-date analysis of cybersecurity usability issues and evaluation techniques applied so far. As a result of this research paper, researchers and practitioners interested in cybersecurity systems who value human and social design elements are likely to find it useful. You can find KnowledgeHut’s Cyber Security courses online and take maximum advantage of them.

Frequently Asked Questions (FAQs)

Businesses and individuals are changing how they handle cybersecurity as technology changes rapidly - from cloud-based services to new IoT devices. 

Ideally, you should have read many papers and know their structure, what information they contain, and so on if you want to write something of interest to others. 

The field of cyber security is extensive and constantly evolving. Its research covers various subjects, including Quantum & Space, Data Privacy, Criminology & Law, and AI & IoT Security. 

Inmates having the right to work, transportation of concealed weapons, rape and violence in prison, verdicts on plea agreements, rehab versus reform, and how reliable are eyewitnesses? 

Profile

Mrinal Prakash

I am a B.Tech Student who blogs about various topics on cyber security and is specialized in web application security

Avail your free 1:1 mentorship session.

Something went wrong

Upcoming Cyber Security Batches & Dates

Course advisor icon

  • How It Works
  • All Projects
  • Write my essay
  • Buy essay online
  • Custom coursework
  • Creative writing
  • Custom admission essay
  • College essay writers
  • IB extended essays
  • Buy speech online
  • Pay for essays
  • College papers
  • Do my homework
  • Write my paper
  • Custom dissertation
  • Buy research paper
  • Buy dissertation
  • Write my dissertation
  • Essay for cheap
  • Essays for sale
  • Non-plagiarized essays
  • Buy coursework
  • Term paper help
  • Buy assignment
  • Custom thesis
  • Custom research paper
  • College paper
  • Coursework writing
  • Edit my essay
  • Nurse essays
  • Business essays
  • Custom term paper
  • Buy college essays
  • Buy book report
  • Cheap custom essay
  • Argumentative essay
  • Assignment writing
  • Custom book report
  • Custom case study
  • Doctorate essay
  • Finance essay
  • Scholarship essays
  • Essay topics
  • Research paper topics
  • Top queries link

Best Hacking Essay Examples

Cyber security outline.

705 words | 3 page(s)

Information on topic

This paper will be identifying what exactly identity theft is, and how criminals who steal identities are able to get away with this cybercrime. Two ways in which people commit cybercrimes is through phishing and ransomware. Both cybercrimes are used to steal identities online.

Use your promo and get a custom paper on "Cyber Security Outline".

Abstract The problem at hand is that thousands of people every year have their identities stolen or fall victim to phishing and ransomware due to lack of knowledge about these crimes. Research into several types of malware will provide insight on how to prevent cyber-attacks, which will in turn provide information on how to avoid identity theft from occurring.

Background Society is becoming more dependent on technology every day, which in turn means more information is being put online for criminals to steal. How does this happen, and what are some ways to prevent it from happening? Why is it easier for criminals to steal information? It is important to research because everyone is turning to the digital age for everyday life and it”s becoming easier to steal information.

Approach Several types of malware used in ransomware and phishing to determine how easy it is to commit identity theft will be researched. Individual phishing attacks will be researched, in addition to attacks on companies, like the Equifax breach in 2017 and the Boeing security breach in 2018. The paper will compare how the different attacks affected the companies, and how they contrasted against each other. For example, Boeing was hit with malware and fell victim to a ransomware attack. The Equifax breach on the other hand was simply caused by poor management not updating software. Additionally, the paper will be exploring how each company handled the data breach after the fact. For cybercrimes committed against individuals, the paper will also be comparing different types of cybercrimes, and how people can avoid them.

Literature Review “User preference of cyber security awareness delivery methods” gives a general overview of cybersecurity and how important it is to prevent cybercrimes. “Is Identity Theft Really Theft?” gives information about multiple types of identity theft and why and if they are considered criminal offenses. It also provides information about what a person should do after falling victim to identity theft. “Facts+Statistics: Identity Theft and Cybercrime” details statistics regarding cybercrimes and how frequently people fall victim to cybercrimes. “Training to Mitigate Phishing Attacks Using Mindfulness Techniques” provides information in the corporate setting on how to prevent phishing from negatively impacting a business, and how employees can use these techniques to lessen the chance of falling for phishing techniques. “Ransomware, Social Engineering and Organization Liability” provides details on what ransomware is and how it uses malware to corrupt and steal information. It also gives information on how to prevent ransomware.

Solution: The solution will be that more people need to be vigilant when going online and giving out information, both for personal use online and in the work setting. Ways to identity different types of malware will also be addressed, as well as the importance of good online protection on both personal computers, and software for larger companies.

Discussion In general, older generations are more susceptible to cybercrimes because they are not as tech-savvy as the current generation who has grown up immersed in technology. Many people do not choose to educate themselves about cybersecurity as they feel like identity theft is unlikely to happen to them. Many people and companies also do not bother to update their security even though it is an extremely easy way to prevent their computers falling victim to malware. Overall, the research will benefit most people, as everyone can and should learn proper techniques for internet safety.

Recommendations and Conclusion Some of the outcomes will be learning additional information that people can follow to prevent identity theft and hacking from occurring and learning about previous security breaches that have occurred to provide information on what to do after a company loses personal client information. This paper should be successful at providing all necessary information for learning about cybersecurity and threats to personal information. Current information about cybersecurity breaches and hacks will be relayed in the paper, and further recommendations to use in personal and work life will also be incorporated.

Have a team of vetted experts take you to the top, with professionally written papers in every area of study.

The Federal Register

The daily journal of the united states government, request access.

Due to aggressive automated scraping of FederalRegister.gov and eCFR.gov, programmatic access to these sites is limited to access to our extensive developer APIs.

If you are human user receiving this message, we can add your IP address to a set of IPs that can access FederalRegister.gov & eCFR.gov; complete the CAPTCHA (bot test) below and click "Request Access". This process will be necessary for each IP address you wish to access the site from, requests are valid for approximately one quarter (three months) after which the process may need to be repeated.

An official website of the United States government.

If you want to request a wider IP range, first request access for your current IP, and then use the "Site Feedback" button found in the lower left-hand side to make the request.

IMAGES

  1. Advanced Cyber Security and its Methodologies

    cyber security research paper outline

  2. (PDF) Cyber security and artificial intelligence

    cyber security research paper outline

  3. Cybersecurity Research Paper

    cyber security research paper outline

  4. Research paper on cyber security

    cyber security research paper outline

  5. (PDF) Cyber Security

    cyber security research paper outline

  6. FREE 10+ Cyber Security Proposal Samples [ Project, Training, Audit ]

    cyber security research paper outline

VIDEO

  1. Cyber Security Final Project Presentation

  2. Network security Research paper

  3. Cyber Security tutorials by Mr. Shoaib Ahmed Sir

  4. Advanced Cyber Security Research Lab- Doon University Dehradun

  5. Prioritizing Cyber Security Projects

COMMENTS

  1. Research paper A comprehensive review study of cyber-attacks and cyber security; Emerging trends and recent developments

    Information technology Cyber-attacks Cyber security Emerging trends Key management 1. Introduction For more than two decades, the Internet has played a significant role in global communication and has become increasingly integrated into the lives of people around the world.

  2. A Study of Cyber Security Issues and Challenges

    The paper first explains what cyber space and cyber security is. Then the costs and impact of cyber security are discussed. The causes of security vulnerabilities in an organization and the challenging factors of protecting an organization from cybercrimes are discussed in brief.

  3. (PDF) A Systematic Literature Review on the Cyber Security

    This paper offers a comprehensive overview of current research into cyber security. We commence, section 2 provides the cyber security related work, in section 3, by introducing about cyber security.

  4. A STUDY OF CYBER SECURITY AND ITS CHALLENGES IN THE SOCIETY

    This paper mainly focuses on challenges faced by cyber security on the latest technologies .It also focuses on latest about the cyber security techniques, ethics and the trends changing the face of cyber security. Keywords: cyber security, cyber crime, cyber ethics, social media, cloud computing, android apps. 1. INTRODUCTION.

  5. Cybersecurity data science: an overview from machine learning

    In a computing context, cybersecurity is undergoing massive shifts in technology and its operations in recent days, and data science is driving the change. Extracting security incident patterns or insights from cybersecurity data and building corresponding data-driven model, is the key to make a security system automated and intelligent. To understand and analyze the actual phenomena with data ...

  6. Cyber risk and cybersecurity: a systematic review of data ...

    32 Altmetric 2 Mentions Explore all metrics Abstract Cybercrime is estimated to have cost the global economy just under USD 1 trillion in 2020, indicating an increase of more than 50% since 2018.

  7. PDF CYBERSECURITY: HOW SAFE ARE WE AS A NATION?

    our national security depends on a strong cyber culture. To stay ahead of the threat, cybersecurity needs to be steeped into the national consciousness through education, sustained messaging and increased cooperation among businesses and government. My research examines the vulnerabilities, and cites case studies, national policy and expert ...

  8. (PDF) A Study of Cyber Security Threats, Challenges in ...

    This paper review 27 articles on cyber security and cybercrimes and it showed that cyber security is a complex task that relies on domain knowledge and requires cognitive abilities to determine ...

  9. Journal of Cybersecurity

    Latest Issue Volume 10 Issue 1 2024 Impact Factor 3.9 Editors-in-Chief Tyler Moore David Pym About the journal Journal of Cybersecurity publishes accessible articles describing original research in the inherently interdisciplinary world of computer, systems, and information security … Find out more Latest articles

  10. A systematic literature review of cyber-security data repositories and

    This section provides the details of the methodology we followed. To achieve our goal of reviewing the datasets and evaluation metrics used in the applications of SSL techniques to cyber-security, we followed the standard systematic literature review guidelines outlined in [] for assessing the search's completeness.The entire process was done on Covidence [], an online tool for systematic ...

  11. A Systematic Literature Review on Cyber Threat Intelligence for ...

    Cyber threat intelligence (CTI) enhances organizational cybersecurity resilience by obtaining, processing, evaluating, and disseminating information about potential risks and opportunities inside the cyber domain. This research investigates how companies can employ CTI to improve their precautionary measures against security breaches.

  12. (PDF) Cybersecurity: trends, issues, and challenges

    Cybersecurity involves various mechanisms, tools or configuration that facilitate or goes a long way down to mitigate the malicious impact of cyberattack on individual, businesses or government ...

  13. Cyber Security Challenges and its Emerging Trends on Latest

    In addition to numerous cyber protection initiatives, many people are also very worried about it. This paper focuses primarily on cyber security concerns related to the new technology. It also concentrates on the new technologies for cyber security, ethics and developments that impact cyber security.

  14. Information Security:A Review of Information Security Issues and

    Abstract: Currently, corporations are more into using distributed systems and relying on networks and communication facilities for transmitting critical and important information that needs to be secured. Therefore, protecting corporations' information becomes more important, and information security is essential to maintain. Information security is defined as protecting the information, the ...

  15. Cyber Security Research Papers

    Cyber Security Research Papers. Master's degree candidates at SANS.edu conduct research that is relevant, has real world impact, and often provides cutting-edge advancements to the field of cybersecurity, all under the guidance and review of our world-class instructors. Cloud Security.

  16. AI hype as a cyber security risk: the moral responsibility of

    This paper examines the ethical obligations companies have when implementing generative Artificial Intelligence (AI). We point to the potential cyber security risks companies are exposed to when rushing to adopt generative AI solutions or buying into "AI hype". While the benefits of implementing generative AI solutions for business have been widely touted, the inherent risks associated ...

  17. Cyber Security Research Paper: 50 Amazing Topics

    January 10, 2020 In the world today, technology has evolved so much, and the bulk of data is stored in cyberspace. But this has also brought about the serious issue of cyber security. In one of the recent cases, in 2017, a malware known as Wannacry ransomware attacked companies across the globe and took over their data.

  18. (PDF) Research Paper on Cyber Security

    ... An effective cybersecurity process involves numerous layers of protecting tools across networks, computers, programs, or information. The methods, the people, and the tools must all accompany...

  19. 60+ Latest Cyber Security Research Topics for 2024

    27th Dec, 2023 Views Read Time 9 Mins In this article The concept of cybersecurity refers to cracking the security mechanisms that break in dynamic environments. Implementing Cyber Security Project topics and cyber security thesis topics /ideas helps overcome attacks and take mitigation approaches to security risks and threats in real-time.

  20. Outline for cyber security paper.docx

    ITCC 231 Marcusmann08151986 4/5/2023 A look at the Trends In Cyber Security and The Progress of The latest Technology Outline 1. Introduction 2. Cyber Security a. Definition of Cyber Security b. Why cyber security is important c. Incident s d. Android security e. Mobile device security 3. Cyber Crime a. Definition 4. Trends in Cyber Security a.

  21. Cyber Security Outline

    Approach Several types of malware used in ransomware and phishing to determine how easy it is to commit identity theft will be researched. Individual phishing attacks will be researched, in addition to attacks on companies, like the Equifax breach in 2017 and the Boeing security breach in 2018.

  22. (PDF) Artificial Intelligence in Cyber Security

    Artificial Intelligence in Cyber Security. July 2021; Journal of Physics Conference Series 1964(4):042072 ... • Papers with titles belonging to subjects outside the scope of this research paper.

  23. How to craft cyber-risk statements that work, with examples

    By following the outlined steps and considering the above cyber-risk statement examples, cybersecurity professionals can effectively communicate cyber-risks and strengthen their organization's defensive posture. Jerald Murphy is senior vice president of research and consulting with Nemertes Research.

  24. Cybersecurity in the Marine Transportation System

    Start Preamble Start Printed Page 13404 AGENCY: Coast Guard, Department of Homeland Security (DHS). ACTION: Notice of proposed rulemaking. SUMMARY: The Coast Guard proposes to update its maritime security regulations by adding regulations specifically focused on establishing minimum cybersecurity requirements for U.S.-flagged vessels, Outer Continental Shelf facilities, and U.S. facilities ...

  25. PDF Research Paper on Cyber Security

    Disadvantages. 2. -. 2. More pandemic-related phishing. continue to use the COVID-19 pandemic as a theme for their phishing campaigns. 2. New kinks on the "Nigerian Prince" fiddle.