By: Arijit Dutta, Director of Data Science
Welcome to the Unraveling Cyber Defense Model Secrets series, where we shine a light on Adlumin’s Data Science team, explore the team’s latest detections, and learn how to navigate the cyberattack landscape.
The increasing threat landscape for organizations has forced cybersecurity teams to adopt digital transformation. The COVID-19 pandemic has further complicated matters by accelerating the adoption of cloud services, leading to a proliferation of cloud providers and a surge in the number of IoT devices transmitting data to the cloud.
This complex web of interconnections has brought about greater scale, connectivity, and speed in our digital lives but has also created a larger attack surface for cybercriminals. Responding to these challenges, cybersecurity teams are turning to AI-powered automation, especially machine learning, to uncover, evaluate, and effectively counter system, network, and data threats. Understanding the role of AI in cybersecurity is critical for organizations to protect themselves against malicious cyber activities effectively.
In this blog, we explore the current technologies available, the exciting developments on the horizon, and the transformative impact of AI.
Current, Upcoming, and Future AI Technology
As in most industries, AI technology is indispensable in organizations today for distilling actionable intelligence from the massive amounts of data being ingested from customers and generated by employees. Organizations can choose from various available data mining and AI methods depending on desired outcomes and data availability. For example, if the goal is to evaluate each customer for digital marketing suitability for a new product, “supervised” methods such as logistic regression or decision-tree classifier could be trained on customer data.
These use cases require customer data on prior actions, such as historical responses to marketing emails. For a customer segmentation problem, “unsupervised” methods such as density-based clustering algorithm (DBSCAN clustering) or principal component analysis (PCA) dimensionality reduction are called for, where we don’t impose prior observations on specific customer actions but group customers according to machine-learned similarity measurements. More advanced methods, such as Artificial Neural Networks, are deployed when the use case depends on learning complex interactions among numerous factors, such as customer service call volume and outcome evaluation or even the customer classification and clustering problems mentioned earlier. The data volume, frequency, and compute capacity requirements are typically heavier for artificial neutral networks (ANNs) than for other Machine Learning techniques.
The most visible near-term evolution in the field is the spread of Large Language Models (LLM) or Generative AI, such as ChatGPT. The underlying methods behind these emergent AI technologies are also based on the ANNs mentioned above – only with hugely complicated neural network architectures and computationally expensive learning algorithms. Adaptation and adoption of these methods for customer classification, segmentation, and interaction-facilitation problems will be a trend to follow in the years ahead.
Cybersecurity Solutions That Use AI
At Adlumin, we develop AI applications for cyber defense, bringing all the techniques above to bear. The central challenge for AI in cyber applications is to find “needle in haystack” anomalies from billions of data points that mostly appear indistinguishable. The applications in this domain are usefully grouped under the term User and Entity Behavior Analytics, involving mathematical baselining of users and devices on a computer network followed by machine-identification of suspicious deviations from baseline.
To skim the surface, here are two solutions cybersecurity teams use that incorporate AI:
Two Automation Cybersecurity Solutions for Organizations
User and Entity Behavior Analytics (UEBA)
UEBA is a machine learning cybersecurity process and analytical tool usually included with security operation platforms. It is the process of gathering insight into users’ daily activities. Activity is flagged if any abnormal behavior is detected or if there are deviations from an employee’s normal activity patterns. For example, if a user usually downloads four megabytes of assets weekly and then suddenly downloads 15 gigabytes of data in one day, your team would immediately be alerted because this is abnormal behavior.
The foundation of UEBA can be pretty straightforward. A cybercriminal could easily steal the credentials of one of your employees and gain access, but it is much more difficult for them to convey that employee’s daily behavior to go unseen. Without UEBA, an organization cannot tell if there was an attack since the cybercriminals have the employee’s credentials. Having a dedicated Managed Detection and Response team to alert you can give an organization visibility beyond its boundaries.
Threat Intelligence
Threat intelligence gathers multi-source, raw, curated data about existing threat actors and their tactics, techniques, and procedures (TTPs). This helps cybersecurity analysts understand how cybercriminals penetrate networks so they can identify signs early in the attack process. For example, a campaign using stolen lawsuit information to target law firms could be modified to target organizations using stolen litigation documents.
Threat intelligence professionals proactively threat hunt for suspicious activity indicating network compromise or malicious activity. This is often a manual process backed by automated searches and existing collected network data correlation. Whereas other detection methods can only detect known categorized threats.
AI Risks and Pitfalls to Be Aware of
When building viable and valuable AI applications, data quality and availability are top of mind. Machines can only train on reliable data for the output to be actionable. Great attention is therefore required in building a robust infrastructure for sourcing, processing, storing, and querying the data. Not securing a chain of custody for input data means AI applications are at risk of generating misleading output.
Awareness of any machine-learned prediction’s limitations and “biases” is also critical. Organizational leadership needs to maintain visibility into AI model characteristics like “prediction accuracy tends to falter beyond a certain range of input values” or “some customer groups were underrepresented in the training data.”
Operationally, an excellent way to proceed is to build and deploy a series of increasingly complex AI applications rather than being wedded to a very ambitious design at the get-go. Iteratively adding functionality and gradually incorporating more data fields can make measuring performance easier and avoid costly mistakes.
Organizations Embracing AI
Organizations need to build a cybersecurity infrastructure embracing the power of AI, deep learning, and machine learning to handle the scale of analysis and data. AI has emerged as a required technology for cybersecurity teams, on top of being one of the most used buzzwords in recent years. People can no longer scale to protect the complex attack surfaces of organizations by themselves. So, when evaluating security operations platforms, organizations need to know how AI can help identify, prioritize risk, and help instantly spot intrusions before they start.