In the ever-evolving fight against data loss, data breaches, and data theft in the 21st century, organizations worldwide have turned to a number of cybersecurity solutions, services, and software in an attempt to keep their data safe and secure from threats.
One such solution is behavioral analytics, more specifically User and Entity Behavior Analytics (UEBA). UEBA utilizes algorithms and machine learning to track anomalous behavior not just from users within a certain network but also the routers, servers, and endpoints making up that network. UEBA has been growing for some time, and a 2022 Market Data Forecast report predicts its global market size to grow from $890.7 million in 2019 to $1.1 billion by 2025. UEBA is also increasingly becoming a feature in core cybersecurity products like SIEM and EDR so it’s growing in ways that standalone market figures don’t completely capture.
However impressive these numbers may be, they don’t answer the question of whether UEBA actually works as promised to stop cybersecurity threats. Moreover, what about other behavioral analytics methods in cybersecurity? How much good can those accomplish? So, those are some of the questions we’re hopefully going to answer today.
Want to Find the UEBA Solution That’s Right for You? Check Out Best User and Entity Behavior Analytics (UEBA) Tools for 2022
Table of Contents
What Is Behavioral Analytics?
Despite the name evoking images of psychological or sociological analyses, behavioral analytics’ origins cannot be found in academia but in the worlds of business and statistics. Behavioral analytics is essentially a subgenre of business analytics, the iterative investigation of past business performance to generate insight when making decisions.
There are a number of different ways to perform this sort of investigation. Whether it’s studying the performance of your direct competitors, using predictive analytics to determine what the future may hold for your industry, or analyzing employee performance and making optimization decisions based on that information, the entire point is to take data in and use it to make better-informed decisions.
Behavioral analytics specifically combines machine learning and big data analytics in concert to take in users’ behavioral data and identify trends, anomalies, and patterns based on this data. “Users” in this case can mean your employees, your customers, or just anyone who directly interfaces with your business and your business’s data on a regular enough basis to generate patterns.
A common use case for behavioral analytics is on eCommerce or media platforms. From the Netflix algorithm that provides you with recommendations on what to watch next based on what you’ve already watched to a meal-delivery platform like GrubHub or DoorDash offering you restaurant-specific discounts based on your ordering history, there are a number of ways that companies leverage your past activity on their platforms to generate insights and predictions in order to keep you spending money with them.
Interested in Seeing What Else Machine Learning Is Doing in Cybersecurity? Read Hyperautomation and the Future of Cybersecurity
Why Use Behavioral Analytics in Cybersecurity?
Data gathering and analysis can be beneficial in cybersecurity too of course. Information is one of the most powerful tools users and enterprises have when combatting data breaches, leaks, and data loss.
Additionally, behavioral analytics is uniquely-suited to the goals of many organizations’ cybersecurity plans. Cybercriminals, much like criminals in the physical world, tend to look for the path of least resistance when infiltrating an area. In the world of cybersecurity, the path of least resistance has consistently been shown to be the human element, specifically user accounts with enough access privileges or credentials for the cybercriminal to execute their plan.
According to a 2020 study conducted by the Ponemon Institute and sponsored by IBM Security, 40% of what the study calls “malicious incidents” occurred due to stolen/compromised employee credentials or cloud misconfigurations. Compromised employee account login information was also the costliest infection vector for enterprises. On average, malicious incidents cost companies $3.86 million per breach according to the study, but when stolen credentials were involved, that number jumped to $4.77 million.
By tracking user behavior, as well as anomalies within other parts of a network like servers or routers, companies have more opportunities to stop a data breach before it happens and potentially help save a business millions of dollars. This is part of the common sales pitch of top companies within the UEBA space like Cynet, IBM, Splunk, or Microsoft, but as with any cybersecurity offering, the technology isn’tt foolproof.
Does UEBA Actually Work?
Many companies tout their UEBA product as being “accurate.” This is common with software that utilizes machine learning or AI algorithms for classification purposes. For example, if a UEBA solution sounds the alarm on 10 anomalous instances in a day, and even 1 of them turned out to be a cybersecurity threat, the solution could be described as being able to accurately predict potential threats. Accuracy is absolutely important, but it isn’t the only measurement needed for success with machine learning. Precision, how often a model identifies true positives instead of false positives, is just as vital.
For UEBA specifically, false negatives or, more accurately, the lack of false negatives is often the most most important metric of all. Producing a number of false positives (being imprecise) is often preferable than allowing a false negative to slip by and cause your business to potentially lose millions. So, how good is UEBA at avoiding false negatives? The short answer is: “it’s complicated.”
In 2019, researchers from Southern Methodist University conducted a study using behavioral analytics algorithms on network traffic to detect DDoS attacks. In their findings, the performance of these algorithms varied wildly by type. Random Forest was the most accurate type of algorithm discussed, scoring 99% in both accuracy and precision. Meanwhile, Naive Bayes scored a dismal 26% in accuracy and 66% in precision.
The type of anomaly being detected also affected performance. While most algorithms performed well against the HULK DDoS tool, none of them were able to accurately identify bot-generated DDoS attacks. This might be due in part to the small sample size of bot attacks that researchers had access to, however.
A 2018 paper published by the Institute of Electrical and Electronics Engineers (IEEE) highlights a specific flaw with UEBA:
“The negative part of applying machine learning in UEBA is the same drawbacks that any machine learning brings. Machine learning has limitations dealing with privileged users, developers, and knowledgeable insiders. Those users represent a unique situation because their job functions often require irregular behaviours. This cause[s] difficulties for statistical analysis to create a baseline [for] the algorithms. Another drawback is that UEBA can’t indicate the long-term sophisticated ‘low and slow’ as attacks because they [do] not have day to day impact and become as if non-existent.”
In other words, UEBA, like other machine learning-based solutions, is at its best when the tasks it is trying to accomplish are simple, predictable, and have easy-to-identify patterns. Users who need to operate in harder-to-predict patterns, like developers or executives or subject matter experts, will give UEBA a tougher time.
David Movshovitz, co-founder and CTO of Israeli cybersecurity firm RevealSecurity, told eSecurity Planet:
“In classical UEBA, you try to look at each operation by itself… You try to learn statistical quantities like… ‘the average number of emails you send’… We claim that this average is a mathematical quantity but has no meaning from a real behavior perspective… And if I may, I would give an example. I would talk about myself as a CTO. There are days where I am busy preparing a presentation. So, very few emails, working mainly with [Microsoft] Office. Other days, I’m working on some bug in the product, and then I will rarely do emails, don’t touch files, only working on the system, on the logs, analyzing…”
It should be noted, however, that Movshovitz is shunning UEBA in favor of his company’s own behavioral analysis product which, according to them, better tracks cybersecurity threats by building what they call “user journeys.” User journeys essentially track the sequencing of user actions to provide what they claim is a more accurate picture of potential cybersecurity threats. This is in contrast to UEBA, which generally treats each user action as its own individual data point laid out on a timeline.
When asked for data to support the company’s claims of superior performance, Movshovitz replied:
“This is not a theoretical claim. We have working systems on customers’ business applications, on-prem, custom-built, and SaaS, and I can give you numbers. For example, we are monitoring a Salesforce application on one of the large insurance companies in Israel, and because we are layering these profiles, we are generating about one alert once a week or even once every two weeks… We are monitoring Microsoft 365 users by a very large bank in Europe… and we are generating about 10-12 alerts a week, something like once or twice a day.”
Although RevealSecurity didn’t provide data to support its approach, Movshovitz’s criticisms of UEBA have some validity. While good at detecting certain types of threats like DDoS attacks, the more it deals with actual human users whose on-network activities can vary wildly between days, the less effective it becomes. For some types of businesses, this won’t be an issue. For others, it can be a complete dealbreaker.
Essentially, UEBA can work, but it won’t necessarily work as a one-size-fits-all solution to your cybersecurity threat detection needs. Like other human-facing AI products like chatbots, it struggles to provide meaningful insight when confronted with situations outside of its expected datasets, situations which human users can provide all too often. However, using UEBA in concert with other solutions, as well as expert staff, can make it a relatively effective tool in the right scenarios.
Want to Learn More About Some of the Best Cybersecurity Solutions Out There? Take a Look at eSecurity Planet’s 2022 Cybersecurity Product Awards
Should You Use Behavioral Analytics?
There is some discussion to be had regarding the ethical considerations surrounding behavioral analytics in cybersecurity. UEBA can run into some of the same issues that other cybersecurity solutions, such as employee monitoring, do and are compounded by the use of AI and machine learning as part of its product.
To focus on the cybersecurity aspect first, there is the problem of what data your UEBA solution is taking in. If it’s just data collected during the user’s work hours or while they are using company hardware/software, it’s probably fine as long as you make that monitoring clear to the user in advance. Transparency is key whenever you’re collecting user data. As long as the user is fully aware of what data is being collected and what that data is being used for, it’s much easier to develop trust with that user.
Now, let’s turn to the ethical problems AI and machine learning specifically can bring into the mix. This quote from a 2021 ScienceDirect article discussing ethical guidelines within a hypothetical AI insurance system sums it up nicely: “The AI system is often treated as a discrete technical system or even as a black box, which presents an almost intractable problem because it is so complicated and therefore difficult to explain.”
The “black box” problem, as it’s called, is common in the AI space and directly relevant to UEBA and UEBA-related products. In fact, RevealSecurity co-founder and CEO Doron Hendler stated in a 2021 interview with IsraelDefense that his company’s dream for its product was “to have a black box where you can upload your logs and receive answers.”
Black box solutions are difficult to effectively deploy in cybersecurity. While users can usually view the input and output of a black box, the internal processes are obscured, and this obscurity harms the trustworthiness of the product. While a security vendor may rightly wish to protect the proprietary information and intellectual property contained in a black box solution, transparency is often the best way to assure employees, customers, and any others who interface with your network that your cybersecurity technology is not potentially being used to abuse the people it’s meant to monitor. It’s also best for customers in a market where some reports show 90% of buyers aren’t getting the effects they were promised.
Tips for Implementing Behavioral Analytics in Cybersecurity
To understand how to implement behavioral analytics in cybersecurity, a good place to start is to first understand the ways in which UEBA is different from another popular form of analytics: cohort analytics.
Cohort analytics can look very similar to UEBA at first blush. It takes data from product or service usage, like a streaming or eCommerce platform, and organizes that data into a series of groups based on related characteristics. Called cohorts, these groups are usually measured by their shared common characteristic over a specified length of time, such as what time of day Netflix users aged 18-49 tend to watch TV shows vs. what time of day they tend to watch movies.
The key differences between cohort analysis and UEBA are twofold. First, cohort analysis is typically used in marketing and advertising circles, and its success or failure will not entirely sink a business. By contrast, UEBA being used in cybersecurity makes it more essential to keeping the company afloat by hopefully preventing damaging data breaches.
Second, UEBA tends to be a bit more granular in the way it parses data. UEBA is focused on detecting anomalies and tracking patterns that don’t conform to whatever statistically-expected patterns are set for it. So while it might group things together based on specific characteristics like login time or time spent within a certain application.
In terms of implementation, data collection is key to making effective use of UEBA. Pairing it with big data analytics can be a great way to boost UEBA capabilities. UEBA relies on machine learning and AI to process and analyze the datasets it is given, and the more data it has, the better it can find patterns and anomalies that might otherwise escape notice, similar to SIEM systems that have begun storing log and security data in data lakes.
Next, make sure you have experts in both machine learning and cybersecurity working together to manage your UEBA solution. Whether it’s a platform bought from a major enterprise or one of the many great open source machine-learning tools available, the best way to maximize the benefits of UEBA is to make sure you have the best people you can find working on it and fine-tuning it.
Finally, be flexible. Cybercriminals are, naturally, trying to avoid detection the best they can, and like any good criminals, they’re constantly evolving their methods and looking for new vulnerabilities to exploit. As such, it’s important that you be open to reiterating your detection and analytical methods whenever necessary to stay as far ahead of hackers as you can. Make sure UEBA is backed up by other cybersecurity solutions and experts to ensure your data is as safe as possible.
Looking for More Cybersecurity Resources? Check Out Top Endpoint Detection & Response (EDR) Solutions in 2022