Manual vs. Autonomous Anomaly Detection: why only the latter scales

Say you’re looking for a smart product to detect anomalies in your organization’s IT environment. A sales rep drops by and shows you all kinds of great artificial intelligence (AI) features with fancy-sounding algorithms. It sounds very impressive and seems like there is a lot of very valuable AI in the product. But, in fact, the opposite is true. This is a manual AI product wrapped in a deceiving jacket.

Let me tell you more.

At StackState, we actually started with manual anomaly detection and then built an autonomous AI engine – autonomous anomaly detection is the first thing it can do. Taking this step meant a big upfront investment, and required a lot of dedication and knowledge but we haven’t looked back...and hopefully, you won’t either once you finish reading this article.

Manual AI

We don’t use the term ‘manual AI’ really, but to differentiate from autonomous AI, we’ll refer to it as that. There are many anomaly detection products out there and they need to have inputs such as what algorithm to use and what kind of z-scores to react to. It all seems relatively simple. Just a couple of knobs to turn, and you get to see a nice preview of how your anomaly detector is configured. Press a few buttons and you’re done.

But is it really that simple?

  • What algorithms will you use for what metric streams? How will you configure these algorithms? At what point do you re-evaluate your choice?

  • How much compute power will you invest in anomaly detection? Do you put anomaly detection on everything? If not, which streams should you prioritize?

  • How will you handle changes to the environment when new streams are added or removed?

To answer these questions continuously, you’re going to need a team.  And you must make sure that your team really has all the skills and capabilities you will need—we're talking about subject matter experts who deeply and intuitively understand the data, who know why the squiggly line is the way it is, data scientists who know what algorithms to pick and how to keep them updated, and data engineers who know how to then deploy those algorithms against streaming data pipelines.  And, of course, the more data, the bigger the team.

If you can’t build such a cross-functional team (and most organizations can’t), then you need the second option to detect anomalies throughout your IT infrastructure—autonomous AI. The difference in terms of effort and effectiveness between manual and autonomous anomaly detection is like going from shoveling coal into the furnace of a locomotive vs. flipping a switch on the Shinkansen bullet train.

The difference in terms of effort and effectiveness between manual and autonomous anomaly detection is like going from shoveling coal into the furnace of a locomotive vs. flipping a switch on the Shinkansen bullet train.

Why do all of the hard work to get your organization from A to Z?

Autonomous AI

With autonomous AI, rather than specifying all of the algorithmic requirements and procedures, you just need to set the specific business end goal. Much like using a car’s GPS navigation system, or Google search, the algorithms are already there and users don’t need to think about them.

StackState’s AI anomaly detection works by tracking the IT system itself – where it changes, what components are most active and most central to business services. It also follows users, monitoring their behavior, i.e., where people look most within their infrastructure for analytics and incidents. So, if people are constantly looking, for example, for the root cause behind slow page loading times or failed user access, the engine reasons from that and starts detecting anomalies for those metrics. It thinks: ‘oh, that metric must be more important than others, so it needs to be tracked.’

This form of autonomous anomaly detection is not so human-dependent as manual AI where, as we know, humans have blind spots and make mistakes. They also have limited knowledge and limited capacity, and IT environments are highly dynamic nowadays. After every change in the IT environment, should a human decide whether to apply or reapply anomaly detection? No, it’s not necessary.

In addition to this automatic learning mechanism, we've also pre-configured many best practices based on the most common features and processes that organizations seek to monitor. Problems can be more quickly identified, which means they can be more quickly resolved and then prevented down the road. For these reasons, autonomous anomaly detection is the only AI that scales as organizations’ business demands grow and their systems get more complex.

Autonomous anomaly detection is the only AI that scales as organizations’ business demands grow and their systems get more complex.

So, my question to you is then, which train do you choose? The labor-intensive, coal-powered train or the smart, swift Shinkansen?

About Lodewijk Bogaards

Lodewijk Bogaards is a StackState co-founder and CTO. Lodewijk combines deep technical skills with high-level technical vision. With over 20 years of experience in IT, he specialized in Monitoring, AI, Graph databases, and software development.


Blog