Cisco Develops AI Agents for Troubleshooting and Compliance in Cisco IQ

May 1, 2026 · Juan Netapp · 9 min read

This video from Outshift by Cisco covered a lot of ground. 17 segments stood out as worth your time. Everything below links directly to the timestamp in the original video.

This development could transform how organizations handle system diagnostics and compliance, potentially reducing the time and resources spent on identifying complex technical issues.

Cisco Develops AI Agents for Troubleshooting and Compliance in Cisco IQ

Cisco is integrating advanced AI capabilities into its Cisco IQ platform, transitioning from traditional workflow-based solutions to sophisticated deep agent systems. These agents leverage inductive, data-driven reasoning to identify root causes in complex, previously unseen scenarios across diverse domains and vendors. This approach moves beyond classical expert systems that are limited to known issues, enabling the AI to troubleshoot much like a human, even if the results are probabilistic and may involve some speculation.

"I can go and do inductive reasoning just datadriven much like I do deductive reasoning using classic expert systems in the past."

▶ Watch this segment — 55:46

Time Series Foundational Models Revolutionize Forecasting with Zero-Shot Capabilities

Time series foundational models are bringing a "massive" transformation to data analysis, enabling zero-shot forecasting across multiple domains without the need for dataset-specific tuning. These models utilize a decoder-only transformer architecture, enhanced with patching and long context windows, to efficiently learn and predict temporal patterns. This advancement eliminates the tedious, dataset-by-dataset tuning that previously consumed significant time for analysts.

"This is really massive guys. This helps us big time. And it's this one thing when I saw this where I said, 'Boy, I wasted 10 years of my life fiddling around with all these individual like per data set tuning things.'"

▶ Watch this segment — 40:27

Cisco's New Time Series Model Boosts Observability Benchmarks

Cisco has introduced a new time series model that significantly lowers the mean absolute error on observability benchmarks while maintaining comparable performance on general benchmarks. This model, which will be available in Splunk with a single command, integrates both coarse and fine time blocks with resolution embeddings to handle varying data frequencies and ensure accurate forecasting. The system aims to simplify complex time series analysis by providing a highly accurate, streamlined solution.

"We come in lower than anybody else. So that means if you're coming up with the right architecture and the right training data, we can undercut everybody."

▶ Watch this segment — 38:00

Agent-Driven System Resolves 'Vote App' Issue, Provides Audit Trail

A prototype system demonstrated an agent-driven investigation successfully resolving a "vote app not reachable" issue, tracing the root cause to disk pressure and associated image pull-back errors. The system initiated with the original problem statement, then iteratively generated and evaluated hypotheses, gathered evidence, and converged on a solution, much like a human troubleshooter. This process included an audit trail of its investigative steps and the capability to generate a detailed report for management.

"We have an investigation for exactly this vote app. And um, what the system came back with here is like this is the boring piece. This is the ultimate resolution."

▶ Watch this segment — 52:27

Deep Agent System Diagnoses 'Vote App' Outage to Disk Pressure

A deep agent system successfully diagnosed a "vote application not reachable" issue, tracing the root cause to disk pressure and ephemeral storage problems within a Kubernetes environment. The system operates with a planner agent and multiple evidence agents that can generate code to ingest and analyze data, identify dependencies, and iteratively refine hypotheses. An investigation graph visualized the process, demonstrating how the agents moved from initial symptoms to the ultimate problem through a series of logical steps.

"So I'm forming I I gather new data and I'm reformulating things... Then I find something like the vote app failed to be admitted to a pot due to disk pressure. Ah we're getting close, right?"

▶ Watch this segment — 47:32

AWS and DataDog Advance Time Series Forecasting with Novel AI Architectures

AWS and DataDog are employing distinct, advanced approaches to time series forecasting. AWS quantizes time series data into a fixed vocabulary of tokens, treating them like words for transformer model training, and has integrated group attention blocks to capture correlations across different time series. DataDog, meanwhile, has scaled its training data to over two trillion points and developed models that prioritize time attention over group attention, reflecting the belief that historical trends within a single series are more indicative of future behavior.

"DataDog went crazy like well they have data like we have with Splunk they have their own observability system so they said well we have that data let's go and use it so they trained on more than two trillion data points so size matters right."

▶ Watch this segment — 33:31

Cisco Develops Time Series Model for Observability with Massive Data Scale

Cisco has developed a new time series foundational model, leveraging a decoder-only approach and training on 300 billion time points, combined with extensive data cleaning. This model specifically addresses observability challenges by integrating both hourly rollups and granular observations, mimicking human memory by summarizing historical data without retaining every detail. The architecture is designed to handle different input frequencies and focus on predicting future events in dynamic environments.

"We went up to 300 billion individual time points and we cleaned the data big time to go and make sure that the entropy is nicely balanced."

▶ Watch this segment — 36:11

Patching Method Reduces Transformer Computational Cost for Time Series

A significant architectural advancement for transformers involves treating groups of data points, known as patches, as single tokens to manage the high computational cost associated with long time series. This technique drastically reduces computational complexity from being proportional to the square of the sequence length to the square of the number of patches. For instance, processing 10,000 data points as individual tokens would involve 100 million computations, but with 32-point patches, this reduces to approximately 100,000 computations, offering substantial savings, especially with multiple attention heads.

"So it's the sequence length times the sequence length. So it's the square of the sequence length i.e. the number of tokens in your context window."

▶ Watch this segment — 26:20

AI Agents Proposed for Collaborative Troubleshooting, Mimicking Human Problem-Solving

Traditional troubleshooting methods, often involving data consolidation in data lakes or cross-departmental meetings, struggle with the complexity of modern IT environments. A new approach proposes an agentic system where AI agents collaborate, mirroring human problem-solving. These agents form hypotheses, gather evidence like logs and commands, evaluate relationships between data points, and iteratively refine their understanding to pinpoint root causes, moving away from rigid, computer-centric processes.

"How about we do this like humans do? So we bring these different domains together and we have these guys behave like humans but this is no longer humans but this is agents that would discuss and troubleshoot like humans would do."

▶ Watch this segment — 45:04

ChatGPT Inspires New Era of Generic Time Series Foundational Models

Inspired by the success of ChatGPT's scaling, the field of time series analysis is shifting towards foundational models designed to learn generic temporal patterns from massive datasets. First explored by Nixla in late 2023 using transformer architectures, these models aim to predict across diverse scenarios, from networking data to financial markets, using a single, pre-trained model. This approach promises to replace repetitive, one-off analyses with a more cost-effective, inference-only process after initial training.

"It was scale. It was really the number of parameters plus the size of the training set that made transformers perform really well. We all remember that, right? The the ChatGPT moment happened because some engineers decided to go and scale the overall thing."

▶ Watch this segment — 21:35

New AI Forecasting Models Enhance Accuracy with Larger Patches and Synthetic Data

To improve forecasting accuracy and mitigate accumulated error, new AI models are adopting larger output patches, which reduces the number of prediction steps required to forecast future points. Additionally, these models complement real-world training data with synthetic time series to fill gaps and enhance data balance, totaling an extra three billion data points across three million synthetic series. The architecture leverages a decoder-only, auto-regressive transformer, similar to GPT, optimizing it for efficient sequential prediction in time series analysis.

"How about I don't do that and I do like bigger steps. So I've taken larger output patches and I can reach 512 points into the future with only four steps."

▶ Watch this segment — 28:09

Researchers Develop AI Method to Identify Key System Features from Metric Changes and Sensor Names

Researchers have developed an AI-driven method to automatically identify representative features in complex systems by analyzing metric changes and the rarity of tokens in sensor path names. The approach considers how metrics change—such as step functions, spikes, or variance shifts—and scores the uniqueness of terms within sensor names, inspired by TF-IDF and entropy concepts. This allows the system to highlight highly important tokens that signal significant system events, like an "admin up down" status, which is rare but critical when it occurs.

"Something that is frequently happening in this document but hardly ever used outside or very rarely used that sounds like important and I want to go and put these up as keywords."

▶ Watch this segment — 14:56

Time Series Foundational Models Evolve to Address Real-World Complexities

Early time series foundational models, while performing well on benchmarks, faced significant challenges with real-world complexities such as local seasonality, non-stationary behavior, and heterogeneous inputs. These limitations prompted a wave of architectural and training data advancements aimed at improving generalization, rather than relying on time-consuming fine-tuning for each dataset. Major companies, including Cisco, have since joined the effort to develop their own time series models, introducing innovations to better handle diverse input frequencies and data scales.

"What they've proven is it can be made to work. And then yeah, how do you generalize this? You can obviously say, well, let's go and um fix this by by fine-tuning. Well, fine-tuning just means you're again tuning things on a per case basis. So, you're back to square one. Not a good idea."

▶ Watch this segment — 23:28

Cisco's AI-Driven Telemetry Prioritizes Relevant Time Series with Combined Metric Analysis

Cisco has integrated a feature ranking mechanism into its AI-driven telemetry in iOS XR 731 that combines metric behavior changes with token rarity from names to identify and stream only relevant time series data. This system can hierarchically rank features, for example, highlighting BFD-related counters when a BFD session breaks, even if it's a rare event. By analyzing both the change in metric behavior (like spikes or step functions) and the uniqueness of terms in sensor names, the system intelligently filters data, ensuring that only important information is streamed for monitoring.

"If I do this for like where I broke a BFD session and I hierarchically rank all the features then all counters are spit out that have the name BFD in there. Not a surprise, but this is a cool feature well filtering mechanism that can help me identify what matters in my system if I have no idea."

▶ Watch this segment — 17:44

Time Series Foundational Models Offer Generic Solution for Data Analysis

Inspired by the success of models like ChatGPT, time series foundational models are emerging as a generic solution to simplify the repetitive and costly process of analyzing time series data. These models promise a one-time training investment followed by only inference costs, aiming to provide a "one-size-fits-all" approach that can generalize across various datasets and prediction scenarios. This represents a significant shift from the traditional method of building and tuning individual models for each specific time series problem.

"Can't we have this one sizefits-all rather than I I do this over and over and over again? Yeah, time series foundational models. There is light at the end of the tunnel."

▶ Watch this segment — 20:39

VoIP Outage Highlights Challenges of Root Cause Analysis in Data-Rich Environments

After predicting future events with time series models, the critical next step is root cause analysis, which often involves sifting through vast amounts of data like logs and show commands. A presented scenario illustrates a VoIP application outage in a Kubernetes environment, caused by disk space issues on a worker node. The sheer volume of data, including 127 files across 65 directories of Kubernetes, system, and router logs, underscores the overwhelming challenge faced by IT operations engineers when trying to identify the source of such intermittent outages.

"So, imagine you're an IT ops engineer and you see received the troubleshooting ticket say well my vote application is not reachable anymore."

▶ Watch this segment — 41:52

Researchers Face Practical Hurdles in Advanced Time Series Forecasting Experiments

Researchers have explored various experimental approaches to handle diverse time series problems, including different projection systems for multi-frequency inputs and concatenating multiple time series into one long string for multivariate forecasting. They also experimented with using different output distributions to better match varied data characteristics. However, many of these advanced techniques, while promising in theory, proved difficult to implement effectively in practice, leading some features to be scaled back in subsequent model versions.

"So they tried a load of things and if you look at Morai 2.0, um multi-frequency pro support gone um multivariate support gone."

▶ Watch this segment — 31:12

Also mentioned in this video

Summarised from Outshift by Cisco · 59:56. All credit belongs to the original creators. Streamed.News summarises publicly available video content.

Streamed.News

Convert your full video library into a digital newspaper.

Get this for your newsroom →

AI Agents and LLMs AI Troubleshooting Cisco IQ Deep Agents Inductive Reasoning Time Series Analysis Foundational Models AI Forecasting Transformer Models Zero-Shot Learning Outshift by Cisco

Cisco Develops AI Agents for Troubleshooting and Compliance in Cisco IQ

Cisco Develops AI Agents for Troubleshooting and Compliance in Cisco IQ

Time Series Foundational Models Revolutionize Forecasting with Zero-Shot Capabilities

Cisco's New Time Series Model Boosts Observability Benchmarks

Agent-Driven System Resolves 'Vote App' Issue, Provides Audit Trail

Deep Agent System Diagnoses 'Vote App' Outage to Disk Pressure

AWS and DataDog Advance Time Series Forecasting with Novel AI Architectures

Cisco Develops Time Series Model for Observability with Massive Data Scale

Patching Method Reduces Transformer Computational Cost for Time Series

AI Agents Proposed for Collaborative Troubleshooting, Mimicking Human Problem-Solving

ChatGPT Inspires New Era of Generic Time Series Foundational Models

New AI Forecasting Models Enhance Accuracy with Larger Patches and Synthetic Data

Researchers Develop AI Method to Identify Key System Features from Metric Changes and Sensor Names

Time Series Foundational Models Evolve to Address Real-World Complexities

Cisco's AI-Driven Telemetry Prioritizes Relevant Time Series with Combined Metric Analysis

Time Series Foundational Models Offer Generic Solution for Data Analysis

VoIP Outage Highlights Challenges of Root Cause Analysis in Data-Rich Environments

Researchers Face Practical Hurdles in Advanced Time Series Forecasting Experiments

Also mentioned in this video

More from

Cisco's AI Ecosystem Leverages Google Cloud for Multi-Regional Microservices

AI Agent Automates Post-Outage Ticket Management