AI Analyzes Failed Tests in DevOps Pipeline for Root Cause Identification

May 1, 2026 · Juan Netapp · 7 min read

This video from Outshift by Cisco covered a lot of ground. 11 segments stood out as worth your time. Everything below links directly to the timestamp in the original video.

Imagine your code failing, and instead of sifting through complex logs, an AI instantly tells you what went wrong. This integration could drastically cut down debugging time and speed up software development cycles.

AI Analyzes Failed Tests in DevOps Pipeline for Root Cause Identification

A large language model (LLM) has been integrated into a DevOps pipeline to automatically analyze failed Robot Framework tests and provide root cause analysis. The system extracts XML output from failed tests, sends it to a cloud API like Anthropic, and then posts the generated analysis as a comment directly in the pull request. This approach is triggered only on failed tests to optimize token usage.

The integration significantly enhances troubleshooting efficiency by offering developers immediate, actionable insights into code failures. For organizations with strict data privacy requirements, the speaker noted the possibility of running a local LLM on the runner device, suggesting tools like Ollama, to keep sensitive data within the internal infrastructure rather than sending it to external cloud services.

"This is a simple usage, but it gives so much value. I don't know why we didn't think about this before. This is an amazing usage for LLM."

▶ Watch this segment — 34:12

AI Streamlines Troubleshooting with Automated Root Cause Analysis in Development Pipelines

A large language model (LLM) is being used to automatically generate summaries and root cause analyses from failed test results within a development pipeline. When a pull request for a service package fails, the LLM processes the XML output from Robot Framework tests, providing developers with immediate, actionable insights such as identifying invalid values within JSON files. This quick analysis allows engineers to efficiently begin troubleshooting without sifting through extensive logs manually.

This application of AI dramatically speeds up the debugging process, allowing development teams to identify and resolve issues more rapidly. By simplifying complex test outputs into clear explanations, the LLM integration enhances overall development efficiency and helps maintain faster release cycles, particularly for intricate network services.

"You can see here I have a summary. This summary is entirely built with AI. Like I have here a very small, simple script. I take this XML file from Robot, which has the test results, and I feed it to this LLM."

▶ Watch this segment — 29:12

Expert Recommends Three Core Test Types for Robust Software Development

An expert recommends three essential types of tests for ensuring software quality: smoke tests, end-to-end tests, and regression tests. Smoke tests verify that all dependencies and libraries are correctly deployed and compatible, preventing issues like unexpected parameter changes or missing sub-libraries that can break services in production. End-to-end tests validate that new user stories function as intended, while regression tests ensure that new changes do not disrupt previously working functionalities.

For implementing these tests, the expert suggests using tools such as Robot Framework and pyATS. Robot Framework, an open-source project with contributions from Cisco, is praised for its human-readable test syntax that simplifies debugging. PyATS, initially an internal Cisco project now open for contributions, is highlighted for its capabilities beyond test automation, making it a comprehensive solution for network device testing.

"If you ask me, these are the minimum three requirements for a proper testbed. That's what I would recommend you to implement in your own use cases."

▶ Watch this segment — 13:04

DevOps Pillars Transform Network Automation, Streamlining Deployment

DevOps culture, characterized by continuous integration, continuous delivery, and continuous deployment, is fundamentally reshaping network automation. Continuous integration fosters collaboration among developers on a shared codebase, ensuring all changes are integrated smoothly. Continuous delivery focuses on bringing these changes into production seamlessly and without disruption, while continuous deployment automates this process entirely, offering an optional closed-loop approach.

This cultural shift bridges the traditional divide between development and operations teams, enabling phased and controlled changes to networks. For NSO (Network Services Orchestrator) services, this means moving services from development repositories to production servers with enhanced security, quality, and efficiency, thereby reducing potential disruptions and improving overall network reliability.

"We want to bridge the gap in between those who develop and those who deploy."

▶ Watch this segment — 4:33

Automated Pipeline Triggers and AI Analysis for Network Service Deployment

A demonstration is underway to showcase an automated pipeline for network service deployment, initiated by events such as new branches, commits, or pull requests in a GitHub repository. This pipeline leverages Docker images for Network Services Orchestrator (NSO), Robot Framework for comprehensive testing, and NETSIMs for simulated network environments. All components are containerized to ensure consistent and reproducible testing conditions.

Upon test completion, particularly in cases of failure, the results are fed into a simple large language model (LLM). This LLM provides contextual information about what went wrong, enabling developers to quickly understand and address issues. This integration of AI aims to streamline troubleshooting and accelerate the development cycle for network services, providing developers with immediate feedback on their code changes.

"All the failed tests, I'm going to pass them through a very, very simple LLM. I'm going to show you how. So, it can give me a little bit of context of what went wrong."

▶ Watch this segment — 18:45

Modular Design, Ephemeral Environments, and Automated Releases Key to DevOps Success

A modular approach, using makefile targets, is recommended for building development pipelines to ensure ease of maintenance, updates, and troubleshooting. Furthermore, basing staging environments on ephemeral containers is crucial; this allows environments to be completely wiped and rebuilt for each pipeline run, eliminating potential overlaps and leftover data that could interfere with testing accuracy.

Finally, implementing automated releases within the system is highlighted as a critical step for reliability. Manual release creation is deemed unreliable and inefficient, advocating for automated mechanisms—like those available in GitHub—to streamline the deployment process. These practices collectively contribute to a more robust, efficient, and consistent software development lifecycle.

"Implement the automated releases in your system, right? ... We don't want to be creating manual releases every single time because that's not reliable."

▶ Watch this segment — 30:48

Establishing Robust Testbeds for NSO Package Validation

Establishing a robust testbed is crucial for validating NSO packages and ensuring network services behave as intended with various payloads. The process involves onboarding NSO packages and setting up a topology of simulated network devices. For basic syntax and integration testing, NetSims offer a straightforward option, providing a simple abstraction of network device data models.

For more complex scenarios, such as testing BGP services or interface statuses, advanced solutions like Container Labs and CML are recommended. These tools allow topologies to be defined in YAML files, generated, and deployed within the pipeline, ensuring thorough testing of services before deployment. This tiered approach to simulation helps confirm that service payloads achieve the desired results across different network complexities.

"We are onboarding these packages, right? I'm putting those packages in my NSO. I'm doing my packages reload. Perfect. Then I am going to set up my testbed."

▶ Watch this segment — 11:03

NSO Services Navigate Staged Deployment with Optional Real Device Testing

NSO services typically follow a structured journey from development branches through staging and testing to deployment, whether manual or automated. This pipeline involves multiple stages to ensure reliability, including dedicated development branches and comprehensive testing. An optional, yet highly recommended, enhancement for critical network components is the integration of a staging lab equipped with actual devices.

This real-device staging lab provides an additional layer of validation, running the same or different test batteries on physical hardware to mirror production conditions more accurately. This crucial second check offers greater certainty before a release, allowing teams to either proceed with deployment or revert to the design phase if issues are detected. The decision to incorporate real device testing depends on the criticality of the use case and available resources.

"Something that it's more and more used whenever the team has the capacity or the budget to do that is to have a staging lab with actual devices."

▶ Watch this segment — 17:14

Successful Pipeline Run Validates ACL Service Deployment and Configuration

A successful pipeline run, triggered by a pull request, demonstrates the efficacy of automated testing and deployment for an Access Control List (ACL) service. All checks within the pipeline passed, indicating successful environment construction, test execution, and artifact generation. The process culminated in a detailed HTML report, confirming that all tests for the ACL service showed green, signifying flawless operation.

This validated deployment involved using a payload to create two ACLs on a target device, followed by verification of the configurations via a 'show run config' command. The system also successfully tested updates to a pre-existing ACL. This robust testing procedure confirms the correct configuration and functionality of the ACL service, providing confidence in its deployment and any subsequent changes.

"Everything's green, green is good."

▶ Watch this segment — 26:01

GitHub Actions Pipeline Initializes Staging Environment for NSO Services

A new branch in a GitHub repository triggers an automated GitHub Actions pipeline, initiating the setup of a staging environment for Network Services Orchestrator (NSO) services. This pipeline prioritizes consistency by utilizing the containerized version of NSO, ensuring that all development teams operate within identical testing conditions, thereby eliminating discrepancies caused by varying NSO versions, Python libraries, or other dependencies.

The containerized NSO approach provides a stable and repeatable environment for pulling and deploying all necessary service packages. This method, now mature enough for daily testing and operations, supports various processor types and architectures, integrates seamlessly with Docker Compose, and is based on Red Hat, offering a reliable foundation for consistent and surprise-free integration and production deployments.

"The moment we do that, we're going to trigger this pipeline, the pipeline that we have there in GitHub Actions, and magic is going to start happening."

▶ Watch this segment — 9:16

Defining the Essential Stages of a Modern Software Development Pipeline

A software development pipeline is a meticulously designed sequence of steps that guides code from initial development through to production deployment. This process typically encompasses several key phases: code development, building the code in a staging environment, rigorous testing, delivery to operations, and optionally, automated deployment into production. These stages represent the minimum requirements for an effective and robust pipeline.

This structured approach ensures that code changes are systematically managed, tested, and delivered, minimizing risks and maximizing efficiency. By formalizing these steps, organizations can achieve greater consistency, reliability, and speed in their software delivery, effectively bridging the gap between developers and deployment teams.

"That's the minimum steps for a proper pipeline, if you ask me."

▶ Watch this segment — 6:37

Also mentioned in this video

Summarised from Outshift by Cisco · 38:52. All credit belongs to the original creators. Streamed.News summarises publicly available video content.

Streamed.News

Convert your full video library into a digital newspaper.

Get this for your newsroom →

AI Analyzes Failed Tests in DevOps Pipeline for Root Cause Identification

AI Analyzes Failed Tests in DevOps Pipeline for Root Cause Identification

AI Streamlines Troubleshooting with Automated Root Cause Analysis in Development Pipelines

Expert Recommends Three Core Test Types for Robust Software Development

DevOps Pillars Transform Network Automation, Streamlining Deployment

Automated Pipeline Triggers and AI Analysis for Network Service Deployment

Modular Design, Ephemeral Environments, and Automated Releases Key to DevOps Success

Establishing Robust Testbeds for NSO Package Validation

NSO Services Navigate Staged Deployment with Optional Real Device Testing

Successful Pipeline Run Validates ACL Service Deployment and Configuration

GitHub Actions Pipeline Initializes Staging Environment for NSO Services

Defining the Essential Stages of a Modern Software Development Pipeline

Also mentioned in this video

More from

Cisco Outshift Drives AI Agent Observability Standards with OpenTelemetry and Agency Collective

Splunk to Release Advanced AI Model for Time Series Analysis