Skip to content

v2.6.0

Latest
Compare
Choose a tag to compare
@netdatabot netdatabot released this 17 Jul 14:48
· 4 commits to master since this release

Table of Contents

Release Summary

This release brings AI-powered monitoring intelligence and expanded platform support to all Netdata users.

Feature What's New
AI Integration • MCP server support enables AI assistants to query your infrastructure
• Natural language questions in AI Insights ("What went wrong at 3 PM?")
Enterprise Integration • Full SCIM group provisioning for Okta
• Automatic space/room access based on Okta groups
Network Monitoring • SNMP profile-based collection with 100+ device profiles (alpha)
• Auto-detection for Cisco, Palo Alto, F5, and more
Platform Expansion • Native packages for RHEL 10, AlmaLinux 10, Rocky Linux 10
• systemd journal support for static builds via Rust implementation

Release Highlights

Model Context Protocol (MCP) Server Integration

Every Netdata Agent and Parent now functions as an MCP server, enabling AI assistants like Claude Desktop to query and analyze your infrastructure monitoring data through a built-in WebSocket interface.

What MCP Enables

AI assistants gain read-only access to your monitoring data:

  • Infrastructure Discovery: Hardware specs, OS details, and streaming topology
  • Metric Intelligence: Full-text search across all contexts, instances, dimensions, and labels
  • System Insights: Execute functions for processes, network connections, systemd journals, and Windows events
  • Alert Analysis: View real-time alerts and complete alert history
  • Advanced Analytics: Complex metric queries with ML-powered anomaly detection
  • Root Cause Analysis: Correlate metrics and anomaly scores to identify issues

Security First

  • Sensitive functions (logs, process monitoring) require temporary API keys
  • Existing Netdata permissions control all data access
  • WebSocket connections need explicit configuration in AI clients

Scalable Visibility

AI assistant visibility scales with your connection point:

Connection Point Visibility Scope
Netdata Child/Standalone Single node only
Netdata Parent Parent + all connected children
Netdata Cloud Full infrastructure (coming soon)

AI Insights: Enhanced with Natural Language Investigation

AI Insights now understands your questions. Simply ask "What went wrong yesterday at 3 PM?" and get a comprehensive report targeting your specific concern—no more manual metric correlation or dashboard hunting during incidents.

Available Reports

Report Type Analysis Period Answers Questions Like
Infrastructure Summary 24 hours - 1 month "How healthy is my infrastructure?"
Capacity Planning 3 months - 2 years "When will I run out of resources?"
Performance Optimization 24 hours - 1 quarter "Where are my bottlenecks?"
Anomaly Analysis 6 hours - 7 days "What caused the outage?"
Investigation (NEW) Custom timeframe "Why did latency spike at 3 PM?"
Alert Troubleshooting Real-time "How do I fix this alert?" (Preview)

What's New

  • Natural Language Queries: Ask questions in plain English about any timeframe or issue
  • Targeted Analysis: Get reports focused on your specific problem, not generic overviews
  • Alert Resolution Guidance: Coming soon—automated investigation of active alerts with fix recommendations

Privacy and Limits

  • Reports are generated on-demand and immediately disposed
  • Your infrastructure data is never used for AI training
  • All reports share the monthly limit of 10 reports

Note

Alert Troubleshooting is currently in preview and will be gradually rolled out to all users.

Okta Integration: Full SCIM Group Provisioning Support

The Okta integration now supports complete SCIM group provisioning, enabling automatic synchronization of both users and groups between Okta and Netdata Cloud.

What's New

Previously limited to user provisioning, the integration now includes:

Capability Before Now
User Provisioning ✅ Create, update, deactivate users ✅ Create, update, deactivate users
Group Sync ❌ Manual group management ✅ Automatic group synchronization
Space/Room Access ❌ Manual assignment ✅ Auto-assignment based on Okta groups

Automated Access Management

When you add or remove users from groups in Okta, these changes instantly reflect in Netdata Cloud. This enables powerful automation scenarios:

  • Assign users to specific Netdata spaces based on their Okta department groups
  • Grant room access automatically based on team membership
  • Revoke access immediately when users leave groups

Learn how to configure SCIM group provisioning in our documentation or explore the Netdata integration in Okta's marketplace.

Automated SNMP Monitoring with Device Profiles

Netdata v2.6.0 adds SNMP profile-based collection (alpha), transforming complex SNMP monitoring into a plug-and-play experience. The profile system makes enterprise network monitoring accessible to everyone, from home labs to data centers, with the simplicity Netdata is known for.

Getting started is simple:

  • Existing users: Profiles are automatically enabled and your devices will be detected and monitored with no additional configuration
  • New users: Just configure SNMP credentials, and Netdata handles the rest

Important

As an alpha release, expect rapid improvements and possible profile format changes in future versions.

What's New

Before Now with Profiles
Manual OID configuration Auto-detection with 100+ device profiles
Limited to IF-MIB metrics Full device metrics: CPU, memory, temperature, status
Complex setup per device Drop-in YAML profiles
No vendor intelligence Vendor-specific metrics and transformations
Fixed monitoring only Support for custom profiles for specialized devices

Extensive Device Coverage

Netdata ships with profiles for major network vendors, adapted from Datadog’s battle-tested definitions:

Category Vendors Included
Switches & Routers Cisco (Catalyst, Nexus, ASR, ISR), Arista, Juniper, HP/HPE, Dell, Extreme
Firewalls Palo Alto, Fortinet FortiGate, Cisco ASA, Checkpoint, SonicWall
Wireless Aruba, Cisco WLC, Ubiquiti, Alcatel-Lucent
Load Balancers F5 BIG-IP, Citrix NetScaler, A10 Thunder
Infrastructure APC UPS/PDU, Dell servers, standard MIBs (BGP, OSPF, TCP/UDP)

Tip

This is just the beginning. We're actively expanding coverage based on user feedback. Missing metrics for your devices? Let us know!

Native Package Support for RHEL 10 and Derivatives

Netdata now provides native packages for RHEL 10, AlmaLinux 10, and Rocky Linux 10.
These packages ensure seamless integration with corporate deployment tools, automated updates, and compliance requirements typical in enterprise environments. Whether you're running RHEL 10 in production or using AlmaLinux or Rocky Linux as alternatives, you get the same reliable, optimized Netdata experience.

Rust-Based systemd-journal Plugin for Static Builds

Static build users can now access systemd journal logs directly from the Netdata dashboard. Previously available in package, source, and Docker installations, the journal plugin required libsystemd, making it incompatible with Alpine-based static builds.

Feature Parity at Last

The new Rust implementation eliminates the last major feature gap between static builds and other installation methods. Static builds are popular for their simplicity - download, extract, and run without package managers or dependencies. However, the Alpine Linux base lacks libsystemd support, forcing users to sacrifice integrated log analysis for portability.
This trade-off is now history. The Rust implementation brings full systemd journal capabilities to static builds without compromising their core advantages. Users on systems without package managers or those preferring self-contained installations finally have access to the same powerful troubleshooting tools as everyone else.

Zero Migration Required

If you're upgrading a static build installation, the journal plugin simply becomes available in your Logs tab. No configuration changes needed. The Rust implementation maintains full compatibility with the existing plugin interface.

Acknowledgments

  • @de-authority for fixing a typo in kickstart.sh.
  • @felipecrs for fixing Docker socket group ID conflicts when host's docker group ID matches an existing container group.
  • @n0099 for clarifying go.d.plugin debug command paths based on installation location in documentation.
  • @tobias-richter for fixing httpcheck collector documentation to use correct header_match parameter name.
  • @andrewm4894 for being a great person and adding Claude CLI instructions for configuring Netdata MCP server.

Contributions

Collectors

Improvements
  • Add SNMP profile-based collection to go.d/snmp collector using adapted DataDog profiles for automated metric discovery and chart creation (@ilyam8)
  • Implement custom update intervals for Windows plugin collectors by running different frequencies in separate threads (windows.plugin) (#20580, #20672, @thiagoftsm @stelfrag)
  • Enable smartctl collector to work on Windows by allowing direct smartctl execution on non-Linux systems (#20574, #20567, @ilyam8)
  • Add PerflibNUMA collector for NUMA node monitoring in Windows plugin (windows/PerflibNUMA) (#20573, @thiagoftsm)
  • Add configurable concurrent device scanning to smartctl collector for improved performance with multiple devices (go.d/smartctl) (#20569, @ilyam8)
  • Add GetPowerSupply module for power supply monitoring in Windows plugin (windows/GetPowerSupply) (#20522, @thiagoftsm)
  • Add PerflibASP collector for ASP.NET monitoring in Windows plugin (windows/PerflibASP) (#20485, @thiagoftsm)
  • Add bearer token file authentication support to HTTP request configuration for all go.d collectors (go.d.plugin) (#20476, @ilyam8)
  • Add Exchange Server monitoring support via PerflibExchange collector in Windows plugin (windows/PerflibExchange) (#20454, @thiagoftsm)
  • Improve apps.plugin grouping with case-insensitive matching, add Windows process name support (apps.plugin) (#20386, @ktsaou)
  • Add autodetection_retry configuration option to go.d collectors for improved automatic discovery handling (go.d.plugin) (#20357, @ilyam8)
  • Improve Prometheus exporter detection in go.d service discovery (go.d.plugin) (#20348, @ilyam8)
  • Add Rust-based systemd-journal plugin to enable journal log viewing in static builds (#20345, @vkalintiris)
  • Add MSSQL wait statistics and resource locks metrics to PerflibMSSQL collector in Windows plugin (windows/PerflibMSSQL) (#20307, @thiagoftsm)
  • Add system page table entries and processor queue length metrics to PerflibMemory collector (windows/PerflibMemory) (#20277, @thiagoftsm)
  • Add IIS W3SVC and W3WP metrics to PerflibWebService collector in Windows plugin (windows/PerflibWebService) (#20245, @thiagoftsm)
Bug Fixes
  • Fix MariaDB User CPU Time calculation to apply workaround only for specific affected versions (go.d/mysql) (#20262, @ilyam8)
  • Add missing configuration properties to go.d collector config schemas for UI configuration support (#20489, #20490, @ilyam8)
Other
  • Migrate iprange package from legacy net package to modern net/netip for improved IP address handling (go.d.plugin) (#20636, @ilyam8)
  • Fix golangci-lint warnings in go.d plugin code (go.d.plugin) (#20360, @ilyam8)

Packaging/Installation

All changes
  • Add informational dialog to Windows installer explaining subscription requirements and limitations for free users (#20593, @thiagoftsm)
  • Add Rocky Linux 10 to CI and package builds, enabling support for RHEL 10, AlmaLinux 10, and Rocky Linux 10 (#20578, @Ferroin)
  • Discontinue POWER8+ builds due to minimal user adoption and high maintenance overhead (#20518, @Ferroin)
  • Fix patch processing script to gracefully skip already-applied patches instead of failing (#20480, @Ferroin)
  • Enable Rust-based systemd journal plugin in static builds, providing journal monitoring capability without libsystemd dependency (#20477, @Ferroin)
  • Add flex as a required dependency in install-required-packages.sh to fix source build failures (#20322, @ilyam8)
  • Fix Docker socket group ID conflicts when host's docker group ID matches an existing container group (#20288, @felipecrs)
  • Add "unix://" scheme prefix to DOCKER_HOST environment variable in run.sh to prevent incorrect tcp://localhost:2375 default (#20286, @ilyam8)

Documentation

All changes
  • Add comprehensive welcome document introducing Netdata's architecture, design philosophy, and capabilities for new users and evaluators (#20669, @ktsaou)
  • Add documentation for analyzing anomaly detection accuracy in machine learning features (#20663, @ktsaou)
  • Update Netdata Cloud documentation with improved UX, enhanced API token guidance, better structure and information flow, and new diagram color palette (#20661, #20665, @kanelatechnical)
  • Fix httpcheck collector documentation to use correct header_match parameter name instead of headers_match (#20652, @tobias-richter)
  • Update Cloud OIDC documentation to use space ID instead of issuer URL in authorization setup (#20643, @car12o)
  • Add Model Context Protocol (MCP) to the Distributed Observability Pipeline diagram in documentation (#20637, @ktsaou)
  • Add NIDL Framework Documentation (#20629, #20630, #20632, #20634, @ktsaou)
  • Add comprehensive Enterprise Evaluation Guide providing an at-a-glance overview of Netdata's capabilities (#20627, @kanelatechnical)
  • Reorganize AI and Machine Learning documentation for better feature discoverability and capability showcase (#20600, @ktsaou)
  • Fix broken link to Netdata's architecture documentation in repository README (#20597, @ilyam8)
  • Add documentation for switching between Netdata installation types and release channels (#20564, @kanelatechnical)
  • Update MCP documentation with limitations disclaimer and generalized real-world usage examples (#20563, @kanelatechnical)
  • Add migration instructions for switching between stable and nightly Netdata agent versions (#20551, @kanelatechnical)
  • Add guide for removing nodes from Netdata Cloud (#20549, @kanelatechnical)
  • Rename machine learning documentation category to better reflect Netdata's comprehensive AI capabilities including MCP and Insights (#20514, @kanelatechnical)
  • Add announcement for Netdata MCP Server preview (#20513, @ilyam8)
  • Add MCP documentation (#20469, @kanelatechnical)
  • Add Claude CLI instructions for configuring Netdata MCP server connection in nd-mcp documentation (#20440, @andrewm4894)
  • Update MSSQL Server monitoring configuration instructions (#20429, @thiagoftsm)
  • Add Netdata Insights documentation and restructure AI content to highlight complementary features with improved diagrams (#20425, @kanelatechnical)
  • Remove sizing-netdata-parents.md (#20421, @ilyam8)
  • Improve documentation for metrics centralization configuration and deployment (#20412, @kanelatechnical)
  • Add debugging example for specific jobs in go.d plugin documentation (#20399, @ilyam8)
  • Improve maintenance documentation (#20398, @kanelatechnical)
  • Improve DynCfg documentation (#20384, @kanelatechnical)
  • Add automatic updates section to Windows installer documentation (#20358, @kanelatechnical)
  • Update health configuration reference with improved visuals, structure, and enhanced readability (#20347, @kanelatechnical)
  • Update SCIM documentation (#20330, #20451, #20495, #20588 @juacker)
  • Update agent alerting and notification documentation with blog content, examples, and visuals (#20329, @kanelatechnical)
  • Add SOC 2 Type 1 attestation blog link to Netdata Cloud security documentation (#20325, @kanelatechnical)
  • Enhance AI and machine learning documentation with improved structure, detailed explanations, visual aids, and marketing-aware content (#20309, @kanelatechnical)
  • Update security documentation to reflect achieved SOC2 Type 1 compliance status (#20300, @shyamvalsan)
  • Update centralized cloud notifications documentation with improved visuals, user-friendly language, and enhanced readability (#20292, #20334, @kanelatechnical)
  • Improve StatsD collector documentation with enhanced content (#20282, @kanelatechnical)
  • Reword go.d plugin troubleshooting section for improved clarity (#20259, @ilyam8)
  • Clarify go.d.plugin debug command paths based on installation location in documentation (#20258, @n0099)
  • Enhance native DEB/RPM package documentation with technical details, improved organization, and installation requirements (#20257, @kanelatechnical)
  • Update main README with improved structure and organization (#20251, #20265, @kanelatechnical)

Other Notable Changes

Improvements
Other
  • Clean up codebase by removing analytics code, fixing warnings, adding command pool tests, and increasing metadata thread shutdown timeout (#20673, @stelfrag)
  • Fix race condition during datafile creation by adding missing lock (#20662, @stelfrag)
  • Improve job completion handling with enhanced timeout mechanism (#20657, @stelfrag)
  • Remove legacy analytics submission in favor of new agent events system (#20654, @stelfrag)
  • Improve ACLK proxy configuration with better memory handling and support for passwordless authentication (#20639, @stelfrag)
  • Fix ACLK connection handling to prevent double-free errors and memory leaks when using proxies (#20625, @stelfrag)
  • Fix packet ID generation to ensure unique, non-zero values with thread-safe atomic operations (#20624, @stelfrag)
  • Improve journal v2 file creation performance by preventing processor yields during startup (#20619, @stelfrag)
  • Fix datafile deletion during indexing by ensuring proper cache cleanup before removal (#20607, @stelfrag)
  • Fix compilation on Windows (#20602, @stelfrag)
  • Update sqlite version to 3.50.2 (#20601, @stelfrag)
  • Ensure metadata workers properly respond to shutdown signals during operation (#20598, @stelfrag)
  • Optimize datafile storage by replacing linked lists with JudyL arrays for improved performance (#20581, @stelfrag)
  • Fix alert version table rebuild to handle duplicate entries gracefully during snapshot creation (#20579, @stelfrag)
  • Add SQL query definitions for cleaning up expired agent events and orphaned health log records (#20570, @stelfrag)
  • Simplify MRG loading by using a single operation instead of separate first/last file loads (#20562, @stelfrag)
  • Add database validity checks before SQL execution and extend db_execute API to return SQLite error codes for improved error handling (#20560, @stelfrag)
  • Improve SQLite shutdown handling to prevent multiple database close attempts during agent termination (#20559, @stelfrag)
  • Add node-update-info CLI command to trigger on-demand node information updates to Netdata Cloud (#20558, @stelfrag)
  • Fix ACLK sync shutdown to verify thread initialization before attempting cleanup (#20555, @stelfrag)
  • Prevent saving alert configuration transitions during agent shutdown to avoid incomplete state persistence (#20553, @stelfrag)
  • Fix MQTT disconnect reason parsing to correctly store reason codes in the proper field (#20540, @stelfrag)
  • Fix thread safety by acquiring lock before accessing SQL statement pool (#20536, @stelfrag)
  • Improve datafile rotation and indexing during shutdown with staged tier stopping, deferred journal indexing, and quota enforcement (#20464, @stelfrag)
  • Clean up orphaned journal files on startup and report (but not delete) unknown files in dbengine directories (#20462, @stelfrag)
  • Optimize database event loop by consolidating callback checks and preventing duplicate indexing operations (#20459, @stelfrag)
  • Improve metadata sync thread shutdown with proper worker cleanup, timeout handling, and service unregistration (#20455, @stelfrag)
  • Fix typo in kickstart.sh (#20417, @de-authority)
  • Rename nd-mcp to nd-mcp.exe on Windows (#20404, @stelfrag)
  • Optimize retention calculation with reduced dataset scanning, early termination detection, and shutdown-aware processing (#20350, @stelfrag)
  • Clean up metadata thread code with fixed shutdown timeouts, simplified worker handling, and improved variable naming (#20323, @stelfrag)
  • Abort health system initialization early if agent shutdown is requested (#20318, @stelfrag)
  • Optimize event loop performance by implementing worker and command pools to reduce memory allocations (#20306, @stelfrag)
  • Fix issues identified by Coverity (#20290, #20511, #20656, @stelfrag)
  • Improve agent shutdown by properly signaling service termination and ensuring all threads are joined (#20280, @stelfrag)
  • Replace pthread implementation with libuv threads for consistent cross-platform thread management (#20250, @stelfrag)
  • Ensure all threads are joinable and properly joined during agent shutdown for clean termination (#20228, @stelfrag

Deprecation notice

Changed in this release

No changes.

Important Changes in Next Major Release

Deprecated Components

Component Type Versions Being Deprecated
APIs v1, v2

What This Means

Only the v3 API and v3 Dashboard will be supported starting with the next major release. These newer versions offer improved performance, enhanced features, and better security.

Important Changes in Next Minor Release

SNMP Legacy Collection Deprecation

We are phasing out legacy SNMP data collection methods (hardcoded IF-MIB and manual custom OID configurations) in favor of the new profile-based system:

Timeline:

  • Next minor release: disable_legacy_collection will default to yes (currently no)
  • Following release: Legacy collection code will be completely removed

What this means for you:

  • Important: Profile-based collection uses different chart IDs than legacy collection. Historical data from legacy metrics will not be migrated and will no longer be visible after switching
  • Start testing profile-based collection now to ensure a smooth transition
  • If any metrics are missing from default profiles, please provide feedback so we can update them

Migration path:

  1. Test profile-based collection alongside legacy collection (current default behavior)
  2. Verify that profiles collect all metrics you need
  3. Report any missing metrics - we'll add them to default profiles
  4. Disable legacy collection when ready using disable_legacy_collection: yes
  5. Legacy users who don't migrate will lose SNMP functionality when legacy code is removed

Support options

As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels:

  • Premium Support: Customers who wish to have a direct channel with Netdata and prioritized support with defined SLAs can contact us.
  • Netdata Learn: Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata.
  • GitHub Issues: Use the Netdata repository to report bugs or open a new feature request.
  • GitHub Discussions: Join the conversation around the Netdata development process and be a part of it.
  • Community Forums: Visit the Community Forums and contribute to the collaborative knowledge base.
  • Discord Server: Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 2000 engineers are already using it!