Releases · netdata/netdata

Release Summary

This release brings AI-powered monitoring intelligence and expanded platform support to all Netdata users.

Feature	What's New
AI Integration	• MCP server support enables AI assistants to query your infrastructure • Natural language questions in AI Insights ("What went wrong at 3 PM?")
Enterprise Integration	• Full SCIM group provisioning for Okta • Automatic space/room access based on Okta groups
Network Monitoring	• SNMP profile-based collection with 100+ device profiles (alpha) • Auto-detection for Cisco, Palo Alto, F5, and more
Platform Expansion	• Native packages for RHEL 10, AlmaLinux 10, Rocky Linux 10 • systemd journal support for static builds via Rust implementation

Release Highlights

Model Context Protocol (MCP) Server Integration

Every Netdata Agent and Parent now functions as an MCP server, enabling AI assistants like Claude Desktop to query and analyze your infrastructure monitoring data through a built-in WebSocket interface.

What MCP Enables

AI assistants gain read-only access to your monitoring data:

Infrastructure Discovery: Hardware specs, OS details, and streaming topology
Metric Intelligence: Full-text search across all contexts, instances, dimensions, and labels
System Insights: Execute functions for processes, network connections, systemd journals, and Windows events
Alert Analysis: View real-time alerts and complete alert history
Advanced Analytics: Complex metric queries with ML-powered anomaly detection
Root Cause Analysis: Correlate metrics and anomaly scores to identify issues

Security First

Sensitive functions (logs, process monitoring) require temporary API keys
Existing Netdata permissions control all data access
WebSocket connections need explicit configuration in AI clients

Scalable Visibility

AI assistant visibility scales with your connection point:

Connection Point	Visibility Scope
Netdata Child/Standalone	Single node only
Netdata Parent	Parent + all connected children
Netdata Cloud	Full infrastructure (coming soon)

AI Insights: Enhanced with Natural Language Investigation

AI Insights now understands your questions. Simply ask "What went wrong yesterday at 3 PM?" and get a comprehensive report targeting your specific concern—no more manual metric correlation or dashboard hunting during incidents.

Available Reports

Report Type	Analysis Period	Answers Questions Like
Infrastructure Summary	24 hours - 1 month	"How healthy is my infrastructure?"
Capacity Planning	3 months - 2 years	"When will I run out of resources?"
Performance Optimization	24 hours - 1 quarter	"Where are my bottlenecks?"
Anomaly Analysis	6 hours - 7 days	"What caused the outage?"
Investigation (NEW)	Custom timeframe	"Why did latency spike at 3 PM?"
Alert Troubleshooting	Real-time	"How do I fix this alert?" (Preview)

What's New

Natural Language Queries: Ask questions in plain English about any timeframe or issue
Targeted Analysis: Get reports focused on your specific problem, not generic overviews
Alert Resolution Guidance: Coming soon—automated investigation of active alerts with fix recommendations

Privacy and Limits

Reports are generated on-demand and immediately disposed
Your infrastructure data is never used for AI training
All reports share the monthly limit of 10 reports

Note

Alert Troubleshooting is currently in preview and will be gradually rolled out to all users.

Okta Integration: Full SCIM Group Provisioning Support

The Okta integration now supports complete SCIM group provisioning, enabling automatic synchronization of both users and groups between Okta and Netdata Cloud.

What's New

Previously limited to user provisioning, the integration now includes:

Capability	Before	Now
User Provisioning	✅ Create, update, deactivate users	✅ Create, update, deactivate users
Group Sync	❌ Manual group management	✅ Automatic group synchronization
Space/Room Access	❌ Manual assignment	✅ Auto-assignment based on Okta groups

Automated Access Management

When you add or remove users from groups in Okta, these changes instantly reflect in Netdata Cloud. This enables powerful automation scenarios:

Assign users to specific Netdata spaces based on their Okta department groups
Grant room access automatically based on team membership
Revoke access immediately when users leave groups

Learn how to configure SCIM group provisioning in our documentation or explore the Netdata integration in Okta's marketplace.

Automated SNMP Monitoring with Device Profiles

Netdata v2.6.0 adds SNMP profile-based collection (alpha), transforming complex SNMP monitoring into a plug-and-play experience. The profile system makes enterprise network monitoring accessible to everyone, from home labs to data centers, with the simplicity Netdata is known for.

Getting started is simple:

Existing users: Profiles are automatically enabled and your devices will be detected and monitored with no additional configuration
New users: Just configure SNMP credentials, and Netdata handles the rest

Important

As an alpha release, expect rapid improvements and possible profile format changes in future versions.

What's New

Before	Now with Profiles
Manual OID configuration	Auto-detection with 100+ device profiles
Limited to IF-MIB metrics	Full device metrics: CPU, memory, temperature, status
Complex setup per device	Drop-in YAML profiles
No vendor intelligence	Vendor-specific metrics and transformations
Fixed monitoring only	Support for custom profiles for specialized devices

Extensive Device Coverage

Netdata ships with profiles for major network vendors, adapted from Datadog’s battle-tested definitions:

Category	Vendors Included
Switches & Routers	Cisco (Catalyst, Nexus, ASR, ISR), Arista, Juniper, HP/HPE, Dell, Extreme
Firewalls	Palo Alto, Fortinet FortiGate, Cisco ASA, Checkpoint, SonicWall
Wireless	Aruba, Cisco WLC, Ubiquiti, Alcatel-Lucent
Load Balancers	F5 BIG-IP, Citrix NetScaler, A10 Thunder
Infrastructure	APC UPS/PDU, Dell servers, standard MIBs (BGP, OSPF, TCP/UDP)

Tip

This is just the beginning. We're actively expanding coverage based on user feedback. Missing metrics for your devices? Let us know!

Native Package Support for RHEL 10 and Derivatives

Netdata now provides native packages for RHEL 10, AlmaLinux 10, and Rocky Linux 10.
These packages ensure seamless integration with corporate deployment tools, automated updates, and compliance requirements typical in enterprise environments. Whether you're running RHEL 10 in production or using AlmaLinux or Rocky Linux as alternatives, you get the same reliable, optimized Netdata experience.

Rust-Based systemd-journal Plugin for Static Builds

Static build users can now access sy...

Netdata v2.5.4 is a patch release to address issues discovered since v2.5.3.

This patch release provides the following bug fixes and updates:

Improved label sanitization in Go plugins by removing null bytes from values (commit, @ilyam8)
Improved Go plugin startup performance by loading SNMP profiles only when used instead of all at startup (commit, @ilyam8)
Added -NoProfile parameter to Windows installer PowerShell execution for cleaner environment setup (#20550, @thiagoftsm)
Optimized memory usage by switching label structures to use the ARAL allocator and reducing memory footprint (#20502, @stelfrag)
Fixed CPU architecture matching for Go plugin builds in 32-bit static builds (#20502, @Ferroin)
Fixed Redis collector to properly maintain TLS configuration for rediss connections (#20478, @ilyam8)
Fixed Go weblog collector to exclude HTTP 429 status codes from 4xx error category (#20443, @Slind14)
Fixed registry save operation by correcting integer overflow issues and adding exponential backoff for failed save attempts (#20437, @ktsaou)
Improved agent shutdown responsiveness by reducing streaming connection timeout from 1000ms to 250ms (#20434, @stelfrag)
Fixed database statement handling with improved thread cleanup and validation before finalization (#20433, @stelfrag)
Fixed memory corruption issue in query progress updates by preventing access to freed web client structures (#20431, @ktsaou)
Added vendored Protobuf and Abseil libraries to static builds with necessary patches for cross-platform compatibility (#17774, @Ferroin)

Support options

As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels:

Netdata Learn: Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata.
GitHub Issues: Make use of the Netdata repository to report bugs or open a new feature request.
GitHub Discussions: Join the conversation around the Netdata development process and be a part of it.
Community Forums: Visit the Community Forums and contribute to the collaborative knowledge base.
Discord Server: Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 2000 engineers are already using it!

Netdata v2.5.3 is a patch release to address issues discovered since v2.5.2.

This patch release provides the following bug fixes and updates:

Fixed context update handling by adjusting conditions for hub queue management (#20416, @stelfrag)
Added ability to debug individual jobs in go.d.plugin instead of all jobs within a module (#20394, @ilyam8)
Added debug logging for HTTP response validation in go.d.plugin HTTP check collector (#20392, @ilyam8)
Fixed duplicate name handling in go.d.plugin dynamic configuration userconfig action (#20346, @ilyam8)
Fixed Oracle database collector to correctly calculate tablespace usage percentages and prevent negative values (#20373, #20378, @ilyam8)
Fixed database engine performance by optimizing file rotation and indexing operations with better job scheduling and concurrency handling (#20354, @stelfrag)

Support options

As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels:

Netdata Learn: Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata.
GitHub Issues: Make use of the Netdata repository to report bugs or open a new feature request.
GitHub Discussions: Join the conversation around the Netdata development process and be a part of it.
Community Forums: Visit the Community Forums and contribute to the collaborative knowledge base.
Discord Server: Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 2000 engineers are already using it!

Netdata v2.5.2 is a patch release to address issues discovered since v2.5.1.

This patch release provides the following bug fixes and updates:

Fixed crash by preventing dynamic configuration initialization for virtual nodes (#20324, @ilyam8)
Updated eBPF library dependency to version 1.5.1 (#20316, @thiagoftsm)
Fixed dynamic configuration issue that incorrectly assigned plugin configurations to virtual nodes instead of the localhost context (#20312, @ktsaou)
Changed user transition log messages from debug to info level (#20308, @ilyam8)
Fixed memory issue by preventing use-after-free when accessing parent information (#20305, @ktsaou)
Fixed use-after-free memory issue in plugins.d inflight function handling (#20304, @ktsaou)
Fixed metadata synchronization shutdown to proceed even when event loop command submission fails (#20303, @stelfrag)
Fixed SNMP collector to properly format system information (#20293, #20301, @ilyam8)
Fixed database maintenance scheduling to properly sequence journal indexing after file rotation operations (#20264, @stelfrag)
Fixed various minor issues including improved shutdown logging, division by zero protection, and updated dimension status messages (#20263, @stelfrag)
Improved MSSQL collector performance by moving database queries to a separate thread (#20230, @thiagoftsm)

Support options

As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels:

Netdata Learn: Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata.
GitHub Issues: Make use of the Netdata repository to report bugs or open a new feature request.
GitHub Discussions: Join the conversation around the Netdata development process and be a part of it.
Community Forums: Visit the Community Forums and contribute to the collaborative knowledge base.
Discord Server: Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 2000 engineers are already using it!

Netdata v2.5.1 is a patch release to address issues discovered since v2.5.0.

This patch release provides the following bug fixes and updates:

Fixed obsolete chart cleanup to properly handle virtual nodes (#20254, @ilyam8)
Fixed SNMP collector to use 32-bit counters for network interfaces when 64-bit counters aren't available (#20249, @ilyam8)
Fixed SNMP collector to fall back to interface description (ifDescr) when interface name (ifName) is empty (#20248, @ilyam8)
Fixed SNMP discovery by correcting SNMPv3 credential parameter names to match expected values (#20247, #20256, @ilyam8)
Fixed compilation on older distributions by removing uv_sleep function call that isn't available in older libuv versions (#20243, @stelfrag)
Fixed claiming in Docker by improving detection of localhost environments and providing correct claim command instructions (#20240, @stelfrag)
Added user configuration option to override default thread stack size (#20236, @stelfrag)
Fixed CouchDB collector to use correct units (bytes instead of KiB) for database size charts (#20235, @ilyam8)

Support options

As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels:

Netdata Learn: Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata.
GitHub Issues: Make use of the Netdata repository to report bugs or open a new feature request.
GitHub Discussions: Join the conversation around the Netdata development process and be a part of it.
Community Forums: Visit the Community Forums and contribute to the collaborative knowledge base.
Discord Server: Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 2000 engineers are already using it!

@barracuda156

Release Summary

Netdata v2.5.0 continues our commitment to stability with significant improvements to system robustness. This release focuses on eliminating potential crashes, resolving memory issues, and enhancing thread management across the codebase. We've implemented comprehensive deadlock detection, improved resource cleanup procedures, and added protection against corrupted data files.

Acknowledgments

@barracuda156 for fixing compilation on macOS versions earlier than 11.
@luiizaferreirafonseca for fixing grammar in the main README file.
@rhoriguchi for fixing filtering of systemd-nspawn container payload in cgroups monitoring.

Contributions

Collectors

Improvements

Added default filtering for systemd-nspawn container payload in cgroups monitoring (#20155, #20168, @ilyam8, @rhoriguchi
Added per-database lock metrics to Windows MSSQL collector (#20141, @thiagoftsm)

Other

Reorganized code in Windows plugin IIS module for better maintainability (#20182, @thiagoftsm)
Added initial work-in-progress implementation of Netdata exporter for OpenTelemetry (#20171, #20199, @ilyam8)
Removed legacy code that handled the WMI to Windows collector renaming in Go module configurations (#20166, @ilyam8)
Cleaned up SNMP collector by removing unused code from vendored Datadog profile components (#20164, @ilyam8)
Added detailed UPS response logging in debug mode for APC UPS collector (#20157, @ilyam8)
Improved test coverage for OpenTelemetry journald exporter remote client functionality (#20143, @ilyam8)
Added metric descriptions and proper unit definitions to SNMP collector profiles for improved chart rendering (#20100, #20163, @Ancairon)

Packaging/Installation

All changes

Updated Windows installer to use a unified license agreement page instead of multiple separate license pages (#20134, @Ferroin)
Updated supported platforms for CI and package builds, adding Alpine 3.21, CentOS Stream 10, Ubuntu 25.04, and Fedora 42 (#20119, #20177, @Ferroin)

Documentation

All changes

Added documentation for centralizing and managing namespaced logs (#20217, @ktsaou)
Improved security and privacy design documentation (#20208, @kanelatechnical)
Added comprehensive documentation for the Dynamic Configuration system, including component usage guidelines and developer information (#20187, #20232, @ktsaou)
Improved systemd journal logs documentation (#20184, @kanelatechnical)
Updated platform support documentation to reflect current compatibility with the latest FreeBSD and macOS versions (#20165, @ilyam8)
Improved dashboard and charts documentation with better formatting, consistent language, and enhanced visual elements for easier navigation (#20162, @kanelatechnical)
Fixed grammar and improved clarity in the main README file (#20144, @luiizaferreirafonseca)
Changed installation documentation to use proper admonition syntax for informational blocks (#20136, @kanelatechnical)
Changed deployment documentation title from singular to plural form (#20133, @kanelatechnical)
Updated installation documentation with improved structure, user-friendly language, visual aids, and proper Docusaurus syntax (#20122, @kanelatechnical)

Other Notable Changes

Bug Fixes

Fixed potential crashes by adding null pointer checks when accessing journal and data files (#20226, @stelfrag)
Added local collection of analytics data to support API information requests, while still respecting telemetry preferences for external reporting (#20221, @stelfrag)
Fixed exporting engine issues including crash on shutdown in static builds and timeout handling when waiting for threads to exit (#20212, @ktsaou)
Fixed thread allocation to consider system memory constraints, preventing crashes during startup on systems with high CPU counts but limited RAM (#20192, @ktsaou)
Fixed potential crash during thread termination in exporting engine (#20191, @ktsaou)
Fixed signal handling to ignore maintenance signals during shutdown process to prevent conflicts (#20190, @ktsaou)
Fixed potential crash when handling repeating alerts that were not properly queued (#20186, @stelfrag)
Fixed race condition when logging pending messages by ensuring atomic operations (#20185, #20188, #20189 @ktsaou)
Fixed health configuration schema parameter for database lookup absolute option to prevent UI validation failures (#20161, @ilyam8)
Fixed multiple memory issues including optimized context queues, buffer overflow protection, thread synchronization for metadata transitions, improved dictionary cleanup, and proper ML model resource management (#20159, @ktsaou)
Fixed label memory accounting to prevent negative values in memory tracking (#20158, @stelfrag)
Fixed crash in Windows MSSQL collector during performance data processing (#20131, #20032 @thiagoftsm)
Fixed database engine startup to safely handle corrupted journal files by skipping them during metrics registry population (#20128, @stelfrag)
Fixed memory leak by properly freeing ACLK message payloads when MQTT connection is unavailable (#20125, @stelfrag)
Fixed memory leaks and improved cleanup procedures across multiple modules, including plugins.d threads, diskspace plugin, and pattern arrays (#20120, @ktsaou)

Other

Improved system type detection by fixing mini-PC identification and adding Proxmox server detection (#20229, @ktsaou)
Fixed potential crashes with additional val...

@dave818

Release Summary

Netdata v2.4.0 is a stability-focused release that addresses many issues that were identified thanks to the new agent reporting system introduced in v2.3.0. This release significantly improves reliability by fixing multiple crash scenarios and memory leaks throughout the codebase.

Key Highlights

Category	Improvements
Memory Optimization	• Resolved significant memory leaks in container monitoring systems, particularly affecting Kubernetes deployments • Fixed memory leaks across database engine components, health alarm entries, and alert pattern matching • Improved SQLite memory management with maximum heap limits and dynamic memory release under system pressure
Stability Improvements	• Fixed numerous crashes in the Windows performance counters handling and container monitoring systems • Improved error handling when dbengine files reside on disks with errors • Enhanced journal file handling with better error logging • Optimized shutdown sequences to prevent resource leaks and crashes • Fixed ACLK synchronization issues to properly handle dynamic host configuration changes
New Features	• Windows Service Monitoring: Added capability to track running states (running, stopped, pending, paused) of Windows services through the windows.plugin/PerflibServices collector (disabled by default, requires manual activation)

Acknowledgments

@dave818 for fixing a cron job syntax error in the updater script by correcting the time format.
@ycdtosa for adding missing --offline-install-source option documentation to kickstart script usage information, adding Synology-specific user and group creation commands to kickstart script for improved DSM compatibility, and updating Synology installation documentation to clearly differentiate steps required for older DSM versions.

Contributions

Collectors

Improvements

Added Windows service monitoring to track running states including running, stopped, pending, and paused services (windows.plugin/PerflibServices) (#19990, @thiagoftsm)

Bug fixes

Fixed Prometheus collector to use appropriate units instead of "ratio" for measurements (go.d/prometheus) (#20069, @ilyam8)
Fixed crash in Windows Hyper-V collector caused by unpopulated shared buffer values (windows.plugin/PerflibHyperV) (#20060, @thiagoftsm)
Fixed MegaCLI collector to properly handle adapter configurations with no connected drives (go.d/megacli) (#20046, @ilyam8)

Other

Added socket and remote client capabilities to OpenTelemetry journald exporter (#20038, #20033, #20121, @ilyam8)
Added hostname labels to virtual nodes in Go-based collectors (#20030, @ilyam8)
Added preliminary support for custom YAML files in SNMP collector that will be used for single metrics in future releases (go.d/snmp) (#20020, @Ancairon)

Packaging/Installation

All changes

Fixed cron job syntax error in updater script by correcting the time format (#20039, @dave818)
Added missing --offline-install-source option documentation to kickstart script usage information (#20025, @ycdtosa)
Added Synology-specific user and group creation commands to kickstart script for improved DSM compatibility (#20024, @ycdtosa)
Added Docker tag rotation system to track the four most recent nightly builds with relative numeric identifiers (#19734, #20089 @Ferroin)

Documentation

All changes

Improved clarity, structure, and examples throughout the Alerts & Notifications documentation (#20085, @kanelatechnical)
Updated documentation to provide clearer guidance on transitioning to static builds for end-of-life platforms (#20075, #20110 @ralphm)
Added documentation for the remove-stale-node command in the Nodes Ephemerality guide (#20057, @ralphm)
Fixed code block formatting in Log2Journal documentation to comply with MDX 3 requirements (#20056, @Ancairon)
Simplified OIDC configuration by removing parameters no longer needed after adding Discovery support (#20053, @juacker)
Improved documentation for observability centralization, including streaming, replication, and node management, with clearer language and structure (#20052, #20073 @kanelatechnical)
Removed on-premises documentation files relocated to a dedicated repository (#20023, @Ancairon)
Improved Windows installer and Machine Learning documentation with simpler language and better organization (#20021, @kanelatechnical)
Improved deployment guides with clearer explanations of standalone installations and centralization options (#20004, @kanelatechnical)
Updated Synology installation documentation to clearly differentiate steps required for older DSM versions (#19989, #19993, #20010 @ycdtosa)
Improved installation documentation with more concise instructions for macOS, offline installation, IPv4 configuration, native packages, and Docker deployment (#19987, @kanelatechnical)
Improved installation documentation for Ansible, Azure, AWS, Kickstart script, and Kubernetes deployments with better organization and clarity (#19981, @kanelatechnical)
Fixed documentation order to provide a more logical top-to-bottom reading flow in kickstart installation guide (#19975, @kanelatechnical)
Updated SCIM documentation to include new Groups support functionality (#19969, @juacker)

Other Notable Changes

Bug Fixes

Fi...

Netdata v2.3.2 is a patch release to address issues discovered since v2.3.1.

This patch release provides the following bug fixes and updates:

Fixed journal file creation reliability with improved error handling and simplified allocation process (#20018, @ktsaou)
Fixed leakage of build environment identifiers by blacklisting GitHub runner machine IDs (#20016, @ktsaou)
Fixed potential memory access violations by adding validation for journal file headers and page boundaries (#20013, @stelfrag)
Fixed a rare crash condition by properly reinitializing data collection for obsolete or archived dimensions (#20007, @ktsaou)
Fixed MegaCLI collector to properly handle missing battery backup units (#20008, @ilyam8)
Changed UUID generation to use version 4 format for better uniqueness (#20002, @ktsaou)
Added detection for additional CI environment variables to automatically disable telemetry (#19999, @ktsaou)
Fixed Agent status system to handle null UUIDs and improved tracking of shutdown time, crash counts, and connection states (#19996, #20003, #20011 @ktsaou)
Added detailed worker thread status information and enhanced crash diagnostics capabilities (#19992, @ktsaou)
Fixed potential crash in Windows perflib collector when handling null pointers (#19985, @ktsaou)
Fixed error reporting to preserve errno values during out-of-memory conditions (#19984, @ktsaou)
Fixed potential crashes when handling empty data arrays (#19983, @ktsaou)
Fixed Agent shutdown by properly joining ACLK and metadata threads before closing database connections (#19980, @stelfrag)
Fixed random crashes during shutdown by avoiding precompiled database statements for host metadata (#19978, @stelfrag)
Limited maximum database file size to 1GB to optimize memory usage during file operations (#19977, @stelfrag)
Fixed crash in variable lookup function when processing search results with scores (#19972, @ktsaou)
Fixed ACLK synchronization thread shutdown with better termination sequence and timeout handling for stuck operations (#19966, @stelfrag)
Improved Parent node startup performance by preloading UUIDs into metrics registry for faster initialization (#19964, @ktsaou)
Fixed Windows installer to properly manage configuration files and handle upgrades correctly (#19962, @thiagoftsm)
Fixed potential crash in health alarm cleanup when unlinking alerts from charts (#19956, @stelfrag)
Fixed buffer overflow when processing cloud rooms during Agent claiming on startup (#19954, @stelfrag)
Updated Agent status reporting system with enhanced crash diagnostics, anonymized stack traces, and ACLK connection status tracking (#19953, #19957, #19959 @ktsaou)
Fixed thread creation issues by adding retry logic when system resource limits are temporarily reached (#19951, @stelfrag)
Added monitoring of IIS Application Pool metrics to Windows collector (#19950, @thiagoftsm)
Improved metadata thread stability with better shutdown handling and enhanced event loop management (#19929, @stelfrag)
Fixed potential deadlocks by processing alert configuration database operations asynchronously through the metadata thread (#19885, @stelfrag)
Reworked shared memory management in eBPF plugin for more reliable interprocess communication (#19844, @thiagoftsm)

Support options

As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels:

Netdata Learn: Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata.
GitHub Issues: Make use of the Netdata repository to report bugs or open a new feature request.
GitHub Discussions: Join the conversation around the Netdata development process and be a part of it.
Community Forums: Visit the Community Forums and contribute to the collaborative knowledge base.
Discord Server: Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 2000 engineers are already using it!

Netdata v2.3.1 is a patch release to address issues discovered since v2.3.0.

This patch release provides the following bug fixes and updates:

Fixed debug information handling by including it in default builds while disabling separate debuginfo packages for Debian-based distributions (#19946, #19948 @Ferroin)
Fixed static build configuration to avoid unnecessary libunwind compilation (#19939, @Ferroin)
Improved detection of low memory conditions with more aggressive monitoring (#19938, @ktsaou)
Fixed installation path for updater script crontab configuration (#19935, @ralphm)
Fixed validation of database page size limits for 32-bit compression format (#19932, @stelfrag)
Fixed compilation issues when building without database engine support or with address sanitizer enabled (#19930, @stelfrag)
Added additional system resource metrics to status file including memory usage and enhanced out-of-memory protection information (#19928, #19937 @ktsaou)
Fixed security issue by preventing exposure of absolute file paths in web server responses (#19925, @ktsaou)
Fixed security vulnerability in daemon status file handling by using file descriptor-based permissions to prevent race conditions (#19924, @Ferroin)
Removed insecure SVG generation endpoint to prevent potential code injection vulnerabilities (#19919, @ilyam8)
Fixed unaligned memory access in socket message buffer by properly aligning memory structures (#19917, @vkalintiris)
Fixed ACLK synchronization by ensuring thread initialization completes before proceeding with startup (#19916, @stelfrag)
Fixed issue where commands could be queued before ACLK initialization was complete (#19914, @ktsaou)
Fixed potential crash when database engine encounters null data files during range operations (#19913, @ktsaou)
Fixed Agent status reporting to handle first-run scenarios when no previous status file exists (#19912, @ktsaou)
Added initial implementation of libbacktrace for improved crash diagnostics (#19910, @ktsaou)
Fixed reliability calculation to properly handle normal Agent exit cases (#19909, @ktsaou)
Added enhanced shutdown diagnostics with timeouts and improved system information in crash reports including cloud provider details (#19903, @ktsaou)

Support options

As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels:

Netdata Learn: Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata.
GitHub Issues: Make use of the Netdata repository to report bugs or open a new feature request.
GitHub Discussions: Join the conversation around the Netdata development process and be a part of it.
Community Forums: Visit the Community Forums and contribute to the collaborative knowledge base.
Discord Server: Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 2000 engineers are already using it!

This success drives rapid adoption among enterprises, reflecting the growing recognition of Netdata as the go-to observability solution for both cloud-native and on-premises environments. Our commitment remains steadfast: to deliver cutting-edge, AI-powered observability with unmatched performance and simplicity—all while being significantly more affordable.

We are also proud to see our users and customers experience high-scale setups, achieving reliable multi-million samples/s setups, effortlessly, streamlining their operations with Netdata.

As we evolve, our focus on empowering businesses with higher-fidelity AI insights ensures Netdata remains the easiest and fastest way to optimize infrastructure and applications at any scale. 🚀

Do you like Netdata? Give Netdata a ⭐ too, on GitHub!

Release Summary

Netdata 2.3 delivers significant enhancements to monitoring reliability and scalability:

Crash Handling & Reporting: A zero-sampling system that captures and analyzes agent crashes with complete diagnostic information, significantly improving reliability across diverse environments.
Extreme Cardinality Protection: Automatic safeguards that maintain performance in high-scale environments with millions of time series while intelligently managing metadata retention.
Nodes Ephemerality & Streaming Alerts: A sophisticated approach to handling node connections in distributed environments, reducing alert noise by distinguishing between permanent and ephemeral nodes.
SNMP Service Discovery: A new system automatically finds and monitors SNMP-enabled devices on configured networks, eliminating manual configuration.

Release Highlights

Nodes Ephemerality & Streaming Alerts

Netdata 2.3 implements a more sophisticated approach to handling node connections in distributed environments. We now define ephemeral nodes as "nodes that are expected to disconnect without raising alerts", enabling smarter monitoring of dynamic infrastructure.

Feature	Description
Smart Node Classification	Distinguish between permanent infrastructure (servers) and ephemeral resources (containers, auto-scaling instances)
Targeted Alerting	Disconnection alerts trigger only for permanent nodes, reducing alert noise and focusing attention on genuine issues
Dynamic Infrastructure Support	Configure auto-scaling cloud instances, containers, and test environments as ephemeral to prevent unnecessary alerts
Simple Configuration	Mark nodes as ephemeral with a single setting in netdata.conf: `is ephemeral node = yes`
Automated Cleanup	Configurable retention periods to automatically remove disconnected ephemeral nodes from dashboards
Selective Cloud Notifications	Netdata Cloud now sends node-unreachable notifications exclusively for permanent nodes
Node Management CLI	Use `netdatacli mark-stale-nodes-ephemeral` to clear alerts for permanently offline nodes

Learn more about managing ephemeral nodes.

Extreme Cardinality Protection

Netdata 2.3 introduces automatic protection against extreme cardinality issues when combining high-dimensional metrics with long retention periods. This system:

Feature	Description
Intelligent Detection	Automatically identifies contexts with excessive ephemeral metrics (≥1000 instances with >50% ephemerality)
Balanced Protection	Preserves all actively collected metrics while selectively clearing retention for ephemeral ones
Resource Optimization	Prevents memory bloat and performance degradation from abandoned time-series metadata
Configurable Thresholds	Adjustable settings for instance count and ephemerality percentage to match your environment
Transparent Operation	Detailed logging of all protection activities for easy monitoring and verification

This protection maintains Netdata's performance even in high-scale environments with millions of time series, while still allowing unlimited cardinality for high-resolution data. Learn more about configuring this feature.

Crash Handling & Reporting

We've implemented a powerful, zero-sampling crash monitoring system that captures and analyzes agent restarts and crashes with complete diagnostic information. This solution leverages systemd's journal for flexible, scalable event tracking without additional licensing costs. With anonymous telemetry enabled, this system helps us identify critical issues across diverse environments, significantly improving Netdata's reliability for all users. Read more about our approach in this blog post.

Feature	Description
Zero-Sampling Collection	Captures every single crash event without sampling, providing complete visibility into system behavior
Comprehensive Diagnostics	Records detailed stack traces, error messages, and system context for accurate root cause analysis
Efficient Deduplication	Intelligent system that prevents redundant reporting (only one crash type per agent per day)
Privacy-Focused	No IP addresses collected, only anonymous telemetry with user opt-out option
Lightweight Implementation	Minimal performance impact, only activates when Agent starts, stops, or crashes
Cost-Effective Architecture	Leverages existing systemd journal infrastructure instead of expensive third-party solutions
High Scalability	Processes up to 20,000 events per second per instance with horizontal scaling capability
Flexible Analysis	Transforms complex JSON data into flattened journal entries for powerful filtering and correlation
Proven Results	Already identified and resolved dozens of critical issues across diverse environments

SNMP Discovery

Netdata 2.3 adds an SNMP service discovery system that automatically finds and monitors SNMP-enabled devices on your networks.

Feature	Description
Automated Device Detection	Scans configured networks to discover SNMP-enabled devices without manual configuration
Flexible Network Configuration	Supports various IP range formats including single IPs, ranges, and CIDR notation (up to 512 IPs per subnet)
Customizable Credentials	Configure multiple credential sets with support for SNMPv2c and SNMPv3 with various security levels
Performance Optimization	Controls network impact through concurrent scan limits and configurable caching of discovery results
Seamless Integration	Automatically...

Releases: netdata/netdata

v2.6.0

Table of Contents

Release Summary

Release Highlights

Model Context Protocol (MCP) Server Integration

What MCP Enables

Security First

Scalable Visibility

AI Insights: Enhanced with Natural Language Investigation

Available Reports

What's New

Privacy and Limits

Okta Integration: Full SCIM Group Provisioning Support

What's New

Automated Access Management

Automated SNMP Monitoring with Device Profiles

What's New

Extensive Device Coverage

Native Package Support for RHEL 10 and Derivatives

Rust-Based systemd-journal Plugin for Static Builds

Contributors

Uh oh!

v2.5.4

Support options

Uh oh!

v2.5.3

Support options

Uh oh!

v2.5.2

Support options

Uh oh!

v2.5.1

Support options

Uh oh!

v2.5.0

Table of Contents

Release Summary

Acknowledgments

Contributions

Collectors

Packaging/Installation

Documentation

Other Notable Changes

Contributors

Uh oh!

v2.4.0

Table of Contents

Release Summary

Key Highlights

Acknowledgments

Contributions

Collectors

Packaging/Installation

Documentation

Other Notable Changes

Contributors

Uh oh!

v2.3.2

Support options

Uh oh!

v2.3.1

Support options

Uh oh!

v2.3.0

Table of Contents

Netdata Growth

Release Summary

Release Highlights

Nodes Ephemerality & Streaming Alerts

Extreme Cardinality Protection

Crash Handling & Reporting

SNMP Discovery

Contributors

Uh oh!