OpenNMS Meridian Development Team

Tarus Balog <tarus@opennms.org>

David Hustace <david@opennms.org>

Benjamin Reed <ranger@opennms.org>

Copyright © 2004-2019 The OpenNMS Group, Inc.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts and with no Back-Cover Texts. A copy of the license is available at http://www.gnu.org/copyleft/fdl.html OpenNMS is the creation of numerous people and organizations, operating under the umbrella of the OpenNMS project. The source code is published under the GNU Affero GPL, version 3 or later and is Copyright © 2002-2020 The OpenNMS Group, Inc.

The current corporate sponsor of OpenNMS is The OpenNMS Group, which also owns the OpenNMS trademark.

Please report any omissions or corrections to this document by creating an issue at http://issues.opennms.org.

OpenNMS Meridian 2019

System Requirements

  • Java 8 through 11: OpenNMS Meridian 2019 runs on JDK 8 through 11. We recommend the most recent version of OpenJDK 11.

  • Default Heap Size: The default heap size is now 2GB.

  • PostgreSQL 10 or higher: Meridian 2019 requires any supported version of PostgreSQL 10 or higher.

What’s New in Meridian 2019

Since Meridian 2018, we have introduced a large number of features, most notably Telemetryd (for processing streaming telemetry like NetFlow and sFlow), the Sentinel (for horizontal scaling of telemetry and other processing), and ALEC (for alarm correlation).

On top of that, there have been many other improvements and bug fixes since Meridian 2018.

Meridian 2019 roughly matches the feature set available in Horizon 25.

Architecture for Learning Enabled Correlation

Horizon 23 introduced support for correlation of alarms into meta-alarms called "situations" using an engine called the Architecture for Learning Enabled Correlation.

Situations are OpenNMS alarms that contain one or more triggering alarms, which allows them to be browsed, acknowledged, and unacknowledged just like any other alarm.

A high-level overview of the goal and implementation of correlation can be seen on the ALEC web site.

Changes to the Alarm Lifecycle

Alarm Clearing

Traditionally, OpenNMS has created and resolved alarms in pairs, with one alarm representing the triggering event (or events), and then a second alarm representing the resolution. Horizon 23 changes this default behavior to use a single alarm to track the problem state, incrementing the alarm count when it occurs while in a problem state, or when moving from resolved back into a problem state. Additionally, you can configure OpenNMS to create a new alarm if a problem happens again.

These behaviors are controlled by the introduction of 2 new settings in the opennms.properties file:

org.opennms.alarmd.legacyAlarmState

This setting reverts to the old (pre-23) behavior of creating separate alarms for a problem and its resolution.

org.opennms.alarmd.newIfClearedAlarmExists

This setting forces Alarmd to create a new alarm if a problem reoccurs, rather than incrementing an existing alarm. (Note: this is ignored if legacyAlarmState is set to true.)

These improvements are covered in a lunch and learn video we published recently, if you would like to learn more.

Alarmd Architecture

To facilitate the implementation of ALEC, alarmd has been rearchitected to use Drools to manage the alarm lifecycle, rather than Vacuumd automations, triggers, and actions.

If are migrating changes to vacuumd-configuration.xml from an earlier Meridian release, it is strongly recommended you port them to the new Alarmd Drools context. The Drools rules are in the $OPENNMS_HOME/etc/alarmd/drools-rules.d/ directory.

Additionally, we no longer generate alarmCreated, alarmEscalated, alarmCleared, alarmUncleared, alarmUpdatedWithReducedEvent, and alarmDeleted events. Instead, it is recommended that you add Drools rules to react to alarm changes.

For more complicated integrations, we also have a new API — AlarmLifecycleListener — for reacting to alarm changes.

Kafka Data Collection Sync

In addition to publishing events, alarms, and node inventory to Kafka, we now publish collected time-series data to the Kafka bus as well.

Sentinel

In addition to the Minion, we have added a new container-based subsystem called "Sentinel." The Sentinel is a Karaf container that can be configured to run a subset of OpenNMS daemons as a standalone tool, to aid in horizontal scaling and/or high availability.

Sentinel is designed to run our Karaf/Camel/SQS-based messaging bus, syslog listener, telemetry receiver, and Newts and Elasticsearch persistence.

Node and Interface Metadata

There is now support for associating arbitrary metadata with nodes and interfaces, including configuring arbitrary metadata in the requisition UI.

For details on using the metadata APIs, see the Admin Guide and the Developer Guide.

Elasticsearch 7.x Support

All of the features that leverage integrations with Elasticsearch i.e. event & alarm history, flows & situation feedback have been updated to support Elasticsearch 7.x. Elasticsearch versions before 7.x are no longer supported.

Given the pace of changes and the number of breaking changes between major versions of Elastisearch, we will focus on supporting a single major version of Elasticsearch per release moving forward.

Other Improvements

Since Meridian 2019 is based on Horizon 25, it contains all the fixes and updates that have occurred since Meridian 2018 was created from the Horizon 21 codebase.

For a more complete list of changes included in this release, see the "What’s New" documentation for the following Horizon releases:

Release Meridian-2019.1.5

Release 2019.1.5 is the sixth release in the Meridian 2019 series.

It fixes a few more security issues, as well as a number of other bugs and a couple of enhancements. Hat tip to Johannes Moritz for the security report.

The codename for 2019.1.5 is Saturn

Bug
  • SNMP Remove from definitions fails for definitions with profile label (Issue NMS-12413)

  • persisted defaultCalendarReport database reports are broken (Issue NMS-12438)

  • Security issue disclosures, 31 Jan 2020 (Issue NMS-12513)

  • Selecting an Icon on Topology Map breaks the map (Issue NMS-12532)

  • Description: Cannot create monitored-service with JSON via ReST (Issue NMS-12625)

  • Confd download fails silently on Docker install (Issue NMS-12642)

Enhancement
  • Event documentation is missing tokens (Issue NMS-12228)

  • Splitting Docker documentation in Horizon, Minion and Sentinel (Issue NMS-12529)

  • Improve OIA performance when mapping alarms (Issue NMS-12581)

  • Events not balanced across partitions when using opennms-kafka-producer (Issue NMS-12616)

Release Meridian-2019.1.4

Release 2019.1.4 is the fifth release in the Meridian 2019 series.

It fixes an HQL injection bug, as well as a few other issues. Hat tip to Johannes Moritz for the security report.

The codename for 2019.1.4 is Jupiter

Bug
  • Cannot process SNMPv3 Informs due to random Engine ID associated with users (Issue NMS-12473)

  • Downtime model change was not updated in the docs (Issue NMS-12520)

  • HQL Injection (Issue NMS-12572)

Enhancement
  • Support signing code in CircleCI (Issue NMS-12557)

Release Meridian-2019.1.3

Release 2019.1.3 is the fourth release in the Meridian 2019 series.

It contains a few bug fixes, most notably a fix for some NPEs as well as a performance issue in topology processing.

The codename for 2019.1.3 is Mars

Bug
  • changing GUI date/timeformat breaks requisition update/import date/time display (Issue NMS-12428)

  • Inefficient locking in the TopologyUpdater class (Issue NMS-12443)

  • MIB Compiler fails with Null Pointer Exception (Issue NMS-12459)

  • The Karaf poller:test command is not location aware (Issue NMS-12460)

  • NPE while compiling a MIB (Issue NMS-12472)

Release Meridian-2019.1.2

Release 2019.1.2 is the third release in the Meridian 2019 series.

It contains a number of alarm classification bug fixes and performance improvements, flow enhancements, and more.

The codename for 2019.1.2 is Earth

Bug
  • possible issue in JCIFS Monitor - contiously increase of threads - finally heap dump (Issue NMS-12407)

  • Wrong links in the Help/Support page (Issue NMS-12418)

  • Classification Engine reload causes OOM when defining a bunch of rules (Issue NMS-12429)

  • Cannot define a specific layer in topology app URL (Issue NMS-12431)

  • Classification UI: Error responses are not shown properly (Issue NMS-12432)

  • Classification Engine: The end of the range is excluded, which is not intuitive (Issue NMS-12433)

  • Ticket-creating automations are incorrectly enabled by default (Issue NMS-12439)

  • Enable downtime model-based node deletion to happen when unmanaged interfaces exist (Issue NMS-12442)

  • Improve alarmd Drools engine performance by using STREAM mode (Issue NMS-12455)

Enhancement
  • Refactoring of the Cassandra installation instructions (Issue NMS-12397)

  • Allow telemetry flows to balance across Kafka partitions (Issue NMS-12427)

  • Add system test for IpfixTcpParser (Issue NMS-12434)

  • Associate exporter node using Observation Domain Id (Issue NMS-12435)

Release Meridian-2019.1.1

Release 2019.1.1 is the second release in the Meridian 2019 series.

It contains a number of bug fixes mostly related to alarm and event processing and potential resource leaks, as well as provisioning enhancements to SNMP profiles.

The codename for 2019.1.1 is Venus

Bug
  • Readiness probe with Minion in Kubernetes with health:check does not work (Issue NMS-12120)

  • Cannot use poller:poll karaf command with WsManMonitor through Minions (Issue NMS-12365)

  • Strange behavior on used threads and file descriptors on Minion (Issue NMS-12366)

  • Upstream Drools Bug: From with modify fires unexpected rule (Issue NMS-12367)

  • "Page Not Found" in alarm-list when choosing number of alarms in dropdown-list (Issue NMS-12379)

  • Build failure during release for 25.1.0 in CircleCI (Issue NMS-12380)

  • backport missing patches from 25.1.0 to foundation-2019 (Issue NMS-12384)

  • Discovery does not honor exclude-range inside a definition (Issue NMS-12385)

  • Discovery exclude-range is not location-aware (Issue NMS-12386)

  • Update OpenJDK 11.0.4 to 11.0.5 (Issue NMS-12387)

  • Elasticsearch event forwarder manipulates in-flight event (Issue NMS-12390)

  • send-event.pl is broken after OpenNMS 25.1.0 update (Issue NMS-12392)

  • SNMP profile fitting is not triggered in some cases when MINION is involved (Issue NMS-12399)

  • Alarmd fails intermittently and OOMs (Issue NMS-12412)

  • SNMP Remove from definitions fails for definitions with profile label (Issue NMS-12413)

Enhancement
  • Create a step-by-step guide how to setup Kafka for Minions (Issue NMS-12368)

  • Enhance new snmp profiles to allow fitting to nodes inside requisitions without SNMP service associated to any IPs configured (Issue NMS-12396)

Release Meridian-2019.1.0

Release 2019.1.0 is the first release in the Meridian 2019 series.

The codename for 2019.1.0 is Mercury

Bug
  • removed service will break BSM web ui (Issue NMS-9322)

  • Event parameters no longer preserve ordering (Issue NMS-9827)

  • The JMX-Cassandra service goes down for all the cluster when a single instance is down. (Issue NMS-10027)

  • deleting a BSM monitor while an alarm is active doesn’t clear the alarm (Issue NMS-10184)

  • default event description is incorrect (Issue NMS-10346)

  • Config tester doesn’t detect missing xml datacollection file (Issue NMS-10396)

  • BSM alarm severity is not being updated (Issue NMS-10578)

  • snmp authentication error traps with Enhanced Linkd / bridge discovery (Issue NMS-10582)

  • Zooming with Backshift is broken (Issue NMS-10635)

  • Karaf shell history thrown out with bathwater on upgrade (Issue NMS-10664)

  • Node detail page renders with no content when invalid node ID specified (Issue NMS-10679)

  • Apparent memory leak in JMX collector, possibly restricted to "weird" JMX transports (Issue NMS-10684)

  • Elasticsearch forwarding fails to recover after outage (Issue NMS-10697)

  • Flow rest results for top N queries are not returned in the correct order (Issue NMS-12104)

  • karaf.log appears on the root file system when running Minion/Sentinel on Ubuntu/Debian. (Issue NMS-12125)

  • WS-MAN doesn’t work with JDK 11 (Issue NMS-12235)

  • ReST API for meta-data doesn’t support JSON (Issue NMS-12272)

  • UI for meta-data is only present when using the horizontal layout (Issue NMS-12273)

  • Groups disappear in classification UI (Issue NMS-12291)

  • BSM simulation mode does not reset the last state (Issue NMS-12302)

  • Web Assets Dependency Rollup 2019-09-24 (Issue NMS-12320)

  • Memory leak in Drools engine for alarmd (Issue NMS-12322)

  • Threshold state keys do not incorporate the collected resource’s instance label (Issue NMS-12329)

  • Reportd generated reports cause: "No bean named '' is defined" in Persisted Reports (Issue NMS-12337)

  • InterfaceNodeCache doesn’t remove deleted nodes immediately (Issue NMS-12338)

  • Delivering a report with "-" in local part of email address is not working (Issue NMS-12342)

  • Install guide for R-core is broken for CentOS 8 (Issue NMS-12352)

  • Karaf feature install issue with opennms-core-tracing-jaeger (Issue NMS-12359)

  • Fix requisition cache when accessing the Requisitions UI via "Edit in Requisition" (Issue NMS-12360)

Enhancement
  • Refactor the compatibility matrix in the documentation (Issue NMS-9684)

  • Be able to change the number of rows for the pagination control on the Requisitions UI (Issue NMS-9793)

  • Documentation typo for /rest/ifservices on the developers guide (Issue NMS-9842)

  • Remove alarm-change-notifier plugin (Issue NMS-10658)

  • Add OpenTracing support for Camel (JMS) RPC (Issue NMS-10961)

  • Support large buffer sizes in Kafka Sink Layer (Issue NMS-11126)

  • Investigate OpenTracing for our RPC communications (Issue NMS-11177)

  • RPC Metrics (Issue NMS-11517)

  • Sink Metrics (Issue NMS-11540)

  • Add a command to show configuration diffs (Issue NMS-12129)

  • Add Web-Hook as delivery option (Issue NMS-12153)

  • Add reply-to field to notification emails (Issue NMS-12224)

  • Refactor Event Timestamps to ISO-8601 Format (Breaking Change) (Issue NMS-12263)

  • Improve robustness of CassandraBlobStore for async operations (Issue NMS-12274)

  • Clearing threshold states via shell should take effect immediately and not require restart (Issue NMS-12277)

  • BSM configuration breaks without being notifed (Issue NMS-12288)

  • List Kafka RPC/Sink topics, Expose Metrics on Karaf shell (Issue NMS-12294)

  • Create proper systemd files for OpenNMS, Minion and Sentinel (Issue NMS-12299)

  • Add ability to update definitions when SNMP profile changes (Issue NMS-12307)

  • Fix security vulnerability with jackson-databind (Issue NMS-12308)

  • Availability boxes on node pages including sub pages differ (Issue NMS-12321)

  • OpenNMS 25 Dependency Still Allows Old PostgreSQL Versions (Issue NMS-12341)

  • Update base container image to use CentOS 8 (Issue NMS-12353)

  • Remove floating OpenJDK dependencies in OCI build (Issue NMS-12354)

  • Detect and help resolve Karaf bootstrap issues (Issue NMS-12356)

  • Update CISCO-ENTITY-SENSOR-MIB threshold trap events to include alarm-data (Issue NMS-12362)

  • switch core/web-assets from yarn to npm (Issue NMS-12363)

  • Collect and display file descriptor statistics via JMX for OpenNMS and Minion (Issue NMS-12364)