Skip to content

Introduction

Due to the open-source nature of RabbitMQ and constant updates, it is very hard for developers to master its configuration for their business needs.  That’s why Peregrine Connect has invested years of work building Neuron ESB with a user-friendly pub-sub system designed to simplify and enhance RabbitMQ integration for you.

Neuron ESB sets intelligent default values, retry mechanisms, and other settings such as transactions, publish confirms, quorum queues and clustering so that your solution just works without learning a lot about RabbitMQ, messaging, and quality of service options.  Neuron also offers more advanced features on top of the RabbitMQ transport like complex conditional subscriptions, Poison Message handling, Monitoring and Alerting. The dynamic management of RabbitMQ infrastructure by Neuron ESB minimizes your direct interaction with the RabbitMQ servers, lowers your operational costs and makes you more productive.

We stand out from the competition by exposing nearly all RabbitMQ features for fine-tuning your messaging infrastructure when you’re ready. We also ensure our software stays up-to-date with the latest advances, security patches, and bug fixes as RabbitMQ evolves

Best of all, our transport-agnostic pub-sub system enables you to develop and test solutions without RabbitMQ, using built-in transports (MSMQ and TCP). Upgrading to RabbitMQ transport in higher environments is as simple as flipping a switch.

With Peregrine Management Suite, you can monitor your messaging, application integration, APIs and data infrastructure from a single central portal, eliminating the need to switch between multiple UIs for database, queuing, integration, and log monitoring.

Overview

Neuron ESB employs a hierarchical, topic-based publish and subscribe model for message routing between parties (publishers and subscribers hosted in .NET applications), adapter, service, and workflow endpoints.  Topic based messaging is a way of abstracting endpoints from one another, enabling easy modification of existing solutions without disrupting other endpoints’ processing.  This works effectively with Neuron ESB endpoints in one-way messaging scenarios. Although Neuron ESB RabbitMQ Topic supports request/reply communication, it should be avoided in high concurrency and throughput situations

Neuron ESB’s unique feature is its adaptability, allowing you to change the Quality of Service and underlying transport of topics to align with specific use cases. You can achieve durable, guaranteed, reliable messaging and in-memory routing by simply changing the transport property of a Neuron ESB topic as shown below:

When changes are made, the underlying Party API is self-aware of changes made at the server and auto configures itself to accommodate the Topic configuration changes, even when deployed and hosted on remote machines. This means that developers using the Party API do not need to know, nor do they directly have to work with or individually program the underlying transport configuration, transaction, or QOS requirements. This is all controlled and managed at the server level. For the user who needs to publish messages, the code is as simple as this:

// Create an instance of a publisher

using (Publisher publisher = new Publisher("MyPublisher"))
{
    // catch any exceptions that may occur while connecting
    // to each individual topic
    PartyConnectExceptions exceptions = publisher.Connect();

    if (exceptions.Count < 1)
    {
        publisher.Send("MyTopic", "<Test>My Request</Test>");
    }
    else
    {
        // log the errors
    }
}

Neuron ESB provides several Transports that users can select for Topics including Named Pipes, TCP, RabbitMQ and MSMQ. Some provide durable, guaranteed, and reliable messaging such as MSMQ and RabbitMQ. RabbitMQ based Topics provide features that enhance the manageability, flexibility, but also the fault tolerance, reliability, and security of the underlying Transport.

Version Support

Neuron ESB (as of version 3.7.5) supports RabbitMQ version 3.12.2 and Erlang 26 and higher. Some features like Non-Blocking Poison Message Handling require RabbitMQ version 3.9 or higher. Neuron ESB also uses version 6.5 of the RabbitMQ Client library. Using the Neuron ESB installer, users can optionally select to install RabbitMQ and Erlang during the setup process or can download these from the locations list in our readme.html file or directly from the RabbitMQ web site.

Configuration

A Neuron ESB Solution (i.e., Application) must be configured with at least one RabbitMQ server before Topics can be configured to use the RabbitMQ Transport. The RabbitMQ configuration for a Neuron ESB Solution can be found on the RabbitMQ tab of a Deployment Group by navigating to Deployment->Environments->Deployment Groups within the Neuron Explorer. If using a Clustered RabbitMQ deployment or Quorum queues, multiple RabbitMQ machines can be entered for a deployment group.

When registering RabbitMQ servers for a Neuron ESB Deployment Group the following information is required:

Property Description
Server Name of the Rabbit MQ Server
Port Port of the RabbitMQ Server. -1 will use the default RabbitMQ port of 5672
Mgmt Port Port of the Rabbit MQ Management Portal. Default is 15672
vHost The configured vHost of the RabbitMQ Server. Default value is “/”. A unique VHost should be used for each solution and each deployment group
Username Username to access RabbitMQ.
Password Password to access RabbitMQ

Although some properties may seem straight forward, others may not. For instance, the “Mgmt Port” represents the port of the RabbitMQ Management Plugin. Neuron ESB requires this to be installed and configured for every RabbitMQ server registered as it uses it to query for the health and message throughput rates which appear in both Endpoint Health and the RabbitMQ Message Management console in the Neuron ESB Explorer. When RabbitMQ is installed through the Neuron ESB installer option, Neuron ESB will automatically install and configure the plugin with the RabbitMQ service. More information regarding the RabbitMQ Management Plugin and how to install it can be found here: https://www.rabbitmq.com/management.html

When connecting to RabbitMQ, the default username “Guest” cannot be used against the RabbitMQ management plugin if the server’s name is anything other than “Localhost”. It is always recommended to create a dedicated username and password in RabbitMQ that can be used by Neuron ESB.

High Availability and Failover

Neuron ESB allows users to enter multiple instances of RabbitMQ servers to support the mirroring of the underlying queues that Neuron ESB will use for the Publishers and Subscribers created within the Neuron ESB Explorer. RabbitMQ can use mirroring, rather than Windows Failover Clustering, to achieve High Availability of messages. Quorum Queues can also be used for HA. More information regarding RabbitMQ HA and its configuration can be found here: https://www.rabbitmq.com/ha.html

When multiple servers are configured, at runtime Neuron will use the first server it can connect to as the primary message server. If for any reason that server becomes unavailable, Neuron ESB will automatically failover to the next server in the list until it can find one that it can establish a connection against. If a connection can be established, publish and subscribe activities remain undisrupted at runtime; messages will not be lost, and the failover will be invisible to the processes and users of Neuron ESB. Internally (if configured) we detect connection issues and then cache and resend the messages when we fail to receive the original acks/nacks from the RabbitMQ servers.

In addition to the HA configuration, Neuron ESB also supports Quorum Queues, more of which can be read about later in this document.

Multiple Environments and VHosts

A common practice by RabbitMQ users is to establish specific vHosts to mirror their deployment environments. For example, there may be a vHost named “Development”, “QA” and “Production”. Neuron ESB’s support for vHosts allows for the mapping of Neuron ESB deployment groups to their respective vHost environments. This will functionally isolate the underlying queues and exchanges that Neuron ESB creates for one deployment group from another on the same RabbitMQ server instance.

Once the RabbitMQ servers are registered with a Deployment Group, Topics and Parties can be configured to use the RabbitMQ Transport.

On startup of the solution, the Neuron ESB runtime will attempt to ensure the underlying RabbitMQ infrastructure is consistent with the running solution. It will attempt to delete any Exchanges and Queues that are not needed in the current solution and will add those to support any new added Topics or Parties.  It will not delete or modify any existing infrastructure created by other Neuron ESB solutions or third party applications. 

Neuron ESB Topic Configuration

The RabbitMQ Transport configuration is located on the Networking tab of the Topic by navigating to Messaging->Publish and Subscribe->Topics within the Neuron Explorer.

Neuron ESB RabbitMQ Transport Properties

Property

Category

Description

Queue Mode

General

Queue Mode can be set to either Default or Lazy. Lazy can be used to write the messages out to disk ASAP to keep memory pressure low, even if Persistence is not enabled. However, if Persistence is enabled, messages will survive a broker restart. Ideally, Persistence and Lazy should be used in conjunction to reduce the chance of message loss and reduce memory consumption.

Quorum Queues

General

If true, Quorum Queues will be used. If this property is changed, it is highly recommended to check that the queues for this topic are empty before applying and saving the solution. Changing and saving this property while Neuron is running requires a restart of the topic’s RabbitMQ Publishing service before connectivity is restored. This can be done through Endpoint Health.

  -Quorum Initial Size

General

The number of nodes/replicas that the topic’s queues should be replicated to. There should be at least 3 nodes and the number of nodes should be odd. These nodes should also be registered in the Deployment Group

  -Delivery Limit

General

The maximum number of times a message will try to be redelivered before being sent to the dead letter exchange.

Delivery Mode

Publish

Persistent (i.e. Durable) or NonPersistent. Persistence ensures message is written to disk. If using Transactions, Persistence SHOULD be used. The tradeoff between Persistence and Non Persistence is Memory vs CPU and Disk I/O.

Transaction Type

Publish

Controls the level of reliability for messages. Either ‘None’, ‘PublisherConfirms’ or ‘Transactional’ can be selected. ‘PublisherConfirms’ uses an asynchronous Ack/Nack protocol and is essentially a Batch Transaction model while ‘Transactional’ forces a commit/rollback on each message published. Thruput wise, None will provide the greatest thruput, followed by Publish Confirm transactions. Transactional setting will provide the least thruput.

  -Batch Size

Publish

Only for use with PublishConfirms type of transactions. The number of messages that will be published to RabbitMQ in a Publish Confirm transaction

  -Batch Confirm Timeout

Publish

Only for use with PublishConfirms type of transactions. The number of seconds to wait after the Batch of messages have been published to receive all ACKs/NACKs from the RabbitMQ server. Should be a value between 1 and 60.

  -Inactivity Timeout

Publish

Only for use with PublishConfirms type of transactions. The number of minutes to wait after the last message sent before checking to determine if all ACKs/NACKs from RabbitMQ have been received. Should be a value between 1 and 5.

  -Resubmit UnAck’d Messages

Publish

Only for use with PublishConfirms type of transactions. If true, messages that were published but were not acknowledged will be republished. This should only occur when dealing with clustered/mirrored RabbitMQ instances. The republish would occur when a connection is reestablished with another server in the cluster. If set to true, Detect Duplicates should also be set to true if once only delivery of a message needs to be enforced

Time to Live

Publish

Default value is 1440. A value in minutes that specifies how long messages are valid for delivery before they are expired (dead letter) and transferred as a failed message into the Neuron Audit database.

Must be Routable

Publish

If set to true, the message must be routable by RabbitMQ. If the message cannot be routed to a destination queue, it will be stored as a failed message into the Neuron ESB database. **This will incur a significant performance penalty. This should NEVER be needed as Neuron ESB will dynamically create any missing queue or exchange on startup.

Prefetch Size

Receive

The number of messages that will be prefetched from the queue to transport layer. A value of 0 means unlimited. A value of 1 will enforce ordered delivery to a specific subscriber as long as the Multi-Threaded receive property is set to false.

Multi Threaded

Receive

If False, all subscribers (i.e. Parties) will use the default single threaded consumer. if True, multiple threads will be used for delivery to the subscriber, each thread essentially representing a RabbitMQ consumer.

  -Number of Threads

Receive

The number of threads to use to dispatch to the Subscriber. Should not exceed 5. Default is 2.

Poison Message Enabled

Receive

Determines whether Poison Messages will be processed. If true, users can specify the retry cycle delay as well as the number of attempts to make before the message is sent to the failure audit database.

  -Handler Type

Receive

Determines whether the processing of Poison Messages by the subscriber will Block the continued receipt of non-Poison Messages remaining in the subscriber’s underlying Queue. Blocking is the default and should be selected if Ordered Messaging is required. When set to Non Blocking, Neuron ESB uses the RabbitMQ Delayed Exchange. Rather than blocking at the subscriber level, the Poison Messages are published to a custom RabbitMQ Exchange and stored on disk for the duration of the Delay. This allows the subscriber to continue to receive non-Poison Messages.

  -Retry Cycle Delay

Receive

The delay between retries. This is the length of time that the system will wait before attempting to redeliver the message and send either an ack or nack. The default global limit for RabbitMQ is 30 minutes at which point the consumer would shutdown with a timeout error.  If this error occurs, reduce the Delay or consider increasing the global limit on the RabbitMQ Server.

  -Max Retry Cycles

Receive

The maximum number of retries. This is the number of times the message will be attempted to be delivered after the first failure. The minimal value is 1.

 Detect Duplicates

Receive

Only for use with PublishConfirms type of transactions. If true, duplicate messages received by the Neuron ESB Party will be discarded. This can be used to provide once only delivery.

  -Detection Window

Receive

Only for use with PublishConfirms type of transactions. The amount of time (in minutes) in which previously received message meta data will be maintained in memory to search against for duplicate messages.

  -Report Duplicates

Receive

Only for use with PublishConfirms type of transactions. If true, if a duplicate message is discovered it will be logged as a Warning in the Neuron ESB Windows Event log.

 Shutdown Handler

Receive

Determines how to handle any messages currently being processed when the Party or Hosting Endpoint does a controlled shutdown or disconnect. ‘ReQueue’ will return the message back to the underlying queue whereas ‘AuditFailure’ will remove the message from the queue and record it as a Failed Message in the Neuron ESB Failed Message report.

SSL Enabled

Security

Connect to RabbitMQ Server using only SSL

  -Port

Security

SSL Port for all RabbitMQ server connections

  -SSL Protocol

Security

SSL Protocol to use for all RabbitMQ server connections.

  -Client Authentication

Security

Require Neuron ESB to provide RabbitMQ Server a client certificate to authenticate against.

     -Certificate

Security

Select a client certificate configured in the Security section of the Neuron ESB Explorer to authenticate against the RabbitMQ Server.

     -Passphrase

Security

Passphrase for the client certificate, if one exists

The values of these properties control how the RabbitMQ transport functions for each Publisher or Subscriber that has a subscription to the configured Topic.  In most cases the internal implementation of these properties has changed to provide, better performance, scalability reliability and fault tolerance.

Publishers and Subscribers

Once a Neuron ESB Topic is configured to use RabbitMQ as an underlying transport, any number of sub topics can be added. Sub topics allow for a more granular and descriptive pub/sub model for business applications as shown below.

Within the Neuron ESB Explorer publishers and subscribers to Topics and/or sub topics can be created that support more complex subscription patterns, conditions and restrictions than what the RabbitMQ Transport alone supports. It’s one of the benefits of using Neuron ESB with RabbitMQ. Publishers and Subscribers can be created within the Neuron ESB Explorer:

Each line item in the grid represents a subscription to either a Topic or Subtopic. Clicking the “Edit Subscriptions” toolbar button launches the Edit Subscription dialog where subscriptions can be further defined and modified.

For instance, the picture above depicts a Publisher’s subscription properties. Any Topic, subtopic or Wildcard can be selected and added to the subscriptions section on the right-hand side. In this depiction the root topic, “Account”, and therefore all subtopics and wildcards derived from it are configured to use the RabbitMQ transport. In Neuron, any mix of Topics can be added, regardless of their underlying transport. In Neuron ESB, each subscription for a Publisher or Subscriber is composed of a Topic or subtopic that may or may not include wildcards and permission restrictions. That subscription can be further refined by either selecting a previously created “condition”, or creating a condition on the fly using the dialog box pictured below:

Conditions can be simple or complex and include custom message properties, Neuron ESB Message Header properties or anything in the body of the message. Each line item in the condition can be and/or/xor together. After a Publisher/Subscriber reads a message from their underlying queue, the condition on their subscription is then evaluated to determine whether the message should be discarded or processed.

Management

The Neuron ESB RabbitMQ Transport channel dynamically creates the necessary underlying RabbitMQ based infrastructure (e.g. Queues, Exchanges, Bindings, etc.) anytime a Neuron ESB Topic starts up. This eliminates the need for administrators to manually create and maintain the RabbitMQ infrastructure Neuron ESB requires. Consequently, this makes the underlying RabbitMQ Transport virtually invisible to anyone using the Neuron ESB Party API, and alleviates the need for administrators to manage additional infrastructure requirements.

Neuron ESB also keeps the infrastructure in sync with changes made within the Neuron ESB Explorer for a specific solution. For example, if Topics/Parties are either renamed or deleted, their respective RabbitMQ Queues and Exchanges will be modified to reflect the changes. For deletions, Neuron ESB will delete its respective underlying RabbitMQ Queues only if there are no messages remaining in the Queue.

Naming Conventions

Neuron ESB uses a general naming convention for all underlying RabbitMQ Queues and Exchanges that it uses to map to Topics and Parties. Queues and Exchanges used the following naming conventions:

RabbitMQ Exchanges:

    Neuron.<InstanceName>.<Topic>
Neuron.<InstanceName>.<Topic>_DeadLetterExchange

RabbitMQ Delayed Poison Message Exchanges :

    Neuron.<InstanceName>.<Topic>.P

RabbitMQ Queues:

    Neuron.<InstanceName>.<Topic>.<Party> 
Neuron.<InstanceName>.<Topic>.<Party>_DeadLetters

Where:

  • InstanceName = Name of the Neuron ESB runtime Instance running the solution
  • Topic = The name of the Neuron ESB Topic within the solution
  • Party = The name of the Neuron ESB Party within the solution

Monitoring

Neuron ESB Explorer

Neuron ESB provides basic monitoring of its RabbitMQ based Topics and their respective Publishers and Subscribers. Internally Topics are mapped to RabbitMQ Exchanges while Publishers and Subscribers are mapped to underlying Queues.

The Queues that represent the Neuron ESB Publishers and Subscribers can be directly monitored and managed through the Neuron ESB Explorer by Navigating to Deployment ->Manage -> RabbitMQ screen as shown below.

The Manage RabbitMQ Queues screen provides a grid containing the following live Queue Metrics:

  • Queue Name
  • Durable
  • Memory (Bytes)
  • Messages (Ready)
  • Messages (Unacknowledged)
  • Messages Total
  • Connections
  • Incoming (msg/sec)
  • Delivery (msg/sec)
  • Acknowledged (msg/sec)

The metrics are obtained by regularly querying the RabbitMQ Server Management Plugin over HTTP. Additionally, users can purge all the messages by selecting a Queue and then selecting “Purge Messages” from the right click context menu.

Another Monitoring feature is provided by the Neuron ESB Endpoint Health Screen. This can be accessed by Navigating to Activity -> Health -> Endpoint Health and clicking the “Start Monitoring” toolbar button. Neuron ESB Endpoint Health displays real time activity monitoring for the Endpoint Host environment as well as the different endpoints that they host. The interface provides users the ability to group and sort, and now lists all machines in the selected deployment group. The Neuron ESB Endpoint Health has a scalable horizontal divider, separating Neuron ESB Messaging entities such as Topics, Adapter and Service Endpoints that are hosted in the Neuron ESB Runtime service from those hosted in the Endpoint Host environment. A context menu is exposed at the entity level that allows users to restart any selected Topic, Endpoint or Endpoint Host.

All RabbitMQ Topics will appear in the top monitoring pane. For example, in the picture above, all the RabbitMQ Topics are red, their Status has changed to FAILED and both Errors and Warnings columns are showing positive numbers. This was generated when the RabbitMQ server was shut down. All the details of the Errors and Warnings are always reported directly in the Neuron ESB Log Files as well as the Neuron ESB Windows Event Log as shown in the example below:

Users can also access all the information exposed in the Endpoint Health Screen through an HTTP API URL http://{Machine}:51002/help/index#/EndpointHealth.

Peregrine Connect Management Suite

Introduction

The Peregrine Connect Management Suite provides a comprehensive web-based portal that allows organizations to securely manage and monitor all Neuron ESB deployment environments, the resources within them, and the applications deployed to them. The Management Suite stands well above all other competitors in the field when it comes to the daily management and monitoring of deployed solutions. Whether these environments are deployed on premise or in the cloud, once managed, API Resources can be created and secured, Business Processes scheduled, Alerts operationalized and subscribed to and, historical and real time monitoring made available for the applications deployed. Dashboards can be created for swift and easy access to features used the most. The Management Suite provides organizations with hawk-like visibility into current and historical performance metrics.

Management Suite Dashboard – Users can create custom Dashboards within Management Suite to provide convenient access and navigation to features, applications and endpoints they use the most. For example, a user may add a dashboard that shows Application Health Information about specific endpoints such as Ordering and Shipping APIs.

Once Neuron ESB environments are configured for management through the Peregrine Connect Management Suite, new capabilities are provided to organizations using Neuron ESB:

  • Task based User and Role Security
  • Environment Management and Monitoring
  • Application Monitoring and Reporting
  • Alerting and Notifications
  • Job Scheduling
  • API Management

One of the many benefits of using the Management Suite with a Neuron ESB Solution is the advanced context-based monitoring and alerting provided for Topics and Endpoints that are configured to use the underlying RabbitMQ Transport. This advanced alerting and monitoring for RabbitMQ based endpoints and Topics are contained within the Management Suite 2.0 release.

RabbitMQ Monitoring

The Peregrine Management Suite provides extended monitoring for any Topics, Parties, or Endpoints that are configured to use the RabbitMQ transport.  Extended RabbitMQ monitoring is available in several sections which include Operations, Endpoint Health, and Application Monitoring. For example, operational administrators in charge of the resources in an environment (i.e., machine, runtimes, etc.) can access the RabbitMQ monitoring on a dedicated tab for any specific environment (i.e., Dev, Stage, QA, Production, etc.).  The following top half section presents an overview monitoring screen of RabbitMQ, showcasing various essential graphs and data points.

The displayed graphs in the top section represent the following data points and counts:

  • Ready Messages: This graph displays the total number of messages ready for consumption in the queues.
  • Incoming Messages: The number of incoming messages to the system is shown in this graph.
  • Publish Rate: This graph illustrates the rate at which messages are being published to the queues.
  • Queues: The total number of queues in the RabbitMQ system is shown in this graph.
  • Unacknowledged Messages: The graph presents the total number of messages that have not been acknowledged by consumers.
  • Outgoing Messages: This graph provides the count of messages sent from the RabbitMQ system.
  • Connections: The total number of connections established with the server is represented in this graph.
  • Consumers: The graph indicates the total number of consumers registered with RabbitMQ.
  • Channels: This graph displays the total number of communication channels in use.

In addition to the graphs, the overview section also includes other essential RabbitMQ data:

  • Memory Consumption: This data is accompanied by the high watermark threshold, offering insights into memory usage, and potential resource constraints.
  • Disk Free Space: The amount of free disk space is provided, along with the low watermark threshold, which helps in monitoring disk utilization.
  • Uptime: This shows the duration for which RabbitMQ has been running since its last start.
  • IP Address & Port: The network information used by RabbitMQ is specified, including the IP address, and port number.
  • RabbitMQ Version: The version of RabbitMQ running on the system is mentioned.
  • Erlang Version: This information indicates the version of Erlang that RabbitMQ is using.
  • Node Name: The node name assigned to the RabbitMQ instance is included.

The Bottom Section of the same page provides details of the RabbitMQ Exchanges and Queues that are maintained as part of the Neuron Solution architecture. Users can drill into each to get specific real time information as well as to create alerts. When a user clicks on the “Exchanges” hyperlink, they are presented with the following section where a table displays a list of the Exchanges that are represented by their Neuron Topic names along with various real time metrics.

The table includes the following fields:

  • Error: This field shows the total number of errors associated with the specific topic represented by the exchange.
  • Warning: The number of warnings related to the topic is presented in this field.
  • No. of Incoming Messages: The total count of incoming messages received by the exchange is displayed in this field.
  • No. of Outgoing Messages: The total count of outgoing messages sent by the exchange is displayed in this field.
  • Message Rate for Incoming Messages: The rate at which incoming messages are processed by the exchange is visualized in this graph.
  • Message Rate for Outgoing Messages: This graph illustrates the rate at which outgoing messages are being sent by the exchange.

Users can click on the expansion icon on the far right to navigate to an expanded view the Exchanges table that displays several inline time indexed charts of various metrics as shown below.

Just as with Exchanges, when a user clicks on the “Queues” hyperlink, they are presented with the following section where a table displays a list of the Queues that are represented by their Neuron Party (subscriber/publisher) and the Neuron Topic that they have a subscription to. 

The table includes the following fields:

  • Status: Indicates whether the queue is currently idle or running.
  • Instance: Specifies the Neuron Instance associated with the queue.
  • Topic: Represents the name of the topic associated with the queue.
  • Party: Refers to the name of the party associated with the queue.
  • Durable: Indicates whether the queue is durable (able to survive server restarts).
  • Memory: Shows the amount of memory utilized by the queue.
  • Ready: Displays the number of messages that are ready for consumption in the queue.
  • Unacked: Indicates the number of unacknowledged messages in the queue.
  • Total: Represents the total number of messages present in the queue.
  • Consumers: Shows the number of consumers currently connected to the queue.
  • Incoming Message Rate: Illustrates the rate at which incoming messages are arriving at the queue.
  • Delivered Message Rate: Represents the rate at which messages are being delivered from the queue.

Users can click on the expansion icon on the far right to navigate to an expanded view the Queues table that displays several inline time indexed charts of various metrics as shown below.

Each Neuron Party and Topic pairing represents a RabbitMQ Queue. By clicking the expansion icon, users can drill directly into the Party to get an overall view of all the real-time and historical activity of the underlying queue being monitored as pictured below:

This expanded view presents detailed information and graphs for a single RabbitMQ queue based on the party name. It includes the following values and graphs:

Ready Messages:

  • Count: Total number of messages ready for consumption in the queue.
  • Rate: Rate at which ready messages are being processed.
  • Size: Size of the messages that are ready for consumption.

Unacknowledged Messages:

  • Count: Total number of messages that have not been acknowledged by consumers.
  • Rate: Rate at which unacknowledged messages are being processed.
  • Size: Size of the unacknowledged messages.

Total Messages:

  • Count: Total number of messages currently present in the queue.
  • Rate: Rate at which messages are being processed in the queue.
  • Size: Total size of all messages in the queue.

Consumer:

  • Count: Number of connected consumers to the queue.
  • Utilization: Consumer utilization rate, indicating how actively consumers are processing messages.

Publish Rate:

  • Messages Count: Total number of messages being published to the queue.
  • Rate: Rate at which messages are being published to the queue.

Outgoing Messages:

  • Count: Total number of outgoing messages from the queue.
  • Rate: Rate at which outgoing messages are being dispatched from the queue

The management suite provides similar monitoring for the Event Process System as it also leverages RabbitMQ to move events from the Neuron runtimes to the Elastic Search storage of the Management Suite.

RabbitMQ Alerts

The Peregrine Connect Management Suite provides operational and application level alerting to the Neuron ESB runtime environment and its dependencies. Alerting is a powerful feature that can be used proactively by organizations so that they can be notified when anything within an environment is exceeding specific thresholds defined by “Alert Rules”. For instance, people in operations may want to be notified if the servers or runtime processes that the application is dependent on are exceeding a certain level of memory or CPU consumption. Alternatively, the business may want to know when a specific service in an application is exceeding N number of requests over a specified period or has been idle. Alerts notifications can be sent via email or SMS text messages.

Alert Rules can be created, edited, and managed within the Alert Management section of Peregrine Management Suite. The image below shows the Alert Management module where Alert Rules have yet to be created.

Alert Rules

An Alert Rule can be thought of as being composed of a combination of properties, attributes, and a Condition. On the first page of the wizard, users can select the appropriate Severity level of the alert, its visibility to other users (i.e. Public or Private),  as well as the specific Environment that the rule should be applied against. Once the Environment is selected, the Event Source which the Alert Rule will be created against can be selected.  The following Event Sources are currently supported, the last 6 of which are specific to RabbitMQ.

  • Neuron ESB Machines
  • Neuron ESB Runtime
  • Neuron ESB Endpoint Host
  • EPS Machine
  • EPS Service
  • Topics,
  • Adapter Endpoints
  • Service Endpoints
  • Workflow Endpoints
  • Peregrine Scheduler
  • Neuron RabbitMQ Queue
  • Neuron RabbitMQ Machine
  • EPS RabbitMQ Machine
  • EPS RabbitMQ Queue
  • Neuron Exchange
  • EPS Exchange

Depending on the Event Source the user selects, they can choose a specific application, or all applications deployed to the selected Environment.  The Event Source also controls the source entity drop down located beneath it. For example, if a user selected Endpoint Host or Adapter Endpoint as an Event Source, they could in turn select the name of the Endpoint Host or Adapter Endpoint (Operands) to create the Condition against. A user could also elect to apply the Alert Rule to all the entities that match the Source type.

The last row of the Create Alert Rule Wizard is used to define the Condition that causes a notification to fire. The Condition can contain one or two Operands; the name of the entity to apply the condition against, and the property of the Entity to evaluate against the Condition. Conditions always evaluate to either True or False. The Operands are represented by the first two dropdown boxes as shown below. In some cases, users can select “Any” instead of a specific entity name. This can be useful if a user wants a specific rule should applied at the entity level rather than just against a named entity:

For example, if the “Neuron RabbitMQ Queue” is selected as the source, the user would then select the appropriate Topic and Party that the underlying queue is mapped to within the application. From there the user would then select the property to evaluate the Condition against. In this example the user could select from one of the following from the table below:

Alert Rule Property

Description

Consumer Count

Consumer count of the selected queue in the RabbitMQ service configured with Neuron.

Total Message

Total number of messages in the selected queue in the RabbitMQ service configured with Neuron.

Unacknowledged Message

Number of Unacknowledged in the selected queue in the RabbitMQ service configured with Neuron.

Consumer Utilization Percentage

Percentage of the time that the queue is able to immediately deliver messages to consumers.

Memory Consumed

Memory consumed by the selected queue in the RabbitMQ service configured with Neuron.

No Of Ready Message

Number of messages ready to be delivered in the selected queue in the RabbitMQ service configured with Neuron.

Depth Message Change Per Second

How much the queue depth has changed per second in the most recent sampling interval.

In some cases, the selection of the Property may define the entire Condition, as in the case of Online or Offline.  However, in most cases users will need to select a specific Operator if the Condition should define a property exceeding or underperforming a certain threshold. For example, an organization may want to be notified if an Endpoint Host exceeds 75% CPU utilization for longer than a 2-minute period. In any case, once the property is selected, any available Operators will be displayed in the drop down located to the right of the property field such as Greater Than, Less Than.

Each Event Source has its own set of Alert Rule Conditions that can be selected as listed below:

Alert Event Source

Alert Rule Property

Description

Neuron/EPS RabbitMQ Machine

Offline

RabbitMQ service configured with EPS is offline.

 

Online

RabbitMQ service configured with EPS is online.

 

CPU Usage

CPU usage of the system.

 

Memory Usage

Memory usage of the system.

 

Disk Usage

Disk usage of the system.

 

File Descriptors

Percentage of File Descriptors in use in the RabbitMQ service configured with EPS.

 

Connection Count

Total number of connections to the RabbitMQ service configured with EPS.

 

TCP Sockets

Percentage of TCP Sockets in use in the RabbitMQ service configured with EPS.

 

Deadletters Length

Total number of messages in Deadletters queue in the RabbitMQ service configured with EPS.

Alert Event Source

Alert Rule Property

Description

Neuron/EPS RabbitMQ Queue

Consumer Count

Consumer count of the selected queue in the RabbitMQ service configured with Neuron.

 

Total Message

Total number of messages in the selected queue in the RabbitMQ service configured with Neuron.

 

Unacknowledged Message

Number of Unacknowledged in the selected queue in the RabbitMQ service configured with Neuron.

 

Consumer Utilization Percentage

Percentage of the time that the queue is able to immediately deliver messages to consumers.

 

Memory Consumed

Memory consumed by the selected queue in the RabbitMQ service configured with Neuron.

 

No Of Ready Message

Number of messages ready to be delivered in the selected queue in the RabbitMQ service configured with Neuron.

 

Depth Message Change Per Second

How much the queue depth has changed per second in the most recent sampling interval.

 

Deadletters Length

Total number of messages in Deadletters queue in the RabbitMQ service configured with EPS.

Alert Event Source

Alert Rule Property

Description

Neuron/EPS RabbitMQ Exchange

Publish In Count

Count of messages published “in” to an exchange, i.e. not taking account of routing.

 

Publish In Rate

How much the exchange publish-in count has changed per second in the most recent sampling interval.

 

Publish Out Count

Count of messages published “out” of an exchange, i.e. taking account of routing.

 

Publish Out Rate

How much the exchange publish-out count has changed per second in the most recent sampling interval.

There are also “canned” system level alerts that ship preconfigured and out of the box for RabbitMQ servers configured in a Neuron solution. Global administrators automatically subscribe to these alerts. They are as follows:

Alert Rule

Description

EPS RabbitMQ High Memory Usage Alarm

RabbitMQ configured to EPS has triggered its memory alarm. When memory use goes above the configured watermark (limit).

EPS RabbitMQ Low Disk Free Available Alarm

RabbitMQ configured to EPS has triggered its low disk free available alarm. When free disk space drops below the configured watermark (limit).

Neuron RabbitMQ High Memory Usage Alarm

RabbitMQ configured to Neuron Instance has triggered its memory alarm. When memory use goes above the configured watermark (limit).

Neuron RabbitMQ Low Disk Free Available Alarm

RabbitMQ configured to Neuron Instance has triggered its low disk free available alarm. When free disk space drops below the configured watermark (limit).

Although Alert rules can be created in the Alert Management section, they can also be created from the ellipsis action menu accessed directly from a monitoring page as shown below:

Notification Messages

Once the Alert Details have been entered, the user can define the message that will be sent out if the Condition evaluates to True. This is done on the Message page of the Create Alert Rule Wizard. This page provides user the ability to determine the email body type (i.e. Plain Text or HTML), the subject line (which can be different for SMS text messages) as well as the actual Body of the message to send out.

Monitoring

Once Alert Rules have been created, Administrators (and others who have specific security rights) can view Alert Rule activation activity within the Alerts Reporting section of Peregrine Management Suite.

The full-page view of the Alert Reporting section offers a detailed picture of all the Alert Rule activations across all Applications deployed to an organization’s environment.  The image below shows the full-page view of the Alert Reporting section, displaying Alert Rule activations in a chart view. Alert activations are grouped by Application, Source, Severity, and Metric. Users can filter Alerts by application and/or their own subscription.

Actions and Logging

In many areas within the Management Suite users can select a RabbitMQ Queue and choose to purge the contents of the queue. Consequently, they can also choose any Endpoint or Topic and choose to restart or recycle it by selecting the action context menu as shown below:

Additionally, all RabbitMQ Server log files can be read and searched within the Management Suite portal. Users no longer must attempt to find, search and parse RabbitMQ server logs on their own. Now they are presented in context, and time indexed with every element of the solution.

Performance and Reliability

Neuron ESB provides several features to the underlying RabbitMQ Transport to enhance reliability, error reporting, and functionality while at the same time significantly increasing the performance of certain operations and message throughput.

Serialization Format

Neuron ESB uses a custom serialization method where only the body of the Neuron ESB Message is published, while the necessary internal Neuron ESB Message headers are custom serialized as RabbitMQ custom header properties. This has several advantages; reduces CPU utilization, allows for the message body to be accessed without proprietary methods and reduces the overall payload size that the underlying RabbitMQ infrastructure must work with.

RabbitMQ Quorum Queues

Neuron Topics using RabbitMQ as the transport include support for the Quorum queue type. The Quorum queue type offers increased data safety and equal or better throughput for the underlying topic’s queues when compared with classic, durable, and mirrored queues.

Quorum queues are implemented by RabbitMQ using a durable, replicated FIFO queue based on the Raft consensus algorithm (more information on Raft can be read here: https://raft.github.io/).  They are desirable when data safety is a top priority and should be used as the default option where replicated queues are desired.

More information about RabbitMQ and their Quorum Queue implementation can be found here: https://www.rabbitmq.com/quorum-queues.html  

Requirements  

  1. A RabbitMQ server(s) with minimum version 3.8.0 
  2. A RabbitMQ cluster with at least 3 RabbitMQ nodes 
  3. A Neuron ESB installation with minimum version 3.7.5.327 

Setting up the Neuron Solution 

The following screenshot shows a new property for RabbitMQ Topics in the Networking tab called “Quorum Queues.” By setting this property to true, Neuron will attempt to create the underlying topic’s queues using the “x-queue-type” RabbitMQ argument set to “quorum” (as opposed to “classic”). 

After enabling the Quorum Queues property, the properties “Quorum Initial Size” and “Delivery Limit” will be shown: 

Quorum Queue Properties 

Property Name 

Description 

Quorum Queues 

This setting determines if Quorum Queues will be used for the underlying Topic’s RabbitMQ queues.  This does not affect the type of queue used for the Dead Letter Exchange.  If a Quorum Queue is desired for the Neuron Instance’s Dead Letter Exchange, the setting “UseQuorumQueueForDeadLetters” found in the appSettings.config file should be set to true. 

Quorum Initial Size 

The number of nodes that will participate in the Quorum and the number of nodes that the queues will be replicated to initially 

Delivery Limit 

Used for poison message handling.  This is the number of attempts that will be performed to redeliver a message if the first attempt to deliver fails.  This is implemented as a Queue Policy on the Neuron Topic’s underlying queues.    

If the Quorum Queues property is changed, applied, and saved in a running Neuron Solution, the Neuron Topic’s RabbitMQ Publishing will automatically be restarted. It can also be manually restarted either in Neuron Explorer’s Endpoint Health page, or by restarting the entire ESB Service.  Otherwise, any Neuron Parties subscribed to the topic will remain disconnected and any Parties attempting to connect to the topic will not succeed.  Any messages published during this time will fail and be audited to the Failed Messages database table.   

After changing queue types, restarting the Publishing service will cause the existing queues to be deleted and recreated with the desired type if there are no messages still in the queue. If there are messages in the queue, the queues will not be deleted, and the topic will fail to reconfigure.  Therefore, it is highly recommended to drain the queue before saving the Neuron Solution if the queue type has changed (i.e., the “Quorum Queues” property was switched). 

After the Publishing Service restarts, any Neuron Parties that were disconnected will automatically reconnect. 

Delivery Limit 

The Delivery Limit mechanism is used for poison message handling.  A poison message causes a consumer to repeatedly requeue a delivery (possibly due to a consumer failure) such that the message is never consumed completely and positively acknowledged so that it can be marked for deletion by RabbitMQ. This property is implemented in RabbitMQ as a Queue Policy (the policy property is called “delivery-limit” and is of type Number).  The creation and deletion of this policy is handled automatically by Neuron and can’t be used for Classic type queues.   

The format of the created policy’s name is as follows: 

    NEURON.<Neuron Instance Name>.<Topic Name>_DeliveryLimitPolicy 

The policy’s definition only includes the “delivery-limit” property.  The policy’s matching pattern (which determines which queues to apply it to) is of the format: 

    NEURON.<Neuron Instance Name>.<Topic Name>_.* 

There will be a policy created for every Quorum Queue type Neuron Topic. 

Once a message has exhausted the number of retries specified in the Delivery Limit, the message will move to the Neuron dead letter queue.

     NOTE: RabbitMQ does not expose any property to control the delay between its delivery attempts.

     NOTE: If “Poison Message Handling” property of the transport is set to True, it will take precedence over the built in Delivery Limit of the Quorum Queue.

Other Properties 

All other properties should behave the same as when using classic type queues. 

Dead Letter Exchange  

The Neuron Instance’s Dead Letter Exchange for RabbitMQ Topics can also be set to use a Quorum Queue.  This is not controlled by a Neuron Solution setting but rather an entry in the appSettings.config file found in the Neuron Instance’s directory (default installation path: “C:\Program Files\Neudesic\Neuron ESB v3\DEFAULT”). 

    UseQuorumQueueForDeadLetters 

This appsetting property determines if a Quorum Queue will be used for the Dead Letter Exchange.  Unlike the property in a RabbitMQ Topic’s Network properties, switching this setting has no effect until the ESBService is restarted since the Dead Letter Exchange is per Neuron Instance. 

However, like the Topic level property, the Queue will be automatically managed upon starting/restarting the Neuron Instance. 

    DeadLetterQuorumInitialSize 

Similar to the Topic level property, this appsetting property determines the number of nodes that will participate in the Quorum and the number of nodes the queue will be replicated to initially. 

Other Considerations 

The number of RabbitMQ nodes in the RabbitMQ cluster should be an odd number.  This is because Quorums need to have a majority to work properly for the voting mechanism.  Although this number can be even, if it is, then the Quorum mechanism may not work properly if a majority cannot be established. 

Also, the guidance from RabbitMQ states that seven nodes should be the upper limit for the Quorum size and above five nodes cause performance to drop.  Please see the Performance Characteristics section found here: https://www.rabbitmq.com/quorum-queues.html#performance 

Publish Confirm Transactions

The Neuron ESB RabbitMQ Transport supports both Transaction types that RabbitMQ offers; their channel-based Transaction model as well as their batched style Transaction model e.g., Publish Confirms. Both are Acknowledge, Negative Acknowledgement (ack/nack) based models. Users can learn more about Publish Confirms here: https://www.rabbitmq.com/confirms.html , as well as why RabbitMQ introduced them: http://www.rabbitmq.com/blog/2011/02/10/introducing-publisher-confirms .

Neuron ESB provides enhancements to the Publish Confirm model to make it more reliable as well as more performant. Neuron ESB handles the ack (acknowledgement) and nack (negative acknowledgement) reconciliation process efficiently, especially where multiple acks/nacks are received on a single event, alleviating unnecessary locking on the collection of messages we must maintain internally. It also provides additional properties that can be used to finely tune the performance and throughput as well the reliability of the batch transaction such as “Batch Confirm Timeout” and “Inactivity Timeout”. These properties force Neuron ESB to call into RabbitMQ for it to finish sending any pending acks/nacks, and only then do we resubmit the messages that we’ve neither received acks or nacks for. Messages we receive nacks for or where we’re notified by RabbitMQ that that message is undeliverable, are automatically moved into the Neuron ESB Failed database table.

Publish Confirm Transactions can be used instead of the Transactional setting when a large number of message thruput is required. Publish Confirm Transactions provide up to a 10X thruput advantage over the single Transactional model. When choosing to use Publish Confirm Transactions, Batch Size should be adjusted based on the expected message size since Neuron ESB must maintain a copy of these messages in memory (regardless of the Persistence setting). The messages are maintained in memory until all the nacks or acks are received from RabbitMQ. An ack indicates a message was successfully delivered to the Exchange, whereas a nack indicates that it was not deliverable. Neuron ESB then reconciles these acks and nacks with the messages stored in memory. Only then are the messages removed from the in-memory collection. Given a specific Batch Size, the average size of the message will determine the Memory usage for each Publisher or Subscriber. The default Batch Size is 20, which should be fine for small messages under 100kb.  However, larger messages will have a larger impact on memory. Depending on the number of endpoints and the number of messages being processed, users may either want to lower or raise the Batch Size.

The Batch Confirm Timeout (default 10 seconds), is used in conjunction with Batch Size. This is the maximum amount of time Neuron ESB will wait to receive all the acks/nacks for all the messages that were published within the batch to RabbitMQ. Usually acks/nacks are received asynchronously from RabbitMQ as each message in the batch is published. During this time, the Neuron ESB Publisher will not publish a new batch of messages. If after the timeout, acks/nacks are still missing for some of the messages, Neuron ESB will do 1 of 2 things; discard the message or attempt to republish the message. The latter is determined if the Resubmit Unacknowledged Messages property on the Topic is set to true. If timeouts are regularly occurring its most likely due to resource issues on the RabbitMQ server.

Resubmitting unacknowledged messages is useful where RabbitMQ outages occur and processing rolls over to another RabbitMQ server in the cluster, assuming RabbitMQ HA mirroring or Quorum Queues are configured. In this case, Neuron ESB will automatically reconnect to the new RabbitMQ server in the cluster and resubmit the unacknowledged messages. When setting the Resubmit Unacknowledged Messages property to true, the Detect Duplicates property of the Topic must also be set to true. Neuron ESB makes this a requirement to ensure once only delivery of the message. Detecting Duplicates is an operation that takes place within the Party subscribing to the messages being published.

The last property related to Publish Confirms Transactions is the Inactivity Period property. By default this is set to 1 minute. This activates functionally a failsafe cleanup process. If there is nothing published for a period of 1 minute but there are still messages pending in memory Neuron ESB will attempt to Republish them or discard them.

If the Neuron ESB Publisher and/or Subscriber go offline or disconnects from the server, any messages that are missing acks or nacks will be forwarded to Neuron ESB’s Failed Message report database. From there users can search, filter, view messages as well as edit and resubmit them.

Prefetch Size

Neuron ESB exposes a Prefetch Size property that affects every Party that subscribes to message from RabbitMQ based Topic. The value is used to specify how many unacknowledged messages should be sent at the same time to the underlying consumer that the Party uses.  The Prefetch Size defines the max number of unacknowledged deliveries that are permitted on a channel.  Setting a limit on this buffer caps the number of received messages before the broker waits for an acknowledgment.

Neuron ESB RabbitMQ Topics do not support publishing and subscribing to messages that don’t require acknowledgments (i.e., Auto Acknowledgements) as it’s a model of messaging that is unreliable (https://wheleph.gitlab.io/posts/2015-06-27-on-rabbitmq-automatic-ack-and-reliable-message-processing/ ).

Messages in RabbitMQ are pushed from the broker to the consumers. The RabbitMQ default prefetch setting gives clients an unlimited buffer, meaning that RabbitMQ, by default, sends as many messages as it can to any consumer that appears ready to accept them. It is, therefore, possible to have more than one message “in flight” on a channel at any given moment.

To clarify, the Prefetch Size is a value that tells RabbitMQ how many messages to send to a consumer before starting to listen for acknowledgements. If it is set to 1, the consumer will need to acknowledge the received message before it will receive the next message. If it is set to 0, a consumer will receive all queued messages, and only then RabbitMQ will be expecting acknowledgements. If the prefetch count is set to any other number, RabbitMQ will not send more messages before at least one of the sent messages has been acknowledged.

Messages are cached by the RabbitMQ client library (in the consumer) until processed. All pre-fetched messages are invisible to other consumers and are listed as unacked messages in the RabbitMQ management interface.

A larger prefetch count generally improves the rate of message delivery. The broker does not need to wait for acknowledgments as often and the communication between the broker and consumers decreases. However, smaller prefetch values can be ideal for distributing messages across larger systems if there are multiple instances of the same Neuron ESB Party subscribing to messages. Smaller values maintain the evenness of message consumption. A value of one helps ensure equal message distribution and, if there is only one subscribing Party, the result is Ordered Message Delivery.

A prefetch count that is set too small may hurt performance since RabbitMQ might end up in a state, where the broker is waiting to get permission to send more messages. A large prefetch count, on the other hand, could take lots of messages off the queue and deliver all of them to one single consumer, keeping the other consumers in an idling state in the case where you have multiple instances of the same Party subscribing to the Topic.

Neuron ESB set the default value of the Prefetch Size to 20. A value of 0 is treated as infinite, allowing any number of unacknowledged messages. When setting this value, its also important to consider the message size that the Party will be receiving. If these are very large messages (i.e., 1MB or larger), configure a smaller message size larger messages will consume more memory cache in each Party that’s consuming them. Avoid the usual mistake of having an unlimited prefetch (i.e., 0), where one Party receives all messages and runs out of memory and crashes, causing all the messages to be re-delivered. A good performance blog to read is the following: https://blog.rabbitmq.com/posts/2012/05/some-queuing-theory-throughput-latency-and-bandwidth

Receiving messages

The underlying Neuron ESB RabbitMQ Transport receives messages for its respective Neuron ESB Parties (or for Dead Letter processing) by creating an underlying channel and event based consumer. The native RabbitMQ Transport provides single threaded consumption of messages via consumers to subscribing Parties.

Multithreaded Receive

Neuron ESB offers a Multithreaded receive capability for users who have reached a threshold of how many messages a Party can receive given a specific transaction and prefetch setting. By setting the Multi Threaded Receive property to True and setting the Number of Threads property to a value greater than 1, Neuron ESB will create a dedicated channel/consumer for receiving messages for each thread. For example, if the Number of Threads property was set to 5, then Neuron ESB would create 5 sets of channel/consumers for each subscribing Party. This means that each Party would be receiving messages off the queue from 5 wired events. Under various circumstances using these settings can significantly increase subscriber throughput when using RabbitMQ based Topics. However, this property should be set to false if configuring for complex or request/reply patterns.

If Ordered Messaging is required, this property should be set to False and the Prefetch Size property should be set to 1.

Poison Message Handling

Neuron ESB Provides two types of Poison Message Handling. Blocking and non-Blocking.

Neuron ESB’s RabbitMQ Transport implements Poison Message handling for unhandled exceptions that are thrown from subscribing endpoints (i.e., hosted Parties, Workflow, Adapter and Service Endpoints) or Business Processes assigned to those endpoints. If enabled, unhandled exceptions will not be immediately audited or logged as errors when they occur.  instead, they will be logged as an initial warning from Neuron ESB’s RabbitMQ client channel as displayed in the example below:

Poison Message: Failed to send Message received from ‘PlayaPub’ to ‘PlayaSubFinance’ on RabbitMQ Topic ‘HR_playa’ with ESB Message ID ‘da8ec78b-9145-4fab-9923-160bbf69f171’. 2 attempts will be made to deliver the message before it is sent to the failed message database. Each Retry attempt will occur after the ‘Retry Cycle Delay’ timespan of ’00:40:00′. Original Exception: On Receive Event – An unhandled exception occurred while executing ‘throw exception’ pipeline on step ‘C#’. Message ID ‘da8ec78b-9145-4fab-9923-160bbf69f171’, Topic ‘HR_playa.Finance’, Source Party ‘PlayaPub’. The code step “C#” failed due to an error: This is a new exception from a Business Process.

 Once the exception is detected the Neuron ESB will wait for the period specified in the “Retry Cycle Delay” property of the RabbitMQ Topic Transport. Once the wait has ended, the message will be released once more to the subscribing endpoint. Each unhandled exception that occurs will increment the internal counter until the number of retries reaches the number specified in the “Max Retry Cycles” property of the RabbitMQ Topic Transport. Each retry will log a warning message like the one below:

Poison Message: Retry 1 of 3. Message received from ‘DfatPub’ on RabbitMQ Topic ‘DFAT with Consumer Tag of ‘amq.ctag-2cxHV77ntnkHIVglehk1_A’ and ESB Message ID ‘1e5a7a76-f3dc-4046-b452-27dba1d6e445’. On Receive Event – An unhandled exception occurred while executing ‘throw exception’ pipeline on step ‘C#’. Message ID ‘1e5a7a76-f3dc-4046-b452-27dba1d6e445’, Topic ‘DFAT’, Source Party ‘DfatPub’. The code step “C#” failed due to an error: This is a new exception from a Business Process.

Once all the retries are exhausted, the original message with the exception information will be forwarded to the Failed Message Audit Report and be removed from the subscribing queue. This will result in an error message being logged like the one below:

A message has been saved to the audit failed database. Topic: DFAT, Party: DfatPub, Id: 1e5a7a76-f3dc-4046-b452-27dba1d6e445

FailureDetail – Poison Message Retries have exhausted. Message received on ‘DFAT’ RabbitMQ Topic with Consumer Tag of ‘amq.ctag-2cxHV77ntnkHIVglehk1_A’. Neuron.Pipelines.PipelineException: On Receive Event – An unhandled exception occurred while executing ‘throw exception’ pipeline on step ‘C#’. Message ID ‘1e5a7a76-f3dc-4046-b452-27dba1d6e445’, Topic ‘DFAT’, Source Party ‘DfatPub’. The code step “C#” failed due to an error: This is a new exception from a Business Process.

FailureType – Poison Message

The Poison Message can be viewed in the Failed Message Viewer as illustrated below. From there it can be edited and resubmitted either to a publishing Party or directly to an endpoint. In fact, users can select multiple messages and submit them at the same time (bulk resubmit):

If the subscribing Endpoint is shut down/stopped in a controlled manner or, the internal consumer threw a Shutdown event, a Cancellation Event will occur and one of two different Informational log entry will be made like this:

Cancellation Event Occurred. Received message with message ID ‘da8ec78b-9145-4fab-9923-160bbf69f171’ to ‘PlayaSubFinance’ Party from the RabbitMQ channel for ‘HR_playa’ topic. Will be returned to the Source Queue. RabbitMQ delivery tag of 1. The operation was canceled.

OR

Cancellation Event Occurred. Received message with message ID ‘da8ec78b-9145-4fab-9923-160bbf69f171’ to ‘PlayaSubFinance’ Party from the RabbitMQ channel for ‘HR_playa’ topic. Message will go to the Neuron ESB Failed Audit database. RabbitMQ delivery tag of 1. The operation was canceled.

The informational message that is generated is determined by the “Shutdown Handler” property.

The following would generate a Cancellation Event:

  • The ESB Service is stopped through the Service Control Manager or through Neuron ESB Explorer.
  • The Endpoint or Endpoint Host is stopped via Endpoint Health
  • Dispose() is called on a hosted Party
  • An internal RabbitMQ Consumer generated a Shutdown event.

However, if an unexpected exception occurred or the subscribing Endpoint, or the Endpoint Host hosting it crashes, the inflight messages that are currently being blocked within the channel will be forwarded to the Neuron ESB Topic’s Dead Letter Queue. From there, Neuron ESB will transfer the message to the Failed Message Audit Report, removing it from the Dead Letter queue. This will result in an error message being logged like the one below:

A message has been saved to the audit failed database. Topic: HR_playa, Party: PlayaSub, Id: b815967d-2dd9-4893-a186-aa2013607867

FailureDetail – The message was retrieved from the configured RabbitMQ dead letter queue.

RabbitMQ x-death header

  Key: count : 1

  Key: reason : rejected

  Key: queue : NEURON.DEFAULT.HR_playa_PlayaSub

  Key: time : 3/17/2023 12:35:21 PM

  Key: exchange : NEURON.DEFAULT.HR_playa

  Key: original-expiration : 86400000

FailureType – Dead Letter

The Dead Letter Message can be viewed in the Failed Message Viewer as illustrated below. From there it can be edited and resubmitted either to a publishing Party or directly to an endpoint. In fact, users can select multiple messages and submit them at the same time (bulk resubmit):

Blocking vs Non Blocking

Neuron ESB allows the configuration of either Blocking or Non-Blocking Poison Messaging by setting the “Handler Type” property. Blocking can be useful if maintaining Ordered delivery of the messages is required or there’s any other requirement that no other messages are processed until all retries are exhausted on the current message being delivered.

It’s important to note that the default behavior of the RabbitMQ Transport is Ordered Messaging. Ordered Messaging can be achieved for the Neuron ESB Topic using the RabbitMQ Transport by setting the Multi-Threaded property to false and the Prefetch Size property to 1. 

There are many scenarios where Ordered delivery of messages is not required, rather continued processing of existing messages in the underlying queue is a priority. Those scenarios can be accommodated by setting the Handler Type property to Non Blocking.

Blocking

The default value for the Handler Type is Blocking. Blocking can only be guaranteed if the Pefetch Size property is set to 1 and the Multi Threaded receive property is set to false (i.e., Ordered Messaging). If these properties are set to any other value, then messages directly behind the poison message that would successfully pass through the subscribing endpoint, could be processed on another thread.

However, the number of messages processed would be limited to the available threads. Imagine the Prefetch Size is set to 20 and Multi-Threaded receive property is set to true.  Under the covers, there would be 5 consumers (by default), where Rabbit would try to use multiple threads to send each of the underlying consumers’ event handlers 20 unacknowledged cached messages.  There would be at least a few threads available between the consumers for message processing.  But, if messages were becoming poisoned due to a target system being offline (i.e., no possibility of having “good” messages in the queue), all those messages in cache as well as those waiting to be retrieved would become poison messages, tying up the consumer’s available threads and CPU cycles until all retries were exhausted and the messages were moved to the Neuron ESB Failed Audit system. After the first set of messages from the cache were moved, the next batch of 3 or 4 messages in the cache of prefetched messages (i.e., 5 times 20 = 100 total cached messages), would be processed, going through all the retries and delays specified by the Retry Cycle Delay and Max Retry Cycles properties.

When using Blocking, the Retry Cycle Delay wait time is implemented within the Neuron ESB Party channel as a Manual Reset Event. Because of this, it can block all other messages from being processed until all the retries have been exhausted. When the poison message is being blocked, an information message like below will be logged.

Poison Message: Blocking for the ‘Retry Cycle Delay’ timespan of ’00:40:00′. After the wait expires Retry 1 of 2 will be attempted. Message received from ‘PlayaPub’ on RabbitMQ Topic ‘HR_playa with Consumer Tag of ‘amq.ctag-azbAqkugsvrCx9g0zK6aVg’ and UNIQUE ID ‘4426f009-f277-46f0-9764-c2bcd4a61d1c’

After each retry, a negative acknowledgement is sent, immediately rolling the message back to the queue (in reality, it never left the queue), which the consumer immediately retrieves again, which if qualified as a retry, gets blocked again in the channel using the Manual Reset Event.

30-minute threshold

RabbitMQ has a default message Acknowledgement Timeout of 30 minutes. That timer starts the moment RabbitMQ retrieves the messages from the Queue and places them in cache for the consumers to process. Given this, if the Retry Cycle Delay were set to 1 minute and the number of retries were set to 3, each message being processed would take a minimum of 3 minutes (not including the amount of time that the processing of the message took to throw back the exception).  3 minutes is well under the 30-minute threshold. However, the next message that is in the cache to process now only has 27 minutes before hitting the default, since it had to wait the 3+ minutes for the message before it. This will go on to the point where some set of messages remaining in the cache will eventually exceed the 30 RabbitMQ’s default message Acknowledgement Timeout of 30 minutes and throw the following error:

Unexpected Shutdown event from RabbitMQ originated from Consumer Channel ‘2’ assigned to ‘PlayaSubFinance’ Party on Topic ‘HR_playa’. Initiating source = Peer; Reply Code = 406; Reason = PRECONDITION_FAILED – delivery acknowledgement on channel 2 timed out. Timeout value used: 1800000 ms. This timeout value can be configured, see consumers doc guide to learn more

Although Neuron ESB will automatically recover from the error and reconnect all the faulted consumers, it will result in the message either being returned to the queue or forwarded to the Neuron ESB Audit Failed database, depending on the value set for the “Shutdown Handler”.

The best way to avoid this situation where there may be extended down times for a system causing hundreds if not thousands of poison messages is to either ensure Ordered Message Delivery is enabled, or set the Handler Type to Non Blocking.

If truly necessary, the RabbitMQ has a default message Acknowledgement Timeout can be extended or even disabled. The following article discusses the how: https://www.mailerq.com/blog/disable-consumer-timeouts-in-rabbitmq-3-8-15-and-higher

Non Blocking

Non Blocking Poison Message Handling is activated when the Handler Type property of the RabbitMQ Transport of the Topic is set to Non Blocking. Non Blocking is useful in cases where Ordered Messaging is not required. This specific Neuron ESB feature is built on top of, and dependent on the RabbitMQ Delayed Message Exchange plugin for RabbitMQ found here: https://github.com/rabbitmq/rabbitmq-delayed-message-exchange. The minimum version of RabbitMQ required is 3.9x

The advantage of this feature over the Blocking type is that is does not implement the wait time specified in the Retry Cycle Delay property in the Neuron ESB Party channel. Instead, the wait time is implemented within an Mnesia table via an Exchange. It will survive restarts of the RabbitMQ broker and Neuron ESB.

Internally, when Neuron ESB detects the first failure (i.e., poison message), instead of sending a negative acknowledgement, it publishes the original message it received to the RabbitMQ Delayed Message Exchange. After the successful publication of the message, an acknowledgement is sent, permanently removing the message from the Party’s underlying queue. The RabbitMQ Delayed Message Exchange handles the firing of the timers associated with each poison message and Neuron ESB tracks the retries. When the message is redelivered, it is ONLY redelivered to the underlying queue of the Neuron Party that threw the error.  

There are a few disadvantages. First, there is no way to determine how many poison messages need to be retried since there is no Queue that they reside in. They are stored in a custom table used by Erlang. Second, if the subscribing endpoint goes down, the retries will start from zero again for all those messages that were in flight. Third, the only way to purge what poison messages there may be is to temporarily disable and then reenable the plugin. There are other limitations specific to the plugin that can be found on its web site.

Lastly, if a topic is configured to use the NON blocking Poison Message Handling but encountered an error because either the delayed exchange plugin was not installed or enabled, the topic will report this as a warning, both Topic and Party will start and the Poison Message Handler will be downgraded to the Blocking type.

Unable to create the ‘NEURON.DEFAULT.DFAT.P’ Poison Message Exchange for ‘DFAT’ Topic on the RabbitMQ Server ‘DESKTOP-6V9K70R’ on Port ‘-1’. This requires that the RabbitMQ Delayed Message Plugin is installed and enabled on the RabbitMQ server. Neuron ESB will temporarily DOWNGRADE the Poison Message Handler Type to ‘Blocking’. The AMQP operation was interrupted: AMQP close-reason, initiated by Peer, code=503, text=’COMMAND_INVALID – invalid exchange type ‘x-delayed-message”, classId=40, methodId=10

Enabling/Disabling

To enable the RabbitMQ Delayed Message Exchange plugin open a command prompt and navigate to the sbin folder of the RabbitMQ Server installation directory. The default path is:

“C:\Program Files\RabbitMQ Server\rabbitmq_server-<version>\sbin”

From there, execute the following command:

rabbitmq-plugins enable rabbitmq_delayed_message_exchange 

If for any reason the messages pending delivery need to be cleared out/removed, the only way to do so is to disable the plugin and then reenable it. Disabling the plugin can be done at the command line:

rabbitmq-plugins disable rabbitmq_delayed_message_exchange

Dead Letter Processing

Neuron ESB’s RabbitMQ Topic transport implementation includes a custom Dead Letter processor, Exchange and Queue to handle those messages delivered to Subscribers that either:

  • Violate a Message Time to Live configured via a RabbitMQ policy (https://www.rabbitmq.com/ttl.html). This is automatically configured by Neuron ESB.
  • If an exception occurs within the Neuron ESB RabbitMQ transport channel before the message is delivered to the subscriber
  • When the Shutdown Handler is set to Audit Failed and cancellation event occurs
  • The message is dropped because its queue exceeded a length limit via a RabbitMQ policy (https://www.rabbitmq.com/maxlength.html)

These messages are automatically detected and moved to the RabbitMQ Dead Letter Queue preconfigured by Neuron ESB. Neuron ESB monitors and retrieves all messages from this queue and moves them into the Neuron ESB Failed Message database table.

Message Level Time to Live can be configured directly within the Transport Properties of the Neuron ESB Topic. The default value is 1440 minutes (i.e., 1 day).

Once moved, these messages can be queried, viewed, modified, and resubmitted by using the “Failed Message Viewer” window launched from the Failed Messages report located by navigating to Activity->Database Reports->Failed Messages within the Neuron ESB Explorer.

The messages recorded will have an Exception Type of “Dead Letter”. The failure message will contain failure details such as the name of the underlying Queue and the associated Exchange that the message expired in as well as the date time stamp indicating when the message expired.

Unfortunately, messages received by the Neuron ESB Dead Letter processor will not have the original error information with it. That’s because a RabbitMQ Message and its headers cannot be modified in this process per the AMQP and Rabbit specifications. However, the original information can be found in either the Neuron ESB Log files or the Windows Neuron ESB Event Log recorded as an AmqpChannelException Exception. This log entry will have the Party, Topic and Message ID of the message which can be used to find the Dead Letter message in the Neuron ESB Failed Audit report.

Performance Metrics

Neuron ESB’s Topic configured with the RabbitMQ Transport was tested against a laptop configured with 32GB RAM, Solid State Hard drive, Windows 11 64 bit, CPU (8 core). A single Neuron ESB Test Client was used to publish messages. A single Neuron ESB Test Client was used to subscribe to the published messages. The results of the testing can be found in the table below. As a side note, all Publish rates exceeded or matched the Message Receive Rate.

 

Configuration

 

Batch Size

 

Prefetch Size

Messages Sent/Received

Message Size (Bytes)

Message Receive Rate

(msg/sec)

No Persistence/No Transactions

 

50

1,000,000

100

8,200

   

100

1,000,000

100

18,800

   

0

1,000,000

100

18,300

   

50

1,000,000

1024

8,100

   

100

1,000,000

1024

18,800

   

0

1,000,000

1024

17,900

   

1

1,000,000

1024

4,383

   

50

1,000,000

10240

6,900

   

100

1,000,000

10240

7,300

   

0

1,000,000

10240

15,342

 

 

 

 

   

Persistence/No Transactions

 

0

1,000,000

100

15,900

   

50

1,000,000

100

17,700

   

1

1,000,000

1024

3,266

   

0

1,000,000

1024

17,800

   

50

1,000,000

1024

16,300

   

0

1,000,000

10240

13,000

 

 

 

 

   

Persistence/Transactions

 

0

1,000,000

100

888

   

0

1,000,000

1024

907

   

0

1,000,000

10240

517

   

1

1,000,000

1024

922

   

50

1,000,000

100

929

   

50

1,000,000

1024

911

   

50

1,000,000

10240

525

     

 

   

Persistence/Publish Confirm Transactions

20

50

1,000,000

1024

6,554

 

50

50

1,000,000

1024

10,100

 

100

50

1,000,000

1024

12,544

 

50

0

1,000,000

1024

10,188

Observations

No Persistence/No Transactions

When persistence and transactions are disabled the Prefetch value will significantly affect thruput. Memory consumption is the highest in both the Erlang process and the process hosting the publishing client (in our case, the Neuron ESB Test Client). It was not unusual for both the publishing client and Erlang to increase to 5 GB of memory usage during the testing. Lastly, message size above 1K (i.e., 10K +) had a more significant impact on the reduction of thruput. Thruput decreased anywhere from 50 to 75% when using Ordered Messaging (i.e., Prefetch set to 1), depending on which other Prefetch value was used.

Persistence/No Transactions

Memory usage of the erlang process and the process hosting the publishing client remained under 1GB when persistence was enabled. Surprisingly, message thruput for smaller messages was greater using persistence than without using a Prefetch value of 50.  There was far less negative impact on thruput with the increase of message size.

Persistence/Transactions

There was very little variation in message rate with message size between 100 bytes and 1KB when single Transactions are enabled (i.e., NOT Publish Confirm Transactions). However, the rate decreased by over 40% when a 10K message was used. Memory usage of the erlang process and the process hosting the publishing client was barely noticeable.  Overall thruput was anywhere between 10X and 20X less than when transactions are disabled. Increasing the Prefetch value from 0 to 1 (i.e., Ordered Messaging) to 50 had marginal effect on thruput.

Persistence/Publish Confirm Transactions

Publish Confirm Transactions are RabbitMQ’s implementation of a batch transaction and generally provides about 10X the thruput of the single Transaction mode. In this mode, Prefetch value has relatively no impact on performance. Users should not modify the Prefetch default value of 20. Batch Size has the greatest impact on thruput. Memory usage of the erlang process and the process hosting the publishing client was barely noticeable. 

CPU Utilization

  • Using Persistence
    • CPU utilization for the Erlang process was approximately 25%.
    • The publishing client was approximately 11%
    • The subscribing client was approximately 30%
  • When using Publish Confirm transactions with Persistence:
    • CPU utilization for the Erlang process was approximately 16%.
    • The publishing client was approximately 11%
    • The subscribing client was approximately 14%

Thread pool Auto-tuning at Startup

Neuron ESB 3.7 and 3.7.5 introduced Auto Tuning of the Thread pool for the Neuron Endpoint Host processes as well as the ESB Service process. This can be used to reduce long startup times of solutions, especially if RabbitMQ topics are used. On startup of the Endpoint Host process, if “StartupAutoTuneThreadpools” is true in the appSettings.config, Neuron ESB will try to monitor a system.threading.timer callback for completion time of the callback.  If the completion time is off by no less than .25 seconds of its expected completion time (clock speed and core count is used to calculate the timespan that must be exceeded to cause a threadpool tune, but won’t be shorter than .25 seconds), then the Threadpool minimum worker thread settings is increased by roughly the value contained in the appSettings.config key “NumThreadsToAddPerTune”.  An easy way to configure the Neuron runtimes to use Auto Tuning is to simply check the box in the Configure Server dialog as pictured below:

This value in the appSettings.config is an approximate number because it will be multiplied by an amount equal to “(1 + <seconds that the callback execution missed by> + <cpu cores>+ <processor speed>) / 10″.  This allows for more threads to be added if the time that the callback execution missed by is larger or if the cpu is quite robust.

The default value for “NumThreadsToAddPerTune” is set to 300.  This value can be decreased if it is found that the threadpool tuning is adding too many threads or increased if it is found that the number of threads being added is still not enough. The recommended amount to change this by is +- 100 but can be changed by smaller amounts if desired.

The Endpoint Host logs will contain final thread tuning counts for each endpoint host at the end of the Neuron startup.  These can be used as rough estimates for calculating and setting the .NET threadpool overrides in the Neuron ESB server’s configuration.

By default, the threadpool minimum worker thread value is set to the number of cores of the cpu.  This is usually not sufficient for a Neuron ESB Solution that utilizes RabbitMQ for topics (if the number of RabbitMQ topics + number of any party/client instances planning to connect to the RabbitMQ topics exceeds the core count, then the threadpool minimum worker threads value should be increased). 

An alternative method to the auto tune feature to estimate the min worker threads setting needed for solutions running RabbitMQ topics is to use the RabbitMQ web management interface.  Start the Neuron ESB service with the desired solution (thread pool auto tuning can be enabled if the solution takes too long to start), then go to the RabbitMQ web management interface and note how many channels are being used by the instance of Neuron ESB (this step is easiest if no other applications are using RabbitMQ).  This number is roughly the desired minimum worker threads setting.  The number might seem high depending on the number of RabbitMQ topics but is a good estimation to use since the threadpool might become exhausted for a given Endpoint Host process during RabbitMQ server failures/outages.

Handshake Continuation Timeout

If a solution has a large number of RabbitMQ Topics and parties, a Handshake Continuation Timeout Exception may occur if the user’s machine could not process the RabbitMQ Connections fast enough. Neuron ESB has extended the client-side Handshake Continuation Timeout from 10 to 30 seconds to reduce the occurrence of this exception. However, users should also modify the RabbitMQ Server configuration entry to match as indicated below:

handshake_timeout: should be increased to 30 seconds

More information can be found at: https://www.rabbitmq.com/networking.html

if there are Handshake timeouts occurring they would appear in the RabbitMQ Service’s log file similar to the entries below:

2023-03-22 16:03:48.832000+01:00 [info] <0.2411.0> accepting AMQP connection <0.2411.0> ([::1]:55638 -> [::1]:5672)

2023-03-22 16:03:58.845000+01:00 [error] <0.2411.0> closing AMQP connection <0.2411.0> ([::1]:55638 -> [::1]:5672):

2023-03-22 16:03:58.845000+01:00 [error] <0.2411.0> {handshake_timeout,frame_header}

Security

Neuron ESB supports SSL when using RabbitMQ Server 3.8.5 and above. SSL can be enabled in the Transport Properties located on the Networking tab of the Topic by navigating to Messaging->Publish and Subscribe->Topics within the Neuron Explorer.

Once SSL has been enabled, the RabbitMQ SSL port needs to be provided and the SSL Protocol to use must be selected. Although Neuron ESB and RabbitMQ supports both SSL2, SSL3, TLS, TLS 1.1 and TLS 1.2 by default SSL3 support is disabled by RabbitMQ to avoid POODLE attacks. More about RabbitMQ and its SSL support can be found here: http://www.rabbitmq.com/ssl.html

Client Authentication can also be enabled by providing a Certificate (registered within the Security section of the Neuron ESB Explorer).

Configuring SSL support is done by modifying the RabbitMQ configuration file as well as registering several important Environment Variables for the machine. More information can be found here: http://www.rabbitmq.com/configure.html

About the Author

Author's Name
Marty Wasznicky

President/CTO

Marty has almost 30 years of experience in the software development industry. He joined Peregrine Connect after six years as a Regional Program Manager in the Connected Systems Division at Microsoft. His responsibilities there included building out Microsoft’s BizTalk Server product integration business, managing a team of SOA/ESB/BPM field specialists and building strategic partner alliances. Marty created the Microsoft Virtual Technical Specialist program and owned the development of Microsoft’s Enterprise Service Bus Toolkit.