Installation
Configuration
Management
Monitoring
Troubleshooting
Advanced Configuration
This section can help you resolve common issues encountered with the Data Hub.
If issues are encountered during installation, run the installer again with logging turned on:
/L*V "<logfile name>"
.
PreEmptive.Analytics.Data.Hub.exe /L*V "install_log.txt"
If an issue prevents a message from being accepted by a destination, the Data Hub categorizes these issues as offline or error responses, depending on the specific scenario. This section offers suggested steps for discovering the type of problem that is preventing message delivery.
When dispatching to durable destinations, the Dispatch Service will queue unsuccessful messages for an offline retry or error retry at a later period. This prevents data loss while the issue with the destination is being diagnosed, though if not resolved, this queuing may eventually lead to disk-full or memory-full scenarios.
Typically, such situations should be diagnosed by examining the Event Log for messages that will explain the response received from the destination. You may also want to monitor the relevant WMI counters to ensure the messages are being received and routed as you expect them to be. If these tools are insufficient, you may need to use an HTTP proxy to trace the actual HTTP request and response, or directly view the queued message(s) that are causing the problem.
The Dispatch Service can be configured to use an HTTP proxy (e.g. Fiddler) to see the exact requests and responses that are transmitted between the the Data Hub and a destination.
Before enabling an HTTP proxy to troubleshoot message retries, you may want to:
<dispatchFromEndpoint>
setting
to false
.
true
.To enable such a proxy:
[Application folder]\DispatchService\HubDispatchService.exe.config
.<system.net>
section, add the following, replacing the proxyaddress
attribute appropriately:
<defaultProxy enabled="true"> <proxy proxyaddress="http://127.0.0.1:8888" bypassonlocal="False"/> </defaultProxy>
To disable such a proxy:
[Application folder]\DispatchService\HubDispatchService.exe.config
.<system.net>
section, remove the <defaultProxy>
element.In some cases, inspecting the contents of the messages at the front of a queue may be sufficient to understand an issue.
To view queued messages:
pahub.<destination id>_offline
for offline responses.pahub.<destination id>_errored
for error responses.Yes
for the Requeue field. Incorrectly choosing No
will cause data loss.
Get Message(s)
.When dispatching to non-durable destinations, the Dispatch Service only tries messages once, and never queues messages. Because messages that receive error or offline responses are never retried, it can be harder to troubleshoot problems with non-durable destinations.
Some suggestions when dealing with non-durable destinations:
In the event that disk space reaches a critical level, the Data Hub will no longer be able to queue incoming messages, causing an outage at the Endpoint.
First, in such a situation, check the Queues page of the RabbitMQ management console to see which (if any) queues are excessively large. Based on that information, there are three possible outcomes:
This typically means that new messages are not being processed by the Dispatch Service. This could be because it is off, because it is not configured with any destinations, or because it has been set not to dispatch messages from the endpoint queue.
If the Dispatch Service is not running, all incoming messages will remain queued in the endpoint queue, filling the disk.
To check for this scenario:
0
.services.msc
) to ensure that the service is actually started.Ways to resolve:
pahub.endpoint
queue. This causes data loss.
If the Dispatch Service is running but has no configured destinations, all incoming messages will remain queued in the endpoint queue, filling the disk.
To check for this scenario:
_all
, _durable
, and _non_durable
.To resolve:
pahub.endpoint
queue. This causes data loss.
For troubleshooting purposes, it is possible to start the Dispatch Service, but not enable messages from the endpoint queue to be dispatched, instead only permitting retries.
To check for this scenario:
Starting dispatch service with "dispatchFromEndpoint" set to "false". No messages will be processed from the endpoint queue.
To resolve:
<dispatchFromEndpoint>
to true
and restart the Dispatch Service. ** If you cannot start the Dispatch Service, check
the Event Logfor any messages explaining why the service could not start, then resolve the issues and try again.This typically means that a destination has been offline for a long time, or that the destination is returning error responses for most messages.
If a destination is offline, messages will remain queued for that destination until it comes online again. If a destination is offline for an extended period, these messages will build up and may eventually fill the disk.
To check for this scenario:
0
for an extended period of time, aside from _all
, _durable
, and _non_durable
.Ways to resolve:
<all/>
. This means new messages will never be delivered to this destination.
pahub.<destination id>_offline
). This causes data loss.
If a large number of messages for a particular destination are receiving error responses, there may be a larger problem with the destination or the connection between the Data Hub and the destination.
To check for this scenario:
0
for an extended period of time.Ways to resolve:
<errorGiveup>
setting, so the Data Hub will discard the oldest messages from the error queue. Note that you will probably also want to reduce the <errorRetry>
setting
(to e.g. 5 minutes) so that the queue is processed quickly.<all/>
. This means new messages will never be delivered to this destination.
pahub.<destination id>_errored
). This causes data loss.
The Data Hub is designed to run well within memory limits on any host with the minimum memory recommendation but under certain scenarios (e.g. very large message sizes) it is possible for it to consume all the memory or for individual components to hit their internal memory limits. In such cases, throughput may be reduced and in extreme cases it can cause an outage at the endpoint.
To see what component is using the most memory, open Task Manager (taskmgr.exe
). Check the following processes and refer to the associated sections for assistance:
erl.exe
process: RabbitMQ.PreEmptive Analytics Data Hub Dispatch Service
or HubDispatchService.exe
process: Dispatch Service.IIS Worker Process
or w3wp.exe
process: Endpoint Web Service.If none of the processes appear to be particularly memory-intensive, check the RabbitMQ logs ([RabbitMQ data folder]\log
) for memory alarms. If they appear, see the RabbitMQ subsection for assistance.
If none of the above appear to be abnormal, the problem may lie outside of the Data Hub components.
If the Erlang process (erl.exe
) is using excessive memory (more than 1GB) or if the log ([RabbitMQ data folder]\log
) shows memory alarms, then RabbitMQ is likely experiencing problems.
Ways to resolve:
[RabbitMQ data folder]\rabbitmq.config
.Then, re-check the memory usage of all components.
If the Dispatch Service is using excessive memory (more than 1GB), it is probably a temporary issue related to message size and data rate. If necessary, restart the Dispatch Service. Then, re-check the memory usage of all components.
If a more permanent solution is needed, try reducing the maximum outgoing connections and Dispatch Service concurrency settings.
If a IIS Worker Process is using more memory than expected (more than 500MB), it is probably a temporary issue related to message size and data rate. To resolve, restart the Endpoint Web Service - this will cause an outage on the Endpoint. Then, re-check the memory usage of all components.
Data Hub User Guide Version 1.3.0. Copyright © 2014 PreEmptive Solutions, LLC