PreEmptive Analytics Data Hub User Guide

Overview

The PreEmptive Analytics Data Hub is server software that accepts, queues, and dispatches analytics messages to configurable destinations. It is designed to function as the "one endpoint" for all your applications instrumented with PreEmptive Analytics tools. By pointing all instrumentation to the Data Hub, you can change where your data goes without having to update and redistribute instrumented applications, increasing the timeliness of collected data. Additionally, you can filter which messages go to each destination, for instance to ensure that a particular application's data is always kept within your network. Finally, the Data Hub provides resilience during outages by queuing and retrying messages until they are successfully delivered to all destinations.

System Overview

The Data Hub operates in three distinct parts: the Endpoint Web Service, the Dispatch Service, and RabbitMQ, a third-party queuing technology.

Internal components of the Data Hub

The Endpoint Web Service is an Internet Information Services (IIS) web service that is responsible for accepting incoming HTTP requests from clients. These messages are queued in the RabbitMQ instance for later processing by the Dispatch Service. The endpoint will operate without any special configuration, but you may wish to configure SSL or other settings.

The Dispatch Service is a Windows Service that processes queued messages and dispatches them to configured destinations. The Dispatch Service must must be configured with the target destinations. You may also want to configure the default timeout and retry settings.

The third-party RabbitMQ instance is used to store messages on disk before they are dispatched. This component contains various queues: a single queue for incoming messages that have not been processed by the Dispatch Service (the endpoint queue), and two additional queues for each destination (the error and offline queues) that the Dispatch Service uses to handle messages whose previous attempts failed.

Terminology

The following definitions will be helpful when using this guide or reading log messages.

  • Message: An HTTP entity body received by the Data Hub, typically generated by an application instrumented with PreEmptive Analytics tools.
    • Other PreEmptive Analytics products may refer to this as an envelope or a batch because the client applications will batch individual messages together before delivering them to e.g. a Data Hub. The Data Hub refers to these as messages, however, because it does not process the individual parts of each envelope/batch, and because RabbitMQ uses the word message for each entity in a queue.
  • Endpoint: The interface on the Data Hub which accepts incoming messages using the Endpoint Web Service.
  • Client: An application that sends messages to the Data Hub over HTTP. Typically this is the application that generated the message, but it may also be an intermediary host, including other Data Hubs.
  • Dispatch: To attempt to deliver a message to a destination over HTTP.
  • Destination: An HTTP service that receives messages from the Data Hub.
    • Durable destination: A destination for which messages will be queued and retried if they are not successfully delivered (this is the default).
    • Non-durable destination: A destination for which messages will be dropped after the first delivery attempt, regardless of success or failure.
  • Queue: A First-In-First-Out (FIFO) structure, maintained by RabbitMQ, to store messages before or between dispatch attempts.
    • Endpoint queue: The queue where all messages are first placed after being accepted by the Endpoint Web Service.
    • Offline queue: A queue where messages are stored after receiving offline responses. Each destination has one of these queues, from which messages will be periodically retried. Messages in this queue never expire.
    • Error queue: A queue where messages are stored after receiving error responses. Each destination has one of these queues, from which messages will be periodically retried. Messages in this queue eventually expire and are dropped if they continue to receive error responses.
  • Response: The result of a dispatch attempt. (Please see HTTP Details for more details about what comprises success, offline, and error responses.)

Data Hub User Guide Version 1.3.0. Copyright © 2014 PreEmptive Solutions, LLC