PreEmptive Analytics Data Hub User Guide

Choosing a Host

Please review the requirements and recommendations below before installing the Data Hub.

Hardware and VM Recommendations

The Data Hub can be run on either physical hardware or a virtual machine. Below are suggested specifications for either type of host.

Minimum Recommendations

  • CPU: 64-bit, 2 cores, 2.0 GHz each
  • RAM: 4 GB
  • Disk: A drive with 50 GB free; see the next section for details

Disk Space Planning

In normal operation, the Data Hub requires very little free disk space on its host, because all data is delivered as soon as it is received. In scenarios where downstream destinations are offline or returning errors, though, the Data Hub needs to have enough free disk space to store all messages that are received (for all offline/error destinations) until they come back online. The amount of disk space required depends directly on the rate at which messages come in, and the size of those messages.

You can estimate the maximum disk space needed for queued data (in bytes) using the following formula:

offlineDuration * durableDestCount * incomingRate * messageSize

where:

  • offlineDuration is the longest time you want to be able to queue messages while a downstream destination is offline, in seconds.
  • durableDestCount is the number of durable destinations to be configured, that will receive all messages.
  • incomingRate is the average incoming message rate, in messages per second.
    • A high-volume estimate for this value would be 10 messages per second.
  • messageSize is the average incoming message size, in bytes per message.
    • Past experience across all users of the Runtime Intelligence Service shows an average of 4 KB.

For example, the minimum recommendation is calculated assuming 2 durable destinations, assuming they could both be offline for up to a week:

1 week (604800 seconds) * 2 destinations * 10 messages/second * 4000 bytes/message

which is approximately 48 GB for queued data.

Software Requirements

The Data Hub requires a 64-bit OS from the following list:

  • Windows Server 2012, or 2012 R2 (recommended)
  • Windows Server 2008 R2
  • Windows 8 or 8.1
  • Windows 7 (64-bit)

It also depends on the following Windows features, which must be installed in this order:

  1. Internet Information Services (IIS): 7.5 or 8.0
    • IIS Management Console
  2. .NET Framework 4.5
    • ASP .NET 4.5

Memory Usage

In normal operation, with average message size of 4KB and incoming message rate under 200 messages/second, Data Hub requires very little memory (less than 1GB). In extreme cases with larger message sizes and higher incoming message rate, using the default Dispatch Service Concurrency Settings, we have found that:

  • The dispatch service can use up to 2GB of memory.
  • RabbitMQ tries to limit its memory usage to 1GB but can use up to 2GB of memory.
  • IIS manages its own memory use. It is not affected by the Dispatch Service Concurrency Settings. It has not exceeded 1GB under normal operation in our testing.

These tests were run with these specifications and hardware configuration.

Network Accessibility

In order to receive analytics data, the Data Hub must be deployed in a location that is reachable by instrumented applications and other upstream clients, now and in the future. If the instrumented applications communicate over the internet, rather than an internal network, the Data Hub should be placed in a DMZ.

For similar reasons, the external hostname of the machine should be stable; otherwise, previously instrumented applications pointing to that location would fail to deliver their messages.

High-volume Performance Considerations

For most use-cases, any host meeting the minimum system requirements can handle any normal rate or volume of data. In rare cases, relating to destination throughput, you may need to tune runtime performance settings of the Data Hub; please see the Performance Tuning page for details.

If you anticipate having very high rates of throughput (e.g. 200+ messages per second), we recommend choosing a RabbitMQ data folder location on a secondary (non-OS) drive. We have also found that an SSD on the OS drive can improve throughput. We have not found a significant throughput improvement from using an SSD for the RabbitMQ drive.

Observed Performance

PreEmptive Solutions has placed the Data Hub through extensive performance with the following host configuration:

  • Physical hardware
  • CPU: Intel Core i7-2670QM 2.2GHz
  • RAM: 16GB
  • Disk: Hitachi HTS725025A9A364 (250GB, 7200RPM, 8MB Cache, SATA, 300 MBps)
  • Network: Gigabit Ethernet
  • OS: Windows Server 2012
  • 8 upstream load-generating clients
  • 4 destinations

This host was able to accept and successfully deliver an overall throughput of 500 messages/second incoming (4KB messages), delivered to multiple destinations, under a variety of runtime conditions (e.g. offline destinations, error responses, large queues, high-latency destinations, etc.), without completely consuming any system resource (e.g. CPU, memory, disk, network).

Note that this test was done over an internal network; if it had been done over the Internet, network bandwidth would likely have limited the overall throughput.


Data Hub User Guide Version 1.3.0. Copyright © 2014 PreEmptive Solutions, LLC