The ExtraHop system helps you monitor network activity and all of your applications. For example, you can learn how well applications are consuming network resources, how systems and devices are communicating with each other, and how to identify transactions that are flowing across the data link layer (L2) to the application layer (L7) in your network.
This guide explains how the ExtraHop system functions so that you can understand how your data is collected and analyzed. We also provide a list of learning resources and some activities to get you started.
- First, learn about our appliances and how they work together.
- Then, learn how the Discover appliance collects data from transactions observed on your wire data capture feed or from machine data through NetFlow, sFlow, IPFIX, and AppFlow traffic on remote flow networks.
- Then, learn how devices that are actively communicating on the network are discovered and classified, which provides you with over 4,000 built-in metrics for dozens of protocols.
- Finally, learn how software deduplication removes unnecessary duplicates from your ExtraHop metric data.
The ExtraHop platform comprises a suite of appliances—Discover, Explore, Trace, and Command—that are designed to passively monitor the network traffic in your environment in real time. Each appliance provides you with different types of information about your network, which you can analyze to determine where problems in your network might be developing.
The ExtraHop Discover appliance (EDA) provides top-level and detailed metrics about transactions and traffic between devices. The Discover appliance includes tools to analyze and visualize all of your network, application, client, infrastructure, and business data.
The Discover appliance passively collects unstructured wire data—all of the transactions on your network—and transforms this data into structured wire data.
Deploy a single Discover appliance, either physical or virtual, anywhere in your
The ExtraHop Explore appliance (EXA) integrates with the ExtraHop Discover appliance to store
transaction and flow records sent from the Discover appliance. You can see, save, and search the
structured flow and transaction information about events on your network with a simple, unified
UI, with no modifications to your existing applications or infrastructure. Deploy a cluster of
three or more Explore appliances to take advantage of data redundancy and performance
The ExtraHop Trace appliance (ETA) continuously collects network packets and integrates with the ExtraHop Discover and Command appliances. You can quickly retrieve all packets that match a set of search criteria within a given time interval. You can then download the packet capture file for further inspection in a packet analyzer, such as Wireshark.
Deploy a Trace appliance when you need access to more than the summary data collected by the
The ExtraHop Command appliance (ECA) provides centralized management and reporting across multiple ExtraHop Discover, Explore, and Trace appliances that are distributed across datacenters, branch offices, and the public cloud.
You can pair an Explore appliance or cluster to multiple Discover appliances, and then query the records stored by each Discover appliance from the Command appliance.
When you add a Trace appliance, you can search, download, and analyze the collected packets to gain further insight about the information flowing across your network.
For most large ExtraHop deployments, a dedicated Command appliance is the most efficient way
to manage all of your remote appliances.
The ExtraHop Discover appliance collects data and generates metrics from two types of data sources: wire data and machine data, such as flow data.
Wire data is observed in real time, which provides information about what’s happening on your network. With wire data, the ExtraHop system passively collects a copy of unstructured packets through a port mirror or tap and stores the data in the appliance datastore. The copied data goes through real-time stream processing, which transforms the packets into structured wire data through the following stages:
- TCP state machines are recreated to perform full-stream reassembly.
- Packets are constructed into flows.
- The structured data is analyzed and processed in the following ways:
- Transactions are identified
- Devices are automatically discovered by MAC and IP address and then classified by their activity.
- Metrics are generated and associated with protocols and sources, and the metric data is then aggregated into metric cycles.
- As new metrics are generated and stored, and the datastore becomes full, the oldest existing metrics are overwritten according to the first-in first-out (FIFO) principle.
Flow data, a type of machine data, can also be collected from a network device and sent to the Discover appliance for analysis or storage. Flow data is an alternative option if wire data cannot be collected from a remote network.
A flow is a set of packets that are part of a single transaction between two endpoints. Similar to how the ExtraHop system can identify flows from wire data, flows from machine data on remote networks can be sent to a Discover appliance for analysis. Flows are identified through their unique combination of IP protocol (TCP/UDP), source and destination IP addresses, and source and destination ports.
- NetFlow v5
- The Cisco proprietary protocol that defines a flow as a unidirectional flow of packets all sharing the following values: ingress interface, source and destination IP address, IP protocol, source and destination ports, and the type of service. NetFlow v5 has a fixed record format with 20 fields and cannot be customized.
- NetFlow v9
- An adapted version of NetFlow v5 where the record format is based on a template. NetFlow v9 has 60+ fields in the records and can be customized. In the Discover appliance, these records are only partially parsed until the template packet is detected.
- An open standard based on the NetFlow v9 standard. ExtraHop supports only the native format; formats where the Enterprise bit is set outside of a trigger are not supported.
- The Citrix implementation of IPFIX with customized extensions to include application-level information such as HTTP URLs, HTTP request methods, status codes, and so on.
- A sampling technology for monitoring traffic in data networks. sFlow samples every nth packet and sends it to the collector whereas NetFlow sends data from every flow to the collector. The primary difference between sFlow and NetFlow is that sFlow is network layer independent and can sample anything. NetFlow v5 is IP based, but v9 and IPFIX can also look at Layer 2.
The Discover appliance enables you to add any of the above flow data sources. You can then view metrics for flow networks (a network device that sends information about flows that are seen across the device) and their interfaces.
With the Discover appliance working as a flow collector and analyzer, you can collect the flow network traffic through the following stages:
- Flow exporters detect and format traffic, caching information about the flow, including source and destination IP addresses, port, IP protocol, and number of bytes and packets.
- The flow exporter sends the cached information from the flow network to the Discover appliance, which acts as a collector and analyzer for the flow data.
- The flow network traffic is analyzed, flows are identified, and metrics are aggregated for the total number of bytes and total number of packets in each flow.
For example, when a client initiates a request to a server, the packet is sent to the router, which directs the packet to the destination server through the network topology. If that router is configured as a flow network exporter, information about the flow is then formatted and sent to the Discover appliance for analysis.
By analyzing flows of network traffic, such as NetFlow traffic, an administrator can identify the top network flows (most bytes consumed), top network talkers (highest throughput), total number of bytes, and the total number of packets per router interface.
The ExtraHop system automatically discovers devices based on what is happening on the network. There are two device discovery modes: layer 2 (L2) discovery and layer 3 (L3) discovery. The default discovery mode is L3 discovery.
In L2 discovery, a device entry is added for every locally observed MAC address over the wire. All IP addresses associated with a MAC address are aggregated into one device.
- when a device responds to an Address Resolution Protocol (ARP) request for the IP address, allowing the ExtraHop appliance to associate the IP address with an MAC address.
- when the associated MAC address is not the MAC address of an L3-routing device.
In addition to creating L3 devices, the Discover appliance also creates an L2 device for each unique MAC address. If the MAC address and IP address are associated with the same device, the Discover appliance links the parent L2 device and the child L3 device. The IP address and MAC address for a device are displayed in the overview section on the Device page in the Metrics section of the Web UI.
- L2 metrics that cannot be associated with a particular child L3 device (for example, L2 broadcast traffic) are associated with the parent L2 device.
- In the device list view in the Metrics section of the Web UI, you can filter the full device list for L2 devices only, L3 devices only, or both types of devices.
- L2 devices that exist solely as parents to L3 child devices do not count against licensed device count limits.
After a device is discovered, the ExtraHop system tracks all of the wire data traffic associated with the device. Device name are then discovered by passively monitoring naming protocols, including DNS, DHCP, NETBIOS, and Cisco Discovery Protocol (CDP). A device can be identified by multiple names, which are all searchable. If a name is not discovered through a naming protocol, the default name is derived from device attributes (MAC address for L2 devices and the IP address for L3 devices). You can also create a custom name for a device.
|Note:||If a device name does not include a hostname, the ExtraHop system has not yet observed naming protocol traffic associated with that device. The ExtraHop system does not perform DNS lookups for device names.|
Based on the type of traffic associated with the device, the ExtraHop system assigns a role to the device, such as a gateway, file server, database, or load balancer.
The ExtraHop system automatically discovers local L3 devices based on observed ARP traffic that is associated with IP addresses. By default, all IP addresses that are observed outside of locally-monitored broadcast domains are aggregated at one of the incoming routers in your network. To identify and learn about individual devices outside of these routers, which are beyond your local network, you can create custom devices and enable reporting on these devices. For example, you can create a single device that encompasses several known IP addresses for a remote site or cloud service.
|Note:||If you have a proxy ARP configured in your network, the ExtraHop system might automatically discover remote devices. For more information, see this ExtraHop forum post.|
- Configure remote discovery in the ExtraHop Admin UI to discover L3 devices for a range of IP addresses that are not on the local network.
- Create a custom device to collect metrics for a remote IP address or a range of IP addresses into one device. For example, you can create a single device that collects metrics for several known IP addresses that belong to remote sites or cloud services.
The ExtraHop system removes duplicate L2 and L3 frames and packets when metrics are collected and aggregated from your network activity by default. L2 deduplication removes identical Ethernet frames (where the Ethernet header and the entire IP packet must match); L3 deduplication removes TCP or UDP packets with identical IP ID fields on the same flow (where only the IP packet must match).
The ExtraHop system checks for duplicates and removes only the immediately-previous packet both on the flow (for L3 deduplication) or globally (for L2 deduplication) if the duplicate arrives within 1 millisecond of the original packet.
By default, the same packet traversing different VLANs is removed by L3 deduplication. In addition, packets must have the same length and the same IP ID, and TCP packets also must have the same TCP checksum.
L2 duplication usually only exists if the exact same packet is seen through the data feed, which is typically related to an issue with port mirroring. L3 duplication is often the result of mirroring the same traffic across multiple interfaces of the same router, which can show up as extraneous TCP retransmissions in the ExtraHop system.
The System Health page in the ExtraHop Web UI contains charts that display L2 and L3 duplicate packets that were removed by the ExtraHop system. Deduplication works across 10 Gbps ports by default and across 1 Gbps ports if software RSS is enabled. L3 deduplication currently is supported only for IPv4, not IPv6.
Check out the following guides and resources that are designed to familiarize new users with our top features.
- Learn how to monitor website performance in our dashboard walkthrough.
- Learn how to identify potential DNS server issues in our metrics walkthrough.
- Learn about wire data fundamentals (online training)
- Learn about getting started with ExtraHop (online training)
- Visit our forums to communicate with other ExtraHop users.
- Contact ExtraHop Support if you need additional help.