Data centers are the epitome of infrastructure automation, and their modern manifestation—cloud—provides an almost magical platform for its users. To construct clouds, separation of concerns into layers of abstraction, like network overlays and service API encapsulations, help enable service agility and innovation. But do these layers curb complexity, or merely mask it?
The truth is, it’s a struggle to understand how all the magic happens behind the curtain of cloud infrastructure. Willfully blind reliability can be a house of cards, with applications stacked upon services, stacked upon a cloud platform, stacked upon data center infrastructure. If the foundation of the cloud architecture—the network—wobbles, or doesn’t live up to its SLA metrics, then issues reverberate all the way up the stack.
Demystifying this magic to identify root causes is a deeply complex problem faced by all data center operators. To thwart and unmask such complexity in the data center, we have engineered a solution that will shine a light on some of the most elusive troubleshooting and analytics issues faced today.
Introducing Contrail Insights
Juniper’s Contrail Insights simplifies multicloud operations with monitoring, troubleshooting and optimizing functions based on telemetry collection, policy rules, artificial intelligence and an intuitive user interface for analysis and observability. It works with VMware, OpenStack, Kubernetes and public cloud environments, as well as private cloud data center infrastructure, and it provides visibility across the network, servers and workloads.
Contrail Insights is available standalone or as part of Contrail Enterprise Multicloud, where’s it’s combined with Contrail Networking and Contrail Security. It’s available with Contrail Cloud for service providers. This next evolution of AppFormix has now been fully merged into the Contrail Command user interface and Contrail APIs for this trifecta of Contrail Networking, Security and Insights products.
In our new release of Contrail Insights, we have greatly expanded the analytics and observability features well beyond what AppFormix previously offered.
Seeing below the tip of the infrastructure iceberg
As revealed in Juniper’s 2019 State of Network Automation Report, monitoring is the most time consuming task day-to-day for network and security operations teams. As teams automate more, monitoring is increasingly the cornerstone of operations since there are fewer changes to manually perform.
Contrail Insights is now doing for monitoring and troubleshooting what Contrail Networking and Contrail Security did for data center, cloud and cloud-native orchestration. Seeing through the layers of entangled automation makes monitoring and troubleshooting possible via the instrumentation of Contrail Insights.
Let’s take a closer look at some of the other new features:
An intuitive tour of topology
A good place to start is the topology view. Contrail Insights shows the fabric of spine and leaf switches and links, all the way down to the servers and their hosted workloads.
The topology viewer works well for any size data center. There are smart arrangement presets as well as featured ability to customize the display with an intuitive user interface. This allows the user to drag and drop, select and move groups of nodes and links at a time and zoom in and out, when dealing with large topologies, improving broad visibility and quickly focusing that visibility as needed.
Visualize and analyze with a heatmap
Switch, link and server resources show up in the topology view with a configurable heat map. Heat maps can be based on switch resource usage, server resources usage or link usage. For example, the indicators show the heat map color scale based on link bytes, packets, or relative utilization.
The right panel provides further controls. Analysis can be done in a contextual way through the topology, with mouse-over tooltips and clicking on resources to present detailed and customizable analytics shown in the right panel with charts, graphs and tables. Contrail Insights provides statistics, both in real-time and on a historical basis. Using the calendar to navigate back to a certain period is immensely valuable for troubleshooting past issues like microbursts or intermittent hot spots.
Powerful querying and root-cause analysis with drill downs
The top-N view features a larger-scale bar chart or table in the main panel of the user interface, allowing operators to explore multi-parameter queries in more detail and to sort the top N results.
To build a query in this mode, the right panel is where the query filters are set. All query fields are populated with drop-down results, so that the user doesn’t have to guess or remember the resource names. This makes it easy to find traffic in a virtual network or between two points. This is also made easy by selecting points or links in the topology view and then clicking the top-N button from that view to enter the top-N interface with filters preset to what was selected in the topology.
Using the drill-down button in the table of results will recurse through the search results. This can aid in sifting through traffic volumes to find an exact flow or traffic group that may be an issue. For example, if a link is running hot, the user can query between the source and destination nodes and then drill down through the traffic volumes of culprit overlays, protocols, and flows.
Troubleshooting with path finder
For each search result row in the top-N view, there’s also a simple find-path button to jump into the Contrail Insights path finder tool. This is a simple way to get into the path finder interface with the right-panel filters all preset to match the context of the row in the previous view, but you can also build path finder queries from scratch.
The path finder tool is ideal for troubleshooting in a visual way, with the heatmap-contextual topology. It displays the path through the network topology for a specific flow or particular set of traffic parameters, and it presents an elegant solution to the problems of overlay-underlay correlation.
Traffic groups—for example a given 3-tuple of source IP, destination IP, and protocol—can be balanced across multiple paths in a data center fabric. Path finder highlights the breakdown of the amount of traffic per path. In the right panel, paths are broken down across a bar chart, showing their relative share and allowing selection of those bars for individual subgroups of the traffic taking one path through the network. Also in the right panel, there is a line graph showing the traffic bandwidth over the selected time window.
Overlay to underlay correlation
Imagine troubleshooting incidents when an application team is experiencing issues. In such cases, the network engineer only knows the overlay information, such as workload endpoints (i.e. source and destination IP addresses,) and needs to find out the path in the underlay network. This requires correlating the overlay networks with the physical underlay data center networks.
Path finder shows the topology with the link path highlighted for the end-to-end path, workload to workload, traversing the server hosts and switching fabric. Because Contrail is overlay and underlay aware, it has all the context to filter on the appropriate domain, tenant, or virtual network, as well as the source and destination of the workload IPs. This is easily filtered in the path finder right panel to reveal the path through the fabric topology shown in the main panel. When the pressure is on for network engineers to show network innocence or find a problem, path finder is a leap forward in troubleshooting.
Underlay to overlay correlation
In the reverse scenario to the above, imagine the NetOps team must determine which applications are using the most bandwidth between two points in the physical network. The network engineer knows only the underlay physical switch IP addresses or interfaces and would like to know the top workloads whose overlay traffic are using the path between those two points.
From the top-N view, the user can select the overlay source and destination, along with other fields of interest, to present in the results. Then, in the right panel as query parameters, the user sets the filter to match the underlay source and destination switches or interfaces. The table view, or particularly its bar-chart view, shows the distribution of top overlay flows between the two switches. Now, to illustrate the result in a topology view, the user simply clicks the flow result’s find-path button in the given row. Presto! Contrail will render the path finder view for the end-to-end flow, clearly illustrating which part of the switch fabric the traffic is taking.
Resource consumption for a given tenant’s virtual network
In this use case, a data center operator wants to know the server and network resource consumption across a tenant or to drill down into more specific consumption at some points.
Starting in the topology view, the user can set up the heatmap configuration for a given time range and filter on just one tenant at the level of the source virtual network field. The topology heatmap highlights will activate for all links, servers and network nodes participating in that virtual network. Hovering the mouse over the highlighted resources shows a quick tooltip view of the resource consumption contextual to our single virtual network. To drill down further, simply click on any resource in the map and the right panel will present the default charts and tables that can be reconfigured to suit the search. For more detailed analysis the user can contextually launch into the top-N and path finder tools.
By now this illustrative blog has given you a good taste for the power of Contrail Insights.
If you’re joining us at this week, be sure to check out the breakout session on “Insights and Operational Simplicity” and the demos in Enterprise Multicloud kiosks. You can also binge Contrail demos to your heart’s content in our YouTube on Contrail Enterprise Multicloud. When you’re ready to judge for yourself, ask your Juniper account team or partner for a demo.