This blog was originally published to the Apstra website – in 2021, Juniper Networks acquired Apstra. Learn more about the acquisition here.
As we release AOS® 3.0, it’s time for the industry to consider taking a hard look at legacy and its crippling effects. It’s also a fitting opportunity to propose a path forward that enables organizations to eliminate legacy forever.
Indeed, everyone hates legacy, when defined in the context of infrastructure management. So why does it happen and how to eliminate it? Legacy is created when growing complexity in your infrastructure prevents you from making the changes in your infrastructure that are required for your business with the required agility and without unreasonable risk. In order to compete, you are required to make risk-free, agile, and safe changes. Yet, even though you own your infrastructure (and not the other way around), legacy prevents you from doing so.
The challenge, of course, is to move away from your legacy to modern infrastructure in a manner that doesn’t disrupt your business. You want to be able to say “it’s not you, it’s me” when you feel the time is right.
The key aspect that makes it difficult to actuate change is that you may not even know what your current state is. How do you achieve “not disrupting your business” when you cannot say with certainty what business applications you are running and what their requirements are?
Your business applications are essentially a collection of compute endpoints. These endpoints have specific reachability requirements. They may have distinct security requirements. Some of them may be members of a load balancing group. They may have different HA and QoS requirements. They may differ in how mission-critical they are to your business. Typically all the aspects (reachability, security, load balancing, HA, QoS) are implemented using different, separate enforcement mechanisms.
Now, as you attempt to change, evolve your legacy infrastructure, or even move to a different one you first need to understand current enforcement mechanisms and their interactions. Then you need to understand and leverage new enforcement mechanisms in your evolved infrastructure (as they have likely changed) and how they map to the ones implemented in the legacy infrastructure, while at the same time ensuring your business application’s requirements are still met.
The foundational automation architecture principle that helps you with this situation is the separation between service/policy specification (representation of your business applications and their requirements) and the enforcement mechanisms (how to implement/enforce these requirements). It states that the specification of your business needs should be decoupled from the way to implement them, satisfy them, and enforce them. Once you have that separation, the portability of your workloads becomes feasible. Only once you have that separation, can you map service intent to enforcement mechanisms — but this separation is the prerequisite.
So the first question is, what should that service specification look like? The answer to that question, in principle, has been around for quite a while. Variants of the concept of endpoint/group-based policy specification can be found in OpenStack, AWS, and Azure. What does it look like?
Business application intent is expressed as a composition of endpoints that are placed into groups with the purpose of expressing the need for some common behavior: reachability, security, load balancing requirements, to state a few examples. Endpoint definition can vary. It can be an application, a virtual machine, a container, an external (unmanaged) endpoint, a physical/logical port, etc. Policies are instantiated and related to groups or individual endpoints to define that behavior. Policies can relate to groups in a directional or non-directional manner. Policies are collections of rules, that abide by a “condition followed by an action” pattern. Groups can be composed of other groups, creating hierarchy.
Endpoints, groups, policies, and rules can be thought of as building blocks for expressing intent. Dynamism is achieved by adding/removing endpoints to/from groups and by inheriting the policies applied to groups. Changing of the behavior is achieved by modifying the policies/rules that apply to groups and endpoints.
So if this has been around for a while, what are we missing? When modeling, there are two places where things can go wrong. First is at the model definition time. To do it right, the model has to be expressive enough to cater to all use-cases, while not being overly complicated. It needs to be complete, yet minimal. Endpoint/group-based policy specification is expressive enough. The second opportunity for error is at the model application time. There are many instances where this model is applied directly to enforcement mechanisms as opposed to service/policy declaration. What one gets, as a result, is the “right model, applied to the wrong domain”.
Then there is the task of mapping service intent to enforcement mechanisms mentioned earlier. If not architected correctly this can be a very challenging task. Intent-Based Networking systems at Level 2 of the IBN taxonomy have the necessary foundations to complete this task, namely the Single Source of Truth and real-time, event-based reaction to change.
At Apstra, we’ve built AOS to incorporate the foundations into the architecture of the solution. In the absence of these foundations, overcoming the challenge typically results in leaking of enforcement abstractions into service/policy abstractions. (See micro-segmentation blog for an example). This violates the separation of policy and enforcement principle and results in non-portable workloads, which was the primary motivation for the separation in the first place. Note that leaking abstractions are sometimes introduced on purpose as vendors building these APIs may have little or no interest in universal portability of the workloads. That spells “lock-in”, which spells inability to change, which spells legacy.
We discussed in earlier blogs that the inability to deal with Day 2 operations turns your greenfield into a brownfield overnight. Business requirements specified in an enforcement agnostic, portable manner eliminate this danger.
This is one reason I’m excited about AOS 3.0, which we released today. It introduces for the first time, Group Based Policies, implemented using the architectural principles described above. With AOS 3.0, customers have the right starting point to eradicate legacy forever.