The Infrastructure as Code paradigm relies on the management and provision of infrastructure through code instead of manual processes.
The state of the infrastructure is stored in files containing a representation of its configuration. However, this does not need to be the exact configuration. It can be a data model to generate the configuration or the final state of the target equipment.
Using this type of representation allows, among other things:
- To easily track changes to the infrastructure. This type of tracking can be done with a version control system, such as Git.
- To easily deploy and replicate the configuration across the network using automation tools.
- To integrate infrastructure into CI/CD (Continuous Integration/Continuous Delivery) pipelines.
The concept of Infrastructure as Code was born with the exponential explosion of data centres. The number of physical or virtual machines that needed to be managed increased in magnitude. The traditional approach of manually configuring equipment was no longer valid. It was essential to be able to automate mass deployments and to know the desired state in which each of the services should be. The use of the IaC paradigm makes it possible to store this status information so that the automation tools can then deploy or update the services.
Thanks to the use of automation and IaC, the management of large data centres with hundreds or thousands of physical servers and hundreds of thousands of virtual servers became easier.
The same paradigm is being tried to be transferred to the world of communication networks. The aim is to be able to take advantage of the benefits it brings, such as:
- Reduced cost of operation, maintenance, and deployment: automated, deterministic and simplified provisioning has lower costs than manual provisioning.
- Increased flexibility and speed of deployment.
- Deterministic configurations: Deployed configurations will never be affected by human errors. If there is an error in the configuration, it is something deterministic, and therefore it is also traceable for correction.
- The use of CI/CD pipelines to test the new configuration before deployment.
- Easily traceable and editable network configuration information.
However, the implementation of this paradigm in communications networks is facing some difficult challenges:
- Heterogeneity of network equipment: each manufacturer has a different way of configuring its equipment, and even within the same manufacturer there may be models with different ways of configuration (see Cisco with IOS, IOS-XR, NX-OS).
- Proprietary functionalities: although all manufacturers claim to advocate standards, almost all tend to make proprietary “enhancements”.
- Shortage of documentation: Many networks are not documented, and others have documentation that is completely outdated; others have fragmented and non-indexed documentation, so finding the right information is quite difficult.
- Templates: In many networks there are no official configuration templates, so the deployment of a new service often follows a “copy from somewhere else where it works” model, leading to a “drift” in the configurations that are applied in the network.
- Operating model: If network problems appear, the necessary configuration is applied to solve the problem and restore the service. Such configurations are not always reverted to a standard configuration and are rarely documented.
- Lack of knowledge: Although network automation has been around for years, there is a lack of knowledge of these techniques in both engineering and network operations departments.
Let’s skip the operational complications of implementing a Network Infrastructure as Code model and focus on the technical ones.
The first and most important stumbling block is how to represent our configuration as code. Here we have two clear models:
- Declarative: with this approach our configuration defines how we want the network equipment to behave, what is the desired state (I need an ISIS adjacency on the Ethernet0 interface).
- Imperative: In this model, the exact commands to be applied and in what order on the target computer are stored.
The imperative model is quite simple and can solve our problem in the short term. But it is not useful in the medium or long term. It is not vendor-independent, as each vendor has its own command line.
The declarative model also has its problems in network equipment. In such equipment, the order in which commands are applied is often critical, and a different order of application can lead to loss of service or management.
Ideally, a declarative model should be used, but the automation that dumps the information into the network should be able to decide if some commands need to be executed before others or not. With today’s automation tools, this approach should not be a problem.
At this point it is necessary to have an abstraction that allows us to model the configuration of an equipment regardless of the manufacturer it belongs to.
The dominant modelling language, and used with network management protocols such as NETCONF, is YANG (Yet Another Next Generation).
This language allows the configuration of a device to be modelled using YAML, JSON or XML files. There are standard YANG models that can be used by almost all manufacturers. In addition, each manufacturer defines its own models to be able to model its specifications.
The problem of achieving complete abstraction is found not in simple configurations, such as applying an IP address to an interface, but in complex structures such as routing policies. The way these policies are implemented by each vendor is quite different, but all vendors typically use several configuration objects to define these policies:
- Objects to filter by prefix.
- Various objects to filter by BGP attributes.
It is not easy to model a routing policy in a declarative way. It is even more complex to abstract this policy from the manufacturer so that it can be transferred to one or the other. However, this pitfall can be overcome with a good definition of the data model and configuration templates. These templates would allow the same input (data) to generate configurations depending on the type of equipment.
Once we see that it is feasible, albeit with some effort, to model the network, we need to look at how to obtain data on the current state of the network. In a green field this would not be a problem, as the modelling would be prior to the implementation of the network. But this is not the usual case. There is equipment that will allow the export of its configuration in YANG format, but not all of them. The rest would need the configuration to be processed so that it can generate the data model. There are tools that are capable of parsing some manufacturers’ configurations and modelling their configuration (such as napalm). For those manufacturers that do not allow this, it will be necessary to implement a configuration parser in order to extract the data to the common model.
This information obtained is what will be stored in the version control system, and what will be used to deploy configuration to the network.
From that point on, this information must be considered to represent the Single Source of Truth, which has implications regarding the way the network operates:
- From now on all network changes must be made based on changes in the configuration files of our data model.
- These changes are deployed in the network through the relevant automations.
- No changes should be made to the network manually, as this could lead to configuration drift.
Maintaining the single source of truth is fundamental in the IaC paradigm. Otherwise, infrastructure management is unfeasible if there are discrepancies between the data stored in the infrastructure and the actual network configuration.
For the above reasons, fully establishing the IaC paradigm in a complex communications infrastructure is practically a pipe dream today. It is entirely possible in SDN-oriented infrastructures, such as Cisco’s ACI, which they even have an agreement with Hashicorp, a leading IaC company with its own Terraform tool. In this type of network, there are centralised elements (controllers) that know the complete state of the network and have APIs that allow easy interaction with them.
So, is it possible to use the IaC paradigm in complex communications networks? The answer is that completely changing the operating model is very complex but moving towards it is feasible.
The strategy for implementing IaC in communications networks would be to select parts of the configuration to be included in the paradigm. The first parts that can be included, which represent a real quick win, are those related to equipment management:
- SNMP configuration.
- NTP configuration
- SYSLOG configuration,
- Configuration of access to the equipment itself.
This type of configuration is usually common to all equipment. Centralising this information so that the network is always in the desired state is relatively easy to achieve.
From there, other parts of the configuration can be added to our IaC infrastructure, such as:
- Configuration of network backbone interfaces: the number of these interfaces is usually small and their configuration is more or less homogeneous.
- QoS policies: these are usually defined once and then applied per interface type.
- Common routing policies: when services are standardised, policies are usually defined so that they can be reused by all clients of the same service. These policies can be modified over time to accommodate new service features. If these policies are within our IaC infrastructure, the network-wide deployment of the modification is unimportant.
The last part that could be added to the IaC infrastructure in an already deployed network would be the customers that are already connected. This is the part that contains the most variability and is often linked to other systems, such as an operator’s own CRM. Having this information in the IaC would simplify two of the most neglected processes of an operator:
- Service decommissioning: It is common for many operator networks to have remnants of decommissioned service configurations. Under a declarative IaC model, when the service is decommissioned at the single source of truth, the automatisms would delete the associated configuration.
- Service modifications: it is not uncommon for modifications to an existing service to be processed as deregistrations and additions, as it is usually easier to conduct this process than modifications. If the IaC model is declarative, the automatisms will take care of these modifications.
In conclusion, the full implementation of the IaC paradigm in a traditional communications network is very complex, but it is possible to take advantage of this paradigm to simplify many network operations. In fact, it is trendy to study the possible application of this model in carrier networks.