Introduction
I have been a network engineer for about ten years. Before that, I spent the initial years of my college career exploring Cisco certifications, briefly working with the college IT team, helping them with network diagrams, Wi-Fi hotspot placement, network monitoring, and more. Even though I had never taken the certification exam before, I had already started studying CCIE topics by the time I was in my last year of college.
All of this hard work undoubtedly contributed to my initial professional success. However, as my work became monotonous, I realized the more effort I put in and the more devices set up manually, the less excitement I felt from my job. Additionally, brand-new network projects that used cutting-edge technology on the production side would merely become just pieces of repetitive configurations that needed to be applied daily and lacked the same level of intrigue as time passed.
I was desperately looking for something new, and I finally found it. When I was roughly six months into a position at a leading telecom device vendor and services company, one of our customers was a small African telecom startup that was very cautious of its capital expenditures. As a result, the 200-odd routers that my team managed had to be configured with credentials belonging to new team members or customers. At first, we would divide the task among the team members and complete it over a week to a month, including other mundane tasks such as collecting network usage data from the devices for a biweekly report.
My ability to identify problems and inefficiencies and address them, or at least a portion of them, using my Python skills, made me a successful network engineer within my organization. And it led to my current position as an Anuta ATOM Evangelist.
From the work that I have done over the last seven or so years, I have developed several best practices. My objective is to use these as examples in a blog series where I will contrast them with Anuta ATOM’s capabilities. The goal will be to demonstrate how all or most of my mundane processes and tasks could have been accomplished much more quickly with ATOM.
In prior roles, I performed most of my tasks using open-source tools like Python/ Jinja/ YAML/ Netmiko/ Textfsm, and more. In this blog, I plan to detail how instrumental ATOM could have been in getting those network automation tasks done smoothly. I will also talk about how a network engineer with far less investment in learning Python, Ansible, and other NetDevOps tools can do the same. When it comes to network automation, the following examples are likely the ones that spring up in any network engineer’s mind:
- Data collection for network troubleshooting
- Dynamic configuration generation
- Taking multiple inputs for doing complex tasks like config generation or filtering network data
- Using modern network controllers and device APIs
- Getting through the Jumpbox server, which in most cases is anti-creative (in my view)
- Processing CLI Data
- Integrating automation with CI/CD tools like GIT
1. Network Data Collection:
- An essential data collection task was one of the first projects I began at my previous firm about four years ago. We were a few weeks into managing a new enterprise customer that brought challenges. This company had gone on an acquisition spree and had incorporated many new companies with disparate networks.
- However, many do not understand what needs to be accomplished within a new company’s IT infrastructure environment to make it work in tandem with another. This topic could easily be an entire blog of its own.
- We faced a very different outage issue a few months into the project. To summarize, whenever the primary Internet link at the hub site went down, a few spoke sites or subnets at a spoke sites would lose internet reachability.
- The asymmetric public route advertisement at the hub locations’ internet border routers was where the root cause analysis landed. Typically, only the primary router was advertising all of the subnets to the ISP. The previous network management service provider company had notably one of the worst change management systems that required manual intervention. As a team, we jokingly said that the network appears to be a lab where most of the previous team may have trained for their CCIE exams. At the end of the day, if there are three ways to accomplish something, there is a high likelihood that you will find all three techniques used in various sections of this network.
- That role was the first in which my manager asked me to create a report of all of the internet border routers to see what they were advertising to local ISPs. During my initial research into it, I reckoned that it would take three interactions with every router to obtain all the information we needed for the report as well as to process every CLI output to get the respective data. This task required extensive use of regular expressions because the output had to be processed before being utilized as input for the following command and a filter.
A screen capture of the Python script might help you understand the challenges.
Fig: A small portion of the script I wrote to get all the data I needed from one device. The comments in green and orange will help you understand what I was trying to accomplish.
- If I had access to ATOM, my efforts would have been reduced to a third or a quarter of the effort. I could have performed an SNMP GET Operation on each device in this group, subscribed to BGP SNMP OIDs to obtain the BGP information from the ISP-facing routers, and then exported the final report using the ATOM UI.
- After that step, I could run the final command of getting the advertised routes to ISP Peer IP and be done!
- The task that took around four full working days could have been done in half a day’s work.
- Let me show you how we can use ATOM to complete the tasks mentioned above.
Fig: Building an SNMP Collection Template.
Fig: Final Report generated after SNMP Data collection from the respective group.
- Now we need to download the individual CSV files for the tables, which look like these:
Fig: I have stored the data in a single CSV file for easy viewing. Now, it is just a matter of combining this data in a single table and then building the script a little more also to get the advertised routes.
- Python script writers can quickly validate that integrating table data using the Python CSV module is much easier than getting raw command outputs from network devices and then building complex regular expressions to extract the data for meaningful insights.
- ATOM can also access the devices parallelly, which dramatically reduces the time needed.
- Over time, this undoubtedly contributes to improving the overall hygiene and resiliency of the network.
2. Device Config Generation:
Another frequent activity that my team had responsibility for was generating the configuration for network devices as per the detailed LLDs, which were a collection of data points from the customer, internal design teams, and internal network inventory tools.
Before using this data in the device configuration commands, we had to ensure its validity. Anyone who creates device configurations knows that there is frequent reuse of the same details multiple times in various formats (such as neighbor device name in lower and upper case across multiple description commands) or details with minor edits in them (such as extracting the first 3 IP’s from a given subnet to use as LAN interface IP for the primary/ secondary router and VRRP IP).
When performing these operations, one tends to let their guard down and commit careless errors that either come back to haunt a migration window immediately or in the future.
I developed a solution using Jinja2 templates, renowned in the industry for their flexibility. The input variables I took were either in CSV files or through YAML files. I would also add Python scripts to the projects that helped with rendering the output config and manipulation, extrapolation, addition, refactoring, and in checking the integrity of the inputs provided by team members.
Fig: Snippet of a Jinja2 config template.
Fig: Snippet of a YAML file used to take input parameters to generate the config.
Fig: Snippet of a Python script that modifies/ enhances user inputs and renders config using Jinja Templates.
And guess what? ATOM can also handle this complex set of tasks. Its configuration allows writing Jinja templates within ATOM and enables populating all the variables inside a form that can be generated in a GUI.
Since ATOM would be placed inside the company’s management network to access the devices directly, it could push this generated configuration quickly onto the target devices.
Accompanying this with ATOM workflows would take the process to a whole new level. Generating configurations for all devices, for example, 2 PE routers, and 2 CE Routers, by filling individual forms and applying them in a single process ensures accuracy and speed. Well, that is what we call a Senior Network Engineers dream!
Additionally, this efficiency democratizes the process because the configurations developed and implemented by a new engineer in the team is as solid as the ones performed by experienced ones. As many say, a correct config a day frees the senior engineers to work more creatively and less on firefighting.
3. Taking Inputs
- As indicated above, I collected inputs using the YAML or CSV files. Naturally, there are different justifications for using each one. Here are some examples to help illustrate.
- YAML files are typically used when there are only a few devices—perhaps just one or two—and many variables because it may generate a lengthier configuration.
- On the other hand, CSV becomes a much better alternative when the configuration is smaller and the number of inputs is less, but when that configuration needs to be generated for a higher number of devices.
- Let’s examine the YAML file case for the time being. It provides a flexible way to give inputs. It offers many advantages, like providing hierarchy within the data and a straightforward relation with Python data structures such as lists and dictionaries, to mention a few points. We can also read and modify YAML data with as little as 4 to 5 lines of code.
- But there are shortcomings. The biggest is the inability to verify the input integrity. For instance, if a specific value is required not to contain any spaces between words or if an IP address should be supplied with a subnet mask using slash notation or a dropdown option – it creates a challenge in only allowing users to choose from a fixed set of values and nothing else.
- With that said, I am aware that most of these issues can be overcome by writing filter functions in Python, which can check if a particular input satisfies all the norms. However, this adds a whole new level of complexity and work.
- These are all tasks that most web frameworks, like Django, Flask, and more., can also facilitate with very few lines of code as most of the leg work is already done in the framework, and the rest is taken care of by the browser.
- In this instance, my first thought was to create a Django-based application that would eventually house all of my projects, large and small, but a network engineer who is hired to handle routine network implementation requests and trouble tickets is only capable of doing so much.
- However, if the last company I worked for had invested in a tool like ATOM, it would have witnessed the efficiency of its workflow feature.
- If your goal is to divide one big task into smaller sub-tasks and handle them one at a time, with the ability to run sub-tasks in an orderly fashion as well as loops and handle errors built within and much more, then ATOM is for you.
- As an example, the user-input task, which allows the admin/ developer to design an input page/ form and define the nature of these inputs, is demonstrated in ATOM below:
Fig: A workflow highlighting a user input task that can take user inputs via a form that the workflow builder can construct. The inputs are then converted to variables that other workflow tasks can access.
Fig: Snapshots of form construction and the rendered form a user has to fill while running the workflow (The variable values generated are shown in the previous Figure).
Summary
- I sincerely hope that this first blog in my planned multi-part series has given you a glimpse into ATOM’s inner workings and demonstrates how all of the rich features found in ATOM can support the NetDevOps journey of any network operator, reduce the steep learning curve of all large and small open-source tools to a minimum and empower every network engineer to perform network automation with the same level of agility and flexibility.
- I’ll reiterate that there’s much more to come that will be revealed in ATOM’s superpowers. We have barely begun to scratch the surface of its capabilities and options.
- In closing, I would like to emphasize that Anuta Networks ATOM is a tool built by enthusiasts for enthusiasts.
Additional Contributors: Sukirti Bhawna