Today is Thanksgiving, and Jack loves this day. Growing up, Thanksgiving was a day full of exceptional fun and sumptuous dinner with his entire family. It’s that day of the year, where his brothers and his cousins made time – without excuses – to be with each other. Jack has continued the tradition until today. His wife, Wendy, cooks a delicious turkey while he and his kids – Diana and Sam – are in charge of decorations.
But Thanksgiving hasn’t been as much fun as it was earlier. Life has been anything but easy ever since Jack joined ACME networks as a network architect. He is always anxious. One phone ring on this day can ruin his entire celebration. There could be a network issue anytime, and Jack will have to present himself at his office in a flash of 30 minutes. Over the past three years, since he joined ACME networks, Jack has missed many Christmases, birthdays, and New Year celebrations.
Crisis
Today was a relatively quiet day. As Jack sat down for a lavish Thanksgiving dinner, he just hoped to get past the rest of the day without any interruptions from work. But Alas! Just as he was thinking of a relaxed day ahead, the phone rang. Few critical alarms had set off. These could not be put off for another day as alarms on a critical day as Thanksgiving could have grave implications for downtime, uptime, customer experience, and deliveries. Jack and his colleague Frank were required to troubleshoot the issues urgently. They also had to fix the overall network environment from various strategic and operational angles before things entered into a bad loop.
As Jack sat in his car, he could see the disappointment in the eyes of Wendy, Diana, and Sam. They knew how tough his job was, but they wished he could spend more time with them. And so did Jack. But did he have any choice?
Panic and confusion reigned the air at ACME networks. ACME is a large corporation with numerous branches across the globe. Plus, it is a toy company, so a holiday like this was probably the most crucial business part of the year. There was no way that the company could afford any delays in its processes, data, and IT hygiene-factors. Its network had a lot packed under its bonnet – and one small slip could cost a fortune.
To monitor its vast multi-vendor IT network, the organization had procured numerous tools for monitoring & automation. One tool to capture syslog information, one for SNMP, and yet another for streaming telemetry. This variety of information is then sent to another utility that displays interesting information. ACME networks have defined policies on this utility, which, when violated, trigger alerts and notifications using yet another alerting tool.
Jack saw tens and hundreds of alarms on his dashboard. He knew instantly that he and Frank would have to spend many hours ahead finding the relevant issue in this vast and tangled haystack of information. The alarms were a chain-reaction. The issue in one of the devices had possibly cascaded to other devices. Jack and Frank would have to go through all the alarms to pinpoint the exact issue. Helpless and exhausted with the expectation of many long hours ahead, they divided the work and got started.
After spending a few hours and sifting through many alarms and logs, Jack and Frank finally narrowed down the problem. The issue was with an improper route configuration at one of the routers. Not only was the route misconfigured, but it was also non-compliant to the policies set up by Jack. The configuration-management tool showed that the mistake happened during one of the recent software-upgrade activities on the device. The method of procedure for provisioning the upgrade laid out by Jack was not followed, and that’s what had resulted in this error.
Jack quickly corrected the configuration and brought the network back online.
Solution
As Jack made his way home to an evening that was already over and a house sleeping under a heavy silence, he realized this could not go on. He had to find a better solution. Monitoring and analyzing a dozen different tools to pinpoint the exact issue was not going to be sustainable. He cannot spend his precious holidays sifting through logs to identify and troubleshoot human errors. That’s not how it should be. Technology should make our lives more comfortable and not messy. He decided that he needs one solution that does all the following
- Standardizes method of procedures
- Checks for compliance violations
- Entails one platform to collect data from a variety of sources
- Defines and enforces policies by automatic remediation
- Alerts grouping and event correlation
- Executes In-depth network analytics.
After evaluating many solutions, Jack found out just what he was looking for. He assessed that Anuta ATOM is the only solution that can help him achieve his goals – A single-pane-of-glass and a single source-of-truth for his entire network. The horizontally-scalable platform will help him eliminate silos and bring about standardization in his network. The workflow capability will help him to integrate with ITSM solutions effortlessly.
Jack decided to do a PoC with the Anuta ATOM platform. He was excited and upbeat to use this next-gen platform.
But he soon came across his first obstacle. His engineers were uncomfortable with the idea of automation of solution provisioning and monitoring the resources. They were more comfortable working on a terminal. The simplified UI that abstracts resources made them uneasy. They were also accustomed to defining policies and procedures on word documents. Network automation with ATOM was new. Defining policies required using the ATOM UI, which they were not used to. The engineers also used the logs presented by their earlier tools. Now they needed to understand ATOM logs and what they portrayed. Engineers didn’t trust the auto-remediation feature in ATOM. They were scared of undesirable consequences in case the automation solution misbehaved.
Jack didn’t anticipate this backlash from network engineers. He was surprised and dejected with the attitude of these engineers. But Jack was determined to bring about the change. Working alongside Anuta engineers, he showed how easy it was to use the intuitive ATOM UI. The network engineers did not have to access the devices anymore. The detailed logs in the ATOM platform provided substantial information on all operations that ATOM takes. Jack used the ATOM Workflow Automation feature to standardize device upgrade procedures for Cisco and Juniper devices. Engineers could now see for themselves how automated device upgrade procedures eliminated human errors and prevented the kind of situations that spoilt a much-awaited and much-deserved Thanksgiving.
To ease the engineers’ fears, Jack enabled approval-based remediation, where ATOM can first ask for approval before it remediates any detected issue.
Epilogue
The PoC was a hit. Network engineers slowly saw the sheer value and agility that was now possible. They soon got convinced of the immense value that the ATOM platform was ready to bring about. Jack became an instant star in his organization. His futuristic thinking not only helped engineers but also added to the company’s profits by reducing operating expenditures. His team was more motivated than ever before. Jack was promoted and widely recognized as a thought leader in his organization.
It’s been a year since ATOM was deployed. Jack has established himself as an automation expert. He has not only automated his entire department but also introduced ATOM to other departments. Jack has enough time now to not only celebrate all family events but also pursue many of his hobbies that he had to sacrifice on the treadmill of network fire-fighting all the time. Now he’s always there for his family. His kids love him for being around when they need him. His employees admire him for his bold and strategic decision. A single, smart decision has brought about a massive turn around in Jack’s life.
Jack never forgets to say his Thanks – to Network Automation.