...

Blogs

Enhanced Control, Reduced Complexity with ATOM’s Auto Rollback for Multi-Domain Controller Orchestration

Navigating the Network Horizon

Tools like Cisco’s vManage, Meraki, and DNA Center have become indispensable in the evolving enterprise networking landscape. These tools offer administrators visibility and control over their networks, from SD-WAN deployments to end-to-end network policies.

The Role of an Orchestrator

Navigating this diverse landscape of networking tools poses a significant challenge for administrators. Each tool has its distinct interface and intricacies, creating a complex environment where seamless coordination becomes demanding. Imagine switching between platforms, each requiring a specific skill set and understanding. This juggling act consumes time and effort and increases the likelihood of errors due to the inherent complexity of managing multiple tools.

Now, include an orchestrator in the picture. Positioned above individual controllers, an orchestrator takes the helm, providing a centralized layer that unifies and enhances their functionalities.

  • Automated Workflows: Tasks such as deploying a new branch office’s network policies or updating device firmware can be templated and automated. This automation not only reduces manual effort but ensures consistency across the network.
  • Unified Management: Instead of switching between multiple interfaces, administrators get a consolidated view and control panel. This unification streamlines management and reduces the chances of configuration errors.
  • Intelligent Rollbacks: In the dynamic networking world, changes can sometimes lead to unforeseen issues. Orchestrators can automatically roll back configurations to a known good state, ensuring network uptime and reliability.
  • Enhanced Reporting and Insights: By pulling data from all these tools, orchestrators can provide holistic insights into network health, security, and performance. These insights can guide proactive measures, optimizing the network even before issues arise.

 

An orchestrator amplifies their capabilities. It offers administrators a simplified yet more powerful means to ensure optimal, consistent, and reliable network operations.

However, oftentimes, the automation needs to handle rollbacks to accommodate device issues, change of plans, etc. In this blog, we will discuss various approaches to achieving rollback.

Preparing for the Unforeseen:

Let’s consider a scenario where a network operator makes changes to the IPv6 configuration on a router to implement a new addressing scheme, but the changes inadvertently cause connectivity issues or disruptions. In such cases, the operator might need to perform an IPv6 rollback to revert to a known and stable configuration quickly.

Potential Failures to Consider:

  • A service can go unreachable in the middle of the workflow.
  • A service call can get a delayed response or timed out
  • Runtime errors emitted by Service calls
  • Business logic exceptions

Some failed operations can be retried before performing a rollback. But, in the absence of Auto Rollback capabilities, developers need to add explicit instructions to handle failures. A closely related concept is ‘Undo’ of a successful run. While Undo can be thought of as an advanced feature, rollback is certainly as important as the use case flow itself.

There are two aspects to an Orchestrator’s value: 

  1. Runtime Capabilities, Stability, Scalability, Maintenance etc
  2. Application Developer productivity

 

In ATOM, cross-domain use cases are programmed as BPMN workflows using drag-and-drop tools and various developer aides. It is imperative that developers think of failure scenarios of the use cases and arrange for appropriate remedial actions in their workflows. 

In this post, I will explain how ATOM helps increase developer productivity when it comes to handling rollback and undo operations.

Auto Rollback with ATOM

ATOM offers Auto Rollback in case of workflow failures or Undo operations. In an ipv4 to ipv6 migration scenario, if ipv6 configuration fails, the compensation event triggers ipam rollback for ipv6 reserved. Simultaneously, the interface command rolls back to the previous ipv4 configuration.

ATOM Execution:

  1. Tasks, Transaction Logs, Call Logs & Rollback Plans
    -ATOM provides a Task construct to tie up logical operations together.
    -ATOM Tasks are hierarchical objects. A workflow can use one or more tasks as needed.
    -Tasks are tracked to the transactions induced by workflow steps.
    -These transactions are converted to external service call chains.
    -Compensating steps for external calls are computed and stored as a ‘Rollback Plan.’
    -When the system/user triggers a rollback/undo, the system verifies the rollback plan and executes it

  2. Operators can review the transaction logs and call logs if so desired
  3. Operators can review the rollback plan before they trigger rollback/undo
  4. ATOM categorizes APIs into Provisioning and Logging/Reporting types to determine their nature and how to handle their rollback.
  5. Workflow Developers can guide the ATOM engine with hints and explicit steps (for complex scenarios) if the auto rollback plan is to be adjusted.

 

The rollback feature holds good for all types of activities except an external script where another version of the script may need to be explicitly invoked in case of a rollback requirement.

In Conclusion

Auto rollback in multi-domain controller orchestration is a critical feature that ensures network stability and reliability. ATOM’s innovative solution addresses the challenges of handling failures and provides a seamless way to manage rollback and undo operations, ultimately increasing developer productivity. As the networking landscape evolves, having the right tools and solutions is essential for network administrators to stay ahead and maintain optimal, consistent, and reliable network operations.

Additional Contributors: Manisha Dhan

About Author

You will also like...