Tuesday, March 23, 2010

Cisco Nexus command - Checkpoints and rollbacks

Using the Cisco Nexus 7010, 5010 and 2148's has changed some of the habits I have traditionally used for the Cisco IOS command set. Some of the new Nexus commands have become second nature and I now miss them on IOS. Being able to use grep is one I really wish was incorporated into IOS. I am used to having it with the ASA platform and now with the Nexus platform - going back to IOS 12.x and not having it there is annoying.

A new command that is really useful on the Nexus platform is checkpoint. There are several things that are unique about checkpoints and how you can use them. First, checkpoints are primarily used for rollback situations. They allow you to make changes on the system and if required due to an error rollback to a known good configuration on the system. There are three rollback types.
Atomic rollback is done when the configuration can be applied with NO errors.
Best Effort rollback will ignore errors and push the configuration onto the system.
Stop At First Failure will process the rollback request until it hits an error and then stops.

The default rollback type is Atomic and this is likely the most common rollback method you would use on a production environment. I am not aware of many folks wanting to rollback to a "Stop At First Failure" or "Best Effort" scenario situation unless true desperation has kicked in. There might be a case of the order of rollback if you are using VDC's and moving physical resources from one VDC to the other in which case perhaps Best Effort might be useful.

Also of note, the rollback feature must be used per Virtual Device Context (VDC), in other words, you have to run the command in each VDC. This is expected behavior as each VDC is it's own NX-OS instance and you have to run all the same commands to get the desired behavior out of the NX-OS platform.

The command itself is very simple:
checkpoint {checkpoint name} description {a description} | filename {path and filename}
Example: checkpoint cp-running-config-known-good-2010-03-22 description checkpoint of running config

There are some restrictions on the checkpoint name (max length 80 characters) and there are restrictions on the filename (max length of 75 characters and filename can't start with the word "system") but otherwise it is pretty straightforward process to get this going. I am using this on NX-OS version 4.3.1, earlier versions had more restrictions on file names and such so read the documentation if you are on an earlier release.

To see what the checkpoint command does you can use the show commands. To see all the checkpoints that are in a given VDC:
show checkpoint all
show checkpoint summary

The checkpoint command basically keeps a small database of checkpoints to allow you to rollback to a specific one and calculates the differences between a current state or checkpoint and that checkpoint you want to move to. It will generate a rollback script when you use the rollback command. If you want to see the differences that are being generated you can do that too:
show diff rollback-patch {checkpoint source name | running-config | startup-config | file filename} {checkpoint destination name | running-config | startup-config | file filename}
Example: show diff rollback-patch running-config checkpoint cp-running-config-known-good-2010-03-22

To actually do a rollback:
rollback running-config {checkpoint cp name | running-config | startup-config | file filename} {atomic | best-effort | stop-at-first-failure}
Example: rollback running-config checkpoint cp-running-config-known-good-2010-03-22 atomic

To see the status of rollbacks:
show rollback log

You can also clear out the checkpoint history and files, use the command with caution.
clear checkpoint database

This is a VERY useful command to build into your scripts prior to pushing out production changes on gear. It allows you to have a well known state stored locally and be able to rollback to it quickly in case of problems in your scripts. Awesome!
- Ed

1 comment:

Pete89 said...

I ran into a gotcha on the checkpoint command. I got a loaner from Cisco. It's a 5596. I enabled FCoE and then tried the checkpoint command. Here is what I got:

switch# checkpoint 1stcheckpoint
ERROR: ascii-cfg: FCOE is enabled. Disabling rollback module (err_id 0x405F0053)

Why Cisco why???