Veriflow - The Formal Future #NFD16

Formal verification. Two words unfamiliar to me before Networking Field Day. To provide a brief summary, formal verification uses mathematical proofs to verify a system is working as designed – the same process used to create hacker-proof code. Veriflow has taken this method and applied it to networking to verify the intended design or operation of the network.

But what does this actually mean to us as network operators? Instead of monitoring isolated metrics like an interfaces’ utilization, we can monitor the intent of the entire network as a whole, including security policies. As a practical example, you can write policy that says:

The user network should never be able to access tcp port 3306 in the datacenter network.

and

Datacenter network links should always be redundant.

Veriflow will monitor the operational state of your network – mac address tables, route tables, access control lists, firewall rules, and alert you if your policy is violated. As more and more network devices are added to your network, this becomes an increasingly powerful tool.

veriflow-slide


Let’s get into the nitty gritty, how does this work? Veriflow’s solution is made up of two components – the “collector” and the “verification engine”. The collector is a VM that runs in your environment, using SSH credentials to log into your network devices and gather operational state using various 'show' commands.

Once the collector has aggregated the data, it forwards all network info to the verification engine, which can be another local VM or SaaS environment run by Veriflow. The nice thing about the SaaS option is that the collector VM never shares your SSH credentials with the verification system. Once the data is in the verification system, it is modeled to predict network-wide behavior and evaluated against any policy rules you have created. Each time the collector retrieves data (user configurable intervals), this is considered a “network snapshot”, and is kept as long as you have the space to store it. A typical network snapshot is 1-2 MB per device. By keeping these snapshots, you have a documented history of your network, and you can go back to any point in time to see what has changed. Besides security policies, this can help you verify network operation after an operating system upgrade, since Veriflow will inform you of any state change, for example if a routing table is incorrect because of a misapplied route-map.

For Networking Field Day, Veriflow also announced the 2.0 version of its software with two features I wanted to callout:

  • First, CloudPredict enables the collector to use API credentials to connect to your AWS environment and gather routing table info, security group rules, and S3 bucket policies. Since “the cloud” can sometimes be a wild west run by developers, CloudPredict can allow you to monitor network and security changes even when others have write access to your accounts – like when you need to make sure your dev VPC is completely isolated from your PCI VPC.

  • Second, Preflight will model your network changes to see potential impact before changes go live. At launch, Preflight only models ACL changes, which is a good start towards more intelligent network change control.

With these new features, and others I haven’t mentioned yet (like automatic network diagram generation), Veriflow was one of my favorite presentations at NFD16. I think Veriflow is showing us where the future of network monitoring and configuration verification is headed. Monitoring the network intent is much more elegant that writing individual alert rules for interface utilization, cpu and memory thresholds, and link up/down events. Testing network policy before applying to your live network – that’s a dream most of us can’t afford with completely redundant network hardware.

Veriflow is doing a lot right, and while listening to their presentation I couldn't help but think about how their platfrom could be expanded into additional areas. I'll be watching to see if any of the following will be rolled into future releases:

  1. Expand Preflight checks to include more components, with API access so we can build Preflight config verification into network automation workflows.
  2. Network utilization analysis. If a link is becoming saturated, use historical data of other existing links to know where traffic should be moved to. If a network device fails, will traffic flow overload the remaining links?
  3. Gather netflow data from the network to compare operational state to actual network flows - also use with Preflight checks to show what traffic flows would be affected.

Formal verification and monitoring the "intent" of the network are exciting advances, and I believe will have a real impact for the network operator.


Veriflow 2.0 Press Release

Disclosure: I attended this presentation as part of Networking Field Day, a Tech Field Day event. All expenses related to the event were sponsored, in addition to any free swag given out by vendors. All opinions expressed on The Network Stack are my own and not those of Tech Field Day, vendors or my employer.