Overview

This article describes how to deploy Direct Internet Access (DIA) with Active-Standby (Active-Passive) High Availability on Versa FlexVNF appliances. It covers the architecture, packet flow, NAT configuration, failover behavior, session synchronization, and Day-2 operational considerations.

The goal is to achieve graceful failover for DIA traffic — equivalent to a pair of traditional active-passive firewalls — covering:

  • Appliance failure (full device down)
  • Individual circuit/link failure
  • Planned maintenance (graceful power-down of one unit)

Architecture: Active-Passive with VRRP

Key Design Elements

ComponentDetails
HA ModeActive-Passive (Active-Standby)
IP FailoverVRRP with shared VIP on WAN-facing /29 subnet
ISP ConnectivityEach appliance has its own ISP link; both share the same /29
Cross-ConnectDedicated link between Active and Passive for sync + forwarding
LAN RoutingeBGP or static; can re-converge to Passive if Active LAN link fails
NAT PoolMust use VRRP VIP for sessions that need fail-over

Packet Flow and Session Handling

Core Principle

The Active device always handles all traffic processing, policy decisions, and security checks. The Passive device acts only as a conduit — it forwards packets to Active without processing them. The forwarding/conduit layer on Passive operates before the session layer, so packets never reach the session lookup or policy engine on Passive while Active is alive.

Forward Path (Client to Internet)

  1. Client sends packet; VRRP directs it to the Active appliance.
  2. Active creates a flow record for the session.
  3. Flow record is synced to Passive via the cross-connect.
  4. Active applies NAT (using VRRP VIP), security policies, and forwards the packet to the ISP/internet.

Reverse Path (Internet to Client)

  1. Response arrives at Active (destination is the VRRP VIP).
  2. Active looks up the existing flow record, applies reverse NAT.
  3. Forwards response to the client via the LAN interface.

Asymmetric Forwarding (LAN Link Failure on Active)

If the Active LAN link goes down, but the device itself is still alive:

  1. LAN routing (eBGP) reconverges — client traffic lands on Passive.
  2. Passive forwards it (unprocessed) to active via the cross-connect.
  3. Active processes it but sees it cannot reach the client directly.
  4. Active routes the response back through Passive to reach the client.

NAT Behavior and VIP Configuration

Recommendation: Configure the NAT pool with the VRRP VIP (not the physical interface IP) for all sessions that need to survive failover. This ensures that when Passive takes over via VRRP + virtual MAC, it inherits the same public IP.

  • NAT pools are configurable — you can have multiple pools with different IPs for different use cases.
  • Sessions NATed to a non-VIP interface IP will not survive failover.
  • All NAT bindings (port mappings, session-to-IP associations) are synced from Active to Passive.

Failover Scenarios

Full Device Failure

  1. Passive detects Active failure via HA heartbeat on the sync interface.
  2. VRRP failover occurs — Passive takes over the VIP with virtual MAC.
  3. New traffic is attracted to the (now Active) Passive.
  4. Existing sessions continue because Passive has synced flow records + NAT state.

Embryonic Sessions (TCP 3-Way Handshake In Progress)

Important: Sessions that have not completed the TCP 3-way handshake at the time of failover may be lost. This is a TCP timing limitation common to all vendors.

ScenarioOutcomeRecovery
SYN sent, SYN-ACK arrives on Passive while Active is alivePassive forwards SYN-ACK to Active (conduit behavior). Active processes it normally.No issue — seamless
SYN sent, Active dies before SYN-ACK arrivesNo session record on Passive. SYN-ACK dropped (SYN-check). Client retransmits SYN.Client TCP retransmit creates new session on new Active
SYN + SYN-ACK completed, ACK in flight, Active dies before syncClient thinks session is established. Server never got ACK. Session ages out.Application-level retry needed. No vendor can recover this.

Session Sync Mechanism

What Gets Synced?

  • Flow records (5-tuple, session state)
  • NAT bindings (port mappings, IP associations)
  • App-specific metadata (App ID, URL category, matched rule)
  • Pinhole information (for Passive to take over sessions)
  • Firewall state

How It Works

  • Dedicated sync interface between Active and Passive (both control plane and data plane).
  • One-on-one direct exchange — no third appliance or controller involved.
  • Bulk transmission: Multiple session records are batched into single messages (not one-at-a-time).
  • Incremental updates: Initial sync sends tuple + NAT bindings; additional metadata (App ID, URL category, matched rule) is sent as it is discovered during packet processing.
  • Active tracks what Passive has consumed — no duplicate data transmitted.

Bandwidth Impact: Sync traffic is metadata only — no payload data crosses the sync link. Combined with bulk batching, the overhead on the cross-connect is minimal.


Cross-Connect and Performance

Fix (VOS 22.1.4+): Inner flow information is now carried so that sessions are distributed across different worker threads on the receiving device, eliminating the CPU bottleneck. A single cross-connect interface is sufficient — port-channel is not required.


Day-2 Operations and Config Ordering (VOS 22.1.4)

Current behavior: In VOS 22.1.4, Active and Passive are managed as separate individual devices, not a single logical unit. Manual config ordering is required for certain operations.

Config Push Ordering Rules

OperationOrderReason
Adding a rule/policyPush to Passive first, then ActivePassive needs to know about incoming states before Active starts sending them
Deleting a rule/policyPush to Active first, then PassiveActive stops sending related state before Passive removes the config

Operational Notes

  • Before any template push, determine which device is currently Active vs. Passive.
  • The Director has monitoring to check device state, but there is no automated ordering logic in 22.1.4.
  • For automated deployment pipelines, scripts must query HA state before pushing.
  • NAT configuration changes require the most caution; firewall/routing changes are less sensitive.
  • Non-data-path changes (passwords, BGP peer additions) can be pushed to either device in any order.

Safe Default: When in doubt, always push additions to Passive first, then Active. For deletes, do the reverse.


ConfigSync (Coming in VOS 23.1.2)

A. HA Workflow (Director)

  • Active + Passive managed as a single logical HA unit (not two separate devices).
  • Single configuration push covers both nodes automatically.
  • Director handles ordering internally.

B. ConfigSync (On-Box Validation)

  • Both Active and Passive nodes exchange configuration state with each other.
  • Before activating a config change on the data path, nodes validate that both sides are in agreement.
  • If a mismatch is detected (typo, missing config on one node), the system blocks activation and notifies the user.
  • Eliminates the risk of config drift between HA peers.

Timeline: VOS 23.1.2 expected in the 2nd half of 2026.


FAQ

Q: What is the source IP in a Wireshark capture between the FlexVNF and ISP?
A: It will be the VRRP VIP, not the physical interface IP, when the NAT pool is configured with the VIP (recommended for failover).

Q: How common is Active-Passive deployment?
A: Approximately 90% of deployments use Active-Active. However, several large enterprise customers have been running Active-Passive successfully for 4-6+ years.

Q: How long has Active-Passive been available in VOS?
A: The feature has been available for many years (6+ years in the field).

Q: What happened to the packet replicator concept?
A: That was a theoretical discussion during Active-Active architecture design. It was never implemented. In Active-Passive, the Passive simply forwards packets to Active — no replication needed.

Q: Do I need a port-channel for the cross-connect?
A: No. The fix in VOS 22.1.4 addresses the worker-thread distribution issue. A single cross-connect interface is sufficient.


Key Takeaways

  1. NAT Pool = VRRP VIP — configure the NAT pool with the virtual IP, not physical interface IPs.
  2. Cross-connect fix landed in 22.1.4 — the double-NAT CPU issue is resolved; single cross-connect is fine.
  3. Config push ordering matters today — adds go to Passive first, deletes go to Active first.
  4. 23.1 eliminates manual ordering — plan upgrade path to 23.1 for HA Workflow + ConfigSync.
  5. Embryonic session loss on failover is expected — this is a TCP physics limitation, not a bug.
  6. Passive never processes packets — the conduit layer forwards before the session layer, avoiding SYN-check drops.