Overview
This article describes how to deploy Direct Internet Access (DIA) with Active-Standby (Active-Passive) High Availability on Versa FlexVNF appliances. It covers the architecture, packet flow, NAT configuration, failover behavior, session synchronization, and Day-2 operational considerations.
The goal is to achieve graceful failover for DIA traffic — equivalent to a pair of traditional active-passive firewalls — covering:
- Appliance failure (full device down)
- Individual circuit/link failure
- Planned maintenance (graceful power-down of one unit)
Architecture: Active-Passive with VRRP
Key Design Elements
| Component | Details |
|---|---|
| HA Mode | Active-Passive (Active-Standby) |
| IP Failover | VRRP with shared VIP on WAN-facing /29 subnet |
| ISP Connectivity | Each appliance has its own ISP link; both share the same /29 |
| Cross-Connect | Dedicated link between Active and Passive for sync + forwarding |
| LAN Routing | eBGP or static; can re-converge to Passive if Active LAN link fails |
| NAT Pool | Must use VRRP VIP for sessions that need fail-over |
Packet Flow and Session Handling
Core Principle
The Active device always handles all traffic processing, policy decisions, and security checks. The Passive device acts only as a conduit — it forwards packets to Active without processing them. The forwarding/conduit layer on Passive operates before the session layer, so packets never reach the session lookup or policy engine on Passive while Active is alive.
Forward Path (Client to Internet)
- Client sends packet; VRRP directs it to the Active appliance.
- Active creates a flow record for the session.
- Flow record is synced to Passive via the cross-connect.
- Active applies NAT (using VRRP VIP), security policies, and forwards the packet to the ISP/internet.
Reverse Path (Internet to Client)
- Response arrives at Active (destination is the VRRP VIP).
- Active looks up the existing flow record, applies reverse NAT.
- Forwards response to the client via the LAN interface.
Asymmetric Forwarding (LAN Link Failure on Active)
If the Active LAN link goes down, but the device itself is still alive:
- LAN routing (eBGP) reconverges — client traffic lands on Passive.
- Passive forwards it (unprocessed) to active via the cross-connect.
- Active processes it but sees it cannot reach the client directly.
- Active routes the response back through Passive to reach the client.
NAT Behavior and VIP Configuration
Recommendation: Configure the NAT pool with the VRRP VIP (not the physical interface IP) for all sessions that need to survive failover. This ensures that when Passive takes over via VRRP + virtual MAC, it inherits the same public IP.
- NAT pools are configurable — you can have multiple pools with different IPs for different use cases.
- Sessions NATed to a non-VIP interface IP will not survive failover.
- All NAT bindings (port mappings, session-to-IP associations) are synced from Active to Passive.
Failover Scenarios
Full Device Failure
- Passive detects Active failure via HA heartbeat on the sync interface.
- VRRP failover occurs — Passive takes over the VIP with virtual MAC.
- New traffic is attracted to the (now Active) Passive.
- Existing sessions continue because Passive has synced flow records + NAT state.
Embryonic Sessions (TCP 3-Way Handshake In Progress)
Important: Sessions that have not completed the TCP 3-way handshake at the time of failover may be lost. This is a TCP timing limitation common to all vendors.
| Scenario | Outcome | Recovery |
|---|---|---|
| SYN sent, SYN-ACK arrives on Passive while Active is alive | Passive forwards SYN-ACK to Active (conduit behavior). Active processes it normally. | No issue — seamless |
| SYN sent, Active dies before SYN-ACK arrives | No session record on Passive. SYN-ACK dropped (SYN-check). Client retransmits SYN. | Client TCP retransmit creates new session on new Active |
| SYN + SYN-ACK completed, ACK in flight, Active dies before sync | Client thinks session is established. Server never got ACK. Session ages out. | Application-level retry needed. No vendor can recover this. |
Session Sync Mechanism
What Gets Synced?
- Flow records (5-tuple, session state)
- NAT bindings (port mappings, IP associations)
- App-specific metadata (App ID, URL category, matched rule)
- Pinhole information (for Passive to take over sessions)
- Firewall state
How It Works
- Dedicated sync interface between Active and Passive (both control plane and data plane).
- One-on-one direct exchange — no third appliance or controller involved.
- Bulk transmission: Multiple session records are batched into single messages (not one-at-a-time).
- Incremental updates: Initial sync sends tuple + NAT bindings; additional metadata (App ID, URL category, matched rule) is sent as it is discovered during packet processing.
- Active tracks what Passive has consumed — no duplicate data transmitted.
Bandwidth Impact: Sync traffic is metadata only — no payload data crosses the sync link. Combined with bulk batching, the overhead on the cross-connect is minimal.
Cross-Connect and Performance
Fix (VOS 22.1.4+): Inner flow information is now carried so that sessions are distributed across different worker threads on the receiving device, eliminating the CPU bottleneck. A single cross-connect interface is sufficient — port-channel is not required.
Day-2 Operations and Config Ordering (VOS 22.1.4)
Current behavior: In VOS 22.1.4, Active and Passive are managed as separate individual devices, not a single logical unit. Manual config ordering is required for certain operations.
Config Push Ordering Rules
| Operation | Order | Reason |
|---|---|---|
| Adding a rule/policy | Push to Passive first, then Active | Passive needs to know about incoming states before Active starts sending them |
| Deleting a rule/policy | Push to Active first, then Passive | Active stops sending related state before Passive removes the config |
Operational Notes
- Before any template push, determine which device is currently Active vs. Passive.
- The Director has monitoring to check device state, but there is no automated ordering logic in 22.1.4.
- For automated deployment pipelines, scripts must query HA state before pushing.
- NAT configuration changes require the most caution; firewall/routing changes are less sensitive.
- Non-data-path changes (passwords, BGP peer additions) can be pushed to either device in any order.
Safe Default: When in doubt, always push additions to Passive first, then Active. For deletes, do the reverse.
ConfigSync (Coming in VOS 23.1.2)
A. HA Workflow (Director)
- Active + Passive managed as a single logical HA unit (not two separate devices).
- Single configuration push covers both nodes automatically.
- Director handles ordering internally.
B. ConfigSync (On-Box Validation)
- Both Active and Passive nodes exchange configuration state with each other.
- Before activating a config change on the data path, nodes validate that both sides are in agreement.
- If a mismatch is detected (typo, missing config on one node), the system blocks activation and notifies the user.
- Eliminates the risk of config drift between HA peers.
Timeline: VOS 23.1.2 expected in the 2nd half of 2026.
FAQ
Q: What is the source IP in a Wireshark capture between the FlexVNF and ISP?
A: It will be the VRRP VIP, not the physical interface IP, when the NAT pool is configured with the VIP (recommended for failover).
Q: How common is Active-Passive deployment?
A: Approximately 90% of deployments use Active-Active. However, several large enterprise customers have been running Active-Passive successfully for 4-6+ years.
Q: How long has Active-Passive been available in VOS?
A: The feature has been available for many years (6+ years in the field).
Q: What happened to the packet replicator concept?
A: That was a theoretical discussion during Active-Active architecture design. It was never implemented. In Active-Passive, the Passive simply forwards packets to Active — no replication needed.
Q: Do I need a port-channel for the cross-connect?
A: No. The fix in VOS 22.1.4 addresses the worker-thread distribution issue. A single cross-connect interface is sufficient.
Key Takeaways
- NAT Pool = VRRP VIP — configure the NAT pool with the virtual IP, not physical interface IPs.
- Cross-connect fix landed in 22.1.4 — the double-NAT CPU issue is resolved; single cross-connect is fine.
- Config push ordering matters today — adds go to Passive first, deletes go to Active first.
- 23.1 eliminates manual ordering — plan upgrade path to 23.1 for HA Workflow + ConfigSync.
- Embryonic session loss on failover is expected — this is a TCP physics limitation, not a bug.
- Passive never processes packets — the conduit layer forwards before the session layer, avoiding SYN-check drops.