Problem statement:
"sdwan-datapath-down" alarm is not generated consistently.
Reason for this behavior:
When WAN path between local site and remote site or controller(s) goes down some where in the underlay(not VOS WAN interface physically down), 'sdwan-datapath-down' alarm gets generated consistently. Also, when path comes up, 'sdwan-datapath-down' alarm gets cleared consistently.
When WAN interface physically goes down (admin/oper down) on Versa device, VOS delete interface from sd-wan context and propagate this information to Controller. Controller receives this delete notification of WAN interface and does not generate 'sdwan-datapath-down' alarm.
If there is only one path (one WAN interface on local site) on local site and that WAN interface physically goes down, there may not be enough time available(since connectivity already lost to controllers) for BGP to update this information to Controller so Controller will generate 'sdwan-datapath-down'. If there are more than one WAN interface on local site and when one or more WAN interfaces goes down physically except atleast one stays up, WAN delete information is propagated to controller via BGP through WAN interface which is up. Controller removes remote site WAN information from its database and does not generate 'sdwan-datapath-down' alarm.
Just to make sure we start on clean state, we always generates clear alarm for 'sdwan-datapath-down' when interface/path comes up irrespective of whether 'sdwan-datapath-down' was generated or not.
Recommendation from Versa:
If you have monitoring system outside of Versa and trying to monitor path status, please treat 'interface-down' as super set of 'sdwan-datapath-down' alarm and consider that path is down because interface itself is physically down.
If you receive 'sdwan-branch-disconnect' from controller for remote site, that means branch device completely lost connectivity to controller and unable to send any alarms including 'interface-down' and consider it as site completely isolated from controller. 'sdwan-branch-disconnect' and 'sdwan-branch-connect' alarms gets generated consistently.