Troubleshooting High Memory Issues On VOS Device : Versa Support

This article explains how to troubleshoot and identify high memory issues on Versa operating System(VOS).

Troubleshooting High Memory Usage

High memory usage investigation needs in-depth analysis of multiple components in Versa operating system.

Investigating high memory utilization requires an in-depth analysis of multiple components within the Versa Operating System (VOS).

High memory usage typically falls into one of the following categories:

1> Intermittent memory spikes occurring at random intervals, potentially causing performance degradation.

2> Persistent memory growth, where memory utilization gradually increases over time since the last service restart.

Case 1: Investigating Intermittent Memory Spikes

Follow the steps below if the device experiences sudden memory spikes at random intervals:

1. Identify the timestamps of the last 2–4 memory spike events and collect the /var/log/versa/debug/mem-high/tshoot.log file from the VOS.

2. Collect device resource utilization details from both the CLI and Versa Analytics, including the total number of active sessions, interface bandwidth utilization, memory usage, CPU utilization, total NAT sessions, and any other relevant system statistics.

3. Verify whether any additional activities coincide with the reported memory spikes. These may include backend scripts, scheduled tasks, or a sudden increase in traffic resulting in a higher number of active sessions and increased memory consumption. Also, review the NAT session counters to determine whether a significant increase in NAT sessions is contributing to the issue.

4. Monitor poller and worker thread utilization to ensure the device is not oversubscribed. Verify that the platform hardware specifications are sufficient to support the observed traffic load and active session count.

5. Compare the top -H output captured in the tshoot.log across 2–4 memory spike events to identify the process responsible for the increased memory utilization during the reported time frame.

Note: VOS releases 21.X please refer to /var/log/versa/debug/mem-high/tshoot.log. For older VOS releases you can obtain /var/log/versa/tshoot.log

Case-2: Persistent Memory Growth

You can follow below detailed steps if you suspect memory is being held up and has been gradually growing from some time.

Step-1: Analysing Basic system Details

Check the system details using the command “vsh details” and “free -h” from the shell . These commands will present you the hardware type, VOS software details, memory consumption(free and used), Spack and OSS Pack details in the system.

Free -h output shows valuable information of memory usage and space available in the RAM. It will also show you the used and available swap space and buffers which are used my Linux kernel. Swap space is utilized when system doesn’t have enough RAM space available. SWAP is additional memory space which is on the disk and not part of the Physical RAM itself.

Load stats is only an indication of the service load on the device, that is CPU and memory used by vsmd.

Step 2: Review System Alarms

Run the show alarms command to check for any high memory threshold breach alarms. Reviewing the alarm history helps determine whether the high memory utilization is caused by intermittent memory spikes at specific intervals or by memory that continues to grow over time without being released.

Step 3: Verify Whether the Memory Utilization Is Expected

Determine whether the observed memory utilization is within the expected operating range for the platform. Verify the hardware model, total installed memory, traffic volume, and active session count to ensure the device is not oversubscribed.

Use the following command to check the CPU utilization, memory utilization, and active session count:

admin@cpe2-cli> show device clients

CLIENT VSN CPU MEM MAX ACTIVE FAILED

ID ID LOAD LOAD SESSIONS SESSIONS SESSIONS

-----------------------------------------------------

15 0 38 79 500000 40 0

The show device clients command provides the maximum supported sessions, the current number of active sessions, and the number of failed sessions for the device.

Keep in mind that a lower active session count does not necessarily indicate lower traffic volume. A small number of long-lived or high-bandwidth ("fat") flows can consume significant network bandwidth. Therefore, correlate the active session count with the actual traffic statistics observed on the device. Versa Analytics can also be used to identify the top traffic consumers and further validate whether the observed memory utilization is expected.

#show interfaces port statistics brief

This command will show you the traffic stats in BPS,PPS and in Percentage.

Step-4: Login into the analytics and review memory utilization graph for last week or for a month. This would give an indication when memory utilization started increasing and if it has dropped at some interval.

If memory utilization continues to increase steadily over time and is not released, a deeper investigation is required to identify the underlying cause, as this may indicate a memory leak. A memory leak occurs when the system fails to release memory that is no longer required by a process. As the process continues to execute, additional memory blocks are allocated without releasing previously used memory. Over time, this results in continuous memory growth, eventually leading to high memory utilization.

Identify when the memory utilization began increasing and review any events or changes that occurred around that time. Possible triggers include:

Software upgrade or patch installation

Network configuration changes

Introduction of new services or security profiles (IPS/IDS, URL Filtering, SSL Inspection, etc.)

Network instability or topology changes

Memory leaks have also been observed in scenarios involving frequent P2MP neighbor churn, IKE flapping, CGNAT flapping, and similar recurring events. Once you confirm that memory utilization is consistently increasing beyond normal operating levels and is not being reclaimed, follow the steps below to identify the root cause.

Step-5: Get into appliance shell and obtain the /var/log/versa/thsoot.log for older releases and /var/log/versa/debug/high-mem for latest VOS releases. Start the investigation from tshoot.log file obtained from either location and check the “top -H” output.

Note: Do not use the htop output to analyze memory or CPU utilization, as the reported system usage can be misleading. Always use top -H for accurate interpretation of process and thread-level resource utilization.

top - 21:15:16 up 40 days, 23:33,  1 user,  load average: 4.70, 4.38, 4.35
Threads: 329 total,   5 running, 324 sleeping,   0 stopped,   0 zombie
%Cpu(s): 20.7 us,  1.9 sy,  0.0 ni, 60.0 id,  0.5 wa,  0.0 hi,  0.2 si, 16.7 st
KiB Mem:   5964680 total,  4853592 used,  1111088 free,   176892 buffers
KiB Swap:  8384508 total,      124 used,  8384384 free.   850332 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
21781 root      20   0 3942172 900700 163920 S  1.7 15.1   1060:47 versa-vsmd
22108 root      20   0 3942172 900700 163920 S  0.0 15.1   0:00.00 eal-intr-thread
22109 root      20   0 3942172 900700 163920 R 35.8 15.1  19534:58 worker-0
22110 root      20   0 3942172 900700 163920 R 30.9 15.1  16735:09 worker-1
22111 root      20   0 3942172 900700 163920 S 32.5 15.1  16682:34 worker-2
22112 root      20   0 3942172 900700 163920 S 30.9 15.1  16713:24 worker-3
22113 root      20   0 3942172 900700 163920 R 30.2 15.1  16839:41 worker-4
22114 root      20   0 3942172 900700 163920 R 31.9 15.1  16687:02 worker-5
22115 root      20   0 3942172 900700 163920 S 16.3 15.1   9306:44 poller-0
22260 root      20   0 3942172 900700 163920 S  3.3 15.1   1765:11 vunet-timer
22281 root      20   0 3942172 900700 163920 S  1.3 15.1 883:48.57 ctrl-data-0
22284 root      20   0 3942172 900700 163920 S  3.0 15.1   1706:20 ipsec-control
22285 root      20   0 3942172 900700 163920 S  0.0 15.1   2:37.62 macsec-control
22289 root      20   0 3942172 900700 163920 S  0.3 15.1 110:31.81 lcore-watchdog
22339 root      20   0 3942172 900700 163920 S  0.0 15.1   1:11.56 kni-handle-requ
20348 root      20   0  877988 184456   6396 S  0.0  3.1 196:32.03 confd
20349 root      20   0  877988 184456   6396 S  0.0  3.1   0:00.07 confd
20350 root      20   0  877988 184456   6396 S  0.0  3.1   0:00.07 confd
20351 root      20   0  877988 184456   6396 S  0.0  3.1   0:00.14 confd
20352 root      20   0  877988 184456   6396 S  0.0  3.1   0:03.22 confd
20353 root      20   0  877988 184456   6396 S  0.0  3.1   0:10.56 confd
20354 root      20   0  877988 184456   6396 S  0.0  3.1   0:00.04 confd
20355 root      20   0  877988 184456   6396 S  0.0  3.1   0:00.08 confd
20356 root      20   0  877988 184456   6396 S  0.0  3.1   0:02.89 confd
20357 root      20   0  877988 184456   6396 S  0.0  3.1   0:00.06 confd
20358 root      20   0  877988 184456   6396 S  0.0  3.1   0:07.55 confd

Top output is very useful and provide actual memory usage(Resident memory) for each process. Top command gives the statistics for available memory, used memory, used buffer and cache memory. If you are taking the top -H output from shell then you can sort the output (Shift +M) with memory usage. By default, it sorts the processes with CPU usage.

You can also dump the output of /proc/meminfo, which shows detailed output of memory counters including Huge Page allocation. You can also collect “sudo cat /proc/vmallocinfo”.

Review the Huge Page allocation and ensure it is allocated correctly. We have observed instances, particularly on virtual deployments, where Huge Page memory is not allocated correctly during system resizing.

Verify that the allocated Huge Pages match the following recommendations:

4 vCPU / 8 GB deployment: 2 GB of Huge Pages

8 vCPU / 16 GB deployment: 4 GB of Huge Pages

If you observe incorrect Huge Page allocation on virtual deployments (KVM, ESXi, Azure, AWS, etc.), execute the following command and reboot the instance:

vsh fixgrub

The vsh fixgrub command updates the boot configuration to allocate the appropriate Huge Page memory during the next boot.

$ cat /proc/meminfo
MemTotal:        5964680 kB
MemFree:         1113440 kB
MemAvailable:    1898280 kB
Buffers:          176892 kB
Cached:           850772 kB
SwapCached:          124 kB
Active:          2203948 kB
Inactive:         284024 kB
Active(anon):    1297588 kB
Inactive(anon):   235296 kB
Active(file):     906360 kB
Inactive(file):    48728 kB
Unevictable:       71428 kB
Mlocked:           71428 kB
SwapTotal:       8384508 kB
SwapFree:        8384384 kB
Dirty:               124 kB
Writeback:             0 kB
AnonPages:       1531684 kB
Mapped:           282636 kB
Shmem:              1492 kB
Slab:             103108 kB
SReclaimable:      67284 kB
SUnreclaim:        35824 kB
KernelStack:        5312 kB
PageTables:        16116 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    10318272 kB
Committed_AS:    1963340 kB
VmallocTotal:   34359738367 kB
VmallocUsed:           0 kB
VmallocChunk:          0 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       2
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:    1048576 kB
DirectMap4k:      131060 kB
DirectMap2M:     3915776 kB
DirectMap1G:     2097152 kB

You can obtain the process status(PS) command output which lists the currently running processes which is red from /proc file system.

[admin@cpe1: ~] $ ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%mem | head
  PID  PPID CMD                         %MEM %CPU
21781 21774 /opt/versa/bin/versa-vsmd - 15.0  228
20348     1 /opt/versa/confd/lib/confd/  3.0  0.3
21918     1 /opt/versa/bin/redis-server  1.8  1.1
21158 21154 /opt/versa/bin/versa-vmod -  1.7  0.0
21244     1 /opt/versa/bin/versa-rtd -N  1.6  0.0
20858     1 /usr/bin/python3 /opt/versa  1.0  0.0
20699 20697 /opt/versa/bin/versa-dnsd -  0.9  0.0
20862 20826 /opt/versa/bin/versa-acctmg  0.9  1.2
20810     1 /usr/bin/nodejs /opt/versa/  0.8  0.0

You can note down the process name which is consuming high amount of memory.

Step-6: Once you can identify the process which is consuming the more memory for example if we have versa-vsmd is consumer process, you can log into the vsmd vty and check the memory allocation:

Please execute the following malloc command:

vsm-vcsn0> show vsm statistics vmalloc

ID     ID String                         alloc    free   in_use  fail  nmismatch    used-bytes 
------ --------------------------------- ------   ------ ------- ----- ----------   ---------- 
1740   VMEM_ID_MCAST_FWD_STATS           1        0      1       0     0            1.2 KB
1743   VMEM_ID_MCAST_OIFLIST_HTABLE      7        0      7       0     0            784.0 KB
1745   VMEM_ID_VS_UTILS_PROFILER         6        0      6       0     0            30.0 KB
1755   VMEM_ID_MON_DATASTORE             3        0      3       0     0            416 B
1757   VMEM_ID_MON_DAC                   1        0      1       0     0            56.0 KB
1758   VMEM_ID_MON_CRC                   2        0      2       0     0            1.0 MB
1759   VMEM_ID_MON_ARC                   2        0      2       0     0            112.0 KB
1807   VMEM_ID_PRNG_CTXT                 6        0      6       0     0            192 B
1808   VMEM_ID_FDT                       7        0      7       0     0            770.0 KB
1812   VMEM_ID_MACSEC_CFG_PATH_GEN       8        0      8       0     0            1.5 KB
1821   VMEM_ID_MACSEC_STATS              7        0      7       0     0            1.9 KB
1823   VMEM_ID_DOT1X_LIB                 126      12     114     0     0            21.6 KB
1825   VMEM_ID_MACSEC_AUTH_DETAIL_TBL    4        0      4       0     0            640.2 KB
1848   VMEM_ID_TWAMP_MOD_HTABLE          7        0      7       0     0            12.2 KB
1873   VMEM_ID_VSMD_DLP_CFG_COOKIE       2        0      2       0     0            20.0 KB
1882   VMEM_ID_VSMD_DLP_TENANT_MAX       3        0      3       0     0            4.8 KB
1885   VMEM_ID_VSMD_DLP_RTE_RING         2        0      2       0     0            5.0 KB
1893   VMEM_ID_DYNAMIC_SCALE_GLOB_DATA   2        0      2       0     0            96 B
1897   VMEM_ID_VSMD_MDM_CFG_COOKIE       2        0      2       0     0            20.0 KB
1906   VMEM_ID_VSMD_MDM_TENANT_MAX       3        0      3       0     0            4.5 KB
1911   VMEM_ID_VSMD_MDM_STATS_THRAED     1        0      1       0     0            112 B
1914   VMEM_ID_HS_WRAPPER_CONTROL        1        0      1       0     0            320 B
----------------------------------------------------------------------------------------------
Total                               332479579 331953710 525869   0     0            595.5 MB

Vmalloc command is very useful to see the memory allocation done from vsmd. You can identify the alloc and in_use column to verify if there is any leak under specific process/function. there are cases where Vmalloc usage doesn't yield any clue ,you can obtain the total memory allocation for the versa-vsmd from "top -H " output by observing the Resident memory and correlate the total used memory under vmalloc stats and check the difference.

In case you see vamlloc memory usage accounting is very less in comparison to resident memory taken from "top -H" then there are memory blocks getting allocated through direct malloc calls which are not accounted under vsmd vmalloc stats. In such cases we may have to resort to jemmloc-caller-stats which needs to be enabled explicitly. Engineering cognizance is required.

vsm-vcsn0>  show vparse memory vdetect

ID ID String                               alloc free in_use  fail nmismatch  used-bytes 
-- --------------------------------------- ------------------ --------------- ---------- 
0  VDETECT_MEM_ID_LIB_GEN                  1     0    1       0    0          32 B
1  VDETECT_MEM_ID_SURICATA_APP_LAYER       27    5    22      0    0          7.9 KB
2  VDETECT_MEM_ID_MAIL_PARSER_SMTP_ANAMOLY 193   162  31      0    0          25.8 KB
3  VDETECT_MEM_ID_MAIL_PARSER_GEN          1     0    1       0    0          16 B
4  VDETECT_MEM_ID_DCERPC_PARSER_GEN        1     0    1       0    0          16 B
5  VDETECT_MEM_ID_FTP_PARSER_GEN           1     0    1       0    0          8 B
6  VDETECT_MEM_ID_JSNORM_ENGINE_GEN        18    0    18      0    0          163.9 KB
7  VDETECT_MEM_ID_LIBNORM_GEN              60    0    60      0    0          23.2 MB
8  VDETECT_MEM_ID_IPREP_GEN                3     0    3       0    0          1.4 KB
9  VDETECT_MEM_ID_URLF_VCATFEED_FEEDS      1     0    1       0    0          1.5 MB
10 VDETECT_MEM_ID_URLF_VCATFEED_MD5INFO    1     0    1       0    0          10.0 MB
11 VDETECT_MEM_ID_URLF_VCATFEED_INDEX      1     0    1       0    0          320.0 KB
12 VDETECT_MEM_ID_BC_URLF_CONFIG_VALUES    1     0    1       0    0          6.0 KB
13 VDETECT_MEM_ID_BC_URLF_SDK_STATUS       1     0    1       0    0          320 B
14 VDETECT_MEM_ID_BC_URLF_LOOKUP_SOURCE    1     0    1       0    0          32 B
15 VDETECT_MEM_ID_BC_URLF_TLD_HASH_TABLE   1     0    1       0    0          10.0 KB
16 VDETECT_MEM_ID_IPS_FILTERS              780   0    780     0    0          17.4 KB
17 VDETECT_MEM_ID_AV_GEN                   2     1    1       0    0          96 B
18 VDETECT_MEM_ID_FILEMAGIC_GEN            63    0    63      0    0          2.0 KB
19 VDETECT_MEM_ID_DEVID_GEN                49176 0    49176   0    0          3.0 MB
21 VDETECT_MEM_ID_HTP_CFG                  8     0    8       0    0          8.8 KB
22 VDETECT_MEM_ID_HTP_LIST                 24    0    24      0    0          1.5 KB
24 VDETECT_MEM_ID_HTP_HOOK                 25    0    25      0    0          400 B
25 VDETECT_MEM_ID_CPP_LIB_GEN              10    0    10      0    0          608 B
----------------------------------------------------------------------------- ----------
Total                                    50400  168   50232   0    0          38.3 MB

vsm-vcsn0> show idp mem non-zero <<<< Collect the IDP stats which are mostly for IDS/IPS.

Step-7: You can check the thrm event stats and see if there is any queue build up is seen. There could be cases where any of the cores being hogged and memory is increasing due to queue build up.

vsm-vcsn0> show vsm statistics thrm evstats

---------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------------------------------------------------

Thread group: 1

------------------------------------------------------------------------------------------------------------

3943(1)| 3948(0)| 0| 0| 0| 0| 0| 0|

------------------------------------------------------------------------------------------------------------

3943(1)| 3949(0)| 0| 0| 0| 0| 0| 0|

------------------------------------------------------------------------------------------------------------

3943(1)| 3907(2)| 2349577952| 2016057686| 0| 0| 0| 0| <<<<<

------------------------------------------------------------------------------------------------------------

3943(1)| 4155(3)| 0| 0| 0| 0| 0| 0|

------------------------------------------------------------------------------------------------------------

3944(1)| 3948(0)| 0| 0| 0| 0| 0| 0|

------------------------------------------------------------------------------------------------------------

3944(1)| 3949(0)| 0| 0| 0| 0| 0| 0|

------------------------------------------------------------------------------------------------------------

3944(1)| 3907(2)| 76294088| 35576529| 0| 0| 0| 0| <<<<<

------------------------------------------------------------------------------------------------------------

3944(1)| 4155(3)| 0| 0| 0| 0| 0| 0|

------------------------------------------------------------------------------------------------------------

3945(1)| 3948(0)| 0| 0| 0| 0| 0| 0|

------------------------------------------------------------------------------------------------------------

3945(1)| 3949(0)| 0| 0| 0| 0| 0| 0|

------------------------------------------------------------------------------------------------------------

3945(1)| 3907(2)| 74415257| 35138969| 0| 0| 0| 0| <<<<<

------------------------------------------------------------------------------------------------------------

3945(1)| 4155(3)| 0| 0| 0| 0| 0| 0|

------------------------------------------------------------------------------------------------------------

3946(1)| 3948(0)| 0| 0| 0| 0| 0| 0|

------------------------------------------------------------------------------------------------------------

3946(1)| 3949(0)| 0| 0| 0| 0| 0| 0|

------------------------------------------------------------------------------------------------------------

3946(1)| 3907(2)| 77491717| 34925201| 0| 0| 0| 0|

------------------------------------------------------------------------------------------------------------

3946(1)| 4155(3)| 0| 0| 0| 0| 0| 0|

------------------------------------------------------------------------------------------------------------

3947(1)| 3948(0)| 0| 0| 0| 0| 0| 0|

------------------------------------------------------------------------------------------------------------

3947(1)| 3949(0)| 0| 0| 0| 0| 0| 0|

------------------------------------------------------------------------------------------------------------

3947(1)| 3907(2)| 102054976| 34582805| 0| 0| 0| 0|

------------------------------------------------------------------------------------------------------------

3947(1)| 4155(3)| 0| 0| 0| 0| 0| 0|

------------------------------------------------------------------------------------------------------------

Analyzing Tech-support:

There would be cases where we may not get direct indication from vmalloc stats that what is consuming the memory, in such instances we need to analyze the device logs in more depth to understand the trigger which is causing high memory usage.

Analyze the logs and find if there is any network instability(interface flaps, protocol flaps, IKE/Ipsec flaps, P2mp neighbor churn, CGnat churn etc) present on that device or in the network.

#Check if you see lot of NBR:CHANGE from specific site-id, it will show branch name as well.

cat /var/log/versa/versa-infmgr.log | grep -i "P2MP:NBR:CHANGE"

2020-01-10 11:40:08.155 INFO infmgr_p2mp_neighbour_change:2596 P2MP:NBR:CHANGE remote network-id 76, branch-id 1653, rtt_index 11, tenant_id 1

2020-01-10 11:40:08.383 INFO infmgr_p2mp_neighbour_change:2596 P2MP:NBR:CHANGE remote network-id 76, branch-id 1653, rtt_index 11, tenant_id 1

2020-01-10 11:40:08.704 INFO infmgr_p2mp_neighbour_change:2596 P2MP:NBR:CHANGE remote network-id 76, branch-id 1647, rtt_index 11, tenant_id 1

2020-01-10 11:40:08.721 INFO infmgr_p2mp_neighbour_change:2596 P2MP:NBR:CHANGE remote network-id 76, branch-id 1647, rtt_index 11, tenant_id 1

2020-01-10 11:40:08.768 INFO infmgr_p2mp_neighbour_change:2596 P2MP:NBR:CHANGE remote network-id 76, branch-id 1653, rtt_index 11, tenant_id 1

#Verify if there is lot of NAT binding churn:

link-id 1 (vni-0/2.0), , seq 1 bindings(INT-WAN-Transport-VR:172.16.255.1 -> 178.219.184.46:4661),

link-id 1 (vni-0/2.0), , seq 1 bindings(INT-WAN-Transport-VR:172.16.255.1 -> 178.219.184.46:29461),

link-id 1 (vni-0/2.0), , seq 1 bindings(INT-WAN-Transport-VR:172.16.255.1 -> 178.219.184.46:27748),

#Get this from branches/site isolated in above step and verify issue is with paired-id || NAT change || BUG(EIM changing every 1-5min) || Link Flaps

Paired-site location-id should be same on both boxes and globally unique. It should not be overlapping with site-id of any other CPE/Controller except, using Site-ID from one of the CPE in HA as location-id on both.

cat /var/log/versa/versa-infmgr.log

cat /var/log/versa/versa-service.log

vsh connect vsmd

show cgnat ei-mappings <local-tnt-id> >> Check if mapping is changing run @interval of 5 to 10 min (bug 42251 )

show cgnat ei-filters <local-tnt-id>

Check the device state/configuration and compare it with the date since when device start showing memory increase. Look for any new changes which would have triggered the usage.
Check if there was any upgrade(VOS, SPACK, OSSpack) was performed on this device.
Check if any new security module configuration such as SSL inspection/IDS/IPS is added.

Following commands can be obtained from the VOS cli and shell.

From VOS command Cli:

show interfaces port statistics brief

show interfaces port statistics detail

show system details

show device clients

show orgs org Tenant-1 sessions summary <Change the Tenant-name

show configuration system service-options | details

show system load-stats

show statistics internal memusage <Make use of unhide full>

From shell:

top -H (Shift +M)

free -h

lscpu

vsh details

cat /proc/net/dev

cat /proc/meminfo

ps -o pid,user,%mem,command ax | sort -b -k3 -r

ps axo %mem,rss,pid,euser,cmd | sort -nr | head -n 10

ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%mem | head

ps -o rss,sz,vsz `pidof versa-vsmd`

sudo cat /proc/(pid of the process)/status < sudo cat /proc/21781/status>

sudo cat /proc/vmallocinfo

sudo cat /proc/`pid of the process`/maps < sudo cat /proc/21781/maps>

sudo cat /proc/`pid of the process’/smaps < sudo cat /proc/21781/smaps>

sudo pmap -XX `pidof versa-vsmd` < sudo pmap -XX 21781>

From VSMD VTY: (collect these commands if you versa-vsmd is the top consumer)

Use “vsh connect vsmd” shell command to enter vsmd vty prompt.

show vsm statistics vmalloc sort

show vparse memory vdetect sort

show vsm statistics rtemalloc summary

show vsm statistics rtemalloc segments

show vsm statistics rtemalloc memzones

show vsm statistics mbuf

show vsm statistics jemalloc | between CPU Merged

show vsm statistics jemalloc

show vsm cpu info

show idp mem non-zero

show vsf nfp module stats brief

show vsm statistics dropped

show vsm statistics datapath

show vsm statistics thrm evstats

show vsm statistics thrm detail

From INF-MGR:

show stats vmalloc sort

show stats vsm

From Vmod:

show vmalloc statistics sort

From Analytics:

Collect the Memory-Usage/CPU-Usage/Access-Circuits-usage/Top-Consumer graphs from analytics for correlation.

Troubleshooting High Memory Issues On VOS Device

Troubleshooting High Memory Usage

More articles in Troubleshooting