Solr buffers the incoming logs in the tlogs and periodically flushes the data to the disk. If the replica is stuck in "Recovery" state, it continues to write to the tlogs without flushing the contents of the older tlogs. This can cause tlog fileto be go in TB causing disk space high utilization.
This issue can happened due to solr going in recovery loop where a Solr replica enters "Recovery" state & is unable to exit that state due to the high logging rate.
Logs to be gathered in this case:
- Admin > System Status > Status
Visualization regarding the status of the Search nodes - Admin > System Status > Database (with time range set to last 7 days)
Provides trends in search logging rate & total search logs currently residing in the Solr cluster - Admin > Configuration > Data Configurations > Search Data Configurations
Provides information regarding customer's search log preferences & retention - Admin > Configuration > Data Configurations > Search logs Daily limit configurations
Provides information regarding daily limit for search logs enforced at Global/Tenant levels - Time on all nodes in cluter should be in sync. Use timedatectl command to verify that.
To recover the disk space, please perform the following steps:
- Stop Versa Analytics services on the node facing the issue: vsh stop
- Stop Solr on the same node: sudo service solr stop
- Clear all the tlogs: sudo rm /var/lib/solr/data/data/searchlogs_shard1_replica1/data/tlog/*
- Start Solr: sudo service solr start
- Start Versa Analytics services: vsh start