As I’ve worked on many large-scale Splunk environments, a common problem I’ve noticed is that Splunk forwarders phone home to the deployment server (DS) too frequently. When a forwarder phones home to the DS more often than necessary, it wastes resources on the DS, and can prevent the DS from deploying apps to forwarders correctly.
By default, a Splunk Universal Forwarder or full Splunk Enterprise instance will phone home to the deployment server every 60 seconds. In a Splunk environment of a moderate size, this can easily overwhelm the resources of the DS. How quickly do you really need to deploy changes to your forwarders anyway? I normally recommend a phone home interval of at least 600 seconds (10 minutes).
Here’s a simple query you can use to find forwarders that are phoning home most frequently.
index=_internal source=*splunkd.log "running phone"
| stats count min(_time) as min_time max(_time) as max_time by host
| eval span=max_time-min_time, minutes_per_phone=span/60/count
| fields host count minutes_per_phone
| sort - count
Happy Splunking!