DNS overload

asked 2026-02-11 13:17:51 +0000

net_tech gravatar image

updated 2026-02-11 15:01:38 +0000

Chuckc gravatar image

Hi,

Server monitoring team is insisting that the server is overloaded with DNS queries, which I don't find any evidence when looking at the wireshark trace.

My understanding of this capture is that the server is getting an average of 250 requests per seconds with some occasional spikes to 600 - 700 requests per second, but nothing crazy that the server is not capable of processing.

this is a Windows 2022 server with 8 cpus and 32GB of ram, which is capable of resolving way over 50k of requests per second.

image description

Posting here to validate my I/O Graph and make sure I didn't mess up the scale.

Thank you

edit retag flag offensive close merge delete

Comments

At first glance it looks OK. So it's time to focus on WHY the server team thinks the DNS is overloaded.

One thing could be that there is issue with forwarding requests. So instead of looking at queries TO you server and replies FROM your server have a look at queries FROM your server and replies TO your server. Most notably check IF you get replies and if so with what latency.

hugo.vanderkooij gravatar imagehugo.vanderkooij ( 2026-02-11 13:31:30 +0000 )edit

They shared an alert from SCOM that shows that the number of queries is over the set limit of 70k, which I am unable to find any evidence of. What's missing from the alert is the time span, which I think is a misconfiguration of the monitoring tool. This server at 250 requests per second gets to 70k in 5 minutes, but not in a second, at least the packets don't show that high number of requests

net_tech gravatar imagenet_tech ( 2026-02-11 13:40:24 +0000 )edit

If SCOM polls once per 5 minute, then indeed you will see a delta of 70k+

So did the team tune SCOM and deliberately set the mark at 70k/5min? If so, what was their motivation? If not, please advise them to not steer on default values ;-)

SYN-bit gravatar imageSYN-bit ( 2026-02-11 13:54:02 +0000 )edit

yes, SCOM was pulling every 5 min with the threshold value at 70k, new value is 5000000 requests in 5 min interval.

net_tech gravatar imagenet_tech ( 2026-02-11 17:10:50 +0000 )edit