Debugging faulty VOIP app - Best approach?
Hi there, We are attempting to fix an issue which is affecting our small school since months. We are using a Panasonic NS500 PBX unit which is cable connected to our LAN, to manage phone calls. The NS500 is an hybrid system which manages analog-in phone calls and SIP/TLS calls between hardphones and softphones. SIP server is local.
While there are no issues at all when using hardphones, softphone to softphone calls are experiencing random issues (no audio). I would say that 30% softphone-to-softphone calls have (no) audio issue, and when this happens, users are required to dial again until the audio is working. Usually the second dial attempt works as expected. Softphones are based on the official Panasonic iOS/Android app.
Unfortunately Panasonic is not really eager to help solving the issue although we suspect the system is prone to glitches, also due to the architecture's complexity:
- Softphones app connects to the local PBX through the Internet or in LAN, depending on current location
- The first time the app is fired, it will attempt to connect using the best connection, so both remote and local connection attempts are made at the same time. Once connection is established, the quickest connection will be used
- Any subsequent connection attempt by the Softphone, will be made through the last-time connection (remote or local). If this fails, then a new connection attempt is made using the other connection method (remote or local)
- Any softphone app, prior being used must authenticate to Panasonic servers which are hosted on Amazon AWS
So Panasonic keeps asking us to double check settings on the router, which in their opinion is causing the issue. We wonder how this can be, as the issue appears randomly.
LAN Scenario:
- Peplink router model Balance One using 5 WANs
- Core LAN + 4 VLANs - Hardphones and PBX are connected to VLAN 400
- Softphones are connected to the SIP server either through Staff VLAN, or through the Internet:
- SIP-ALG is disabled
- Softphones are configured to use TLS
A the moment I am using Wireshark to capture traffic on VLAN 400, where the PBX resides, but I could also capture traffic on the router end.
To make things easier and to exclude as much as possible routing issues, I have momentarily connected the softphones to the same VLAN as the PBX.
When capturing traffic, you can clearly see TCP traffic between the softphone, the app and the authentication servers over the internet. At a given point, when the callee will answer, Wireshard will show UDP packets flowing between the softphones. When the (no audio) issue occurs, since the callee will answer to the phone call, wiresharks show UDP traffic only from the caller: there's no UDP traffic from callee, which explains why there's no audio.
There are no errors in the firewall log, which could help spotting any LAN request being blocked
Without knowing what "handshake" schemes are in effect it is difficult to understand where the issue occurs.
How would ...