Why am I seeing "J%C3%A4%C3%A4kaapi" rather than "jääkaappi"?

asked 2020-09-24

Nordic

updated 2020-09-24

Guy Harris

Should I be able to see nordic letters in Wireshark capture?

I'm using Wireshark 2.6.3.

I'm sending a http post and I'm missing scandic letters (Ä and Ö).

The post should say "jääkaappi" but when I capture the post I see following: "J%C3%A4%C3%A4kaapin"

When using a different http post sniffer I can see the scandics.

Your question is missing some context, but "%C3%A4" is the UTF-8 representation of "ä". If you post a link to a capture containing the packet at issue on a public share, e.g. Google Drive, DropBox etc. we can comment further.

grahamb ( 2020-09-24 )

answered 2020-09-24

Guy Harris

What you're seeing is called "percent encoding". Whatever program is sending the HTTP POST is using percent encoding, and has, for example, percent-encoded the ä character as its two-octet representation in UTF-8 (hex C3 followed by hex A4, as per @grahamb's comment), with those two octets percent-encoded.

Wireshark doesn't do percent-encoding - it's showing you what was in the packet, without percent-encoding it.

When using a different http post sniffer I can see the scandics.

Perhaps that's not a packet sniffer, so that it doesn't capture, and display, raw network packets, which is what Wireshark does. You refer to it as an "http post sniffer"; perhaps it's a proxy sniffer, and perhaps it undoes the percent-encoding to show you what the percent-encoded text represents. What sniffer is it?

