This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

How to dissect UTF8 string and bits with a custom dissector

0

I am trying to write a customized dissector as a plugin on windows platform.
I am doing this on version 2.1.1-git, win32.

Q1
I have a string which has 320 length, Unicode UTF-16LE encoded. I am trying to do the following parsing but the result is I can only get the first char on the front end display.
For example, I received a string "Hello Lee", after I do my parsing, I can only see "H" in the front end.
I have the following related code.

void
proto_register_foo(void)
{
    //...
    static hf_register_int hf[] = {
        { &hf_foo_message, { "Message", "foo.message", FT_STRING, STR_UNICODE, NULL, 0x0, NULL, HFILL } }
    };
    //...
}

static int dissect_foo(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree U, void *data U) { //… proto_tree_add_item(foo_tree, hf_foo_message, tvb, 0, 320, ENC_LITTLE_ENDIAN); //… }

I also tried to get the string from tvb first like this:

GByteArray temp_str;
temp_str.data = malloc(320);
temp_str.len = 320;
tvb_get_string_bytes(tvb, *offset, 320, STR_UNICODE, &temp_str, endoff)
calfreeloc(temp_str.data);
temp_str.len = 0;

but I am not sure how to set the variable endoff

Q2
I have a byte that include three variables.
bit 1-2 is value_a
bit 3-6 is value_b
bit 7-8 is value_c

I am trying to do put them into static hf_register_info hf[], but then I found nothing similar to FT_BITS, what should I do?

asked 07 Jul '16, 20:32

SulfredLee's gravatar image

SulfredLee
26448
accept rate: 0%


One Answer:

3

Separate questions should be asked separately, but:

Q1

Unicode UTF-16LE encoded

...

proto_tree_add_item(foo_tree, hf_foo_message, tvb, 0, 320, ENC_LITTLE_ENDIAN);

You need to specify the full encoding, not just the byte order. for string values. In this case, it's ENC_UTF_16|ENC_LITTLE_ENDIAN, so

 proto_tree_add_item(foo_tree, hf_foo_message, tvb, 0, 320, ENC_UTF_16|ENC_LITTLE_ENDIAN);

I also tried to get the string from tvb first like this:

Try it like this:

string = tvb_get_string_enc(NULL, tvb, *offset, 320, ENC_UTF_16|ENC_LITTLE_ENDIAN);

That will return a UTF-8 string that must be freed with g_free(). If you want the raw bytes of the string, do

buffer = tvb_memdup(NULL, tvb, *offset, 320);

That will return an array of octets (not null-terminated) that must be freed with g_free().

Q2

what should I do?

Define three fields in the hf[] array, and set the bitmask fields of their definitions to be bit masks corresponding to the bits (I don't know whether you're numbering the bits from top to bottom, or from bottom to top, so I don't know what the right masks would be).

answered 07 Jul '16, 21:41

Guy%20Harris's gravatar image

Guy Harris ♦♦
17.4k335196
accept rate: 19%

Thank you I will try it @@"

(07 Jul '16, 22:25) SulfredLee