# Type for Dissecting n-bit Quantities

EDIT: Is it possible to just parse a large portion of a header to fit into a standard size, and then just mask from there? For example, for a 40 bit header, I could break that into two 'segments', one of 32 bit size, one of 8 bit size. If I then use a pointer to the same reference point, can I then use different masks to parse out different pieces of that 32 bit size to recover the original field?

To be more concrete, let me walk through an example of what I mean.

Assume I have a 40 bit header, foo, with the following fields:

bit<2> version
bit<1> bypass
bit<1> command_flag
.... (skipping until the end of 32 bits for concision)
bit<8> sequuence_number


Assume that the fields have been defined in similar order in the dissector:

  static int hf_foo_version = -1;
static int hf_foo_bypass = -1;
static int hf_foo_cmd_flag = -1;
...
static int foo_sequence_number = -1;


Can I do something like the following?

/*inside the proto_register_foo function*/
static hf_register_info hf[] = {
{ &hf_foo_version,
{ "Version",         "foo.version",
FT_UINT32, BASE_DEC,
NULL, 0x03,
NULL, HFILL }
}
};
static hf_register_info hf[] = {
{ &hf_foo_version,
{ "Bypass",         "foo.bypass",
FT_UINT32, BASE_DEC,
NULL, 0x07,
NULL, HFILL }
}
};


and so on, to parse each field out of the 32 bit selection, offset from the same pointer?

Original question below for more background - if you can answer the rephrasing of the question in this edit, though, then I can potentially remove the original question and use this rephrasing.

================================================================================

Hello. I am in the middle of writing dissector for a custom protocol encapsulating ethernet/ipv4. The protocol has two headers. The first header is 5 bytes in length while the second header is 1 byte in length. However, those bytes are not evenly divided.

For example, let's call the first header Foo, to be consistent with the Developer's Guide. Foo has many fields of varying sizes - 1,2,6,8, and 10 bit fields.

From reading README.developer, I can guess that for the 1 bit fields I can use things like gboolean for 1 bit and guint8 for 8 bit fields. I've been given a template that uses an FT_ prefix instead of the g prefix - for example, it uses FT_Boolean and FT_UINT8 instead for the previous types.

I am trying to follow the example given in section 9.2 of the Wireshark Developer's Guide - specifically, I'm currently looking at the section containing the following code for dissecting specific fields:

 static hf_register_info hf[] = {
{ &hf_foo_pdu_type,
{ "FOO PDU Type", "foo.type",
FT_UINT8, BASE_DEC,
NULL, 0x0,
NULL, HFILL }
}
};


One example field in my custom protocol is a 2-bit sequence flag field. In my code, this looks like:

    /* defined earlier in the actual file, included here for reference */
static const value_string foo_sequence_flags[] = {
{... /*omitted for concision as at this point, I cannot handle segmented data*/},
{3, "Unsegmented data" },
{0, NULL }
};

/* also defined earlier, included for reference.*/
static int hf_foo_seq_flags = -1

/*inside the proto_register_foo function*/
static hf_register_info hf[] = {
{ &hf_foo_seq_flags,
{ "Sequence Flags",         "foo.seq_flags",
<unknown type here>, BASE_DEC,
VALS(foo_sequence_flags), 0b11 ...
edit retag close merge delete

(I'm not a developer so take with a grain of NaCl)
1. packet-ieee80211.c has 2 and 4 bit fields in proto_register_ieee80211() that are put in FT_UINT8.
2. See README.dissector for FT_ definitions.
3. HFILL is mentioned in README.dissector:

The HFILL macro at the end of the struct will set reasonable default values for internally used fields.


And defined in epan/proto.h:

/**
* HFILL initializes all the "set by proto routines" fields in a
* be changed as necessary.
*/
#define HFILL -1, 0, HF_REF_TYPE_NONE, -1, NULL


3b. Header fields defined in epan/proto.h:

/** information describing a header field */

( 2020-06-30 00:42:53 +0000 )edit

Sort by » oldest newest most voted

What is the correct type to use here for dissecting this quantity? It's a two bit field, so boolean is too small and guint8 and/or FT_UINT8 are too large. I don't know what type to use there, which is where the label for this particular question comes from.

The correct types are FT_BOOLEAN for Booleans and FT_UINTn for the non-Booleans; n should be the length of the octet-aligned field in which the bitfields are contained.

The lengths are specified elsewhere; see below.

I've been given a template that uses an FT_ prefix instead of the g prefix - for example, it uses FT_Boolean and FT_UINT8 instead for the previous types.

...

What is the difference between FT_ and g versions of a type, if any? For example, what is the difference between "gboolean" and "FT_BOOLEAN" as a type?

They're not prefixes for the same kind of types.

The g prefix is for GLib types; those are typedefs in C, for use in C code.

The FT_ prefix is for Wireshark named field types. Those are data types for data within packets; some might happen to be similar to GLib (or C99) types, but that's more of a case of "those are concepts that go beyond either C or network packets" than a case of "the types are equivalents".

For example, a gboolean is 32 bits on all the platforms that GLib supports (unless it's used as the type of a C bitfield), but an FT_BOOLEAN field could be anywhere from 1 to 8 bytes.

I don't see FT mentioned at all in README.developer.

See section 1.2 "Explanation of needed substitutions in code skeleton." and section 1.5 "Constructing the protocol tree."; in the latter, look in the subsection "type (FIELDTYPE)".

BASE_DEC is how this value should be printed in Wireshark.

Yes, except that, for FT_BOOLEAN, it's a "special case" for Boolean bitfields, giving the length of the octet-aligned field in which the bitfield appears; see below.

The next four options are VALS(), ???, NULL, and HFILL. VALS() is allowing me to pass in the array that has a mapping from values read to semantic meanings for wireshark to display.

VALS() can actually be different things for different field types. VALS() is used for mappings from integral values to strings giving a semantic description (think of it as being for enumerated data types).

For Boolean values (FT_BOOLEAN), TFS() is used, instead; it stands for "true/false string", and what's wrapped in TFS() is a pointer to a struct true_false_string, which is a structure containing two character string pointers, one for the string to be used if the field is "true" (non-zero), and one for the string to be used if the field is "false" (zero).

There are some additional ones, such as VALS64() for 64-bit fields and 64-bit numerical values, and RVALS(), to specify the meanings of certain ranges of value.

?? seems to be...a mask? a size? a predefined value ...

more

Wow, that was a great answer. Thank you so much for taking the time to type it all out, I really appreciate it. I'm sorry I can't upvote it, I don't have the reputation to do so. I'm still reading through it, but I wanted to ask one thing while I'm thinking about it and reading through:

I don't see FT mentioned at all in README.developer.

See section 1.2 "Explanation of needed substitutions in code skeleton." and section 1.5 "Constructing the protocol tree."; in the latter, look in the subsection "type (FIELDTYPE)".

I was using https://github.com/boundary/wireshark... for that, which doesn't have a section 1.2 or section 1.5. I guess I was looking at the wrong file, in that case? Would you please show me the one that you were referring to?

( 2020-06-30 14:25:42 +0000 )edit

@dmanderson

Accepting the answer, if it has solved your issue, by clicking the checkmark to the left of it is the accepted way of operation here.

The boundary repo is a forked copy that appears to have last been updated in 2013. No wonder that was confusing you if that was your reference. The canonical source code reference (to view files) is at https://code.wireshark.org/review/git...

( 2020-06-30 14:56:34 +0000 )edit

@grahamb

I did accept it as soon as I actually finished reading it in detail. When the page refreshed, I saw your comment. Thank you for the correct link!

( 2020-06-30 15:14:08 +0000 )edit

@Guy Harris

Oh, another question just to make sure I understand. Near the end, right before the 8 bit sequence number, there's a 10 bit field for frame length.

Assuming it's defined as foo_frame_length, which option is correct?

Option A: There's 18 bits left in the header, so I can use a FT_UINT16 and mask for the upper 10 bits of it. So it'd be something like

{ &hf_foo_frame_length,
{ "Frame Length",         "foo.frame_len",
FT_UINT16, BASE_DEC,
NULL, 0xFFC0,
NULL, HFILL }
}


I'm assuming I shouldn't use an FT_UINTN where N is larger than the remainder of the data I want to read.

Option B: Assume that it's still interpreting as part of the original 32 bit value I specified, and I just keep shifting the mask over:

       { &hf_foo_frame_length,
{ "Frame Length",         "foo.frame_len",
FT_UINT32, BASE_DEC,
NULL, 0x000003ff,
NULL, HFILL }
}


(more)
( 2020-06-30 16:41:10 +0000 )edit

It sounds as if there's a 32-bit value with a bunch of subfields, so option B is probably best.

( 2020-07-01 01:08:52 +0000 )edit