Hi everyone!
First, I'll try to describe what I am trying to achieve, then what I've tried so far. I'll scatter some questions around, labeled Q[n] - because I'd already be happy if single questions were answered (there's the core problem and there are questions about tips and tricks). Thank you! :)
What I am trying to achieve
I am trying to analyze an application-specific custom protocol which is transported via USB CDC (virtual serial) between a USB host and a USB device. The communication can be recorded using tcpdump -i usbmon.... The USB bulk packets then contain multiple streams of the custom protocol. The following diagram tries to explain how that can look like (three different streams labeled a, b and c):
┌────────────────────┐ ┌───────────────┐ ┌────────────────────┐
│┌───┐┌───┐┌───┐┌───┐│ │┌───┐┌───┐┌───┐│ │┌───┐┌───┐┌───┐┌───┐│
││a ││b ││a ││b ││ ││a ││a ││c ││ ││a ││b ││a ││c ││
│└───┘└───┘└───┘└───┘│ │└───┘└───┘└───┘│ │└───┘└───┘└───┘└───┘│
└────────────────────┘ └───────────────┘ └────────────────────┘
USB bulk packet #1 USB bulk packet #2 USB bulk packet #3
My final goal is to extract the stream data of every stream (illustrated as a, b and c above), extract them from the USB packets and re-concatenate them for further inspection.
In reality, it's a little bit more complicated than in the illustration - but thankfully, I've got a Lua dissector of the encapsulated protocol. When I put the Lua script in my Wireshark plugins folder and open the .pcap file it gets dissected and I can inspect it manually - nice! That also means that
- the "Leftover Capture Data" column is empty,
- the custom protocol's name appears in the "Protocol" column and
- that I can see one or multiple frames of the streams which have been transported in the USB bulk packet.
In my understanding there's already a Wireshark feature called Following Protocol Streams. I've seen that demonstrated for other protocols such as HTTP - and I think that's exactly what I'd like to use. When right-clicking on a decoded packet of the custom protocol, there's no option to choose from in the Follow menu item - which I understand because how should Wireshark know about the semantics of the custom protocol's stream frames. Ideally, I could select "Custom protocol: Stream a", "Custom protocol: Stream b" and "Custom protocol: Stream c" there.
Q1: Is there a tutorial on what to do to extend the feature for custom protocols? Does this require extension/modification of the Lua dissector script?
What I've tried so far
I've stumbled across "Taps" and "Listeners".
Q2: Do these two terms mean the same and hence can be used interchangeably?
I've seen code snippets where "Listener instances" have been called "tap". (FYI: I am not a native speaker of English.)
There are some links which give an idea of what's possible and code snippets of how to write and use them. I'd like to know if I'd be mis-using them to "follow streams" or if this is the proper way to. It may be do-able this way but completely cumbersome.
I've played around with these two examples and this example which gave me some idea about the methods of Listener class and how you can hook into the Wireshark GUI (adding a menu item, opening a Window and adding text content to it). (As the class description is a subsection of a chapter called "Post-Dissection Packet Analysis" I think I might be heading in the right direction, but I do not know.)
I am now trying to add a Listener/Tap for the packets of the dissected custom protocol to the same Lua script.
Q3: Is it a good idea/ common practice to have a (Lua) dissector and Listener/Tap in the same Lua script?
I am extending the script and executing it from command line as follows:
<wireshark executable> -X lua_script:<dissector_and_listener_script>.lua <usbmon recording>.pcap
Q4: Do you have any recommendations of how to (interactively?) debug the execution of the Lua script?
Currently, g̵o̵o̵d̵ ̵o̵l̵d̵ print debugging is my friend. This is okay-ish, but there might be a way to look at variable types and contents during runtime not only be converting things to string and printing them out.
The documentation of the parameter list of Listener.new([tap], [filter], [allfields]) confuses me:
tap (optional) The name of this tap. See Listener.list() for a way to print valid listener names.
"The name of this tap" let's me think I can choose an arbitrary name for this Tap to be registered.
However, calling tap = Listener.new("bananas") yields an error "Tap bananas not found".
I also get the idea that the following snippet lists existing taps (tap-able protocols?), but that doesn't really help. I see that the custom protocol is not listed there.
-- Print a list of tap listeners to stdout.
for _,tap_name in pairs(Listener.list()) do
print(tap_name)
end
Q5: How do I use the tap parameter of Listener::new() properly?
I mean using "usb" as argument works - but it doesn't seem to add any harm or profit. "<name of="" the="" custom="" protocol="" also="" shown="" in="" the="" protocol="" column="">" does notnil also does the trick - I cannot observe any difference.
What about the filter argument? I've placed "<name of="" the="" custom="" protocol="" which="" is="" also="" shown="" in="" the="" protocol="" column="">" - because then the Listener::packet() method will only be called for relevant packets of the custom protocol.
Q6: How do I use the filter parameter of Listener::new() properly?
Then, there's the last parameter to new():
allfields (optional) Whether to generate all fields. The default is false. Note: This impacts performance.
As this is a boolean with default false, I've played around and currently it is set to true. However, when printing the last argument of packet() which often is called tapinfo, I simply get nil. No effect?
Q7: When does the allfields=true parameter/argument of Listener::new() take effect?
Okay, so this is it about uncertainties of method Listener::new(). In the end I can create a Listener object and its callback function tap.packet(pinfo,tvb,tapinfo) will be called for the custom protocol packets - but it may not be optimal or sufficient doing it that way. Let's head over there...
When later called by Wireshark, the packet function will be given:
- A Pinfo object
- A Tvb object
- A tapinfo table
The Pinfo object contains Packet Information.
I see pinfo.cols['info'] resp pinfo.cols.protocol being touched by the dissector.
Some columns cannot be modified, and no error is raised if attempted. The columns that are known to allow modification are "info" and "protocol".
However, when I print pinfo.cols['info'] (seems to be the same as pinfo.cols.info) and pinfo.cols.protocol inside the Listener callback function tap.packet(pinfo,tvb,tapinfo) I only get:
(info)
(protocol)
This does not give any information about the packet contents. I clearly see the protocol and info columns being written to by the dissector part of the Lua script and printed in Wireshark's GUI.
Q8: How do I access the (dissected) packet contents properly from the pinfo?
Is the dissector called after the listener and hence there's no data available to the listener yet? This would explain the behaviour. How can I fix this? I've also tried in placing the listener code below the dissector code but without effect.
Q9: Can I access the (dissected) packet contents properly from the tvb or tapinfo?
As seen above, there's still the "Tvb object" and the "tapinfo table". While the latter one currently is nil, I can print(tostring(tvb)) and see some binary data. This seems to start with the USB URB data... - do I need to manually throw this at the dissector again?
Summing it up...
Q10: Am I heading in the right direction? Or should I take a different approach to reach my goal - which one?
Sorry, but I won't be able to share the Lua script itself or screenshots of the data. I've tried to describe my problem as generic as possible.