This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

Help in reverse-engineering a protocol running atop TCP

Hi all,

I'm having an issue understanding how a program communicates with a device for sending table entries from a database. It can send thousands of entries but it will send them in blocks containing 16 table entries per block which, right now my data equals about 520 blocks before its done sending data.

The problem I'm having is understanding how it verifies that the data in each block is correct (a checksum or crc?). After the TCP header of each block there's a sequence of bits that are the same in every block and I assume this sequence is telling the device it wants to send another block. Each block also ends the same, 2 octets that are always the same and then 2 octets that are different. The octets that are different is what has me stumped. If these final 16 bits are wrong then the device kills communication. I haven't been able to find anything on the internet regarding a checksum or crc that is placed at the end of the data section in a packet, all I've found info on is the checksum in the tcp header. I've ran brute force tools against many captured blocks of data without any luck of finding a crc polynomial.

Would anyone be able to help me out with this problem? Perhaps I'm missing something or just don't understand how data is sent. I'm pretty sure this data is being sent using a synchronous socket since it waits for a reply from the device before sending the next block. But outside of that I'm not sure if there's something in the code of the application and the device that uses a proprietary way of checking the data in the block or if this is something standard with device communication.

I am providing a link to download wireshark file. Each packet (from source IP 192.168.1.115) ends in 10 03 then the 2 octets I'm trying to "crack". Again, not sure if its some sort of checksum or crc.

http://www.mediafire.com/download/mt3ph1dazlv44gr/help.pcapng

Thank you for any help, its greatly appreciated!

checksum packet crc

asked 08 Feb '14, 22:53

dirtyrobinson
11●3●3●5
accept rate: 0%

edited 12 Feb '14, 20:22

Guy Harris ♦♦
17.4k●3●35●196

I'll see if the company that made the software would share how to communicate with their device, thanks guys for looking at it. I appreciate it!

Jason

(10 Feb '14, 14:44) dirtyrobinson

3 Answers:

TCP has a checksum in the header that is used to verify integrity of the TCP header AND the TCP data section, so TCP itself takes care of the data block being correct (as much as you can do that with a 16 bit CRC, of course). Usually, data transported by TCP does not have an additional checksum in the data itself, because the segmentation process of TCP could cut it from the current packet and transmit it as first part of the next packet. If an application wants to put a specific checksum in a TCP packet data segment it must control the creation of the whole segment, which is a lot more complex than just sending the data and trusting on the CRC of TCP.

Now, if a communication using TCP as a transport protocol implements an additional checksum mechanism (like you assume to be the last 2 bytes of the payload) you need to have the specification of how the checksum is calculated. If you don't have that you can only guess or try to reverse engineer how it is calculated, but that may be a frustrating task. The only good thing is that you think that it is only 2 bytes long, which means that the calculation is probably a pretty simple algorithm. Keep in mind that you'll have to calculate the CRC on the data segment only, not the full packet, so you need to extract the data bytes after the TCP header and run your brute force tools on that.

So you should try to find the protocol specifications for the stuff that is transported after the TCP header. Since you already know some details about it (when you say there are 16 table entries per packet) you seem to have access to that.

answered 09 Feb '14, 03:23

Jasper ♦♦
23.8k●5●51●284
accept rate: 18%

What I don't have access to is the source code that explains how it splits everything, I just know that it takes a file in hex then splits it after every 1264 bits, it adds 16 bits of data before each block of the file is transmitted then it adds 4 bits at the end, since I don't have the source I don't know how its determining the final 2 bits. I can only decompile the app and watch the assembly code.

Are there any other options as to what those 2 bytes may be? I only know of 2 options, its either a checksum or a crc.

(09 Feb '14, 23:00) dirtyrobinson

Are there any other options as to what those 2 bytes may be?

could be anything.

I only know of 2 options, its either a checksum or a crc.

a crc is just an algorithm to implementat a checksum.

If you look thoroughly at the packets, you'll find a 'clear' structure

There is

a record 'magic' number: 10 53 10 50 10 53 (wich is '.S.P.S' in ASCII)
there is (probably) a record start delimiter: 10 02
there is (probably) a record end delimiter: 10 03
there are some bytes that are unclear at the beginning, after the start delimiter
there are those two bytes at the end, probably some form of a checksum

To sum it up: It's probably better to ask the vendor of the software about the structure of the data, unless you have much more data and a lot of time to analyze the data structure. If that later applies, you could try to 'brute force' the last two bytes, by trying several known checksum algorithms that produce a 16 bit output. However keep in mind, that the developer of the software might have created its own checksum algorithm. If that is the case, you will have no chance at all to figure out that algorithm, just by looking at the packet bytes.

So, again: It's probably better to ask the vendor of the software or to reverse engineer the software with a debugger/disassembler.

Regards
Kurt

answered 10 Feb '14, 04:23

Kurt Knochner ♦
24.8k●10●39●237
accept rate: 15%

edited 10 Feb '14, 15:09

The company that made the software is of no help, I was kind of expecting the response I got though. They would only share the function if there was a sales opportunity for them.

(12 Feb '14, 13:36) dirtyrobinson

O.K. then tell them you will decide a 10 million deal, but only after you get the information you need :-))

BTW: why do you need that information? What are you trying to do?

(12 Feb '14, 14:12) Kurt Knochner ♦

Its for my cash registers, the programs that can communicate with it lack "customability" when it comes to reports and inventory. The PC program creates files of the data from the register and the program I created can edit and pull data from the files created by the PC program. I'm getting tired of having to use the vendor's software to send information back to the register.

I matched my data with the streams created by the cash register and everything is an exact match except for whatever those final 2 bytes are.

(12 Feb '14, 15:09) dirtyrobinson

Well, then I would take the debugger/disassembler approach.

What happens if you send data with the wrong 'checksum'?

(12 Feb '14, 15:16) Kurt Knochner ♦

Is the software Java or .net based?

(12 Feb '14, 15:25) Kurt Knochner ♦

Can you post the software on dropbox?

(12 Feb '14, 15:30) Kurt Knochner ♦

No PM here, but if you click my name, you'll find my e-mail address.

(12 Feb '14, 15:54) Kurt Knochner ♦

crc16tbl is odd. That could be a sign for a mapping table for a 'custom' CRC16 code. If that is the case you need to analyze the code to reverse engineer their CRC16 algorithm and the table.

(12 Feb '14, 17:01) Kurt Knochner ♦

UPDATE: the OP has deleted some of his comments. As a result, my comments cannot be fully understood in the now void context :-(

(13 Feb '14, 00:17) Kurt Knochner ♦

showing 5 of 9 show 4 more comments

If it's encrypted you do have a chance (in fact, you have a 100% chance if you handle it right): the key must reside somewhere on your computer. Just pop open your favorite debugger, watch for a bit (err, a hundred bytes or so I'd hope) of data to come in from a socket, set a watchpoint on that data, and look at the stack traces of things that access it. If you're really lucky, you might even see it get decrypted in place. If not, you'll probably pick up on the fact that they're using a standard encryption algorithm (they'd be fools not to from a theoretical security standpoint) either by looking at stack traces (if you're lucky) or by using one of the pf6x9j1 / S-box profilers out there (avoid the academic ones, most of them don't work without a lot of trouble). Many encryption algorithms use blocks of "standard data" that can be detected (these are the IVs / S-boxes), these are what you look for in the absence of other information. Whatever you find, google it, and try to override their encryption library to dump the data that's being encrypted/decrypted. From these dumps, (PRODUCT SPAM REMOVED) should be relatively easy to see what's going on.

REing an encrypted session can be a lot of fun, but it requires skill with your debugger and lots of reading. It can be frustrating but you won't be sorry if you spend the time to learn how to do it :)

answered 13 Feb '14, 00:56

cusabio1
11●1
accept rate: 0%

edited 13 Feb '14, 01:29

Kurt Knochner ♦
24.8k●10●39●237