Issues when reading a string from TCP socket in Node.js

I've implemented a client/server that communicate using a TCP socket. The data that I'm writing to the socket is stringified JSON. Initially everything works as expected, however, as I increase the rate of writes I eventually encounter JSON parse errors where the beginning on the client receives the beginning of the new write on the end of the old one.

Here is the server code:

var data = {};
data.type = 'req';
data.id = 1;
data.size = 2;
var string = JSON.stringify(data);
client.write(string, callback());

Here is how I am receiving this code on the client server:

client.on('data', function(req) {
    var data = req.toString();
    try {
        json = JSON.parse(data);
    } catch (err) {
         console.log("JSON parse error:" + err);
    } 
});

The error that I'm receiving as the rate increases is:

SyntaxError: Unexpected token {

Which appears to be the beginning of the next request being tagged onto the end of the current one.

I've tried using ; as a delimiter on the end of each JSON request and then using:

 var data = req.toString().substring(0,req.toString().indexOf(';'));

However this approach, instead of resulting in JSON parse errors seems to result in completely missing some requests on the client side as I increase the rate of writes over 300 per second.

Are there any best practices or more efficient ways to delimit incoming requests via TCP sockets?

Thanks!

Answers


Thanks everyone for the explanations, they helped me to better understand the way in which data is sent and received via TCP sockets. Below is a brief overview of the code that I used in the end:

var chunk = "";
client.on('data', function(data) {
    chunk += data.toString(); // Add string on the end of the variable 'chunk'
    d_index = chunk.indexOf(';'); // Find the delimiter

    // While loop to keep going until no delimiter can be found
    while (d_index > -1) {         
        try {
            string = chunk.substring(0,d_index); // Create string up until the delimiter
            json = JSON.parse(string); // Parse the current string
            process(json); // Function that does something with the current chunk of valid json.        
        }
        chunk = chunk.substring(d_index+1); // Cuts off the processed chunk
        d_index = chunk.indexOf(';'); // Find the new delimiter
    }      
});

Comments welcome...


You're on the right track with using a delimiter. However, you can't just extract the stuff before the delimiter, process it, and then discard what came after it. You have to buffer up whatever you got after the delimiter and then concatenate what comes next to it. This means that you could end up with any number (including 0) of JSON "chunks" after a given data event.

Basically you keep a buffer, which you initialize to "". On each data event you concatenate whatever you receive to the end of the buffer and then split it the buffer on the delimiter. The result will be one or more entries, but the last one might not be complete so you need to test the buffer to make sure it ends with your delimiter. If not, you pop the last result and set your buffer to it. You then process whatever results remain (which might not be any).


Be aware that TCP does not make any guarantees about where it divides the chunks of data you recieve. All it guarantees is that all the bytes you send will be received in order, unless the connection fails entirely.

I believe Node data events come in whenever the socket says it has data for you. Technically you could get separate data events for each byte in your JSON data and it would still be within the limits of what the OS is allowed to do. Nobody does that, but your code needs to be written as if it could suddenly start happening at any time to be robust. It's up to you to combine data events and then re-split the data stream along boundaries that make sense to you.

To do that, you need to buffer any data that isn't "complete", including data appended to the end of a chunk of "complete" data. If you're using a delimiter, never throw away any data after the delimiter -- always keep it around as a prefix until you see either more data and eventually either another delimiter or the end event.

Another common choice is to prefix all data with a length field. Say you use a fixed 64-bit binary value. Then you always wait for 8 bytes, plus however many more the value in those bytes indicate, to arrive. Say you had a chunk of ten bytes of data incoming. You might get 2 bytes in one event, then 5, then 4 -- at which point you can parse the length and know you need 7 more, since the last 3 bytes of the third chunk were payload. If the next event actually contains 25 bytes, you'd take the first 7 along with the 3 from before and parse that, and look for another length field in bytes 8-16.

That's a contrived example, but be aware that at low traffic rates, the network layer will generally send your data out in whatever chunks you give it, so this sort of thing only really starts to show up as you increase the load. Once the OS starts building packets from multiple writes at once, it will start splitting on a granularity that is convenient for the network and not for you, and you have to deal with that.


Try with end event and no data

var data = '';

client.on('data', function (chunk) {
  data += chunk.toString();
});

client.on('end', function () {
  data = JSON.parse(data); // use try catch, because if a man send you other for fun, you're server can crash.
});

Hope help you.


Need Your Help

How to find GCD, LCM on a set of numbers

java math greatest-common-divisor lcm

What would be the easiest way to calculate Greatest Common Divisor and Least Common Multiple on a set of numbers? What math functions can be used to find this information?

Unity singleton manager classes

c# design-patterns unity3d singleton

In Unity, whats a good way to create a singleton game manager that can be accessed everywhere as a global class with static variables that will spit the same constant values to every class that pulls