Last Updated: February 25, 2016
·
616
· bhaskerkode

Send many small packets or few big packets? #tcp

I wish more was said about optimizing the correlation between TCP transmission and CPU / load average.

Here is some code I quickly wrote that doesn't give me all the answers, but is a start.

Start a tcp server on one terminal

git clone https://github.com/extend/ranch
cd examples/tcp_echo
%% optionally comment the echo part
make && ./_rel/bin/tcp_echo_example console 

This starts the ranch tcp server on port 5555

I opened up another terminal for the client

Create a function that returns N bytes of "x"

1> Bin = fun(N)->
    lists:foldl(fun(X,Acc)->         
      <<Acc/binary,"x">> 
   end, <<>>, lists:seq(1,N)) end.
#Fun<erl_eval.6.80484245>
2> Bin(1).
<<"x">>
3> Bin(10).
<<"xxxxxxxxxx">>

How long does it take to send a byte

4> timer:tc(fun()-> 
   {ok,C} = gen_tcp:connect("localhost",5555,[]),    
   gen_tcp:send(C,Bin(1)) 
 end).
{14606,ok}

Try it out with different sizes

9> timer:tc(fun()-> 
    {ok,C} = gen_tcp:connect("localhost",5555,[]), gen_tcp:send(C,Bin(150000)) end).
{4923405,ok}

Send 10 one after another with the same socket

11> timer:tc(fun()-> 
  {ok,C} = gen_tcp:connect("localhost",5555,[]),
  [gen_tcp:send(C,Bin(1400)) || _X <- lists:seq(1,10)], ok end).
{819,ok}

Make a test function, memoize a bunch of stuff

25> Test4 = fun(Size,Packets) -> 
FullBin = Bin(Size), Max = lists:seq(1,Packets),     timer:tc(fun()-> 
{ok,C} = gen_tcp:connect("localhost",5555,[binary]),
[gen_tcp:send(C,FullBin) || _X <- Max], ok 
end) end.
#Fun<erl_eval.12.80484245>

Send 10,000 bytes, but with different size packets vs number of packets

32> [{Size,Packets,Test4(Size,Packets)} || {Size,Packets} <- 
[{1,10000}, % send 10k 1 byte sized packets
{10,1000},
{100,100},
{1000,10},
{10000,1}]].  % send one 10k byte sized packet
[{1,10000,{589211,ok}},
 {10,1000,{71843,ok}},
 {100,100,{9680,ok}},
 {1000,10,{1575,ok}},
 {10000,1,{795,ok}}]

So sending 1 large packet with 10,000 bytes was fastest. I found this paper[1] interesting that says

1bps of network link requires 1Hz of CPU processing
[1] http://pdf.aminer.org/000/446/252/tcp_performance_re_visited.pdf

But I wish I could see the impact on CPU, TIME_WAIT, etc which is less trivial to co-relate.