Send many small packets or few big packets? #tcp
I wish more was said about optimizing the correlation between TCP transmission and CPU / load average.
Here is some code I quickly wrote that doesn't give me all the answers, but is a start.
Start a tcp server on one terminal
git clone https://github.com/extend/ranch
cd examples/tcp_echo
%% optionally comment the echo part
make && ./_rel/bin/tcp_echo_example console
This starts the ranch tcp server on port 5555
I opened up another terminal for the client
Create a function that returns N bytes of "x"
1> Bin = fun(N)->
lists:foldl(fun(X,Acc)->
<<Acc/binary,"x">>
end, <<>>, lists:seq(1,N)) end.
#Fun<erl_eval.6.80484245>
2> Bin(1).
<<"x">>
3> Bin(10).
<<"xxxxxxxxxx">>
How long does it take to send a byte
4> timer:tc(fun()->
{ok,C} = gen_tcp:connect("localhost",5555,[]),
gen_tcp:send(C,Bin(1))
end).
{14606,ok}
Try it out with different sizes
9> timer:tc(fun()->
{ok,C} = gen_tcp:connect("localhost",5555,[]), gen_tcp:send(C,Bin(150000)) end).
{4923405,ok}
Send 10 one after another with the same socket
11> timer:tc(fun()->
{ok,C} = gen_tcp:connect("localhost",5555,[]),
[gen_tcp:send(C,Bin(1400)) || _X <- lists:seq(1,10)], ok end).
{819,ok}
Make a test function, memoize a bunch of stuff
25> Test4 = fun(Size,Packets) ->
FullBin = Bin(Size), Max = lists:seq(1,Packets), timer:tc(fun()->
{ok,C} = gen_tcp:connect("localhost",5555,[binary]),
[gen_tcp:send(C,FullBin) || _X <- Max], ok
end) end.
#Fun<erl_eval.12.80484245>
Send 10,000 bytes, but with different size packets vs number of packets
32> [{Size,Packets,Test4(Size,Packets)} || {Size,Packets} <-
[{1,10000}, % send 10k 1 byte sized packets
{10,1000},
{100,100},
{1000,10},
{10000,1}]]. % send one 10k byte sized packet
[{1,10000,{589211,ok}},
{10,1000,{71843,ok}},
{100,100,{9680,ok}},
{1000,10,{1575,ok}},
{10000,1,{795,ok}}]
So sending 1 large packet with 10,000 bytes was fastest. I found this paper[1] interesting that says
1bps of network link requires 1Hz of CPU processing
[1] http://pdf.aminer.org/000/446/252/tcp_performance_re_visited.pdf
But I wish I could see the impact on CPU, TIME_WAIT, etc which is less trivial to co-relate.