server: fix buffer release timing in processUnaryRPC by lqs · Pull Request #7998 · grpc/grpc-go (original) (raw)

Here are the benchmark results, taken on a Ryzen 9 7945HX CPU. Each test group was run multiple times, and the fastest result was selected.

before

df := func(v any) error {
    defer d.Free()
$ ~/sdk/go1.23.4/bin/go run benchmark/benchmain/main.go -benchtime=60s -workloads=unary -compression=off -maxConcurrentCalls=100 -trace=off -reqSizeBytes=1 -respSizeBytes=1 -networkMode=Local -resultFile=before
go1.23.4/grpc1.71.0-dev
unary-networkMode_Local-bufConn_false-keepalive_false-benchTime_1m0s-trace_false-latency_0s-kbps_0-MTU_0-maxConcurrentCalls_100-reqSize_1B-respSize_1B-compressor_off-channelz_false-preloader_false-clientReadBufferSize_-1-clientWriteBufferSize_-1-serverReadBufferSize_-1-serverWriteBufferSize_-1-sleepBetweenRPCs_0s-connections_1-recvBufferPool_nil-sharedWriteBuffer_false:
50_Latency: 530.3120µs	90_Latency: 747.7040µs	99_Latency: 1405.5230µs	Avg_Latency: 581.5540µs	Bytes/op: 10153.135931747744	Allocs/op: 178.97030961211047
Histogram (unit: µs)
Count: 10295689  Min: 110.7  Max: 41317.2  Avg: 581.55
------------------------------------------------------------
[     110.655000,      110.656000)         1    0.0%    0.0%
[     110.656000,      110.662016)         0    0.0%    0.0%
[     110.662016,      110.704228)         0    0.0%    0.0%
[     110.704228,      111.000400)         0    0.0%    0.0%
[     111.000400,      113.078425)         0    0.0%    0.0%
[     113.078425,      127.658446)         0    0.0%    0.0%
[     127.658446,      229.956072)       164    0.0%    0.0%
[     229.956072,      947.705647)   9740824   94.6%   94.6%  #########
[     947.705647,     5983.643190)    554125    5.4%  100.0%  #
[    5983.643190,    41317.230000)       574    0.0%  100.0%
[   41317.230000,   289227.841760)         1    0.0%  100.0%
Number of requests:  10295689	Request throughput:  1.3727585333333334e+06 bit/s
Number of responses: 10295689	Response throughput: 1.3727585333333334e+06 bit/s

after

dataFree := grpcsync.OnceFunc(d.Free)
defer dataFree()
df := func(v any) error {
    defer dataFree()
$ ~/sdk/go1.23.4/bin/go run benchmark/benchmain/main.go -benchtime=60s -workloads=unary -compression=off -maxConcurrentCalls=100 -trace=off -reqSizeBytes=1 -respSizeBytes=1 -networkMode=Local -resultFile=after
go1.23.4/grpc1.71.0-dev
unary-networkMode_Local-bufConn_false-keepalive_false-benchTime_1m0s-trace_false-latency_0s-kbps_0-MTU_0-maxConcurrentCalls_100-reqSize_1B-respSize_1B-compressor_off-channelz_false-preloader_false-clientReadBufferSize_-1-clientWriteBufferSize_-1-serverReadBufferSize_-1-serverWriteBufferSize_-1-sleepBetweenRPCs_0s-connections_1-recvBufferPool_nil-sharedWriteBuffer_false:
50_Latency: 533.2280µs	90_Latency: 744.5700µs	99_Latency: 1415.7970µs	Avg_Latency: 584.1710µs	Bytes/op: 10225.500515690499	Allocs/op: 181.97781262126145
Histogram (unit: µs)
Count: 10246456  Min:  64.9  Max: 39638.3  Avg: 584.17
------------------------------------------------------------
[      64.876000,       64.877000)         1    0.0%    0.0%
[      64.877000,       64.882985)         0    0.0%    0.0%
[      64.882985,       64.924788)         0    0.0%    0.0%
[      64.924788,       65.216775)         0    0.0%    0.0%
[      65.216775,       67.256256)         0    0.0%    0.0%
[      67.256256,       81.501688)         3    0.0%    0.0%
[      81.501688,      181.003628)       142    0.0%    0.0%
[     181.003628,      876.007898)   9584488   93.5%   93.5%  #########
[     876.007898,     5730.495526)    661038    6.5%  100.0%  #
[    5730.495526,    39638.273000)       783    0.0%  100.0%
[   39638.273000,   276478.380825)         1    0.0%  100.0%
Number of requests:  10246456	Request throughput:  1.3661941333333333e+06 bit/s
Number of responses: 10246456	Response throughput: 1.3661941333333333e+06 bit/s

comparison

$ ~/sdk/go1.23.4/bin/go run benchmark/benchresult/main.go before after
unary-networkMode_Local-bufConn_false-keepalive_false-benchTime_1m0s-trace_false-latency_0s-kbps_0-MTU_0-maxConcurrentCalls_100-reqSize_1B-respSize_1B-compressor_off-channelz_false-preloader_false-clientReadBufferSize_-1-clientWriteBufferSize_-1-serverReadBufferSize_-1-serverWriteBufferSize_-1-sleepBetweenRPCs_0s-connections_1-recvBufferPool_nil-sharedWriteBuffer_false
               Title       Before        After Percentage
            TotalOps     10295689     10246456    -0.48%
             SendOps            0            0      NaN%
             RecvOps            0            0      NaN%
            Bytes/op     10153.14     10225.50     0.71%
           Allocs/op       178.97       181.98     1.68%
             ReqT/op   1372758.53   1366194.13    -0.48%
            RespT/op   1372758.53   1366194.13    -0.48%
            50th-Lat    530.312µs    533.228µs     0.55%
            90th-Lat    747.704µs     744.57µs    -0.42%
            99th-Lat   1.405523ms   1.415797ms     0.73%
             Avg-Lat    581.554µs    584.171µs     0.45%
           GoVersion     go1.23.4     go1.23.4
         GrpcVersion   1.71.0-dev   1.71.0-dev