brpc icon indicating copy to clipboard operation
brpc copied to clipboard

使用example中的例子验证熔断功能(可选熔断和连接超时熔断)

Open 52coder opened this issue 3 years ago • 11 comments

Describe the bug (描述bug) 基于example中的例子验证可选熔断(根据失败率) 原始代码:https://github.com/52coder/incubator-brpc/tree/master/example/asynchronous_echo_c%2B%2B client.cpp修改点: https://github.com/52coder/incubator-brpc/blob/master/example/asynchronous_echo_c%2B%2B/client.cpp#L65 添加: options.enable_circuit_breaker = true; L30行修改为rr : DEFINE_string(load_balancer, "rr", "The algorithm for load balancing");

server.cpp在改动前,先编译出来echo_server,然后在https://github.com/52coder/incubator-brpc/blob/master/example/asynchronous_echo_c%2B%2B/server.cpp#L63前面增加如下代码: cntl->SetFailed(brpc::EREQUEST, "Fail to parse request");然后编译出二进制echo_server_fail,这里的想法是生成一个100%失败的server触发熔断。

To Reproduce (复现方法) 验证方法,三个终端下分别执行 ./echo_server --port=8002 ./echo_server --port=8003 ./echo_server_fail --port=8001

client端运行: ./echo_client --server="list://192.168.49.1:8001,192.168.49.1:8002,192.168.49.1:8003"

Expected behavior (期望行为) 期望行为是由于client端开启了熔断,其中一个server一直返回失败,所以应该会触发熔断才对,实际结果是跑了一晚上,一直出错的那个server并未摘除 ` I0803 00:26:21.137101 1872894 client.cpp:45] Received response from 192.168.49.1:8002: hello world (attached=bar) latency=603us

I0803 00:26:22.137842 1872886 client.cpp:45] Received response from 192.168.49.1:8003: hello world (attached=bar) latency=734us

W0803 00:26:23.138861 1872894 client.cpp:42] Fail to send EchoRequest, [E1003][127.0.1.1:8001][E1003]Fail to parse request

I0803 00:26:24.139042 1872886 client.cpp:45] Received response from 192.168.49.1:8002: hello world (attached=bar) latency=590us

I0803 00:26:25.139412 1872894 client.cpp:45] Received response from 192.168.49.1:8003: hello world (attached=bar) latency=639us

W0803 00:26:26.140028 1872886 client.cpp:42] Fail to send EchoRequest, [E1003][127.0.1.1:8001][E1003]Fail to parse request

I0803 00:26:27.140468 1872894 client.cpp:45] Received response from 192.168.49.1:8002: hello world (attached=bar) latency=733us

I0803 00:26:28.141491 1872886 client.cpp:45] Received response from 192.168.49.1:8003: hello world (attached=bar) latency=739us

W0803 00:26:29.142124 1872894 client.cpp:42] Fail to send EchoRequest, [E1003][127.0.1.1:8001][E1003]Fail to parse request

I0803 00:26:30.142181 1872886 client.cpp:45] Received response from 192.168.49.1:8002: hello world (attached=bar) latency=484us

I0803 00:26:31.142817 1872894 client.cpp:45] Received response from 192.168.49.1:8003: hello world (attached=bar) latency=737us

W0803 00:26:32.143875 1872886 client.cpp:42] Fail to send EchoRequest, [E1003][127.0.1.1:8001][E1003]Fail to parse request

I0803 00:26:33.144834 1872894 client.cpp:45] Received response from 192.168.49.1:8002: hello world (attached=bar) latency=748us

I0803 00:26:34.145134 1872886 client.cpp:45] Received response from 192.168.49.1:8003: hello world (attached=bar) latency=676us

W0803 00:26:35.145437 1872894 client.cpp:42] Fail to send EchoRequest, [E1003][127.0.1.1:8001][E1003]Fail to parse request

I0803 00:26:36.145931 1872886 client.cpp:45] Received response from 192.168.49.1:8002: hello world (attached=bar) latency=739us

`

Versions (各种版本) OS:centos7 Compiler: brpc:最新版 protobuf: 熔断参数均使用默认值:https://github.com/apache/incubator-brpc/blob/master/src/brpc/circuit_breaker.cpp#L32 代码看了Controller::Call::OnComplete---->FeedbackCircuitBreaker。 Additional context/screenshots (更多上下文/截图) 另外连接超时的熔断也未构造出来,设置连接超时时间1ms,rpc超时时间5ms,看服务中打印超时1008错误,但实际未触发熔断。

52coder avatar Aug 02 '22 16:08 52coder

测试的qps多少?加大qps看看?理论上应该能触发,后续我看看能不能在我环境复现一下

cdjingit avatar Aug 05 '22 03:08 cdjingit

@cdjingit 去掉了例子中的sleep,增大了qps也没有触发熔断,学习了这个issue,有点不太理解熔断跟qps的关系,有没有触发熔断的机制详细介绍?

52coder avatar Aug 07 '22 10:08 52coder

检查一下是否有Socket[xxx] isolated by circuit breaker 的log,brpc的默认重连方式只需要tcp能够建立连接,有可能熔断之后又恢复了。 另外connections监控页面也可以检查各个连接的健康状况。

连接超时的熔断需要用iptables drop掉对应端口的包,直接sleep是不行的

TousakaRin avatar Aug 08 '22 07:08 TousakaRin

@TousakaRin @cdjingit 有事情耽误,今天有空看了下,其中一个server回包的时候设置了cntl->SetFailed(brpc::EREQUEST, "Fail to parse request");,没有触发熔断的日志,server还能源源不断收到请求。

超时熔断验证没问题,现在失败率(可选)熔断使用example中的例子没有验证通过,业务中暂时没有开启。

52coder avatar Aug 19 '22 06:08 52coder

可以贴一下完整的代码么,我在server端添加cntl->SetFailed(brpc::EREQUEST, "Fail to parse request");之后,在client看到的日志是W0822 11:08:00.715427 748755 client.cpp:90] [E1003][10.227.87.49:8000][E1003]Failed to parse request,而你贴的日志是Fail to send EchoRequest, 比较奇怪。

TousakaRin avatar Aug 22 '22 03:08 TousakaRin

@TousakaRin 代码就是使用的example中的例子

原始代码:https://github.com/52coder/incubator-brpc/tree/master/example/asynchronous_echo_c%2B%2B client.cpp修改点: https://github.com/52coder/incubator-brpc/blob/master/example/asynchronous_echo_c%2B%2B/client.cpp#L65 添加: options.enable_circuit_breaker = true; L30行修改为rr : DEFINE_string(load_balancer, "rr", "The algorithm for load balancing");

server.cpp在改动前,先编译出来echo_server,然后在[https://github.com/52coder/incubator-brpc/blob/master/example/asynchronous_echo_c%2B%2B/server.cpp#L63前面增加如下代码: cntl->SetFailed(brpc::EREQUEST, "Fail to parse request"); 然后编译出二进制echo_server_fail,这里的想法是生成一个100%失败的server触发熔断。

52coder avatar Aug 26 '22 07:08 52coder

image 我用你描述的代码是能正常熔断的,默认配置下如果server 错误率100%,第151个请求会触发熔断。 这是我用的代码:https://github.com/TousakaRin/brpc/tree/circuit_breaker_test

有一种可能是你的代码比较老,没有这个commit: https://github.com/apache/incubator-brpc/commit/c59aecdded0fbc01cda5b5462ef99c4d084b31ae

TousakaRin avatar Aug 26 '22 08:08 TousakaRin

可以贴一下完整的代码么,我在server端添加cntl->SetFailed(brpc::EREQUEST, "Fail to parse request");之后,在client看到的日志是W0822 11:08:00.715427 748755 client.cpp:90] [E1003][10.227.87.49:8000][E1003]Failed to parse request,而你贴的日志是Fail to send EchoRequest, 比较奇怪。

代码是拷贝的asynchronous_echo_c++目录,修改代码是和你的https://github.com/TousakaRin/brpc/tree/circuit_breaker_test代码一样,除了没有dummy_server.port这个文件,client端打印client.cpp:42] Fail to send EchoRequest, [E1003][127.0.1.1:8003][E1003]Failed to parse request也是正常的吧,因为server端的回报设置了failed,打印这个应该没问题。

重新检查了一遍,其中一个server 100%失败,跑半小时还是没有触发熔断。

52coder avatar Aug 27 '22 15:08 52coder

@TousakaRin 不知道我得测试方法是否有问题,和你的代码一样(缺少了dummy_server.port),server端编译出来两个版本,一个包含cntl->SetFailed(brpc::EREQUEST, "Failed to parse request");,一个不包含这行代码: ./echo_server_ok --port=8001 ./echo_server_fail --port=8002

client.cpp中DEFINE_string(server, "list://0.0.0.0:8001,0.0.0.0:8002", "IP Address of server"); ./echo_client 直接运行

server和client运行在同一台ubunut物理机上面,使用的brpc版本(包含你提到的那个代码): root@52coder:~/incubator-brpc/example/enable_circuit_breaker_c++# git branch

  • (HEAD detached at 1.2.0) master

辛苦帮review下,实在没看出哪里有问题导致没有触发熔断?

client终端输出一直是(20min+): W0827 23:17:06.067692 13282 client.cpp:42] Fail to send EchoRequest, [E1003][127.0.1.1:8002][E1003]Failed to parse request I0827 23:17:07.068715 13280 client.cpp:45] Received response from 0.0.0.0:8001: hello world (attached=bar) latency=1281us W0827 23:17:08.069341 13282 client.cpp:42] Fail to send EchoRequest, [E1003][127.0.1.1:8002][E1003]Failed to parse request I0827 23:17:09.070102 13280 client.cpp:45] Received response from 0.0.0.0:8001: hello world (attached=bar) latency=1048us

52coder avatar Aug 27 '22 15:08 52coder

编译信息及文件列表: make之后文件列表: root@52coder:~/incubator-brpc/example/enable_circuit_breaker_c++# ls -lt total 373920 -rwxr-xr-x 1 root root 95129344 8月 27 23:13 echo_server -rwxr-xr-x 1 root root 95059624 8月 27 23:13 echo_client -rw-r--r-- 1 root root 496992 8月 27 23:13 client.o -rw-r--r-- 1 root root 4803 8月 27 23:13 client.cpp -rwxr-xr-x 1 root root 95129344 8月 27 23:06 echo_server_ok -rw-r--r-- 1 root root 948400 8月 27 23:06 server.o -rw-r--r-- 1 root root 4203 8月 27 23:06 server.cpp -rwxr-xr-x 1 root root 95129432 8月 27 23:05 echo_server_fail -rw-r--r-- 1 root root 5 8月 27 23:05 dummy_server.port -rw-r--r-- 1 root root 926128 8月 27 22:54 echo.pb.o -rw-r--r-- 1 root root 22009 8月 27 22:54 echo.pb.h -rw-r--r-- 1 root root 4878 8月 27 22:52 CMakeLists.txt -rw-r--r-- 1 root root 1073 8月 27 22:52 echo.proto -rw-r--r-- 1 root root 815 8月 27 22:52 Makefile

root@52coder:~/incubator-brpc/example/enable_circuit_breaker_c++# make

Generating echo.pb.cc /usr/local/bin/protoc --cpp_out=. --proto_path=. echo.proto Compiling echo.pb.o g++ -c -I/usr/include/ -I../../output/include -DBRPC_WITH_GLOG=0 -DGFLAGS_NS=google -g -std=c++0x -DNDEBUG -O2 -D__const__= -pipe -W -Wall -Wno-unused-parameter -fPIC -fno-omit-frame-pointer echo.pb.cc -o echo.pb.o Compiling client.o g++ -c -I/usr/include/ -I../../output/include -DBRPC_WITH_GLOG=0 -DGFLAGS_NS=google -g -std=c++0x -DNDEBUG -O2 -D__const__= -pipe -W -Wall -Wno-unused-parameter -fPIC -fno-omit-frame-pointer client.cpp -o client.o Linking echo_client g++ -L/usr/lib/x86_64-linux-gnu -L../../output/lib -Xlinker "-(" echo.pb.o client.o -Wl,-Bstatic -lgflags -lprotobuf -lleveldb -lsnappy -lbrpc -Wl,-Bdynamic -Xlinker "-)" -lpthread -lssl -lcrypto -ldl -lz -lrt -o echo_client Compiling server.o g++ -c -I/usr/include/ -I../../output/include -DBRPC_WITH_GLOG=0 -DGFLAGS_NS=google -g -std=c++0x -DNDEBUG -O2 -D__const__= -pipe -W -Wall -Wno-unused-parameter -fPIC -fno-omit-frame-pointer server.cpp -o server.o In file included from ../../output/include/butil/resource_pool.h:89, from ../../output/include/brpc/socket.h:30, from ../../output/include/brpc/redis.h:33, from ../../output/include/brpc/server.h:43, from server.cpp:22: ../../output/include/butil/resource_pool_inl.h: In instantiation of ‘static butil::ResourcePool<T>* butil::ResourcePool<T>::singleton() [with T = brpc::Socket]’: ../../output/include/butil/resource_pool.h:118:38: required from ‘int butil::return_resource(butil::ResourceId<T>) [with T = brpc::Socket]’ ../../output/include/brpc/socket_inl.h:112:32: required from here ../../output/include/butil/resource_pool_inl.h:368:17: warning: ‘new’ of type ‘butil::ResourcePoolbrpc::Socket’ with extended alignment 64 [-Waligned-new=] 368 | p = new ResourcePool(); | ^~~~~~~~~~~~~~~~~~ ../../output/include/butil/resource_pool_inl.h:368:17: note: uses ‘void* operator new(std::size_t)’, which does not have an alignment parameter ../../output/include/butil/resource_pool_inl.h:368:17: note: use ‘-faligned-new’ to enable C++17 over-aligned new support Linking echo_server g++ -L/usr/lib/x86_64-linux-gnu -L../../output/lib -Xlinker "-(" echo.pb.o server.o -Wl,-Bstatic -lgflags -lprotobuf -lleveldb -lsnappy -lbrpc -Wl,-Bdynamic -Xlinker "-)" -lpthread -lssl -lcrypto -ldl -lz -lrt -o echo_server rm echo.pb.cc root@52coder:~/incubator-brpc/example/enable_circuit_breaker_c++#

52coder avatar Aug 28 '22 01:08 52coder

可以在client的connection页面看一下错误数量有没有正确记录

TousakaRin avatar Sep 08 '22 09:09 TousakaRin