大龙的博客

常用链接

统计

最新评论

Linux kernel scaling: Ports and port Cycling --- 转(http://blog.csdn.net/zgl_dm/article/details/6593661)

NOTE: The content of this article is subject to change as we are still investigating the issue While attempting to benchmark redis a coworker (Kal McFate) and I were hitting a 28k limit on concurrent connections from a client machine to our redis server. After investigating we found the following: The default setting for the ephemeral port range on linux (net.ipv4.ip_local_port_range) is not ideal for scale. Default: 32768-61000 Recommended for scale: 1025-65000 Additionally even after changing this setting we were limited by sockets staying open in the TIME_WAIT state. Most of the poor documentation on the internet suggests setting the following in order to address the issue: net.ipv4.tcp_tw_recycle = 1 and net.ipv4.tcp_tw_reuse = 1 This is in fact incorrect. First you should choose one setting or the other not both. tcp_tw_recycle should be considered unsafe for load balancers and other customer facing devices that communicate over a higher latency network and or utilize failover services. This is due to the fact that TIME_WAIT is required in order to deal with packets that arrive for a connection after the same packet has been previously accepted via a retransmit. Setting net.ipv4.tcp_tw_reuse = 1 appears to have resolved our issue. This has passed the limiting factor from the client to the redis server. This issue is difficult to debug due to the fact that while incoming port exhaustion (socket -> accept) will produce a kernel level logged error, ephemeral local port exhaustion creates an application level rather generic could not connect error. We are now investigating other areas this change might benefit! A better solution as far as client -> redis communication is concerned is probably pipelining requests via a single persistent connection. We are looking into this as well. UPDATE: Data is still applicable to concurrency issues, however the root cause here ended up being that the client code was throwing the socket away before properly hanging up on the server. So the socket was left in TIME_WAIT until the timeout period expired. LESSON: When it comes to sockets in TIME_WAIT the issue is most likely caused by crappy TCP socket handling Additionally enabling net.ipv4.tcp_tw_reuse on a development system may cover up poorly implemented protocol and TCP socket level handling :/ http://www.lakitu.us/2011/04/linux-kernel-scaling-ports-and-port-cycling/

posted on 2013-02-18 09:51 大龙 阅读(299) 评论(0)  编辑 收藏 引用


只有注册用户登录后才能发表评论。
网站导航: 博客园   IT新闻   BlogJava   知识库   博问   管理