﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>C++博客-&amp;豪</title><link>http://www.cppblog.com/qywyh/</link><description>豪-&gt;blog</description><language>zh-cn</language><lastBuildDate>Sat, 04 Apr 2026 23:26:13 GMT</lastBuildDate><pubDate>Sat, 04 Apr 2026 23:26:13 GMT</pubDate><ttl>60</ttl><item><title>[转]	 linux 常用定位问题命令总结</title><link>http://www.cppblog.com/qywyh/archive/2010/11/21/134208.html</link><dc:creator>豪</dc:creator><author>豪</author><pubDate>Sun, 21 Nov 2010 04:25:00 GMT</pubDate><guid>http://www.cppblog.com/qywyh/archive/2010/11/21/134208.html</guid><wfw:comment>http://www.cppblog.com/qywyh/comments/134208.html</wfw:comment><comments>http://www.cppblog.com/qywyh/archive/2010/11/21/134208.html#Feedback</comments><slash:comments>1</slash:comments><wfw:commentRss>http://www.cppblog.com/qywyh/comments/commentRss/134208.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/qywyh/services/trackbacks/134208.html</trackback:ping><description><![CDATA[<font  color="#3F3F3F" face="Arial, sans-serif" size="3"><span  style="border-collapse: collapse; font-size: 12px; line-height: 24px;"><div><br></div><div>1：查看CPU负载--mpstat</div><div>mpstat -P ALL [internal [count]]</div><div><br></div><div>参数的含义如下：</div><div>-P ALL 表示监控所有CPU</div><div>internal 相邻的两次采样的间隔时间</div><div>count 采样的次数</div><div><br></div><div>mpstat命令从/proc/stat获得数据输出</div><div>输出的含义如下：</div><div><br></div><div><br></div><div>CPU 处理器ID</div><div>user 在internal时间段里，用户态的CPU时间（%） ，不包含 nice值为负 进程 ?usr/?total*100</div><div>nice 在internal时间段里，nice值为负进程的CPU时间（%） ?nice/?total*100</div><div>system 在internal时间段里，核心时间（%） ?system/?total*100</div><div>iowait 在internal时间段里，硬盘IO等待时间（%） ?iowait/?total*100</div><div>irq 在internal时间段里，软中断时间（%） ?irq/?total*100</div><div>soft 在internal时间段里，软中断时间（%） ?softirq/?total*100</div><div>idle 在internal时间段里，CPU除去等待磁盘IO操作外的因为任何原因而空闲的时间闲置时间 （%） ?idle/?total*100</div><div><br></div><div>intr/s 在internal时间段里，每秒CPU接收的中断的次数 ?intr/?total*100</div><div>CPU总的工作时间total_cur=user+system+nice+idle+iowait+irq+softirq</div><div><br></div><div>total_pre=pre_user+ pre_system+ pre_nice+ pre_idle+ pre_iowait+ pre_irq+ pre_softirq</div><div>user=user_cur &#8211; user_pre</div><div>total=total_cur-total_pre</div><div><br></div><div>其中_cur 表示当前值，_pre表示interval时间前的值。上表中的所有值可取到两位小数点。</div><div><br></div><div>2：查看磁盘io情况及CPU负载--vmstat</div><div>usage: vmstat [-V] [-n] [delay [count]]</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;-V prints version.</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;-n causes the headers not to be reprinted regularly.</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;-a print inactive/active page stats.</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;-d prints disk statistics</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;-D prints disk table</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;-p prints disk partition statistics</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;-s prints vm table</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;-m prints slabinfo</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;-S unit size</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;delay is the delay between updates in seconds.&nbsp;</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;unit size k:1000 K:1024 m:1000000 M:1048576 (default is K)</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;count is the number of updates.</div><div><br></div><div>vmstat从/proc/stat获得数据</div><div><br></div><div>输出的含义如下:&nbsp;</div><div>FIELD DESCRIPTION FOR VM MODE</div><div>&nbsp;&nbsp; Procs</div><div>&nbsp;&nbsp; &nbsp; &nbsp; r: The number of processes waiting for run time.</div><div>&nbsp;&nbsp; &nbsp; &nbsp; b: The number of processes in uninterruptible sleep.</div><div><br></div><div>&nbsp;&nbsp; Memory</div><div>&nbsp;&nbsp; &nbsp; &nbsp; swpd: the amount of virtual memory used.</div><div>&nbsp;&nbsp; &nbsp; &nbsp; free: the amount of idle memory.</div><div>&nbsp;&nbsp; &nbsp; &nbsp; buff: the amount of memory used as buffers.</div><div>&nbsp;&nbsp; &nbsp; &nbsp; cache: the amount of memory used as cache.</div><div>&nbsp;&nbsp; &nbsp; &nbsp; inact: the amount of inactive memory. (-a option)</div><div>&nbsp;&nbsp; &nbsp; &nbsp; active: the amount of active memory. (-a option)</div><div><br></div><div>&nbsp;&nbsp; Swap</div><div>&nbsp;&nbsp; &nbsp; &nbsp; si: Amount of memory swapped in from disk (/s).</div><div>&nbsp;&nbsp; &nbsp; &nbsp; so: Amount of memory swapped to disk (/s).</div><div><br></div><div>&nbsp;&nbsp; IO</div><div>&nbsp;&nbsp; &nbsp; &nbsp; bi: Blocks received from a block device (blocks/s).</div><div>&nbsp;&nbsp; &nbsp; &nbsp; bo: Blocks sent to a block device (blocks/s).</div><div><br></div><div>&nbsp;&nbsp; System</div><div>&nbsp;&nbsp; &nbsp; &nbsp; in: The number of interrupts per second, including the clock.</div><div>&nbsp;&nbsp; &nbsp; &nbsp; cs: The number of context switches per second.</div><div><br></div><div>&nbsp;&nbsp; CPU</div><div>&nbsp;&nbsp; &nbsp; &nbsp; These are percentages of total CPU time.</div><div>&nbsp;&nbsp; &nbsp; &nbsp; us: Time spent running non-kernel code. (user time, including nice time)</div><div>&nbsp;&nbsp; &nbsp; &nbsp; sy: Time spent running kernel code. (system time)</div><div>&nbsp;&nbsp; &nbsp; &nbsp; id: Time spent idle. Prior to Linux 2.5.41, this includes IO-wait time.</div><div>&nbsp;&nbsp; &nbsp; &nbsp; wa: Time spent waiting for IO. Prior to Linux 2.5.41, shown as zero.</div><div>&nbsp;&nbsp; &nbsp; &nbsp; st: Time spent in involuntary wait. Prior to Linux 2.6.11, shown as zero.</div><div><br></div><div>3：查看内存使用情况--free</div><div>usage: free [-b|-k|-m|-g] [-l] [-o] [-t] [-s delay] [-c count] [-V]</div><div>&nbsp;&nbsp;-b,-k,-m,-g show output in bytes, KB, MB, or GB</div><div>&nbsp;&nbsp;-l show detailed low and high memory statistics</div><div>&nbsp;&nbsp;-o use old format (no -/+buffers/cache line)</div><div>&nbsp;&nbsp;-t display total for RAM + swap</div><div>&nbsp;&nbsp;-s update every [delay] seconds</div><div>&nbsp;&nbsp;-c update [count] times</div><div>&nbsp;&nbsp;-V display version information and exit</div><div><br></div><div>[root@Linux /tmp]# free</div><div><br></div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;total &nbsp; &nbsp; used &nbsp; &nbsp; &nbsp; &nbsp;free &nbsp; &nbsp; &nbsp; shared &nbsp; &nbsp;buffers &nbsp; cached</div><div>Mem: &nbsp; &nbsp; &nbsp; 255268 &nbsp; &nbsp;238332 &nbsp; &nbsp; &nbsp;16936 &nbsp; &nbsp; &nbsp; &nbsp; 0 &nbsp; &nbsp; &nbsp; &nbsp;85540 &nbsp; 126384</div><div>-/+ buffers/cache: &nbsp; 26408 &nbsp; &nbsp; &nbsp; 228860&nbsp;</div><div>Swap: &nbsp; &nbsp; &nbsp;265000 &nbsp; &nbsp; &nbsp;0 &nbsp; &nbsp; &nbsp; &nbsp; 265000</div><div><br></div><div>Mem：表示物理内存统计&nbsp;</div><div>-/+ buffers/cached：表示物理内存的缓存统计&nbsp;</div><div>Swap：表示硬盘上交换分区的使用情况，这里我们不去关心。</div><div>系统的总物理内存：255268Kb（256M），但系统当前真正可用的内存b并不是第一行free 标记的 16936Kb，它仅代表未被分配的内存。</div><div><br></div><div>第1行 &nbsp;Mem：</div><div>total：表示物理内存总量。&nbsp;</div><div>used：表示总计分配给缓存（包含buffers 与cache ）使用的数量，但其中可能部分缓存并未实际使用。&nbsp;</div><div>free：未被分配的内存。&nbsp;</div><div>shared：共享内存，一般系统不会用到，这里也不讨论。&nbsp;</div><div>buffers：系统分配但未被使用的buffers 数量。&nbsp;</div><div>cached：系统分配但未被使用的cache 数量。buffer 与cache 的区别见后面。&nbsp;</div><div>total = used + free &nbsp; &nbsp;</div><div>第2行 &nbsp; -/+ buffers/cached：</div><div>used：也就是第一行中的used - buffers-cached &nbsp; 也是实际使用的内存总量。&nbsp;</div><div>free：未被使用的buffers 与cache 和未被分配的内存之和，这就是系统当前实际可用内存。</div><div>free 2= buffers1 + cached1 + free1 &nbsp; //free2为第二行、buffers1等为第一行</div><div><br></div><div>buffer 与cache 的区别</div><div>A buffer is something that has yet to be "written" to disk.&nbsp;</div><div>A cache is something that has been "read" from the disk and stored for later use</div><div>第3行：</div><div>对操作系统来讲是Mem的参数.buffers/cached 都是属于被使用,所以它认为free只有16936.</div><div>对应用程序来讲是(-/+ buffers/cach).buffers/cached 是等同可用的，因为buffer/cached是为了提高文件读取的性能，当应用程序需在用到内存的时候，buffer/cached会很快地被回收。</div><div>所以从应用程序的角度来说，可用内存=系统free memory+buffers+cached.</div><div><br></div><div>swap</div><div>swap就是LINUX下的虚拟内存分区,它的作用是在物理内存使用完之后,将磁盘空间(也就是SWAP分区)虚拟成内存来使用.</div><div><br></div><div>4：查看网卡情况--sar</div><div>详细见man</div><div>4.1：查看网卡流量：sar -n DEV delay count&nbsp;</div><div>服务器网卡最大能承受流量由网卡本身决定，分为10M、10/100自适应、100+以及1G网卡，一般普通服务器用的是百兆，也有用千兆的。</div><div><br></div><div>输出解释：</div><div>IFACE</div><div>&nbsp;&nbsp; &nbsp; &nbsp; Name of the network interface for which statistics are reported.</div><div><br></div><div>rxpck/s</div><div>&nbsp;&nbsp; &nbsp; &nbsp; Total number of packets received per second.</div><div><br></div><div>txpck/s</div><div>&nbsp;&nbsp; &nbsp; &nbsp; Total number of packets transmitted per second.</div><div><br></div><div>rxbyt/s</div><div>&nbsp;&nbsp; &nbsp; &nbsp; Total number of bytes received per second.</div><div><br></div><div>txbyt/s</div><div>&nbsp;&nbsp; &nbsp; &nbsp; Total number of bytes transmitted per second.</div><div><br></div><div>rxcmp/s</div><div>&nbsp;&nbsp; &nbsp; &nbsp; Number of compressed packets received per second (for cslip etc.).</div><div><br></div><div>txcmp/s</div><div>&nbsp;&nbsp; &nbsp; &nbsp; Number of compressed packets transmitted per second.</div><div><br></div><div>rxmcst/s</div><div>&nbsp;&nbsp; &nbsp; &nbsp; Number of multicast packets received per second.</div><div><br></div><div>4.2：查看网卡失败情况：sar -n EDEV delay count&nbsp;</div><div>输出解释：</div><div>IFACE</div><div>&nbsp;&nbsp; &nbsp; &nbsp; Name of the network interface for which statistics are reported.</div><div><br></div><div>rxerr/s</div><div>&nbsp;&nbsp; &nbsp; &nbsp; Total number of bad packets received per second.</div><div><br></div><div>txerr/s</div><div>&nbsp;&nbsp; &nbsp; &nbsp; Total number of errors that happened per second while transmitting packets.</div><div><br></div><div>coll/s</div><div>&nbsp;&nbsp; &nbsp; &nbsp; Number of collisions that happened per second while transmitting packets.</div><div><br></div><div>rxdrop/s</div><div>&nbsp;&nbsp; &nbsp; &nbsp; Number of received packets dropped per second because of a lack of space in linux buffers.</div><div><br></div><div>txdrop/s</div><div>&nbsp;&nbsp; &nbsp; &nbsp; Number of transmitted packets dropped per second because of a lack of space in linux buffers.</div><div><br></div><div>txcarr/s</div><div>&nbsp;&nbsp; &nbsp; &nbsp; Number of carrier-errors that happened per second while transmitting packets.</div><div><br></div><div>rxfram/s</div><div>&nbsp;&nbsp; &nbsp; &nbsp; Number of frame alignment errors that happened per second on received packets.</div><div><br></div><div>rxfifo/s</div><div>&nbsp;&nbsp; &nbsp; &nbsp; Number of FIFO overrun errors that happened per second on received packets.</div><div><br></div><div>txfifo/s</div><div>&nbsp;&nbsp; &nbsp; &nbsp; Number of FIFO overrun errors that happened per second on transmitted packets.</div><div><br></div><div><br></div><div>5：定位问题进程--top, ps</div><div>top -d delay，详细见man</div><div>ps aux 查看进程详细信息</div><div>ps axf 查看进程树</div><div><br></div><div>6：查看某个进程与文件关系--losf</div><div>需要root权限才能看到全部，否则只能看到登录用户权限范围内的内容</div><div><br></div><div>lsof -p 77//查看进程号为77的进程打开了哪些文件</div><div>lsof -d 4//显示使用fd为4的进程&nbsp;</div><div>lsof abc.txt//显示开启文件abc.txt的进程</div><div>lsof -i :22//显示使用22端口的进程</div><div>lsof -i tcp//显示使用tcp协议的进程</div><div>lsof -i tcp:22//显示使用tcp协议的22端口的进程</div><div>lsof +d /tmp//显示目录/tmp下被进程打开的文件</div><div>lsof +D /tmp//同上，但是会搜索目录下的目录，时间较长</div><div>lsof -u username//显示所属user进程打开的文件</div><div><br></div><div>7：查看程序运行情况--strace</div><div>usage: strace [-dffhiqrtttTvVxx] [-a column] [-e expr] ... [-o file]</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;[-p pid] ... [-s strsize] [-u username] [-E var=val] ...</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;[command [arg ...]]</div><div>&nbsp;&nbsp; or: strace -c [-e expr] ... [-O overhead] [-S sortby] [-E var=val] ...</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;[command [arg ...]]</div><div><br></div><div>常用选项：</div><div>-f：除了跟踪当前进程外，还跟踪其子进程。</div><div>-c：统计每一系统调用的所执行的时间,次数和出错的次数等.&nbsp;</div><div>-o file：将输出信息写到文件file中，而不是显示到标准错误输出（stderr）。</div><div>-p pid：绑定到一个由pid对应的正在运行的进程。此参数常用来调试后台进程。</div><div><br></div><div>8：查看磁盘使用情况--df</div><div>test@wolf:~$ df</div><div>Filesystem &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1K-blocks &nbsp; &nbsp; &nbsp;Used Available Use% Mounted on</div><div>/dev/sda1 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;3945128 &nbsp; 1810428 &nbsp; 1934292 &nbsp;49% /</div><div>udev &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;745568 &nbsp; &nbsp; &nbsp; &nbsp;80 &nbsp; &nbsp;745488 &nbsp; 1% /dev</div><div>/dev/sda3 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 12649960 &nbsp; 1169412 &nbsp;10837948 &nbsp;10% /usr/local</div><div>/dev/sda4 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 63991676 &nbsp;23179912 &nbsp;37561180 &nbsp;39% /data</div><div><br></div><div>9：查看网络连接情况--netstat</div><div>常用：netstat -lpn</div><div>选项说明：</div><div>&nbsp;-p, --programs &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; display PID/Program name for sockets</div><div>&nbsp;-l, --listening &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;display listening server sockets</div><div>&nbsp;-n, --numeric &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;don't resolve names</div><div>&nbsp;-a, --all, --listening &nbsp; display all sockets (default: connected)</div></span></font><img src ="http://www.cppblog.com/qywyh/aggbug/134208.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/qywyh/" target="_blank">豪</a> 2010-11-21 12:25 <a href="http://www.cppblog.com/qywyh/archive/2010/11/21/134208.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>A brief history of Consensus, 2PC and Transaction Commit.</title><link>http://www.cppblog.com/qywyh/archive/2010/08/12/123258.html</link><dc:creator>豪</dc:creator><author>豪</author><pubDate>Thu, 12 Aug 2010 15:37:00 GMT</pubDate><guid>http://www.cppblog.com/qywyh/archive/2010/08/12/123258.html</guid><wfw:comment>http://www.cppblog.com/qywyh/comments/123258.html</wfw:comment><comments>http://www.cppblog.com/qywyh/archive/2010/08/12/123258.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/qywyh/comments/commentRss/123258.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/qywyh/services/trackbacks/123258.html</trackback:ping><description><![CDATA[Notes:<br>*. <span class="Apple-style-span" style="border-collapse: separate; color: #000000; font-family: Simsun; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;"><span class="Apple-style-span" style="color: #cccccc; font-family: 'Trebuchet MS',Trebuchet,Verdana,sans-serif; line-height: 20px; text-align: left; font-size: small;"><a href="http://research.microsoft.com/users/lamport/pubs/time-clocks.pdf" style="color: #99aadd;">Time, Clocks and the Ordering of Events in a Distributed System" (1978)</a></span></span><br>&nbsp; &nbsp; 1. The issue is that in a distributed system you cannot tell if event A happened before event B, unless A caused B in some way. Each observer can see events happen in a different order, except for events that cause each other, ie there is only a partial ordering of events in a distributed system.<br>&nbsp;&nbsp;&nbsp; 2. Lamport defines the "happens before" relationship and operator, and goes on to give an algorithm that provides a total ordering of events in a distributed system, so that each process sees events in the same order as every other process.<br>&nbsp;&nbsp;&nbsp; 3. Lamport also introduces the concept of a distributed state machine: start a set of deterministic state machines in the same state and then make sure they process the same messages in the same order. <br>&nbsp;&nbsp;&nbsp; 4. Each machine is now a replica of the others. The key problem is making each replica agree what is the next message to process: a consensus problem. <br>&nbsp;&nbsp;&nbsp; 5. However, <span style="color: red;">the system is not fault tolerant;</span> if one process fails that others have to wait for it to recover.<br><br>*.&nbsp; <span class="Apple-style-span" style="border-collapse: separate; color: #000000; font-family: Simsun; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;"><span class="Apple-style-span" style="color: #cccccc; font-family: 'Trebuchet MS',Trebuchet,Verdana,sans-serif; line-height: 20px; text-align: left; font-size: small;"><span class="Apple-converted-space"></span><a href="http://research.microsoft.com/%7EGray/papers/DBOS.pdf" style="color: #99aadd; text-decoration: none;">"Notes on Database Operating Systems" (1979)</a>.<span class="Apple-converted-space"> </span></span></span><br>&nbsp;&nbsp;&nbsp; 1. 2PC problem: Unfortunately 2PC would block if the TM (Transaction Manager) fails at the wrong time. <br><br>*.&nbsp; <span class="Apple-style-span" style="border-collapse: separate; color: #000000; font-family: Simsun; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;"><span class="Apple-style-span" style="color: #cccccc; font-family: 'Trebuchet MS',Trebuchet,Verdana,sans-serif; line-height: 20px; text-align: left; font-size: small;"><span class="Apple-converted-space"></span><a href="http://www.cs.cornell.edu/courses/cs614/2004sp/papers/Ske81.pdf" style="color: #99aadd; text-decoration: none;">"NonBlocking Commit Protocols" (1981)</a><br></span></span>&nbsp;&nbsp;&nbsp; 1. 3PC problem: The problem was coming up with a nice 3PC algorithm, this would only take nearly 25 years!<br><br>*. <span class="Apple-style-span" style="border-collapse: separate; color: #000000; font-family: Simsun; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;"><span class="Apple-style-span" style="color: #cccccc; font-family: 'Trebuchet MS',Trebuchet,Verdana,sans-serif; line-height: 20px; text-align: left; font-size: small;"><a href="http://theory.lcs.mit.edu/tds/papers/Lynch/jacm85.pdf" style="color: #99aadd; text-decoration: underline;">"Impossibility of distributed consensus with one faulty process" (1985)</a><br></span></span>&nbsp;&nbsp;&nbsp; 1. this famous result is known as the "FLP" result<br>&nbsp;&nbsp;&nbsp; 2. By this time "consensus" was the name given to the problem of getting <span style="color: red;">a bunch of processors to agree a value.</span><br>&nbsp;&nbsp;&nbsp; 3. The kernel of the problem is that you cannot tell the difference between a process that has stopped and one that is running very slowly, making dealing with faults in an asynchronous system almost impossible. <br>&nbsp;&nbsp;&nbsp; 4. a distributed algorithm has two properties: <span style="color: red;">safety and liveness</span>. 2PC is safe: no bad data is ever written to the databases, but its liveness properties aren't great: if the TM fails at the wrong point the system will block.<br>&nbsp;&nbsp;&nbsp; 5. The asynchronous case is more general than the synchronous case: an algorithm that works for an asynchronous system will also work for a synchronous system, but not vice versa. <br><br>*.&nbsp; <span class="Apple-style-span" style="border-collapse: separate; color: #000000; font-family: Simsun; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;"><span class="Apple-style-span" style="color: #cccccc; font-family: 'Trebuchet MS',Trebuchet,Verdana,sans-serif; line-height: 20px; text-align: left; font-size: small;"><span class="Apple-converted-space"></span><a href="http://research.microsoft.com/users/lamport/pubs/byz.pdf" style="color: #99aadd; text-decoration: none;">"The Byzantine Generals Problem" (1982)</a></span></span><br>&nbsp;&nbsp;&nbsp; 1. In this form of the consensus problem the processes can lie, and they can actively try to deceive other processes. <br><br>*.&nbsp; <span class="Apple-style-span" style="border-collapse: separate; color: #000000; font-family: Simsun; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;"><span class="Apple-style-span" style="color: #cccccc; font-family: 'Trebuchet MS',Trebuchet,Verdana,sans-serif; line-height: 20px; text-align: left; font-size: small;"><span class="Apple-converted-space"></span><a href="http://research.microsoft.com/%7EGray/papers/TandemTR88.6_ComparisonOfByzantineAgreementAndTwoPhaseCommit.pdf" style="color: #99aadd; text-decoration: none;">"A Comparison of the Byzantine Agreement Problem and the Transaction Commit Problem." (1987)<span class="Apple-converted-space">&nbsp;</span></a>.</span></span><br>&nbsp;&nbsp;&nbsp;  1. At the time the best consensus algorithm was the Byzantine Generals, but this was too expensive to use for transactions.<br><br>*.&nbsp; <span class="Apple-style-span" style="border-collapse: separate; color: #000000; font-family: Simsun; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;"><span class="Apple-style-span" style="color: #cccccc; font-family: 'Trebuchet MS',Trebuchet,Verdana,sans-serif; line-height: 20px; text-align: left; font-size: small;"><span class="Apple-converted-space"></span><a href="http://infoscience.epfl.ch/getfile.py?recid=88273&amp;mode=best" style="color: #99aadd; text-decoration: none;">"Uniform consensus is harder than consensus" (2000)</a><br><span class="Apple-converted-space"></span></span></span>&nbsp;&nbsp;&nbsp;  1. With uniform consensus all processes must agree on a value, even the faulty ones - a transaction should only commit if all RMs are prepared to commit.<br>&nbsp;&nbsp;&nbsp; <br>*.&nbsp; <span class="Apple-style-span" style="border-collapse: separate; color: #000000; font-family: Simsun; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;"><span class="Apple-style-span" style="color: #cccccc; font-family: 'Trebuchet MS',Trebuchet,Verdana,sans-serif; line-height: 20px; text-align: left; font-size: small;"><span class="Apple-converted-space"></span><a href="http://research.microsoft.com/users/lamport/pubs/lamport-paxos.pdf" style="color: #99aadd; text-decoration: none;">"The Part-Time Parliament" (submitted in 1990, published 1998)</a><br></span></span>&nbsp;&nbsp;&nbsp;  1.<span style="color: red;"> Paxos consensus algorithm</span><br>&nbsp;&nbsp;&nbsp; <br>*.&nbsp; <span class="Apple-style-span" style="border-collapse: separate; color: #000000; font-family: Simsun; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;"><span class="Apple-style-span" style="color: #cccccc; font-family: 'Trebuchet MS',Trebuchet,Verdana,sans-serif; line-height: 20px; text-align: left; font-size: small;"><span class="Apple-converted-space"></span><a href="http://research.microsoft.com/lampson/58-Consensus/Acrobat.pdf" style="color: #99aadd; text-decoration: none;">"How to Build a Highly Availability System using Consensus" (1996)</a>.<span class="Apple-converted-space"> <br></span></span></span>&nbsp; &nbsp;  1.<span style="color: red;"></span> This paper provides a good introduction to building fault tolerant systems and Paxos. <br><br>*.&nbsp; <span class="Apple-style-span" style="border-collapse: separate; color: #000000; font-family: Simsun; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;"><span class="Apple-style-span" style="color: #cccccc; font-family: 'Trebuchet MS',Trebuchet,Verdana,sans-serif; line-height: 20px; text-align: left; font-size: small;"><span class="Apple-converted-space"></span><a href="http://research.microsoft.com/users/lamport/pubs/paxos-simple.pdf" style="color: #99aadd; text-decoration: none;">"Paxos Made Simple (2001)</a></span></span><br>&nbsp;&nbsp;&nbsp; 1. The kernel of Paxos is that given a fixed number of processes, any majority of them must have at least one process in common. For example given three processes A, B and C the possible majorities are: AB, AC, or BC. If a decision is made when one majority is present eg AB, then at any time in the future when another majority is available at least one of the processes can remember what the previous majority decided. If the majority is AB then both processes will remember, if AC is present then A will remember and if BC is present then B will remember.<br>&nbsp;&nbsp;&nbsp; 2. Paxos can tolerate lost messages, delayed messages, repeated messages, and messages delivered out of order.<br>&nbsp;&nbsp;&nbsp; 3. It will reach consensus if there is a single leader for long enough that the leader can talk to a majority of processes twice. Any process, including leaders, can fail and restart; in fact all processes can fail at the same time, the algorithm is still safe. There can be more than one leader at a time.<br>&nbsp;&nbsp;&nbsp; 4. Paxos is an asynchronous algorithm; there are no explicit timeouts. However, it only reaches consensus when the system is behaving in a synchronous way, ie messages are delivered in a bounded period of time; otherwise it is safe. There is a pathological case where Paxos will not reach consensus, in accordance to FLP, but this scenario is relatively easy to avoid in practice.<br><br>*.&nbsp;&nbsp; <span class="Apple-style-span" style="border-collapse: separate; color: #000000; font-family: Simsun; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;"><span class="Apple-style-span" style="color: #cccccc; font-family: 'Trebuchet MS',Trebuchet,Verdana,sans-serif; line-height: 20px; text-align: left; font-size: small;"><span class="Apple-converted-space"></span><a href="http://theory.lcs.mit.edu/tds/papers/Lynch/jacm88.pdf" style="color: #99aadd; text-decoration: none;">"Consensus in the presence of partial synchrony" (1988)<span class="Apple-converted-space"> </span></a><br></span></span>&nbsp;&nbsp;&nbsp; 1. There are two versions of partial synchronous system: in one processes run at speeds within a known range and messages are delivered in bounded time but the actual values are not known a priori; in the other version the range of speeds of the processes and the upper bound for message deliver are known a priori, but they will only start holding at some unknown time in the future. <br>&nbsp;&nbsp;&nbsp; 2. The partial synchronous model is a better model for the real world than either the synchronous or asynchronous model; networks function in a predicatable way most of the time, but occasionally go crazy.<br>&nbsp;&nbsp;&nbsp; <br>*.&nbsp;&nbsp; <span class="Apple-style-span" style="border-collapse: separate; color: #000000; font-family: Simsun; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;"><span class="Apple-style-span" style="color: #cccccc; font-family: 'Trebuchet MS',Trebuchet,Verdana,sans-serif; line-height: 20px; text-align: left; font-size: small;"><span class="Apple-converted-space"></span><a href="http://research.microsoft.com/research/pubs/view.aspx?tr_id=701" style="color: #99aadd; text-decoration: none;">"Consensus on Transaction Commit" (2005)</a>.<span class="Apple-converted-space"> <br></span></span></span>&nbsp;&nbsp;&nbsp; 1. A third phase is only required if there is a fault, in accordance to the Skeen result. Given 2n+1 TM replicas Paxos Commit will complete with up to n faulty replicas.<br>&nbsp;&nbsp;&nbsp; 2. Paxos Commit does not use Paxos to solve the transaction commit problem directly, ie it is not used to solve uniform consensus, rather it is used to make the system fault tolerant.<br>&nbsp;&nbsp;&nbsp; 3.&nbsp; Recently there has been some discussion of the<span style="color: red;"> CAP conjecture</span>: Consistency, Availability and Partition. The conjecture asserts that <span style="color: red;">you cannot have all three in a distributed system</span>: a system that is consistent, that can have faulty processes and that can handle a network partition.<br>&nbsp;&nbsp;&nbsp; 4. Now take a Paxos system with three nodes: A, B and C. We can reach consensus if two nodes are working, ie we can have consistency and availability. Now if C becomes partitioned and C is queried, it cannot respond because it cannot communicate with the other nodes; it doesn't know whether it has been partitioned, or if the other two nodes are down, or if the network is being very slow. The other two nodes can carry on, because they can talk to each other and they form a majority. So for the CAP conjecture, Paxos does not handle a partition because C cannot respond to queries. However, we could engineer our way around this. If we are inside a data center we can use two independent networks (Paxos doesn't mind if messages are repeated). If we are on the internet, then we could have our client query all nodes A, B and C, and if C is partitioned the client can query A or B unless it is partitioned in a similar way to C.<br>&nbsp;&nbsp;&nbsp; 5. a synchronous network, if C is partitioned it can learn that it is partitioned if it does not receive messages in a fixed period of time, and thus can declare itself down to the client.<br><br>*.&nbsp;&nbsp; <span class="Apple-style-span" style="border-collapse: separate; color: #000000; font-family: Simsun; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;"><span class="Apple-style-span" style="color: #cccccc; font-family: 'Trebuchet MS',Trebuchet,Verdana,sans-serif; line-height: 20px; text-align: left; font-size: small;"><span class="Apple-converted-space"></span><a href="http://www.allhands.org.uk/2006/proceedings/papers/624.pdf" style="color: #99aadd; text-decoration: none;">"Co-Allocation, Fault Tolerance and Grid Computing" (2006)</a>.<br><br><br></span></span><a href="http://betathoughts.blogspot.com/2007/06/brief-history-of-consensus-2pc-and.html">[REF] http://betathoughts.blogspot.com/2007/06/brief-history-of-consensus-2pc-and.html</a><br>                    <img src ="http://www.cppblog.com/qywyh/aggbug/123258.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/qywyh/" target="_blank">豪</a> 2010-08-12 23:37 <a href="http://www.cppblog.com/qywyh/archive/2010/08/12/123258.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Lock-Free</title><link>http://www.cppblog.com/qywyh/archive/2010/07/20/120886.html</link><dc:creator>豪</dc:creator><author>豪</author><pubDate>Tue, 20 Jul 2010 08:58:00 GMT</pubDate><guid>http://www.cppblog.com/qywyh/archive/2010/07/20/120886.html</guid><wfw:comment>http://www.cppblog.com/qywyh/comments/120886.html</wfw:comment><comments>http://www.cppblog.com/qywyh/archive/2010/07/20/120886.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/qywyh/comments/commentRss/120886.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/qywyh/services/trackbacks/120886.html</trackback:ping><description><![CDATA[<br>
<div>A "wait-free" procedure can complete in a finite number of steps, regardless of the relative speeds of other threads.<br><br>A "lock-free" procedure guarantees progress of at least one of the threads executing the procedure. That means some threads can be delayed arbitrarily, but it is guaranteed that at least one thread makes progress at each step.<br><br>CAS：assuming the map hasn't changed since I last looked at it, copy it. Otherwise, start all over again.</div>
<div><br></div>
<div><span style="font-family: Verdana,Arial,Helvetica,sans-serif; font-size: 13px;">Delay Update：In plain English, the loop says "I'll replace the old map with a new, updated one, and I'll be on the lookout for any other updates of the map, but I'll only do the replacement when the reference count of the existing map is one."</span><span style="font-family: Verdana,Arial,Helvetica,sans-serif; font-size: 13px;">&nbsp;</span></div>
<div><span style="font-family: Verdana,Arial,Helvetica,sans-serif; font-size: 13px;"><br></span></div>
<div><br></div>
<div>[REF]<a href="http://www.drdobbs.com/cpp/184401865">http://www.drdobbs.com/cpp/184401865</a></div><img src ="http://www.cppblog.com/qywyh/aggbug/120886.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/qywyh/" target="_blank">豪</a> 2010-07-20 16:58 <a href="http://www.cppblog.com/qywyh/archive/2010/07/20/120886.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Lessons Learned from scaling Farmville</title><link>http://www.cppblog.com/qywyh/archive/2010/07/16/120552.html</link><dc:creator>豪</dc:creator><author>豪</author><pubDate>Fri, 16 Jul 2010 07:06:00 GMT</pubDate><guid>http://www.cppblog.com/qywyh/archive/2010/07/16/120552.html</guid><wfw:comment>http://www.cppblog.com/qywyh/comments/120552.html</wfw:comment><comments>http://www.cppblog.com/qywyh/archive/2010/07/16/120552.html#Feedback</comments><slash:comments>1</slash:comments><wfw:commentRss>http://www.cppblog.com/qywyh/comments/commentRss/120552.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/qywyh/services/trackbacks/120552.html</trackback:ping><description><![CDATA[
<span style="font-size: 12px; font-family: verdana, arial, helvetica, sans-serif; "><p class="MsoNormal" style="margin-top: 0px; "><br><o:p></o:p></p><p dir="ltr" class="MsoNormal" style="margin-top: 0px; text-indent: -0.25in; margin-left: 1in; margin-right: 0px; "><strong>1.<span style="font-size: 7pt; ">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span>Interactive games are write-heavy</span></strong><span>. Typical web apps read more than they write so many common architectures may not be sufficient. Read heavy apps can often get by with a caching layer in front of a single database. Write heavy apps will need to partition so writes are spread out and/or use an in-memory architecture.</span><o:p></o:p></p><p dir="ltr" class="MsoNormal" style="margin-top: 0px; text-indent: -0.25in; margin-left: 1in; margin-right: 0px; "><strong><span style="font-family: Verdana, sans-serif; font-size: 10pt; ">2.</span><span style="font-size: 7pt; ">&nbsp;&nbsp;&nbsp;&nbsp;</span><span>Design every component as a degradable service</span></strong><span>. Isolate components so increased latencies in one area won't ruin another. Throttle usage to help alleviate problems. Turn off features when necessary.</span><o:p></o:p></p><p dir="ltr" class="MsoNormal" style="margin-top: 0px; text-indent: -0.25in; margin-left: 1in; margin-right: 0px; "><strong><span style="font-family: Verdana, sans-serif; font-size: 10pt; ">3.</span><span style="font-size: 7pt; ">&nbsp;&nbsp;&nbsp;&nbsp;</span><span>Cache Facebook data</span></strong><span>. When you are deeply dependent on an external component consider caching that component's data to improve latency.</span><o:p></o:p></p><p dir="ltr" class="MsoNormal" style="margin-top: 0px; text-indent: -0.25in; margin-left: 1in; margin-right: 0px; "><strong><span style="font-family: Verdana, sans-serif; font-size: 10pt; ">4.</span><span style="font-size: 7pt; ">&nbsp;&nbsp;&nbsp;&nbsp;</span><span>Plan ahead for new release related usage spikes</span></strong><span>.</span><o:p></o:p></p><p dir="ltr" class="MsoNormal" style="margin-top: 0px; text-indent: -0.25in; margin-left: 1in; margin-right: 0px; ">5.<span style="font-size: 7pt; ">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><strong><span>Sample</span></strong><span>. When analyzing large streams of data, looking for problems for example, not every piece of data needs to be processed. Sampling data can yield the same results for much less work.</span></p><p dir="ltr" class="MsoNormal" style="margin-top: 0px; text-indent: -0.25in; margin-left: 1in; margin-right: 0px; "><br></p><p dir="ltr" class="MsoNormal" style="margin-top: 0px; text-indent: -0.25in; margin-left: 1in; margin-right: 0px; "><span><span  style="font-family: Georgia, 'Times New Roman', serif; font-size: 14px; color: rgb(38, 38, 38); font-style: italic; line-height: 25px; "><p style="margin-bottom: 1em; margin-top: 0em; ">The key ideas are to isolate troubled and highly latent services from causing latency and performance issues elsewhere through use of error and timeout throttling, and if needed, disable functionality in the application using on/off switches and functionality based throttles.</p></span></span></p></span><img src ="http://www.cppblog.com/qywyh/aggbug/120552.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/qywyh/" target="_blank">豪</a> 2010-07-16 15:06 <a href="http://www.cppblog.com/qywyh/archive/2010/07/16/120552.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>php copy on write</title><link>http://www.cppblog.com/qywyh/archive/2010/05/18/115734.html</link><dc:creator>豪</dc:creator><author>豪</author><pubDate>Tue, 18 May 2010 14:45:00 GMT</pubDate><guid>http://www.cppblog.com/qywyh/archive/2010/05/18/115734.html</guid><wfw:comment>http://www.cppblog.com/qywyh/comments/115734.html</wfw:comment><comments>http://www.cppblog.com/qywyh/archive/2010/05/18/115734.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/qywyh/comments/commentRss/115734.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/qywyh/services/trackbacks/115734.html</trackback:ping><description><![CDATA[1.如果是非引用赋值，用于赋值的变量指向的zval的is_ref=0，则直接指向，refcount++；若zval的is_ref=1，则copy on write，原zval refcount不变, 新变量指向一个新的zval，is_ref=0, refcount=1;<br><br>2.如果是引用赋值，用于复制的变量指向的zval的is_ref=0，则copy on write，原zval refcount--，新变量和引用变量同时指向新的zval，is_ref=1,refcount=2; 若zval的is_ref=1，则直接指向,refcount++;<br><br><img src ="http://www.cppblog.com/qywyh/aggbug/115734.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/qywyh/" target="_blank">豪</a> 2010-05-18 22:45 <a href="http://www.cppblog.com/qywyh/archive/2010/05/18/115734.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss>