无锁线程通信(0)：回到起点，又见曙光

经由FCC同学提醒，我才发现，之前的两篇用volatile的无锁通信，确实是败在了优化上了。编译器和CPU硬件会无情的击毁我们认为的执行顺序。

这篇(0)就是否定掉前两篇，然后在未来的(3)和(4)里面，将用cas等原语来重新设计这个无锁的通信。

关于编译器的优化，我猜是为了CPU的U/V流水线来做的。它让状态的修改操作，有可能在操作之前就发生了，从而直接颠覆了整个方法。

另外shbooom所说的，两个原子操作的问题。可能是少打了一个不字。

我说的原子操作，是在最终操作上来理解的，或者称为内部因果关联上。就是说，对状态赋值前的取地址操作，都不算作赋值操作的一部分。同样的对状态进行检查的时候的取地址操作，也不算作检查的一部分。因为他们互不影响。最终影响检查状态的，只有写内存位置的那一个操作。在这之间出现的执行穿插，最多会导致本次检查无法命中，但下次检查就一定会命中。

[Barrier]

在多篇文章中，我找到了解决办法。可以通过Barrier结束掉之前的所有读写操作，从而让设置状态这个操作的写操作不会被优化到实际任务操作的中间或者前面去。

在VC中，有_ReadBarrier _WriteBarrier 和 _ReadWriteBarrier的特殊指令，可以将之前的所有对内存的读和写或者读写的优化限制到Barrier之前，从而不会把后面的操作优化到前面去。

在其他的编译器中，应该也有相应的东西来保障执行顺序。

不过我还没测试过这个对性能带来的影响。希望不要太大。

posted on 2010-05-06 16:59 饭中淹阅读(2106) 评论(9) 编辑收藏引用所属分类: 数据算法分析

# re: 无锁多线程通信(0)：回到起点 2010-05-06 18:01 shbooom

if(locked==false) { locked=true; doSomething();}这不是两个原子操作？如何防止两步之间没有人插入？
回复更多评论

# re: 无锁多线程通信(0)：回到起点 2010-05-06 18:36 饭中淹

@shbooom
不是locked，是dosomething.
线程A
if(doSomething==true) { DoSomething(); doSomething=false;}

线程B
if(doSomething==false) { DoOtherthing(); doSomething=true;}

在A线程DoSomething()时，B线程进不去判断，所以不会用DoOtherthing()来干扰。

或者就按照你的locked
修改成这样
线程A
if(locked==false) { doSomething(); locked=true;}
线程B
if(locked) { doOtherthing(); locked = false;}

在A的花括号内，B的doOtherthing不会影响A的doSomething。

回复更多评论

# re: 无锁多线程通信(0)：回到起点，又见曙光 2010-05-06 18:58 cpm

Intel Architectures Software Developer's Manual Volume 3A System Programming Guide Chapter 7 Section 2
7.2.2 Memory Ordering in P6 and More Recent Processor Families
The Intel Core 2 Duo, Intel Atom, Intel Core Duo, Pentium 4, and P6 family proces-
sors also use a processor-ordered memory-ordering model that can be further
defined as "write ordered with store-buffer forwarding." This model can be character-
ized as follows.
In a single-processor system for memory regions defined as write-back cacheable,
the following ordering principles apply (Note the memory-ordering principles for
single-processor and multiple-processor systems are written from the perspective of
software executing on the processor, where the term "processor" refers to a logical
processor. For example, a physical processor supporting multiple cores and/or
HyperThreading Technology is treated as a multi-processor systems.):
1 Reads are not reordered with other reads.
2 Writes are not reordered with older reads.
3 Writes to memory are not reordered with other writes, with the exception of
writes executed with the CLFLUSH instruction and streaming stores (writes)
executed with the non-temporal move instructions (MOVNTI, MOVNTQ,
MOVNTDQ, MOVNTPS, and MOVNTPD).
4 Reads may be reordered with older writes to different locations but not with older
writes to the same location.
5 Reads or writes cannot be reordered with I/O instructions, locked instructions, or
serializing instructions.
6 Reads cannot pass LFENCE and MFENCE instructions.
7 Writes cannot pass SFENCE and MFENCE instructions.
In a multiple-processor system, the following ordering principles apply:
1 Individual processors use the same ordering principles as in a single-processor
system.
2 Writes by a single processor are observed in the same order by all processors.
3 Writes from an individual processor are NOT ordered with respect to the writes
from other processors.
4 Memory ordering obeys causality (memory ordering respects transitive
visibility).
5 Writes to the same location have a total order.
6 Locked instructions have a total order.

其实只有写操作和其后的读操作(不同的地址)才会乱序执行回复更多评论

# re: 无锁多线程通信(0)：回到起点，又见曙光 2010-05-06 19:07 cpm

看了你之前的源代码,当线程A需要放入数据时B线程一直在使用数据的话，线程A会占100%的CPU，系统里可不光A，B两个线程，如果使用系统提供的临界区，线程A会陷入内核然后释放出CPU控制权，等线程B处理完会将A唤醒，提高了系统的利用率。回复更多评论

# re: 无锁多线程通信(0)：回到起点，又见曙光 2010-05-06 19:22 饭中淹

@cpm
代码中的赋值是在一起的，而且是对不同的变量，所以他们可能会被乱序执行。这样另一个线程获取到通过状态后，可能这时候其他的写操作还没有执行完。就会出问题。

所以在修改状态前，要用一个写barrier来过滤掉其他写操作的reorder，从而实现修改状态时，所有写操作都完成。

另外，关于例程的代码，只是做分析用。实际使用时，还是得用一些手段来让系统对线程进行调度。这个是最基本的常识。
回复更多评论

# re: 无锁线程通信(0)：回到起点，又见曙光 2010-05-07 15:24 cpm

@饭中淹
3 Writes to memory are not reordered with other writes, with the exception ...
所以写操作和写操作之间不会乱序（例外情况这里不会遇到）
同时
2 Writes by a single processor are observed in the same order by all processors.
因此，一个线程的写操作也是被所有线程依次观察到的（例如一个线程先写了a，后写了b，那么所有线程观察的结果都是a先被写，b后被写）
所以这种情况下不需要内存屏障，只要变量加了volatile保证编译器不优化就没有问题

另外，你的代码之所以无锁能够运行，是因为通常需要多线程锁的那些线程地位是对等的，而在你的例子里，其实是把一个单线程问题披上了多线程的外套。A线程只会把iState 由false改为true，而B线程只会把iState 由true改为false。既然这样，为何不让A线程检查完直接处理呢？
而真正需要多线程同步的问题（即不能简化为单线程的问题），锁是不可避免的。就算是lock-free编程，也是使用了硬件锁。回复更多评论

# re: 无锁线程通信(0)：回到起点，又见曙光 2010-05-07 16:57 饭中淹

@cpm
在这里，我说的是个多线程通信问题，它的单个任务是线性的。
我认为：

所有的多线程问题都可以细分为单个线性任务。

不能细分为线性任务的问题，是不存在的。

锁是不可避免的，就算是状态也是一种锁。

这里的无锁，就是不用系统提供的锁。

无论是什么多线程问题，锁都是保证线性任务线性执行的。

所以没有你说的不能简化为单线程的问题。

回复更多评论

# re: 无锁线程通信(0)：回到起点，又见曙光 2010-05-07 17:03 饭中淹

@cpm
另外，我准备用barrier来代替volatile，从而让所有的读写操作都能够被优化，并且，还能够获得及时的更新。
回复更多评论

# re: 无锁线程通信(0)：回到起点，又见曙光 2010-06-02 10:04 maybetrueness

为什么不使用Intel的线程开发包TBB呢？这个库已经被多个产品使用了，而不是实验性代码。
里面有模板 atomic<T> 来设定原子性变量，来代替volatile, 既保证正确性，又保证跨平台。你们担心的问题，我相信Intel的工程师都已经考虑到并且保证没有问题的。回复更多评论

刷新评论列表

只有注册用户登录后才能发表评论。
【推荐】100%开源！大型工业跨平台软件C++源码提供，建模，组态！

相关文章: 【TRACK】 PROJECT SUNSHINE 之数据对象类特性 ProjectSUNSHINE备忘录之统一化的应用架构 Project Sunshine 【内存图像混合库】两个版本【简单的字符串模版匹配】【数据对象和映射记录】客户端小图元缓存池随机挑选心绪不宁

网站导航: 博客园 IT新闻 BlogJava 博问 Chat2DB 管理

饭中淹的避难所~~~~~

公告

常用链接

留言簿(19)

随笔分类

随笔档案

新闻档案

相册

我的其他BLOG

搜索

最新评论

阅读排行榜

评论排行榜

评论