前置知识在这里。
在 stackoverflow 上有这么一个问题,问题的答案中有这么几段:
At the same time, x86 defines quite a strict memory model, which bans most possible reorderings, roughly summarized as follows:
Stores have a single global order of visibility, observed consistently by all CPUs, subject to one loosening of this rule below.
Local load operations are never reordered with respect to other local load operations.
Local store operations are never reordered with respect to other local store operations (i.e., a store that appears earlier in the instruction stream always appears earlier in the global order).
Local load operations may be reordered with respect to earlier local store operations, such that the load appears to execute earlier wrt the global store order than the local store, but the reverse (earlier load, older store) is not true.
简单概括一下,就是在 x86 平台采用较强的内存序,只有 store load 会发生乱序。
看各位八股文老仙们背的实在辛苦,本文提供一点可以直接实操证明这些问题的手段。
perfbook 一书在讲 memory barrier 相关的概念时,都使用了一个叫 litmus 的工具,现在被集成在 herdtools 中,安装好 herdtools 就已经有了 litmus,上面提到的所有读写重排/乱序的情况我们都可以进行测试。
读写乱序测试
X86 RW
{ x=0; y=0; }
P0 | P1 ;
MOV EAX,[y] | MOV EAX,[x] ;
MOV [x],$1 | MOV [y],$1 ;
locations [x;y;]
exists (0:EAX=1 /\ 1:EAX=1)
%%%%%%%%%%%%%%%%%%%%%%%%%
% Results for sb.litmus %
%%%%%%%%%%%%%%%%%%%%%%%%%
X86 OOO
{x=0; y=0;}
P0 | P1 ;
MOV EAX,[y] | MOV EAX,[x] ;
MOV [x],$1 | MOV [y],$1 ;
locations [x; y;]
exists (0:EAX=1 /\ 1:EAX=1)
Generated assembler
##START _litmus_P0
movl -4(%rsi,%rcx,4), %eax
movl $1, -4(%rbx,%rcx,4)
##START _litmus_P1
movl -4(%rbx,%rcx,4), %eax
movl $1, -4(%rsi,%rcx,4)
Test OOO Allowed
Histogram (2 states)
500000:>0:EAX=1; 1:EAX=0; x=1; y=1;
500000:>0:EAX=0; 1:EAX=1; x=1; y=1;
No
Witnesses
Positive: 0, Negative: 1000000
Condition exists (0:EAX=1 /\ 1:EAX=1) is NOT validated
Hash=7cdd62e8647b817c1615cf8eb9d2117b
Observation OOO Never 0 1000000
Time OOO 0.14
写读乱序测试
X86 RW
{ x=0; y=0; }
P0 | P1 ;
MOV EAX,[y] | MOV EAX,[x] ;
MOV [x],$1 | MOV [y],$1 ;
locations [x;y;]
exists (0:EAX=1 /\ 1:EAX=1)
%%%%%%%%%%%%%%%%%%%%%%%%%%
% Results for sb2.litmus %
%%%%%%%%%%%%%%%%%%%%%%%%%%
X86 OOO
{x=0; y=0;}
P0 | P1 ;
MOV [x],$1 | MOV [y],$1 ;
MOV EAX,[y] | MOV EAX,[x] ;
locations [x; y;]
exists (0:EAX=0 /\ 1:EAX=0)
Generated assembler
##START _litmus_P0
movl $1, -4(%rbx,%rcx,4)
movl -4(%rsi,%rcx,4), %eax
##START _litmus_P1
movl $1, -4(%rsi,%rcx,4)
movl -4(%rbx,%rcx,4), %eax
Test OOO Allowed
Histogram (4 states)
2 *>0:EAX=0; 1:EAX=0; x=1; y=1;
499998:>0:EAX=1; 1:EAX=0; x=1; y=1;
499999:>0:EAX=0; 1:EAX=1; x=1; y=1;
1 :>0:EAX=1; 1:EAX=1; x=1; y=1;
Ok
Witnesses
Positive: 2, Negative: 999998
Condition exists (0:EAX=0 /\ 1:EAX=0) is validated
Hash=2d53e83cd627ba17ab11c875525e078b
Observation OOO Sometimes 2 999998
Time OOO 0.12
读读和写写乱序测试
这里我没想到太好的办法,所以将读读和写写混在一起进行测试,无论是 WW 会发生重排,或是 RR 会发生重排,都可能会出现在 P0 中,EAX = 2,EBX = 0 的情况。
X86 OOO
{ x=0; y=0; }
P0 | P1 ;
MOV EAX,[x] | MOV [y],$1 ;
MOV EBX,[y] | MOV [x],$2 ;
locations [x;y;]
exists (0:EAX=2 /\ 0:EBX=0)
%%%%%%%%%%%%%%%%%%%%%%%%%%
% Results for sb3.litmus %
%%%%%%%%%%%%%%%%%%%%%%%%%%
X86 OOO
{x=0; y=0;}
P0 | P1 ;
MOV EAX,[x] | MOV [y],$1 ;
MOV EBX,[y] | MOV [x],$2 ;
locations [x; y;]
exists (0:EAX=2 /\ 0:EBX=0)
Generated assembler
##START _litmus_P0
movl -4(%rbx,%rcx,4), %eax
movl -4(%rdx,%rcx,4), %r11d
##START _litmus_P1
movl $1, -4(%rdi,%rax,4)
movl $2, -4(%rcx,%rax,4)
Test OOO Allowed
Histogram (3 states)
500000:>0:EAX=0; 0:EBX=0; x=2; y=1;
1 :>0:EAX=0; 0:EBX=1; x=2; y=1;
499999:>0:EAX=2; 0:EBX=1; x=2; y=1;
No
Witnesses
Positive: 0, Negative: 1000000
Condition exists (0:EAX=2 /\ 0:EBX=0) is NOT validated
Hash=74f6930f2a61d6cfec9fb5ea3132555e
Observation OOO Never 0 1000000
Time OOO 0.11