Question

我写了一个简单的程序来测试内存同步。使用全局队列与两个共享进程，并将两个进程绑定到不同的核心。我的代码很糟糕。

 #include<stdio.h>
 #include<sched.h>
 #define __USE_GNU

 void bindcpu(int pid) {
     int cpuid;
     cpu_set_t mask;
     cpu_set_t get;
     CPU_ZERO(&mask);

     if (pid > 0) {
         cpuid = 1;
     } else {
         cpuid = 5;
     }

     CPU_SET(cpuid, &mask);

     if (sched_setaffinity(0, sizeof(mask), &mask) == -1) {
         printf("warning: could not set CPU affinity, continuing...\n");
     }
 }

 #define Q_LENGTH 512
 int g_queue[512];

 struct point {
     int volatile w;
     int volatile r;
 };  

 volatile struct point g_p;


 void iwrite(int x) {
     while (g_p.r == g_p.w);
     sleep(0.1);
     g_queue[g_p.w] = x;
     g_p.w = (g_p.w + 1) % Q_LENGTH;
     printf("#%d!%d", g_p.w, g_p.r);
 }

 void iread(int *x) {
     while (((g_p.r + 1) % Q_LENGTH) == g_p.w);
     *x = g_queue[g_p.r];
     g_p.r = (g_p.r + 1) % Q_LENGTH;
     printf("-%d*%d", g_p.r, g_p.w);
 }

 int main(int argc, char * argv[]) {
     //int num = sysconf(_SC_NPROCESSORS_CONF);
     int pid;

     pid = fork();
     g_p.r = Q_LENGTH;
     bindcpu(pid);
     int i = 0, j = 0;

     if (pid > 0) {
         printf("call iwrite \0");
         while (1) {
             iread(&j);
         }
     } else {
         printf("call iread\0");
         while (1) {
             iwrite(i);
             i++;
         }
     }

 }

两个进程之间的数据英特尔（R）Xeon（R）CPU E3-1230和两个内核没有同步。

CPU：Intel（R）Xeon（R）CPU E3-1230 操作系统：3.8.0-35-通用＃50~precision1-Ubuntu SMP

我想知道IPC之外如何在用户的不同核心之间同步数据空间？

Answer 1

如果您希望应用程序操作cpus共享缓存以完成IPC，我不相信您能够这样做。

“Linux内核开发第二版”第9章提供了有关同步多线程应用程序（包括原子操作，半固定，障碍等）的信息： http://www.makelinux.net/books/lkd2/ch09

所以你可能会对你在那里寻找的东西有所了解。

这是英特尔®智能高速缓存“共享缓存多核系统的软件技术”的一个不错的写作：http://archive.is/hm0y

这里有一些stackoverflow问题/答案，可以帮助您找到您正在寻找的信息：

Storing C/C++ variables in processor cache instead of system memory

C++: Working with the CPU cache

Understanding how the CPU decides what gets loaded into cache memory

很抱歉用链接轰炸你，但如果没有更清楚地了解你想要实现的目标，这是我能做的最好的事情。

Answer 2

我建议阅读“Volatile: Almost Useless for Multi-Threaded Programming”，了解为什么要从示例代码中删除volatile。相反，使用C11或C ++ 11原子操作。另请参阅TBB设计模式手册中的Fenced Data Transfer示例。

下面我展示了我改为使用C ++ 11 atomics的问题示例部分。我用g ++ 4.7.2编译它。

 #include <atomic>

...

  struct point g_p;

  struct point {
      std::atomic<int> w;
      std::atomic<int> r;
  };

 void iwrite(int x) {
     int w = g_p.w.load(std::memory_order_relaxed);
     int r;
     while ((r=g_p.r.load(std::memory_order_acquire)) == w);
     sleep(0.1);
     g_queue[w] = x;
     w = (w+1)%Q_LENGTH;
     g_p.w.store( w, std::memory_order_release);
     printf("#%d!%d", w, r);
 }

 void iread(int *x) {
     int r = g_p.r.load(std::memory_order_relaxed);
     int w;
     while (((r + 1) % Q_LENGTH) == (w=g_p.w.load(std::memory_order_acquire)));
     *x = g_queue[r];
     g_p.r.store( (r + 1) % Q_LENGTH, std::memory_order_release );
     printf("-%d*%d", r, w);
 }

关键变化是：

我到处都删除了“volatile”。
struct point的成员声明为std :: atomic
g_p.r和g_p.w的某些加载和存储被隔离。其他人被悬挂。
当加载由另一个线程修改的变量时，代码将其“快照”为局部变量。

代码使用“宽松加载”（无栅栏），其中线程加载其他线程无法修改的变量。我将这些载荷从旋转环中提升出来，因为重复它们是没有意义的。

代码使用“获取负载”，其中线程加载由另一个线程设置的“消息就绪”指示符，并使用“释放存储”，其中存储“消息就绪”指示符“待读取另一个线程。释放是必要的，以确保在写入“就绪”指示符（g_p的成员）之前写入“消息”（队列数据）。获取同样需要确保在看到“就绪”指示符后，“消息”将被读取。

使用快照，以便printf报告线程实际使用的值，而不是稍后出现的某个新值。一般来说，我喜欢使用快照样式有两个原因。首先，触摸共享内存可能很昂贵，因为它通常需要缓存行传输。其次，该样式为我提供了一个稳定的值，可以在本地使用，而不必担心重读可能会返回不同的值。

如何在Xeon上同步不同内核之间的数据（linux如何使用内存屏障）

2 个答案: