缓存友好与数据局部性

学习目标

理解 CPU 缓存层次
掌握 AOS vs SOA
理解数据局部性

核心概念

缓存行

use std::mem;

// CPU 缓存行通常 64 字节
const CACHE_LINE_SIZE: usize = 64;

// 确保数据结构缓存行对齐
#[repr(align(64))]
struct CacheLine {
    data: [u8; 64],
}

AOS vs SOA

// AOS: Array of Structures（面向对象思维）
struct PointAOS {
    x: f64,
    y: f64,
    z: f64,
}
let points: Vec<PointAOS> = vec![/* ... */];
// 遍历时访问 x, y, z 不连续

// SOA: Structure of Arrays（数据导向思维）
struct PointsSOA {
    x: Vec<f64>,
    y: Vec<f64>,
    z: Vec<f64>,
}
// 遍历 x 时内存连续，缓存友好

实际对比

// AOS
struct Entity {
    position: [f32; 3],
    velocity: [f32; 3],
    health: f32,
}

// 只需要位置时，仍然加载 velocity 和 health（浪费缓存）

// SOA
struct Entities {
    positions: Vec<[f32; 3]>,
    velocities: Vec<[f32; 3]>,
    health: Vec<f32>,
}

// 只遍历 positions，缓存效率高

迭代顺序

// 二维数组：行优先 vs 列优先
const N: usize = 1000;
let mut matrix = vec![vec![0.0f64; N]; N];

// 行优先（缓存友好）
for i in 0..N {
    for j in 0..N {
        matrix[i][j] += 1.0;
    }
}

// 列优先（缓存不友好）
for j in 0..N {
    for i in 0..N {
        matrix[i][j] += 1.0;
    }
}

分支预测

// 排序后的数据更容易预测
let mut data: Vec<i32> = (0..1000).collect();
data.sort();  // 排序后分支预测准确率高

// 条件判断更少
let sum: i32 = data.iter()
    .filter(|&&x| x > 500)  // 使用迭代器而非 if
    .sum();

小结

技巧	说明
SOA	结构体数组 → 数组结构体
行优先	遍历时内存连续
缓存行对齐	`#[repr(align(64))]`
排序数据	提高分支预测准确率

缓存友好与数据局部性

缓存友好与数据局部性

学习目标

核心概念

缓存行

AOS vs SOA

实际对比

迭代顺序

分支预测

小结

练习编辑器