Salsa: LRU needs keep dependency info

August 23, 2022

reivew #371 加深了我对 salsa 做 recomputation 的理解。

我们先来看一个测试，思考这个测试能否 pass

#[salsa::tracked(jar = Jar, lru = 3)]
fn get_hot_potato(db: &dyn Db, input: MyInput) -> Arc<HotPotato> {
    db.push_log(format!("get_hot_potato({:?})", input.field(db)));
    Arc::new(HotPotato::new(input.field(db)))
}

#[salsa::tracked(jar = Jar)]
fn get_hot_potato2(db: &dyn Db, input: MyInput) -> u32 {
    db.push_log(format!("get_hot_potato2({:?})", input.field(db)));
    get_hot_potato(db, input).0
}

#[test]
fn lru_keeps_dependency_info() {
    let mut db = Database::default();
    let capacity = 3;

    // Invoke `get_hot_potato2` 33 times. This will (in turn) invoke
    // `get_hot_potato`, which will trigger LRU after 32 executions.
    let inputs: Vec<MyInput> = (0..(capacity + 1))
        .map(|i| MyInput::new(&mut db, i as u32))
        .collect();

    for (i, input) in inputs.iter().enumerate() {
        let x = get_hot_potato2(&db, *input);
        assert_eq!(x as usize, i);
    }

    db.salsa_runtime_mut()
        .synthetic_write(salsa::Durability::HIGH);

    // We want to test that calls to `get_hot_potato2` are still considered
    // clean. Check that no new executions occur as we go here.
    db.assert_logs_len((capacity + 1) * 2);

    // calling `get_hot_potato2(0)` has to check that `get_hot_potato(0)` is still valid;
    // even though we've evicted it (LRU), we find that it is still good
    let p = get_hot_potato2(&db, *inputs.first().unwrap());
    assert_eq!(p, 0);
    db.assert_logs_len(0);
}

这个测试是要做什么？

为了方便，将 get_hot_potato 记为 get，get_hot_potato2 记为 get2。

我们有两个 tracked function：get 和 get2，不用关心这两个 function 做了啥，只需要知道 get2 调用了 get，也就是说 get2 依赖 get 的结果。另一个需要注意的点是 get 设置了 lru = 3，而 get2 没有设置 lru。

现在调用 get2 四次，来看看 salsa 内部的状态，get 并没有存储 input 为 0 的结果，因为它设置了 lru = 3，input 为 0 的结果被 evict 了。

+-------+----------------+-----------------+
| input |       get      |       get2      |
+-------+----------------+-----------------+
|   0   |                |        0        |
+-------+----------------+-----------------+
|   1   |        1       |        1        |
+-------+----------------+-----------------+
|   2   |        2       |        2        |
+-------+----------------+-----------------+
|   3   |        3       |        3        |
+-------+----------------+-----------------+

我们现在调用 get2(0)（这里应该写 input0 会比较好，input0 = MyInput::new(&mut db, 0)，但为了简单就写了 0），能直接用 salsa 存储的结果吗？还需要重新计算吗？显然不用，不是存储有吗？那如果现在有别的输入改变了 salsa 的 current_revision 呢? synthetic_write 就是在做这件事。也就是说 salsa 的 current_revision 大于 get 和 get2 结果被 verify 的 revision。这意味着 shallow_verify_memo 不能确定 get2(0) 的结果是否能用，得让 deep_verify_memo 去进一步判断。deep_verify_memo 会检查 get2(0) 依赖的其他计算结果有没有改变，所以会去查看 get(0)，发现压根儿没有存储 get(0) 的结果，当然也没法判断它有没有改变，只能保守地认为发生了改变，所以 get2(0) 和 get(0) 都会重新计算。

显然，这不够好，实际上 get2(0) 的结果是可以重用的，因为 get(0) 的结果虽然被 evict 了，但并没有发生改变。 在我们这个场景下，get2(0) 并不关心 get(0) 的结果是多少，只关心从它上次被 verify 后有没有发生改变。

有没有方法去优化它呢？#371 提供了一种解决方案。我们 evict 的时候，不再直接删除整个 Memo，只是把 Memo.value 设置为 None，保留 verified_at 和 revisions。

/// Evicts the existing memo for the given key, replacing it
/// with an equivalent memo that has no value. If the memo is untracked, BaseInput,
/// or has values assigned as output of another query, this has no effect.
pub(super) fn evict(&self, key: K) {
    use dashmap::mapref::entry::Entry::*;
    use crate::runtime::local_state::QueryOrigin;

    if let Occupied(entry )=  self.map.entry(key) {
        let memo = entry.get().load();
        match memo.revisions.origin {
            QueryOrigin::Assigned(_)
            | QueryOrigin::DerivedUntracked(_)
            | QueryOrigin::BaseInput
            | QueryOrigin::Field => {
                // Careful: Cannot evict memos whose values were
                // assigned as output of another query
                // or those with untracked inputs
                // as their values cannot be reconstructed.
                return;
            },

            QueryOrigin::Derived(_) => {
                let memo_evicted = Arc::new(Memo::new(
                    None::<V>,
                    memo.verified_at.load(),
                    memo.revisions.clone(),
                ));

                entry.get().store(memo_evicted);
            }
        }
    }
}