v8 IR连连看

发表于 2018-08-21 | 分类于浏览器 | | 阅读次数:

v8 Graph

首先说一个结论，v8的graph都是由下往上遍历。

具体举一个例子,从json里随便提取一句
{"source":0,"target":13,"index":3,"type":"control"},

void PrintEdges(Node* node) {
  for (int i = 0; i < node->InputCount(); i++) {
    Node* input = node->InputAt(i);
    if (input == nullptr) continue;
    PrintEdge(node, i, input);
  }
}

void PrintEdge(Node* from, int index, Node* to) {
  if (first_edge_) {
    first_edge_ = false;
  } else {
    os_ << ",\n";
  }
  const char* edge_type = nullptr;
  if (index < NodeProperties::FirstValueIndex(from)) {
    edge_type = "unknown";
  } else if (index < NodeProperties::FirstContextIndex(from)) {
    edge_type = "value";
  } else if (index < NodeProperties::FirstFrameStateIndex(from)) {
    edge_type = "context";
  } else if (index < NodeProperties::FirstEffectIndex(from)) {
    edge_type = "frame-state";
  } else if (index < NodeProperties::FirstControlIndex(from)) {
    edge_type = "effect";
  } else {
    edge_type = "control";
  }
  os_ << "{\"source\":" << SafeId(to) << ",\"target\":" << SafeId(from)
      << ",\"index\":" << index << ",\"type\":\"" << edge_type << "\"}";
}

事实上反映到代码里，它是这样的。
也就是说,from对应node，input对应to。
画图的时候，source即to，target即from。
注意这里的from和to到底指谁，在看代码的时候才不会迷失。

以一个函数为例

void GraphReducer::ReplaceWithValue(Node* node, Node* value, Node* effect,
                                    Node* control) {
  if (effect == nullptr && node->op()->EffectInputCount() > 0) {
    effect = NodeProperties::GetEffectInput(node);
  }
  if (control == nullptr && node->op()->ControlInputCount() > 0) {
    control = NodeProperties::GetControlInput(node);
  }

  // Requires distinguishing between value, effect and control edges.
  for (Edge edge : node->use_edges()) {
    Node* const user = edge.from();
    DCHECK(!user->IsDead());
    if (NodeProperties::IsControlEdge(edge)) {
      if (user->opcode() == IrOpcode::kIfSuccess) {
        Replace(user, control);
      } else if (user->opcode() == IrOpcode::kIfException) {
        DCHECK_NOT_NULL(dead_);
        edge.UpdateTo(dead_);
        Revisit(user);
      } else {
        DCHECK_NOT_NULL(control);
        edge.UpdateTo(control);
        Revisit(user);
      }
    } else if (NodeProperties::IsEffectEdge(edge)) {
      DCHECK_NOT_NULL(effect);
      edge.UpdateTo(effect);
      Revisit(user);
    } else {
      DCHECK_NOT_NULL(value);
      edge.UpdateTo(value);
      Revisit(user);
    }
  }
}

Edge edge : node->use_edges()
use_edges即图中的红色边。绿色边代表的是input，红色边代表use
这其实很好理解，如果我们来写一个双向图搜索，在设计的时候，每个节点肯定有进行前后遍历的两条边可以选择，v8也是这么设计的。

class V8_EXPORT_PRIVATE Node final {
    ...
    class InputEdges;
    inline InputEdges input_edges();

    class Inputs;
    inline Inputs inputs() const;

    class UseEdges final {
        public:
        typedef Edge value_type;

        class iterator;
        inline iterator begin() const;
        inline iterator end() const;

        bool empty() const;

        explicit UseEdges(Node* node) : node_(node) {}

        private:
        Node* node_;
    };
    UseEdges use_edges() { return UseEdges(this); }
...
...

Node* const user = edge.from();

还记得每条边的from么？对，就是target，现在知道对应谁了么？上一个和下一个都是相对的概念，只是我们要知道究竟对应的是谁即可

接下来代码是一个选择分支，我选一个else if分析
1
2
3
4
else if (NodeProperties::IsEffectEdge(edge)) {
DCHECK_NOT_NULL(effect);
edge.UpdateTo(effect);
Revisit(user);
首先effect来源于
1
2
3
if (effect == nullptr && node->op()->EffectInputCount() > 0) {
effect = NodeProperties::GetEffectInput(node);
}
这里是取的原来的节点的输入边，假设它就是effect input。
那么假设我们现在就是0->5->13，0即effect，5即node，13即user
edge是5->13的这条边。
然后我们edge.UpdateTo(effect);

还记得每条边的from么？是的，这个to本来应该是5，这里就是把5->13变成了0->13，就移走了原本在这个路径上的这个节点，要注意的是这个移走并不是直接移除，因为它可能在其他路径上还被用到，只是不在这条路径上了，具体的看我贴的源码（fuck the source code…)

  void UpdateTo(Node* new_to) {
    Node* old_to = *input_ptr_;
    if (old_to != new_to) {
      if (old_to) old_to->RemoveUse(use_);
      *input_ptr_ = new_to;
      if (new_to) new_to->AppendUse(use_);
    }
  }
  ...
  void Node::RemoveUse(Use* use) {
  DCHECK(first_use_ == nullptr || first_use_->prev == nullptr);
  if (use->prev) {
    DCHECK_NE(first_use_, use);
    use->prev->next = use->next;
  } else {
    DCHECK_EQ(first_use_, use);
    first_use_ = use->next;
  }
  if (use->next) {
    use->next->prev = use->prev;
  }
}

如图有两条红线，通过一个use链链接起来。

v8 IR

Sea-of-Nodes

v8的IR是Sea-of-Nodes，可以说是一种Program Dependence Graph，其宗旨是“在统一的表达形式下，把分析进行彻底”
用这样的IR所表达的程序里完全不需要“变量”的概念了，一切都是经过透彻分析后的“值”。各种操作并不操作变量，而是从依赖获取输入值，运算后产生新的值。每个“值”自身会携带足够依赖信息来判明它在怎样的路径上有效，依赖的数据输入是哪些；
TurboFan的依赖信息有三种，Control, Data, Effect

Control Dependence（控制依赖）可以看作是常规控制流图反过来画。不是基本块持有一个线性列表里面装着节点，而是每个节点都记录着自己的控制依赖是谁——要哪个前驱控制节点执行到的时候我才可以执行。
Data Dependence很简单，就是use-def链，换句话说一个节点的数据输入都是谁。例如说 a + b，这个“+”的数据依赖就是a和b。
Effect Dependence则记录“副作用的顺序”——主要就是内存操作的顺序。它只记录顺序而不必维护SSA形式的其它特性。

基于此，v8放宽了operation的执行顺序，由effect edges来管理stateful operations的执行顺序，且保留了控制流图的“骨架”
举几个简单的例子

Iterative Reduction

需要提一下的是v8的reduce，在每次JIT phase结束的时候，都会根据当前的graph进行Reduce

整个reduce是从代码最后的End节点开始进行DFS,先遍历子节点，然后遍历自己，记录state改变的节点，将其放入revisit栈，在所有节点遍历完后，对这些需要重新遍历的节点revisit.

void GraphReducer::ReduceGraph() { ReduceNode(graph()->end()); }
...
...
...
void GraphReducer::ReduceNode(Node* node) {
  DCHECK(stack_.empty());
  DCHECK(revisit_.empty());
  Push(node);
  for (;;) {
    if (!stack_.empty()) {
      // Process the node on the top of the stack, potentially pushing more or
      // popping the node off the stack.
      ReduceTop();
    } else if (!revisit_.empty()) {
      // If the stack becomes empty, revisit any nodes in the revisit queue.
      Node* const node = revisit_.front();
      revisit_.pop();
      if (state_.Get(node) == State::kRevisit) {
        // state can change while in queue.
        Push(node);
      }
    } else {
      // Run all finalizers.
      for (Reducer* const reducer : reducers_) reducer->Finalize();

      // Check if we have new nodes to revisit.
      if (revisit_.empty()) break;
    }
  }
  DCHECK(revisit_.empty());
  DCHECK(stack_.empty());
}

....
...
...
Reduction reduction = (*i)->Reduce(node);

Lowering to Machine

事实上IR的node是随着JIT phase逐层下降接近至Machine的

一个JSxxx node可能会变成更多个Machine node