TensorFlow Data Flow Graph Optimization

TensorFlow represents an user program as a computation graph/data flow graph where each node represents a mathematical computation eg. add, subtract, matrix multiply, ReLU, etc and each edge represents input/output tensor data. A node has zero or more input edges and zero or more output edges. Data flow graph is an important design because TensorFlow can perform the following code optimizations using the knowledge about the computation graph.

  • Remove dead nodes. These are the source nodes, sink nodes, control flow nodes, and stateful nodes.
  • Remove identity nodes. (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/common_runtime/function.cc)
  • Perform constant folding to check if a node can be evaluated as constant and therefore eligible for subsequent constant propagation.
  • Perform function inlining.
  • Perform common subexpression elimination. This technique is to find the common subexpressions within the graph and replacing them with a single computation to avoid redundant computation Read more about common subexpression elimination here

Check out the full implementation in graph_optimizer.cc

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/common_runtime/graph_optimizer.cc

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/common_runtime/constant_folding.cc

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/graph/optimizer_cse.cc