TensorFlow Data Flow Graph Optimization

TensorFlow represents an user program as a computation graph/data flow graph where each node represents a mathematical computation eg. add, subtract, matrix multiply, ReLU, etc and each edge represents input/output tensor data. A node has zero or more input edges and zero or more output edges. Data flow graph is an important design because TensorFlow can perform the following code optimizations using the knowledge about the computation graph.

  • Remove dead nodes. These are the source nodes, sink nodes, control flow nodes, and stateful nodes.
  • Remove identity nodes. (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/common_runtime/function.cc)
  • Perform constant folding to check if a node can be evaluated as constant and therefore eligible for subsequent constant propagation.
  • Perform function inlining.
  • Perform common subexpression elimination. This technique is to find the common subexpressions within the graph and replacing them with a single computation to avoid redundant computation Read more about common subexpression elimination here

Check out the full implementation in graph_optimizer.cc

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/common_runtime/graph_optimizer.cc

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/common_runtime/constant_folding.cc

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/graph/optimizer_cse.cc

TensorFlow Papers

Great papers on TensorFlow

A Tour of TensorFlow

TensorFlow: A System for Large-Scale Machine Learning

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

A Comparison of Distributed Machine Learning Platforms

TensorFlow Estimators: Managing Simplicity vs. Flexibility in
High-Level Machine Learning Frameworks

The TensorFlow Partitioning and Scheduling Problem: It’s the Critical Path!