Graphcore produced a series of striking images of computational graphs mapped to its “Intelligent Processing Unit.”
The graph compiler builds up an intermediate representation of the computational graph to be scheduled and deployed across one or many IPU devices. The compiler can display this computational graph, so an application written at the level of a machine learning framework reveals an image of the computational graph which runs on the IPU.
The image below shows the graph for the full forward and backward training loop of AlexNet, generated from a TensorFlow description.
Our Poplar graph compiler has converted a description of the network into a computational graph of 18.7 million vertices and 115.8 million edges. This graph represents AlexNet as a highly-parallel execution plan for the IPU. The vertices of the graph represent computation processes and the edges represent communication between processes. The layers in the graph are labelled with the corresponding layers from the high level description of the network. The clearly visible clustering is the result of intensive communication between processes in each layer of the network, with lighter communication between layers.