Sunday, March 11, 2018

Pre-declaration of tasks

Recently, I've noted some old ideas being used at the core of more modern frameworks. One of them is the pre-declaration of all the tasks before they are executed. This is similar to entering the floor you want to reach before you enter an elevator. This allows for more optimizations.

For example, in TensorFlow, the computation graph is built and given to the framework. The framework then optimizes the graph, assigns the computing nodes, and starts the computation. Imagine your graph has constant * input1 + constant * input2 + ... + constant * inputn, it could be optimized into constant * (input1 + input2 + ... + inputn), a 50% reduction in computation! This optimization would not be possible if the framework wasn't given the whole expression to work with. Moreover, given a fuller picture, the framework could also more intelligently rearrange the inputs to be closer to each other, taking advantage of locality.

Another cute example is in Ray distributed framework. Ray cleverly uses Object ID (similar to Future) to abstract a result of a computation. When another computation depends on an Object ID, a dependency graph is built and known. Any suitable node can take on appropriate computing task as long as it can obtain required inputs.

This idea is exactly what a compiler is. After all, you tell the compiler about all the tasks that you are going to execute in a high level language, and it translate them into the most optimized native form it could.

I am fascinated by simple and commonsensical ideas like this one. It reinforces the core theme of a spiral evolution of technologies. Hope to see more of those smart elevators.

No comments:

Post a Comment