-
Notifications
You must be signed in to change notification settings - Fork 699
Control flows #2738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @mattiasmar! We support LSTMs by unrolling the network. For example, the char-rnn and English-To-German use LSTM structures. At some point we may extend Glow to support more control flow features but this |
For the purpose of extending Glow to support LSTM structures (not unrolled); Could you provide some pointers? Design thoughts? Links to work in progress? Core elements that would need to be implemented? |
To support general control flow we would need to extend the internal Glow data structures. Rename Function to BasicBlock, and add a new top-level function. Allow BasicBlocks to be terminated with some "branch" instruction that jumps to a different basic block. This is going to be a major change, and I am not sure how to implement things like control-flow-aware automatic differentiation (that we have today for the DAG). I think that the best path forward right now would be to rely on the high-level framework (example, PyTorch) to implement the control flow and call into Glow for the dag sections of the compute. Moving forward we can gradually implement full support for control-flow, based on the semantics of PyTorch. |
So how you approach a model like GNMT? Unrolling the network wouldn't be very processing efficient. How could we best take benefit of Glow, without doing a major change to the code base (while keeping the option to do some minor changes)? |
We have some initial support for Predication, which would allow us to skip unnecessary computation in an unrolled network. That would probably be the shortest path to a highly performant NMT. Note that this requires the backend to also support predication. |
Our backend supports LSTM as-is. So if I add code for loading LSTM operator in glow, then can we implement something similar to https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/core/providers/cpu/rnn/deep_cpu_lstm.cc in BoundInterpreterFunction for profiling? |
@ajayanto We already have basic LSTM support, just not as a single node (it's currently directly implemented via its component pieces instead of as a Node that's lowered; see This could be useful for you if you are going to run an unrolled RNN that uses LSTMs, or some other network with an LSTM that does not use control flow. But just to be clear, we still do not have support for control flow needed for general RNNs, e.g. ONNX's Scan or C2's RecurrentNetwork, which we would need to unroll. |
@jfix71 as a workaround, |
@ajayanto You don't need to provide an implementation for If your backend does not want to lower the LSTM then it does not need to -- it would return false for LSTMNode inside In order to do this you'd need add an LSTMNode, somewhat following the instructions in |
@jfix71 the current implementation of LSTM in Function::createLSTM() doesn't has any option to provide weights as input. So is this function training specific only? What do I have to do in order to have an inference node? |
@hgarg5072 For historical reasons we didn't have an |
Hi @ksaurabh-cadence -- I don't think much has changed here, actually. I still think the easiest path forward for now would be to allow e.g. PyTorch (or whatever is driving runtime execution, e.g. a C wrapper with a main for executing an AOT bundle) to call into Glow for high performance execution of each iteration of the model, assuming the control flow is mostly there to wrap around e.g. and LSTM until an end token is seen. Or alternatively, to improve support for predication and then unroll the model to some max length. We could still do a large refactor as Nadav mentioned as well, but I am not clear on the cost-benefit there, as it will require a decent amount of work, and I think we could likely get most of the benefit from the first approach. |
Hi @jfix71 Is there an example that I can look up for the first approach you suggest using a c-wrapper? If not, is there an example which can be added? What kind of improvement for predication do you have in mind? |
Hi,
When can Glow be expected to support control flows? I would like to see LSTM implemented in Glow.
Thanks!
The text was updated successfully, but these errors were encountered: