You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Change tokenizer name to bpe_tokenizer and extract a base class (#3009)
Summary:
Pull Request resolved: #3009
We want to be able to support more than 1 implementation of tokenizer. Currently `tokenizer.cpp` is adopted from `llama2.c` but we also wanted to support `Tiktoken` (will be added in next PR).
This PR extract out a base class `Tokenizer` and make it extendable by different implementations.
Reviewed By: mergennachin
Differential Revision: D56052583
fbshipit-source-id: bd9143957165211b1f600f781233b9ceff440cc1
0 commit comments