Bibliography (7):
https://research.google/blog/a-fast-wordpiece-tokenization-system/
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
https://github.com/huggingface/tokenizers
https://www.tensorflow.org/text/guide/subwords_tokenizer
Wikipedia Bibliography:
Lexical analysis § Tokenization  :
https://en.wikipedia.org/wiki/Lexical_analysis#Tokenization
https://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_algorithm  :
https://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_algorithm
Trie