I tried to use UTF-8 encoding to get around its various protections, but I can't seem to get it do anything other than translate my UTF-8 back into English.
Can't spell words backwards, same as gpt-3.
We need better tokenization algorithms. People, for example, segment text as they read and can re-tokenize if necessary. Something similar for nlp would be great.