(py package) train your own tokenizer based on BPE algorithm for the LLMs (supports the regex pattern and special tokens)
Automatic text metrics (BLEU, ROUGE, METEOR, +++)