Show HN: LLM post-training to speak like GenZ, costing less than a cup of coffee

(github.com)

5 points | by aidarbek 1 hour ago

1 comments

aidarbek 1 hour ago
Hello everyone! I post-trained small LLM (Qwen Instruct 2.5 0.5B) to speak like GenZ using SFT+RL. Training was done in Google Colab using the cheapest GPU runtime.
Honestly, you could probably get better results with simple prompting of frontier models, as I wasn’t optimizing for results with the smallest model, cheapest GPU and synthetic data. But it was a good and fun learning exercise: how you can actually run RL to improve models without huge investments using the current advancements in AI, tooling and infrastructure.
Overall, it cost me <$2 using Colab's pay as you go plan to train the model, which was surprisingly less than I expected.
Notebook example is on Github, feel free to give it a try in your own free plan!