1 comments

  • aidarbek 1 hour ago
    Hello everyone! I post-trained small LLM (Qwen Instruct 2.5 0.5B) to speak like GenZ using SFT+RL. Training was done in Google Colab using the cheapest GPU runtime.

    Honestly, you could probably get better results with simple prompting of frontier models, as I wasn’t optimizing for results with the smallest model, cheapest GPU and synthetic data. But it was a good and fun learning exercise: how you can actually run RL to improve models without huge investments using the current advancements in AI, tooling and infrastructure.

    Overall, it cost me <$2 using Colab's pay as you go plan to train the model, which was surprisingly less than I expected.

    Notebook example is on Github, feel free to give it a try in your own free plan!