Awesome Self-Supervised Fine-Tuning Papers and Source Codes

SPIN: Boost Your LLM’s Performance Without New Human Annotations—Just Use Self-Play Fine-Tuning 1206

Imagine you’ve fine-tuned a language model using a standard Supervised Fine-Tuning (SFT) dataset—like Zephyr-7B on UltraChat—but you don’t have access…

12/27/2025Language Model Alignment, Preference-Free Optimization, Self-Supervised Fine-Tuning