DeepSeek-R1-FineTuning icon indicating copy to clipboard operation
DeepSeek-R1-FineTuning copied to clipboard

Fine-Tuning of DeepSeek-Style Reasoning Models | RL + Quantization Implementation

Finetune Deepseek_R1_8b on _QuantumMechanics Dataset

  • Base Model : unsloth/DeepSeek-R1-Distill-Llama-8B
  • Training Dataset : 0xZee/dataset-CoT-Quantum-Mechanics-1224
  • HF Finetuned Model : 0xZee/DeepSeek-R1-8b-ft-QuantumMechanics-CoT

Demo NoteBook