Model Instability

model n_incidents instability_score
qwen2.5:7b 112 0.6279761904761906
llama3.2:3b 112 0.625
phi4-mini:3.8b 112 0.6190476190476192
gemma3:4b 112 0.6130952380952381
qwen3:8b 112 0.6101190476190476
mistral:latest 104 0.592948717948718
gemma4:latest 104 0.5576923076923077