Model Instability
| model | n_incidents | instability_score |
|---|---|---|
| qwen2.5:7b | 112 | 0.6279761904761906 |
| llama3.2:3b | 112 | 0.625 |
| phi4-mini:3.8b | 112 | 0.6190476190476192 |
| gemma3:4b | 112 | 0.6130952380952381 |
| qwen3:8b | 112 | 0.6101190476190476 |
| mistral:latest | 104 | 0.592948717948718 |
| gemma4:latest | 104 | 0.5576923076923077 |