Détails des tests sur Gemma4 quantisés par Unsloth.ai
Détail des résultats des tests GSM8k
gemma-4-26B-A4B-it-UD-IQ2_XXS.gguf-gsm8k
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B', 'tokenizer': 'google/gemma-4-26b-a4b-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 8, bat
ch_size: 1
|Tasks|Version| Filter |n-shot| Metric | |Value | |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k| 3|flexible-extract| 8|exact_match|↑ |0.8825|± |0.0089|
| | |strict-match | 8|exact_match|↑ |0.8764|± |0.0091|
gemma-4-26B-A4B-it-UD-IQ4_NL.gguf-gsm8k
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B', 'tokenizer': 'google/gemma-4-26b-a4b-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 8, bat
ch_size: 1
|Tasks|Version| Filter |n-shot| Metric | |Value | |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k| 3|flexible-extract| 8|exact_match|↑ |0.8870|± |0.0087|
| | |strict-match | 8|exact_match|↑ |0.8734|± |0.0092|
gemma-4-26B-A4B-it-UD-Q3_K_M.gguf-gsm8k
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B', 'tokenizer': 'google/gemma-4-26b-a4b-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 8, bat
ch_size: 1
|Tasks|Version| Filter |n-shot| Metric | |Value | |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k| 3|flexible-extract| 8|exact_match|↑ |0.8893|± |0.0086|
| | |strict-match | 8|exact_match|↑ |0.8779|± |0.0090|
gemma-4-26B-A4B-it-UD-Q3_K_S.gguf-gsm8k
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B', 'tokenizer': 'google/gemma-4-26b-a4b-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 8, bat
ch_size: 1
|Tasks|Version| Filter |n-shot| Metric | |Value | |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k| 3|flexible-extract| 8|exact_match|↑ |0.8893|± |0.0086|
| | |strict-match | 8|exact_match|↑ |0.8779|± |0.0090|
gemma-4-26B-A4B-it-UD-Q3_K_XL.gguf-gsm8k
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B', 'tokenizer': 'google/gemma-4-26b-a4b-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 8, bat
ch_size: 1
|Tasks|Version| Filter |n-shot| Metric | |Value | |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k| 3|flexible-extract| 8|exact_match|↑ |0.8863|± |0.0087|
| | |strict-match | 8|exact_match|↑ |0.8772|± |0.0090|
gemma-4-26B-A4B-it-UD-Q4_K_M.gguf-gsm8k
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B', 'tokenizer': 'google/gemma-4-26b-a4b-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 8, bat
ch_size: 1
|Tasks|Version| Filter |n-shot| Metric | |Value | |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k| 3|flexible-extract| 8|exact_match|↑ |0.8999|± |0.0083|
| | |strict-match | 8|exact_match|↑ |0.8908|± |0.0086|
gemma-4-26B-A4B-it-UD-Q4_K_S.gguf-gsm8k
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B', 'tokenizer': 'google/gemma-4-26b-a4b-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 8, bat
ch_size: 1
|Tasks|Version| Filter |n-shot| Metric | |Value | |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k| 3|flexible-extract| 8|exact_match|↑ |0.9007|± |0.0082|
| | |strict-match | 8|exact_match|↑ |0.8931|± |0.0085|
gemma-4-26B-A4B-it-UD-Q4_K_XL.gguf-gsm8k
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B', 'tokenizer': 'google/gemma-4-26b-a4b-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 8, bat
ch_size: 1
|Tasks|Version| Filter |n-shot| Metric | |Value | |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k| 3|flexible-extract| 8|exact_match|↑ |0.9022|± |0.0082|
| | |strict-match | 8|exact_match|↑ |0.8946|± |0.0085|
gemma-4-26B-A4B-it-UD-Q5_K_S.gguf-gsm8k
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B', 'tokenizer': 'google/gemma-4-26b-a4b-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 8, bat
ch_size: 1
|Tasks|Version| Filter |n-shot| Metric | |Value | |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k| 3|flexible-extract| 8|exact_match|↑ |0.9060|± |0.0080|
| | |strict-match | 8|exact_match|↑ |0.8939|± |0.0085|
gemma-4-26B-A4B-it-UD-Q5_K_M
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B', 'tokenizer': 'google/gemma-4-26b-a4b-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 8, bat
ch_size: 1
|Tasks|Version| Filter |n-shot| Metric | |Value | |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k| 3|flexible-extract| 8|exact_match|↑ |0.9098|± |0.0079|
| | |strict-match | 8|exact_match|↑ |0.8984|± |0.0083|
gemma-4-26B-A4B-it-UD-Q5_K_XL
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B', 'tokenizer': 'google/gemma-4-26b-a4b-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 8, batch_size: 1
|Tasks|Version| Filter |n-shot| Metric | |Value | |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k| 3|flexible-extract| 8|exact_match|↑ |0.9045|± |0.0081|
| | |strict-match | 8|exact_match|↑ |0.8939|± |0.0085|
Détail des résultats des tests ARC
gemma-4-26B-A4B-it-UD-IQ2_XXS
local-completions ({'model': 'google/gemma-4-26B-A4B-it', 'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: None, batch_size: 1
| Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr|
|------------------|------:|-----------------|-----:|-----------|---|-----:|---|-----:|
|arc_challenge_chat| 1|remove_whitespace| 0|exact_match|↑ |0.8003|± |0.0117|
gemma-4-26B-A4B-it-UD-IQ4_NL
local-completions ({'model': 'google/gemma-4-26B-A4B-it', 'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: None, batch_size: 1
| Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr|
|------------------|------:|-----------------|-----:|-----------|---|-----:|---|-----:|
|arc_challenge_chat| 1|remove_whitespace| 0|exact_match|↑ |0.9309|± |0.0074|
gemma-4-26B-A4B-it-UD-Q3_K_M
local-completions ({'model': 'google/gemma-4-26B-A4B-it', 'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: None, batch_size: 1
| Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr|
|------------------|------:|-----------------|-----:|-----------|---|-----:|---|-----:|
|arc_challenge_chat| 1|remove_whitespace| 0|exact_match|↑ |0.9172|± |0.0081|
gemma-4-26B-A4B-it-UD-Q3_K_S
local-completions ({'model': 'google/gemma-4-26B-A4B-it', 'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: None, batch_size: 1
| Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr|
|------------------|------:|-----------------|-----:|-----------|---|-----:|---|-----:|
|arc_challenge_chat| 1|remove_whitespace| 0|exact_match|↑ |0.9172|± |0.0081|
gemma-4-26B-A4B-it-UD-Q3_K_XL
local-completions ({'model': 'google/gemma-4-26B-A4B-it', 'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: None, batch_size: 1
| Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr|
|------------------|------:|-----------------|-----:|-----------|---|-----:|---|-----:|
|arc_challenge_chat| 1|remove_whitespace| 0|exact_match|↑ |0.9206|± |0.0079|
gemma-4-26B-A4B-it-UD-Q4_K_M
local-completions ({'model': 'google/gemma-4-26B-A4B-it', 'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: None, batch_size: 1
| Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr|
|------------------|------:|-----------------|-----:|-----------|---|-----:|---|-----:|
|arc_challenge_chat| 1|remove_whitespace| 0|exact_match|↑ |0.9275|± |0.0076|
gemma-4-26B-A4B-it-UD-Q4_K_S
local-completions ({'model': 'google/gemma-4-26B-A4B-it', 'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: None, batch_size: 1
| Tasks |Version| Filter |n-shot| Metric | |Value| |Stderr|
|------------------|------:|-----------------|-----:|-----------|---|----:|---|-----:|
|arc_challenge_chat| 1|remove_whitespace| 0|exact_match|↑ |0.936|± |0.0072|
gemma-4-26B-A4B-it-UD-Q4_K_XL
local-completions ({'model': 'google/gemma-4-26B-A4B-it', 'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: None, batch_size: 1
| Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr|
|------------------|------:|-----------------|-----:|-----------|---|-----:|---|-----:|
|arc_challenge_chat| 1|remove_whitespace| 0|exact_match|↑ |0.9266|± |0.0076|
gemma-4-26B-A4B-it-UD-Q5_K_S
local-completions ({'model': 'google/gemma-4-26B-A4B-it', 'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: None, batch_size: 1
| Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr|
|------------------|------:|-----------------|-----:|-----------|---|-----:|---|-----:|
|arc_challenge_chat| 1|remove_whitespace| 0|exact_match|↑ |0.9292|± |0.0075|
gemma-4-26B-A4B-it-UD-Q5_K_M
local-completions ({'model': 'google/gemma-4-26B-A4B-it', 'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: None, batch_size: 1
| Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr|
|------------------|------:|-----------------|-----:|-----------|---|-----:|---|-----:|
|arc_challenge_chat| 1|remove_whitespace| 0|exact_match|↑ |0.9266|± |0.0076|
gemma-4-26B-A4B-it-UD-Q5_K_XL
local-completions ({'model': 'google/gemma-4-26B-A4B-it', 'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: None, batch_size: 1
| Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr|
|------------------|------:|-----------------|-----:|-----------|---|-----:|---|-----:|
|arc_challenge_chat| 1|remove_whitespace| 0|exact_match|↑ |0.9317|± |0.0074|
Détail des résultats des tests ifeval
gemma-4-26B-A4B-it-UD-IQ2_XXS.gguf-ifeval
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B-it', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 0,
batch_size: 1
|Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|------|------:|------|-----:|-----------------------|---|-----:|---|------|
|ifeval| 4|none | 0|inst_level_loose_acc |↑ |0.9317|± | N/A|
| | |none | 0|inst_level_strict_acc |↑ |0.9101|± | N/A|
| | |none | 0|prompt_level_loose_acc |↑ |0.8983|± |0.0130|
| | |none | 0|prompt_level_strict_acc|↑ |0.8706|± |0.0144|
gemma-4-26B-A4B-it-UD-IQ4_NL.gguf-ifeval
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B-it', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 0,
batch_size: 1
|Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|------|------:|------|-----:|-----------------------|---|-----:|---|------|
|ifeval| 4|none | 0|inst_level_loose_acc |↑ |0.9341|± | N/A|
| | |none | 0|inst_level_strict_acc |↑ |0.9185|± | N/A|
| | |none | 0|prompt_level_loose_acc |↑ |0.9002|± |0.0129|
| | |none | 0|prompt_level_strict_acc|↑ |0.8799|± |0.0140|
gemma-4-26B-A4B-it-UD-Q3_K_M.gguf-ifeval
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B-it', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 0,
batch_size: 1
|Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|------|------:|------|-----:|-----------------------|---|-----:|---|------|
|ifeval| 4|none | 0|inst_level_loose_acc |↑ |0.9281|± | N/A|
| | |none | 0|inst_level_strict_acc |↑ |0.9185|± | N/A|
| | |none | 0|prompt_level_loose_acc |↑ |0.8928|± |0.0133|
| | |none | 0|prompt_level_strict_acc|↑ |0.8799|± |0.0140|
gemma-4-26B-A4B-it-UD-Q3_K_S.gguf-ifeval
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B-it', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 0,
batch_size: 1
|Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|------|------:|------|-----:|-----------------------|---|-----:|---|------|
|ifeval| 4|none | 0|inst_level_loose_acc |↑ |0.9281|± | N/A|
| | |none | 0|inst_level_strict_acc |↑ |0.9185|± | N/A|
| | |none | 0|prompt_level_loose_acc |↑ |0.8928|± |0.0133|
| | |none | 0|prompt_level_strict_acc|↑ |0.8799|± |0.0140|
gemma-4-26B-A4B-it-UD-Q3_K_XL.gguf-ifeval
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B-it', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 0,
batch_size: 1
|Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|------|------:|------|-----:|-----------------------|---|-----:|---|------|
|ifeval| 4|none | 0|inst_level_loose_acc |↑ |0.9365|± | N/A|
| | |none | 0|inst_level_strict_acc |↑ |0.9209|± | N/A|
| | |none | 0|prompt_level_loose_acc |↑ |0.9057|± |0.0126|
| | |none | 0|prompt_level_strict_acc|↑ |0.8854|± |0.0137|
gemma-4-26B-A4B-it-UD-Q4_K_M.gguf-ifeval
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B-it', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 0,
batch_size: 1
|Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|------|------:|------|-----:|-----------------------|---|-----:|---|------|
|ifeval| 4|none | 0|inst_level_loose_acc |↑ |0.9353|± | N/A|
| | |none | 0|inst_level_strict_acc |↑ |0.9209|± | N/A|
| | |none | 0|prompt_level_loose_acc |↑ |0.9020|± |0.0128|
| | |none | 0|prompt_level_strict_acc|↑ |0.8854|± |0.0137|
gemma-4-26B-A4B-it-UD-Q4_K_S.gguf-ifeval
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B-it', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 0,
batch_size: 1
|Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|------|------:|------|-----:|-----------------------|---|-----:|---|------|
|ifeval| 4|none | 0|inst_level_loose_acc |↑ |0.9281|± | N/A|
| | |none | 0|inst_level_strict_acc |↑ |0.9209|± | N/A|
| | |none | 0|prompt_level_loose_acc |↑ |0.8928|± |0.0133|
| | |none | 0|prompt_level_strict_acc|↑ |0.8835|± |0.0138|
gemma-4-26B-A4B-it-UD-Q4_K_XL.gguf-ifeval
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B-it', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 0,
batch_size: 1
|Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|------|------:|------|-----:|-----------------------|---|-----:|---|------|
|ifeval| 4|none | 0|inst_level_loose_acc |↑ |0.9293|± | N/A|
| | |none | 0|inst_level_strict_acc |↑ |0.9149|± | N/A|
| | |none | 0|prompt_level_loose_acc |↑ |0.8946|± |0.0132|
| | |none | 0|prompt_level_strict_acc|↑ |0.8762|± |0.0142|
gemma-4-26B-A4B-it-UD-Q5_K_S
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B-it', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 0,
batch_size: 1
|Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|------|------:|------|-----:|-----------------------|---|-----:|---|------|
|ifeval| 4|none | 0|inst_level_loose_acc |↑ |0.9293|± | N/A|
| | |none | 0|inst_level_strict_acc |↑ |0.9173|± | N/A|
| | |none | 0|prompt_level_loose_acc |↑ |0.8946|± |0.0132|
| | |none | 0|prompt_level_strict_acc|↑ |0.8780|± |0.0141|
gemma-4-26B-A4B-it-UD-Q5_K_M
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B-it', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 0,
batch_size: 1
|Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|------|------:|------|-----:|-----------------------|---|-----:|---|------|
|ifeval| 4|none | 0|inst_level_loose_acc |↑ |0.9365|± | N/A|
| | |none | 0|inst_level_strict_acc |↑ |0.9257|± | N/A|
| | |none | 0|prompt_level_loose_acc |↑ |0.9057|± |0.0126|
| | |none | 0|prompt_level_strict_acc|↑ |0.8909|± |0.0134|
gemma-4-26B-A4B-it-UD-Q5_K_XL
local-completions ({'base_url': 'http://localhost:8050/v1/completions', 'api_key': 'EMPTY', 'pretrained': 'google/gemma-4-26B-A4B-it', 'tokenizer': 'google/gemma-4-26B-A4B-it'}), gen_kwargs: ({}), limit: None, num_fewshot: 0, batch_s
ize: 1
|Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|------|------:|------|-----:|-----------------------|---|-----:|---|------|
|ifeval| 4|none | 0|inst_level_loose_acc |↑ |0.9329|± | N/A|
| | |none | 0|inst_level_strict_acc |↑ |0.9221|± | N/A|
| | |none | 0|prompt_level_loose_acc |↑ |0.8983|± |0.0130|
| | |none | 0|prompt_level_strict_acc|↑ |0.8817|± |0.0139|
Détail des résultats des tests MMLU
Les tests MMLU sont suspendus pour le moment, en raison de la durée de chaque test
gemma-4-26B-A4B-it-UD-IQ2_XXS.gguf
gguf ({'base_url': 'http://localhost:8050'}), gen_kwargs: ({}), limit: None, num_fewshot: 5, batch_size: 1
| Tasks |Version|Filter|n-shot|Metric| |Value | |Stderr|
|---------------------------------------|------:|------|-----:|------|---|-----:|---|-----:|
|mmlu | 2|none | |acc | |0.7206|± |0.0036|
| - humanities | 2|none | 5|acc |↑ |0.6867|± |0.0066|
| - formal_logic | 1|none | 5|acc |↑ |0.6587|± |0.0424|
| - high_school_european_history | 1|none | 5|acc |↑ |0.8424|± |0.0285|
| - high_school_us_history | 1|none | 5|acc |↑ |0.8333|± |0.0262|
| - high_school_world_history | 1|none | 5|acc |↑ |0.8861|± |0.0207|
| - international_law | 1|none | 5|acc |↑ |0.8595|± |0.0317|
| - jurisprudence | 1|none | 5|acc |↑ |0.8056|± |0.0383|
| - logical_fallacies | 1|none | 5|acc |↑ |0.7607|± |0.0335|
| - moral_disputes | 1|none | 5|acc |↑ |0.7081|± |0.0245|
| - moral_scenarios | 1|none | 5|acc |↑ |0.6212|± |0.0162|
| - philosophy | 1|none | 5|acc |↑ |0.7170|± |0.0256|
| - prehistory | 1|none | 5|acc |↑ |0.8179|± |0.0215|
| - professional_law | 1|none | 5|acc |↑ |0.5737|± |0.0126|
| - world_religions | 1|none | 5|acc |↑ |0.8480|± |0.0275|
| - other | 2|none | 5|acc |↑ |0.7177|± |0.0078|
| - business_ethics | 1|none | 5|acc |↑ |0.8300|± |0.0378|
| - clinical_knowledge | 1|none | 5|acc |↑ |0.7283|± |0.0274|
| - college_medicine | 1|none | 5|acc |↑ |0.6474|± |0.0364|
| - global_facts | 1|none | 5|acc |↑ |0.3600|± |0.0482|
| - human_aging | 1|none | 5|acc |↑ |0.6099|± |0.0327|
| - management | 1|none | 5|acc |↑ |0.7379|± |0.0435|
| - marketing | 1|none | 5|acc |↑ |0.9060|± |0.0191|
| - medical_genetics | 1|none | 5|acc |↑ |0.6700|± |0.0473|
| - miscellaneous | 1|none | 5|acc |↑ |0.8084|± |0.0141|
| - nutrition | 1|none | 5|acc |↑ |0.7549|± |0.0246|
| - professional_accounting | 1|none | 5|acc |↑ |0.5709|± |0.0295|
| - professional_medicine | 1|none | 5|acc |↑ |0.7610|± |0.0259|
| - virology | 1|none | 5|acc |↑ |0.5000|± |0.0389|
| - social sciences | 2|none | 5|acc |↑ |0.8125|± |0.0069|
| - econometrics | 1|none | 5|acc |↑ |0.6930|± |0.0434|
| - high_school_geography | 1|none | 5|acc |↑ |0.8333|± |0.0266|
| - high_school_government_and_politics| 1|none | 5|acc |↑ |0.8964|± |0.0220|
| - high_school_macroeconomics | 1|none | 5|acc |↑ |0.7308|± |0.0225|
| - high_school_microeconomics | 1|none | 5|acc |↑ |0.8782|± |0.0212|
| - high_school_psychology | 1|none | 5|acc |↑ |0.8936|± |0.0132|
| - human_sexuality | 1|none | 5|acc |↑ |0.6870|± |0.0407|
| - professional_psychology | 1|none | 5|acc |↑ |0.7941|± |0.0164|
| - public_relations | 1|none | 5|acc |↑ |0.6455|± |0.0458|
| - security_studies | 1|none | 5|acc |↑ |0.8000|± |0.0256|
| - sociology | 1|none | 5|acc |↑ |0.8408|± |0.0259|
| - us_foreign_policy | 1|none | 5|acc |↑ |0.9000|± |0.0302|
| - stem | 2|none | 5|acc |↑ |0.6841|± |0.0079|
| - abstract_algebra | 1|none | 5|acc |↑ |0.4800|± |0.0502|
| - anatomy | 1|none | 5|acc |↑ |0.7556|± |0.0371|
| - astronomy | 1|none | 5|acc |↑ |0.8224|± |0.0311|
| - college_biology | 1|none | 5|acc |↑ |0.8681|± |0.0283|
| - college_chemistry | 1|none | 5|acc |↑ |0.5000|± |0.0503|
| - college_computer_science | 1|none | 5|acc |↑ |0.6800|± |0.0469|
| - college_mathematics | 1|none | 5|acc |↑ |0.4600|± |0.0501|
| - college_physics | 1|none | 5|acc |↑ |0.5980|± |0.0488|
| - computer_security | 1|none | 5|acc |↑ |0.7400|± |0.0441|
| - conceptual_physics | 1|none | 5|acc |↑ |0.7064|± |0.0298|
| - electrical_engineering | 1|none | 5|acc |↑ |0.7448|± |0.0363|
| - elementary_mathematics | 1|none | 5|acc |↑ |0.6614|± |0.0244|
| - high_school_biology | 1|none | 5|acc |↑ |0.8935|± |0.0175|
| - high_school_chemistry | 1|none | 5|acc |↑ |0.6749|± |0.0330|
| - high_school_computer_science | 1|none | 5|acc |↑ |0.8400|± |0.0368|
| - high_school_mathematics | 1|none | 5|acc |↑ |0.3963|± |0.0298|
| - high_school_physics | 1|none | 5|acc |↑ |0.7152|± |0.0368|
| - high_school_statistics | 1|none | 5|acc |↑ |0.6898|± |0.0315|
| - machine_learning | 1|none | 5|acc |↑ |0.6429|± |0.0455|
| Groups |Version|Filter|n-shot|Metric| |Value | |Stderr|
|------------------|------:|------|-----:|------|---|-----:|---|-----:|
|mmlu | 2|none | |acc | |0.7206|± |0.0036|
| - humanities | 2|none | 5|acc |↑ |0.6867|± |0.0066|
| - other | 2|none | 5|acc |↑ |0.7177|± |0.0078|
| - social sciences| 2|none | 5|acc |↑ |0.8125|± |0.0069|
| - stem | 2|none | 5|acc |↑ |0.6841|± |0.0079|
gemma-4-26B-A4B-it-UD-IQ4_NL.gguf-gsm8k
gguf ({'base_url': 'http://localhost:8050'}), gen_kwargs: ({}), limit: None, num_fewshot: 5, batch_size: 1
| Tasks |Version|Filter|n-shot|Metric| |Value | |Stderr|
|---------------------------------------|------:|------|-----:|------|---|-----:|---|-----:|
|mmlu | 2|none | |acc | |0.6565|± |0.0039|
| - humanities | 2|none | 5|acc |↑ |0.6315|± |0.0069|
| - formal_logic | 1|none | 5|acc |↑ |0.6270|± |0.0433|
| - high_school_european_history | 1|none | 5|acc |↑ |0.8182|± |0.0301|
| - high_school_us_history | 1|none | 5|acc |↑ |0.8480|± |0.0252|
| - high_school_world_history | 1|none | 5|acc |↑ |0.8523|± |0.0231|
| - international_law | 1|none | 5|acc |↑ |0.8099|± |0.0358|
| - jurisprudence | 1|none | 5|acc |↑ |0.7130|± |0.0437|
| - logical_fallacies | 1|none | 5|acc |↑ |0.6810|± |0.0366|
| - moral_disputes | 1|none | 5|acc |↑ |0.5723|± |0.0266|
| - moral_scenarios | 1|none | 5|acc |↑ |0.5140|± |0.0167|
| - philosophy | 1|none | 5|acc |↑ |0.5852|± |0.0280|
| - prehistory | 1|none | 5|acc |↑ |0.7654|± |0.0236|
| - professional_law | 1|none | 5|acc |↑ |0.5743|± |0.0126|
| - world_religions | 1|none | 5|acc |↑ |0.7427|± |0.0335|
| - other | 2|none | 5|acc |↑ |0.6244|± |0.0084|
| - business_ethics | 1|none | 5|acc |↑ |0.7200|± |0.0451|
| - clinical_knowledge | 1|none | 5|acc |↑ |0.6566|± |0.0292|
| - college_medicine | 1|none | 5|acc |↑ |0.6474|± |0.0364|
| - global_facts | 1|none | 5|acc |↑ |0.3300|± |0.0473|
| - human_aging | 1|none | 5|acc |↑ |0.5202|± |0.0335|
| - management | 1|none | 5|acc |↑ |0.7379|± |0.0435|
| - marketing | 1|none | 5|acc |↑ |0.4487|± |0.0326|
| - medical_genetics | 1|none | 5|acc |↑ |0.7000|± |0.0461|
| - miscellaneous | 1|none | 5|acc |↑ |0.7075|± |0.0163|
| - nutrition | 1|none | 5|acc |↑ |0.5915|± |0.0281|
| - professional_accounting | 1|none | 5|acc |↑ |0.5355|± |0.0298|
| - professional_medicine | 1|none | 5|acc |↑ |0.8125|± |0.0237|
| - virology | 1|none | 5|acc |↑ |0.4518|± |0.0387|
| - social sciences | 2|none | 5|acc |↑ |0.7345|± |0.0078|
| - econometrics | 1|none | 5|acc |↑ |0.5614|± |0.0467|
| - high_school_geography | 1|none | 5|acc |↑ |0.8081|± |0.0281|
| - high_school_government_and_politics| 1|none | 5|acc |↑ |0.7772|± |0.0300|
| - high_school_macroeconomics | 1|none | 5|acc |↑ |0.7000|± |0.0232|
| - high_school_microeconomics | 1|none | 5|acc |↑ |0.8782|± |0.0212|
| - high_school_psychology | 1|none | 5|acc |↑ |0.8239|± |0.0163|
| - human_sexuality | 1|none | 5|acc |↑ |0.6718|± |0.0412|
| - professional_psychology | 1|none | 5|acc |↑ |0.6863|± |0.0188|
| - public_relations | 1|none | 5|acc |↑ |0.6273|± |0.0463|
| - security_studies | 1|none | 5|acc |↑ |0.6653|± |0.0302|
| - sociology | 1|none | 5|acc |↑ |0.6816|± |0.0329|
| - us_foreign_policy | 1|none | 5|acc |↑ |0.7800|± |0.0416|
| - stem | 2|none | 5|acc |↑ |0.6492|± |0.0082|
| - abstract_algebra | 1|none | 5|acc |↑ |0.4400|± |0.0499|
| - anatomy | 1|none | 5|acc |↑ |0.6074|± |0.0422|
| - astronomy | 1|none | 5|acc |↑ |0.5921|± |0.0400|
| - college_biology | 1|none | 5|acc |↑ |0.7986|± |0.0335|
| - college_chemistry | 1|none | 5|acc |↑ |0.4700|± |0.0502|
| - college_computer_science | 1|none | 5|acc |↑ |0.6700|± |0.0473|
| - college_mathematics | 1|none | 5|acc |↑ |0.5100|± |0.0502|
| - college_physics | 1|none | 5|acc |↑ |0.5294|± |0.0497|
| - computer_security | 1|none | 5|acc |↑ |0.7300|± |0.0446|
| - conceptual_physics | 1|none | 5|acc |↑ |0.6340|± |0.0315|
| - electrical_engineering | 1|none | 5|acc |↑ |0.6759|± |0.0390|
| - elementary_mathematics | 1|none | 5|acc |↑ |0.6878|± |0.0239|
| - high_school_biology | 1|none | 5|acc |↑ |0.8742|± |0.0189|
| - high_school_chemistry | 1|none | 5|acc |↑ |0.6798|± |0.0328|
| - high_school_computer_science | 1|none | 5|acc |↑ |0.8100|± |0.0394|
| - high_school_mathematics | 1|none | 5|acc |↑ |0.4593|± |0.0304|
| - high_school_physics | 1|none | 5|acc |↑ |0.6093|± |0.0398|
| - high_school_statistics | 1|none | 5|acc |↑ |0.7407|± |0.0299|
| - machine_learning | 1|none | 5|acc |↑ |0.4554|± |0.0473|
| Groups |Version|Filter|n-shot|Metric| |Value | |Stderr|
|------------------|------:|------|-----:|------|---|-----:|---|-----:|
|mmlu | 2|none | |acc | |0.6565|± |0.0039|
| - humanities | 2|none | 5|acc |↑ |0.6315|± |0.0069|
| - other | 2|none | 5|acc |↑ |0.6244|± |0.0084|
| - social sciences| 2|none | 5|acc |↑ |0.7345|± |0.0078|
| - stem | 2|none | 5|acc |↑ |0.6492|± |0.0082|