Fix Ernie4.5 MoE without shared experts #14746

pwilkin · 2025-07-17T22:19:57Z

Fix bug per discussion in #14658

pwilkin · 2025-07-17T22:21:51Z

@CISC I believe this is what you had in mind :)

CISC · 2025-07-17T22:32:22Z

Yes, however you can remove add_expert_shared_feed_forward_length and change tensor loading in llama-model.cpp, see similar code:

llama.cpp/src/llama-model.cpp

Lines 4784 to 4786 in 760b448

    
           layer.ffn_gate_shexp = create_tensor(tn(LLM_TENSOR_FFN_GATE_SHEXP, "weight", i), {n_embd, n_ff_exp * n_expert_shared}, 0); 
        
           layer.ffn_down_shexp = create_tensor(tn(LLM_TENSOR_FFN_DOWN_SHEXP, "weight", i), {        n_ff_exp * n_expert_shared, n_embd}, 0); 
        
           layer.ffn_up_shexp   = create_tensor(tn(LLM_TENSOR_FFN_UP_SHEXP,   "weight", i), {n_embd, n_ff_exp * n_expert_shared}, 0);

CISC · 2025-07-17T22:34:35Z

Use n_expert_shared as condition for loading them, and remember to init that value here:

llama.cpp/src/llama-model.cpp

Lines 1657 to 1662 in 760b448

    
           if (arch == LLM_ARCH_ERNIE4_5_MOE) { 
        
               ml.get_key(LLM_KV_EXPERT_FEED_FORWARD_LENGTH,        hparams.n_ff_exp); 
        
               ml.get_key(LLM_KV_EXPERT_SHARED_FEED_FORWARD_LENGTH, hparams.n_ff_shexp, false); 
        
               ml.get_key(LLM_KV_INTERLEAVE_MOE_LAYER_STEP,         hparams.n_moe_layer_step); 
        
               ml.get_key(LLM_KV_LEADING_DENSE_BLOCK_COUNT,         hparams.n_layer_dense_lead); 
        
           }

CISC · 2025-07-17T23:10:20Z

Actually, let's not as there are already GGUFs out there. The old calculation is fine as well.

CISC · 2025-07-17T23:15:38Z

@nicoboss You will have to reconvert (or delete the ernie4_5-moe.expert_shared_feed_forward_length key).

github-actions bot added the python python script changes label Jul 17, 2025

pwilkin force-pushed the fix-big-ernie-moe branch from 88611c1 to f6e4931 Compare July 17, 2025 22:22

Fix Ernie4.5 MoE without shared experts

19eb88c

pwilkin force-pushed the fix-big-ernie-moe branch from f6e4931 to 19eb88c Compare July 17, 2025 22:24

CISC approved these changes Jul 17, 2025

View reviewed changes

CISC merged commit 670e136 into ggml-org:master Jul 17, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix Ernie4.5 MoE without shared experts #14746

Fix Ernie4.5 MoE without shared experts #14746

pwilkin commented Jul 17, 2025

Uh oh!

pwilkin commented Jul 17, 2025

Uh oh!

CISC commented Jul 17, 2025

Uh oh!

CISC commented Jul 17, 2025 •

edited

Loading

Uh oh!

CISC commented Jul 17, 2025

Uh oh!

CISC commented Jul 17, 2025

Uh oh!

Uh oh!

Uh oh!

Fix Ernie4.5 MoE without shared experts #14746

Fix Ernie4.5 MoE without shared experts #14746

Conversation

pwilkin commented Jul 17, 2025

Uh oh!

pwilkin commented Jul 17, 2025

Uh oh!

CISC commented Jul 17, 2025

Uh oh!

CISC commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CISC commented Jul 17, 2025

Uh oh!

CISC commented Jul 17, 2025

Uh oh!

Uh oh!

Uh oh!

CISC commented Jul 17, 2025 •

edited

Loading