GA»Æ½ð¼×

´Ó³ß¶Èµ½Â䵨£ºÍòÕ×Ô°Çø £¬Ñ¡ÒÔÌ«²Ê¹â Ø­ ¡¶ÍòÕ×Ô°ÇøÒÔÌ«²Ê¹â×êÑл㱨¡·¼¼Êõ×êÑлá
date
Ô¤Ô¼Ö±²¥
AIʱÆÚ £¬Ò½ÁÆÍøÂçÔõô½¨ Ø­ GA»Æ½ð¼×Ò½ÁƼ«¼òÒÔÌ«²Ê¹âË«³¬ÈÚºÏÍøÂç½â¾ö¹æ»®°ä²¼
date
Ô¤Ô¼Ö±²¥
GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾
²úÆ·
< ·µ»ØÖ÷²Ëµ¥
²úÆ·ÖÐÐÄ
²úÆ·
½â¾ö¹æ»®
< ·µ»ØÖ÷²Ëµ¥
½â¾ö¹æ»®ÖÐÐÄ
ÐÐÒµ
ºÏ×÷ͬ°é
·µ»ØÖ÷²Ëµ¥
Ñ¡ÔñÇøÓò/˵»°
GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾

Äú¶©ÔĵIJúÆ·ÓиüР£¬Çëʵʱ²éÔÄ

²é¿´ÏêÇé
GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾ GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾

½âÃÜDeepSeek-V3ÍÆÀíÍøÂ磺MoE¼Ü¹¹ÈôºÎ³Á¹¹µÍʱÑÓ¡¢¸ßÍÌÍÂÐèÒª£¿

DeepSeek-V3°ä²¼Íƶ¯É¢²¼Ê½ÍÆÀíÍøÂç¼Ü¹¹Éý¼¶ £¬MoEÄ£ÐÍÒýÈë´ó¹æÄ£×¨¼Ò²¢ÐÐͨѶ £¬ÍÆÀíÁ÷Á¿ÌصãÏÔÖø±ä¶¯ £¬Decode½×¶Î¶ÔÍøÂçʱ¶ÈÃô¸Ð¡£ÍøÂçÐè±£ÏÕµÍʱÑÓÓë¸ßÍÌÍ £¬Í¨¹ý¶ËÍøÐ­Í¬¸ºÔØÆ½ºâÓëÓµÈû½ÚÔì¼¼ÊõÓÅ»¯»úÄÜ¡£¸ßЧÔËάʵÏÖ¹ÊÕϼ±¾ç¶¨Î»ÓëÒµÎñ¸ß¿ÉÓà £¬µ¥¹ìË«Æ½ÃæÓëShuffle¶àÆ½Ãæ×éÍø¹æ»®Ôڵͳɱ¾ÏÂÂú×ã¸ß»úÄÜÍÆÀíÐèÒª £¬Îª´ó¹æÄ£MoEÄ£ÐͲ¿ÊðÌṩÖ÷ÌâÍøÂçÖ§³Ö¡£

  • GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾

    °ä²¼¹¦·ò£º2025-10-27

  • GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾

    µã»÷Á¿£º

  • GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾

    µãÔÞ£º

·ÖÏíÖÁ

GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾
GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾
GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾

ÎÒÏëÆÀÂÛ

Ò»¡¢ÍÆÀí³¡¾°ºÍMoEÄ£ÐÍÒýÈëÍøÂçÐÂËßÇó

2025ËêÊ× £¬DeepSeek-V3°ä²¼ £¬Ñ¸¿ìÒý·¢¹úÄÚ±íµÄ¿í·º¹Ø×¢ºÍ²¿ÊðÈȳ±¡£×÷ΪÖ÷Ìâ»ù´¡Éèʩ֮һ £¬É¢²¼Ê½ÍÆÀíÍøÃæ¶ÔȫеÄÐèÒª¡£ÕûÌåÀ´¿´ £¬ÍÆÀíÓëѵÁ·µÄÁ÷Á¿²î¾à¡¢MoEÄ£Ðͼܹ¹µÄÒýÈëÒÔ¼°DeepSeek¿ªÔ´¼¼Êõ¹æ»®µÈ¶à³Á³É·Ö £¬Ó°ÏìÁËÍøÂ罨ÉèµÄ·½ÏòºÍÒªÇó¡£

´«Í³Å¨ÃÜÄ£Ð͵ÄѵÁ·ÓëÍÆÀíÁ÷Á¿ÖÐ £¬95%ÒÔÉÏΪTensor Parallel£¨TP£©Í¨Ñ¶ £¬ÖØÒªÔÚ»úÄڸߴø¿íÓòͨ¹ýall-reduceʵÏÖ £¬»ú±íµÍ´ø¿íÓò½öÔÚͬºÅ¿¨¼äÖ´ÐеÍÁ÷Á¿µÄÊý¾Ý²¢ÐУ¨DP£©ºÍÁ÷Ë®Ïß²¢ÐУ¨PP£©Í¨Ñ¶¡£¶øDeepSeekѡȡµÄMoE£¨Mixture of Experts£©Ä£Ðͼܹ¹ÏÔÖøÅ¤×ªÁËÁ÷Á¿Ìص㡣ѵÁ·ºÍÍÆÀí½×¶Î¾ù²»Ñ¡È¡TPͨѶ £¬È¡¶ø´úÖ®µÄÊÇ´ó¹æÄ£×¨¼Ò²¢ÐУ¨EP£©Í¨Ñ¶ £¬ÑµÁ·½×¶ÎEPÁ÷Á¿Õ¼±È³¬¹ý95% £¬ÍÆÀí½×¶ÎÔò´ïµ½100%¡£EPͨѶÓâÔ½¶à¸ö°¼Í¹´ø¿íÓò £¬ÇÒѡȡall-to-allͨѶģʽ £¬Í¨Ñ¶½á¹¹¸´ÔÓÇÒÁ÷Á¿¾Þ´ó £¬¶ÔÍøÂç»úÄÜÌá³öÁ˸ü¸ß¡¢¸ü²î¾à»¯µÄÒªÇó¡£

DeepSeekÄ£ÐͲÎÊý¹æÄ£´ïµ½6710ÒÚ £¬ÔÚÍÆÀí²¿ÊðÖÐÒýÈëÁËPD·ÖÀëºÍ´ó¹æÄ£EP²¢ÐÐ £¬Íƶ¯ÂúѪ°æ¸ß»úÄÜÍÆÀí×ßÏòÉ¢²¼Ê½¡£Ïà±È´«Í³µ¥»úÍÆÀí £¬É¢²¼Ê½ÍÆÀí´øÀ´ÁËÏÔÖø²î¾à £¬Ê¹µÃÍÆÀíÁ÷Á¿Ä£Ê½ÓëÉ¢²¼Ê½ÑµÁ·¸üΪ¿¿½ü £¬µ«Á½ÕßÔÚÁ÷Á¿ÌصãÉÏÒÀÈ»´æÔÚÏÔÖøÇø±ð¡£

ͨѶÁ÷Á¿¿ÉÓÉÒÔϹ«Ê½¹ÀË㣺£¨minibatch´óÓ× × ¸ßµÍÎij¤¶È × °µ²Ø²ãά¶È£©× ½ÚµãÊý × £¨dispatch_alltoallͨѶ´ÎÊý × FP8×Ö½ÚÊý + combine_alltoallͨѶ´ÎÊý × BF16×Ö½ÚÊý£©× GPUÕÆ¹ÜµÄ²ãÊý¡£Ï±íͳ¼ÆÖØÒªEPÁ÷Á¿×÷Ϊ²Î¿¼¡£

×ÜͨѶÁ¿ µ¥´ÎͨѶÁ¿
ѵÁ· 315GB

dispatch£º112MB

combine£º224MB

ÍÆÀíPrefill 57.09GB

dispatch£º168MB

combine£º336MB

ÍÆÀíDecode 1218MB

dispatch£º3.5MB

combine£º7MB

ѵÁ·³¡¾°Á÷Á¿Ä£Ê½¹Ì¶¨ÇÒÃ÷È· £¬µ¥´Îµü´ú×ÜÁ÷Á¿¸ß´ï315GB £¬µ¥´ÎEPͨѶÁ÷Á¿Ô¼112MB¡£

ÍÆÀí³¡¾°Á÷Á¿ÊÜÓû§ÊäÈëÓ°Ïì £¬µßô¤½Ï´ó¡£Prefill½×¶ÎÒÔ4K¸ßµÍÎÄ¡¢batch sizeΪ4ÍÆËãÁ÷Á¿´óÓ× £¬µ¥´Îµü´ú×ÜÁ÷Á¿Ô¼57.09GB £¬µ¥´ÎͨѶÁ÷Á¿ÓëѵÁ·Ïà½ü£»Decode½×¶ÎÒÔ128²¢·¢ÍÆËã £¬µ¥´Îµü´úÁ÷Á¿ÏÔÖø½µµÍÖÁÔ¼1.2GB £¬µ¥´ÎͨѶÁ÷Á¿½öΪ¼¸MB £¬PrefillÓëDecode½×¶ÎÁ÷Á¿²î¾àÏÔÖø¡£

»ùÓÚÒÔÉÏÈ«ÐÂÇÒ¸´ÔÓµÄÍøÂçÐèÒª £¬Éî¿Ì¼ø±ðºÍ·ÖÎöDeepSeekÍÆÀíÍøÂçµÄ¹Ø¼ü¼¼Êõ £¬ÊDZ£ÏÕÍÆÀí¸ß»úÄÜ¡¢µÍ³É±¾Óë¸ß¿¿µÃסÐԵĹؼü¡£ÏÂÎÄÎÒÃǽ«´ÓµÍÍøÂçʱÑÓ¡¢¸ßÐ§ÍøÂçÔËάºÍµÍ³É±¾×éÍø½Ç¶È £¬·¢Õ¹½éÉÜDeepSeekÍÆÀíÍøÂç¹Ø¼ü¼¼Êõ¡£

¶þ¡¢µÍʱÑÓÍøÂçÖúÁ¦ÍÆÀí¸ßÍÌÍÂ

ƾ¾ÝÉÏÊöÁ÷Á¿·ÖÎö £¬Decode½×¶ÎµÄµ¥´ÎͨѶÁ÷Á¿½öΪ3.5MB/7MB¡£½áºÏDeepSeek¹Ù·½¿ªÔ´Í¨Ñ¶¿âDeepEPµÄ»úÄÜ £¬µ±Ç°³¡¾°ÏÂDecode½×¶ÎµÄdispatchͨѶʱ³¤ÔÚ100usÄÚ £¬combineͨѶʱ³¤ÔÚ200usÄÚ¡£Decode½×¶ÎµÄSLOͨ³£ÒªÇóµÍÓÚ50ms £¬µ«EPͨѶ´ÎÊý¸ß´ï116´Î £¬Ã¿´ÎͨѶ³ÇÊе¼ÖÂʱÑÓµþ¼Ó £¬Òò¶ø¶ÔÍøÂçʱÑÓÌá³öÁ˺ܸߵÄÒªÇó¡£×ÛÉÏ £¬ÔÚDecode½×¶Î £¬ºÜÉٵĵ¥´ÎͨѶÁ÷Á¿¡¢ºÜ¶ÌµÄͨѶʱ³¤¡¢ºÜ¸ßµÄSLOÒªÇó¶¼¶ÔÍøÂçÌá³öÁ˽ϵ͵ÄʱÑÓÐèÒª¡£

GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾

H800ÍøÂçʱÑÓ¶ÔDecodeÍÌ͵ÄÓ°Ïì

GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾

H20ÍøÂçʱÑÓ¶ÔDecodeÍÌ͵ÄÓ°Ïì

ÉÏͼÊǶÔ4K/1K¸ßµÍÎÄ £¬1KÊä³öµÄDecode³¡¾° £¬ÔÚH800/H20É豸Ï £¬ÒÔ128 batch×÷Ϊ³¡¾° £¬½øÐеÄÍøÂçʱÑÓ¶ÔDecodeÍÌÍÂÓ°Ïì·ÂÕæ¡£ÈçͼËùʾ £¬µ±ÍøÂç²à²úÉú1msµÄʱÑÓÔö³¤Ê± £¬ÎÞÂÛÊÇH800»¹ÊÇH20 £¬ÔÚ·ÖÆçµÄ¸ßµÍÎij¡¾°Ï £¬ÍÌͳÇÊвúÉú¾Þ´óÓ°Ïì £¬ÍÌͽµÂä·ù¶È¸ß´ï80%×óÓÒ £¬ÏÕЩÒѾ­Ö±½Óµ¼Öµ±Ç°Decode½Úµã²»³ÉÓᣵ±ÍøÂçÉϲúÉú100usµÄʱÑÓʱ £¬4K¸ßµÍÎij¡¾°Ï £¬ÍÌͽµÂä¿ÉÄÜ´ïµ½20%+¡£Óɴ˿ɼû £¬Decode½Úµã¶ÔÍøÂçʱÑÓµÄÃô¸Ð¶ÈºÜ¸ß¡£ÔÚDeepSeek´ó¹æÄ£EP²¢ÐÐall-to-allͨѶģʽÏ £¬ÍøÂçʱÑÓµÄÖØÒªÓ°Ïì³É·ÖÊǸºÔØÆ½ºâºÍÓµÈû½ÚÔ죺

GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾

ÈçÉÏͼËùʾ £¬ÔÚ´ó¹æÄ£EPµÄDeepSeekÍÆÀí³¡¾° £¬EPÓòµÄͨѶ¿ÉÄܺá¿ç¶à¸öLeaf £¬Á÷Á¿×ßÏòSpine £¬ÈÝÒײúÉúµäÐ͵ÄECMP¹þÏ£²»¾ùÎÊÌâ £¬µ¼Ö½ϸ߶¯Ì¬Ê±ÑÓ¡£ÇÒDeepSeekµÄMoEÄ£ÐÍÍÆÀíÒײú×ÌÊ·ý¼ä¸ºÔز»Ò»ÖºÍÊ·ýÄÚר¼Ò¸ºÔز»Ò»ÖÂÎÊÌâ £¬ÔÚÍøÂçÉϲû·¢ÎªÁ÷Á¿ÖдóÓ×Á÷»ìºÏ¡£¸Ã¾°Ïó¸üÈÝÒ×¼Ó¾çECMP²»¾ùµ¼ÖµĶ¯Ì¬Ê±ÑÓÎÊÌâ £¬Ç·°²µÄ¸ºÔØÆ½ºâÕ½Êõ £¬ÔÚÍøÂçÉÏÈÝÒ×ÒýÈë100us+ÉõÖÁ¸ü¸ßµÄ¶¯Ì¬Ê±ÑÓ¡£ÈçÉÏÎÄ·ÖÎö £¬ÕâÑùµÄ¶¯Ì¬Ê±ÑÓˮƽ¶ÔÍÌ͵ÄÓ°Ïì¿ÉÄÜ´ïµ½20%+¡£ÔÚDeepSeek¹Ù·½³¡¾°ÖÐ £¬Ñ¡È¡IB»¥»»»úºÍCXÍø¿¨µÄAdaptive Routing£¨AR£©¼¼Êõ £¬ÓÐЧ»º½âÁËECMP¸ºÔز»¾ùÎÊÌâ¡£ÔÚRoCE»·¾³Ï £¬¶ËÍøÐ­Í¬µÄ¸ºÔØÆ½ºâ¹æ»®ÔÚÈç´Ë¿Ì±¡µÄµÍʱÑÓÒªÇóÏ £¬ÊÇÖÁ¹Ø³ÁÒªµÄ¡£

GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾

´Ë±í £¬MoEÄ£Ð͵Ĵó¹æÄ£×¨¼Ò²¢ÐÐͨѶÐÔÖÊÉÏÊÇÒ»ÖÖall-to-allģʽ £¬ÍøÂçÖÐÌìÈ»´æÔÚincastÁ÷Á¿¡£ºÏÀíµÄÓµÈû½ÚÔìÕ½Êõ¿ÉÄÜÔ¤·ÀÒòÁ÷Á¿½µ¿ì»òPFC£¨Priority Flow Control£©´¥·¢¶ø´øÀ´µÄ¸ß¶¯Ì¬Ê±ÑÓ £¬±£ÏÕÍøÂçʱÑӵIJ»±äÐÔºÍÍÆÀí»úÄÜ¡£

Èý¡¢¸ßЧ¶ËÍøÔËά±£Ïո߿ÉÓÃÍÆÀíÒµÎñ

GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾

Âý¹ÊÕÏ¡¢hangÒì³£

GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾

Á´Â·¹ÊÕÏ

Ëæ×ÅDeepSeekÍÆÀíÒýÈë´ó¹æÄ£×¨¼Ò²¢ÐУ¨EP£© £¬É¢²¼Ê½ÍÆÀí¼¯ÈºÃæ¶ÔÓëѵÁ·¼¯ÈºÀàËÆµÄ¹ÊÕÏÌôÕ½¡£Æ¾¾ÝMeta¹«¿ªµÄ×êÑÐÊý¾Ý £¬ÒÔ1024¿¨¼¯ÈºÎªÀý £¬¾ùÔÈÿ7.9Ó×ʱ»á²úÉúÒ»´Î¹ÊÕÏ¡£½áºÏ¹ÊÕ϶ÔÍÆÀíµÄÓ°Ïì £¬¿É½«¹ÊÕÏÀàÐÍ×ÛºÏΪÈýÀࣺ

Âý½ÚµãÒì³££º¹ÊÕϲúÉúºóÍÆÀí¹¤×÷²»ÖжÏ £¬µ«²¿ÃŽڵã»ò½×¶Î»úÄܽµÂä £¬µ¼ÖÂÕûÌåÍÆÀí±»ÍÏÂý £¬²û·¢ÎªÂý½ÚµãЧӦ¡£

HangÒì³££º¹ÊÕϵ¼ÖÂÍÆÀí³¤¹¦·ò¿¨¶ÙÓÚijһ½×¶Î £¬¹¤×÷ÎÞ·¨³ÖÐøÍÆ¶¯ £¬µ«ÕûÌåÍÆÀíÈÔδÖжÏ¡£

Á´Â·¹ÊÕÏ£ºÁ´Â·ÖжÏÖ±½Óµ¼ÖÂÕû¸öÍÆÀíÊ·ýÍ˳ö¡£

ÔÚÂý½ÚµãÒì³£ºÍ¶Ì¹¦·òHangÒì³£³¡¾°Ï £¬¹ÌÈ»ÍÆÀí¹¤×÷ÈÔÔÚÔËÐÐ £¬µ«ÍÆÀí»úÄÜÏÔÖøÊÜË𠣬TTFT£¨Time To First Token£©ºÍTPOT£¨Time Per Output Token£©Ö¸±êÏÔÖø¶ñ»¯ £¬ÍÌÍÂÁ¿¿ÉÄܽµÂä50%ÒÔÉÏ¡£Òò¶ø £¬Õë¶ÔÂý¹ÊÕϺÍHangÒì³£µÄʵʱ¼à¿Ø¡¢¼±¾ç¶¨Î»ÓëÅŲé £¬¶ÔÓÚ±£ÏÕÍÆÀí»úÄÜÓµÓгÁÒª¼ÛÖµ¡£

¶øÔÚ³¤¹¦·òHangÒì³£»òÁ´Â·¹ÊÕϵ¼ÖÂÍÆÀíÊ·ýÖ±½ÓÍ˳öµÄÇé¿öÏ £¬ÒµÎñÓ°Ïì¸üΪÑϳÁ¡£¶ÔÓÚ´ó¹æÄ£Ê·ý²¿Êð»·¾³ £¬¿Éͨ¹ýÒªÇó¼±¾çÇл»ÖÁÆäËû½¡È«Ê·ý £¬Ëä¿ÉÄܾÍÒ岿ÃÅÓû§ÂÄÀú £¬µ«Äܱ£ÏÕÒµÎñÂ½ÐøÐÔ¡£Ïà½Ï֮Ϡ£¬ÉÙÁ¿Ê·ý²¿Êð£¨Èçµ¥¸öDecodeÊ·ý£©²úÉú¹ÊÕÏʱ £¬ÍùÍùÖ±½Óµ¼ÖÂÒµÎñÖжÏ £¬ÑϳÁÓ°Ïì²»±äÐÔºÍÓû§ÂÄÀú¡£Òò¶øÓ×¹æÄ£³¡¾°Ï £¬¹ÊÕϵĶ¨Î»¡¢ÌÓÉúºÍ¶ã±Ü £¬ÊDZ£ÏÕÒµÎñ¿ÉÓÃÐԵĹؼü¼¿Á©¡£

ËÄ¡¢¸ßÐÔ¼Û±ÈÍÆÀí×éÍøÑ¹Õ¥°ÙÍòtoken³É±¾

1.Ë«¿ÚÍø¿¨Ë«Æ½Ãæ×éÍø£º

GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾

µ¥¹ìË«Æ½Ãæ×éÍø

»ùÓÚÉÏÊö¶ÔÍøÂçµÍʱÑӺ͸߿¿µÃסÐÔµÄÐèÒª £¬Ñ¡È¡ÈçͼËùʾµÄµ¥¹ìË«Æ½Ãæ×éÍø¹æ»® £¬¿ÉÄÜ×î´óˮƽ±£ÏÕ»úÄÜÓë¿¿µÃסÐÔ¡£Ïà±È´«Í³CLOS¼Ü¹¹ £¬¸Ã¹æ»®ÔÚÐÔ¼ÛÆ©Ó÷Ãæ¸ü¾ßÓÅÊÆ¡£¾ßÌåÌØµãÈçÏ£º

ÓÅÊÆ£º

ÍøÂç½á¹¹¼ò½à£ºÁ÷Á¿¼¯ÖÐÓÚLeaf»¥»»»ú £¬½µµÍ¿ç»¥»»»úͨѶ¸´ÔÓ¶È £¬ÏÔÖøÏ÷¼õʱÑÓ¡£

³É±¾Ð§Òæ¸ß£ºÖ§³ÖÍ­À»¥Áª £¬Ï÷¼õ»¥»»»úÊýÁ¿ £¬ÕûÌåÍøÂçͶÈë¸üµÍ¡£

ʱÑӵͣºÊý¾ÝÃæÁ´Â·×½öΪ2Ìø £¬×î´óÌøÊýΪ1Ìø £¬È·±£µÍʱÑÓ´«Êä¡£

Á÷¿ØÐèÒªµÍ£ºÎÞ¸ºÔØÆ½ºâÎÊÌâ £¬Á÷Á¿×ßµ¥Ò»õè¾¶ £¬¼ò»¯Á÷¿ØÉè¼Æ¡£

Ò×ÓÚÀ©´ó£ºÐÂÔö½ÚµãÎÞÐèÔö³¤¶þ²ãÍøÂç £¬Ö§³Ö¼¯ÈººáÏòÀ©´ó¡£

BondÊÊÅäÐÔÇ¿£ºÑ¡È¡bondË«Æ½Ãæ×éÍøÌáÉýÍøÂç¿¿µÃסÐÔ £¬ÇÒÓÉÓÚÎÞ¶þ²ã×éÍø £¬bond¹æ»®²»»á´øÀ´¶î±í»¥»»»ú³É±¾¡£

ÁÓÊÆ£º

½Ã½ÝÐÔÊÜÏÞ£ºPrefill»òDecodeÊ·ý²»³É¿çLeaf²¿Ê𠣬µ¥Ê·ý×î´ó¹æÄ£ÊÜÏÞÓÚ256¿¨¡£

¼æÈÝÐÔ²»¼°£º×éÍøÕë¶ÔÍÆÀíÁ÷Á¿¸öÐÔÓÅ»¯ £¬ÄÑÒÔ¼æÈÝѵÁ·ÓëÍÆÀíÒ»Ì廯³¡¾°¡£

KV Cache´«ÊäÒÀÀµ´æ´¢Íø£ºÔÚѡȡPD·ÖÀ벿Êðʱ £¬ÈôÊÇ´æÔÚ¿çLeafµÄPDÊ·ý £¬Ôò±ØÐ뽨Éè´æ´¢ÍøÂçÒÔÖ§³ÖKV Cache´«Êä¡£

2.Shuffle¶àÆ½Ãæ×éÍø£º

GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾

»ùÓÚË«Íø¿ÚÍø¿¨µÄË«Æ½Ãæ×éÍø¹æ»® £¬µ¥Pod×î´ó¹æÄ£ÊÜÏÞÓÚ256¿¨ £¬µ¼Ö½ýÝÐÔ²»¼°¡£ÎªÍ»ÆÆÕâһƿ¾± £¬ÔÚServerÓ뻥»»»úÖ®¼äÒýÈëShuffle(¹â½»²æºÐ) £¬ÊµÏÖÎïÀí²ãÃæµÄ·Ö¹â¡£ÒÀ¸½400GbpsÍø¿¨ºÍTH5оƬ»¥»»»ú £¬×éÍø¹æ»®Éý¼¶ÎªËÄÆ½Ãæ £¬µ¥Pod×î´ó¹æÄ£À©´óÖÁ512¿¨ £¬Âú×ã¾ø´óÎÞÊýÍÆÀí²¿ÊðÐèÒª¡£´Ë¹æ»®Ö§³Ö¸ü´ó¹æÄ£µÄEP²¢ÐкÍPDÊ·ýÊýÁ¿Ôö³¤ £¬ÇÒPDÊ·ýÎÞÐè¿çPodµ÷¶È £¬´ó·ùÌáÉýPodÄÚ×éÍø½Ã½ÝÐÔ £¬ÏÔÖø½µµÍ¶ÔKV Cache´æ´¢ÍøÂçµÄÒÀÀµ¡£

½«À´ £¬Ëæ×Å800GbpsÍø¿¨ºÍTH6оƬ»¥»»»úµÄÀûÓà £¬Shuffle¶à¹ì¹æ»®¿ÉÍØÕ¹ÖÁ8¹ì¡£ÔÚ±£Õϵ¥GPUÏíÓÐ800Gbps´ø¿íµÄǰÌáÏ £¬µ¥Pod×î´ó¹æÄ£¿ÉÀ©´óÖÁ1024¿¨ £¬Âú×㳬´ó¹æÄ£ÍÆÀí·þÎñÐèÒª¡£¸Ã¹æ»®ÔÚÎÞ¶þ²ã×éÍø¼Ü¹¹Ï £¬ÒÀÈ»ÌṩºÜ¸ßµÄPD·ÖÀ벿Êð½Ã½ÝÐÔ £¬PDÊ·ýÎÞÐè¿çPodµ÷¶È £¬Ò²ÎÞÐèKV Cache´«ÊäרÓÃÍøÂç £¬ÊµÏÖÁË׿ԽµÄÐÔ¼Û±Å×ë»úÄÜ¡£

×ܽá

DeepSeek MoEÄ£Ð͵ÄÉ¢²¼Ê½ÍÆÀí²¿Êð´øÀ´ÁËÍÆÀíÍøÂç¼Ü¹¹ºÍ»úÄܱ£ÏÕµÄÈ«ÐÂÌôÕ½¡£ÍÆÀí½×¶ÎµÄͨѶģʽºÍÁ÷Á¿ÌصãÓ봫ͳѵÁ·´æÔÚÏÔÖø²î¾à £¬ÓÈÆäÊÇDecode½×¶Î¶ÔÍøÂçʱÑÓÃô¸Ð £¬ÒªÇóÍøÂç¾ß±¸µÍʱÑӺ͸ßÍÌÍÂÄÜÁ¦¡£¶ËÍøÐ­Í¬µÄ¸ºÔØÆ½ºâËã·¨ºÍÓµÈû½ÚÔì¼¼ÊõÊDZ£ÏÕÍøÂç»úÄܵĹؼü¡£Óë´Ëͬʱ £¬ÍÆÀíÒµÎñ¸ß¿ÉÓÃÐÔÒªÇóÃÀÂúµÄ¹ÊÕÏ¼à¿Ø¡¢¼±¾ç¶¨Î»ºÍ¹ÊÕÏÌÓÉúÕ½Êõ¡£Õë¶ÔÕâЩÐèÒª £¬Éè¼Æ¼ò½à¸ßЧÇҾ߱¸¸ß¿¿µÃסÐԵĵ¥¹ìË«Æ½Ãæ×éÍø¹æ»® £¬¿ÉÄÜÔÚ±£ÕÏ»úÄܵÄͬʱ½µµÍ³É±¾¡£½«À´ £¬Ëæ×ÅDeepSeek¼°ÀàËÆ´ó¹æÄ£MoEÄ£ÐÍµÄ¿í·º²¿Êð £¬ÍÆÀíÍøÂçµÄÓÅ»¯ºÍ´´Ð½«³ÉΪÖ÷Ì⾺ÕùÁ¦¡£

ÓйرêÇ©£º

GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾ GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾

µãÔÞ

¸ü¶à¼¼Êõ²©ÎÄ

ÈκαØÒª £¬ÇëÁªÏµGA»Æ½ð¼×

GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾

·µ»Ø¶¥²¿

ÊÕÆð
GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾ ÎĵµAI¸±ÊÖ
GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾ ÎĵµÆÀ¼Û
ev-close ev-close-m
¸Ã×ÊÁÏÊÇ·ñ½â¾öÁËÄúµÄÎÊÌ⣿
ev-close ev-close-m
Äú¶Ôµ±Ç°Ò³ÃæµÄÖÐÒâ¶ÈÈôºÎ£¿
²»Õ¦µÎ
¼«¶ÈºÃ
dark-star dark-star dark-star dark-star dark-star
ev-close ev-close-m
ÄúÖÐÒâµÄÔ­ÒòÊÇ£¨¶àÑ¡£©£¿
Äú¶ÔÎĵµÊÇ·ñ»¹ÓÐÆäËüµÄÎÊÌâ»ò½¨Ò飿
Ϊ¾¡¿ì½â¾öÎÊÌâ £¬ÇëÄúÁôÏÂÁªÏµ·½Ê½Òﱋȯ¸´
ÓÊÏä
ÊÖ»úºÅ
ev-bg
¸Ð¼¤ÄúµÄ·´À¡£¡
GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾
GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾
GA»Æ½ð¼×¡¤(Öйú¼¯ÍÅ)¹Ù·½ÍøÕ¾
ÇëÑ¡Ôñ·þÎñÏîÄ¿
¹Ø¹ØÕ÷ѯҳ
ÊÛǰÕ÷ѯ ÊÛǰÕ÷ѯ
ÊÛǰÕ÷ѯ
ÊÛºó·þÎñ ÊÛºó·þÎñ
ÊÛºó·þÎñ
¶¨¼û·´À¡ ¶¨¼û·´À¡
¶¨¼û·´À¡
¸ü¶àÁªÏµ·½Ê½
¡¾ÍøÕ¾µØÍ¼¡¿