DeepSeek releases Prover-V2 model with 671 billion parameters
DeepSeek today released a new model named DeepSeek-Prover-V2-671B on the AI open source community Hugging Face. It is reported that DeepSeek-Prover-V2-671B uses a more efficient safetensors file format and supports multiple calculation precisions, making it easier and more resource-efficient to train and deploy models faster, with 671 billion parameters, or an upgraded version of the Prover-V1.5 mathematical model released last year. In terms of model architecture, the model uses the DeepSeek-V3 architecture, adopts the MoE (Mixture of Experts) mode, has 61 Transformer layers, and a 7168-dimensional hidden layer. It also supports ultra-long contexts, with a maximum position embedding of 163,800, making it capable of handling complex mathematical proofs. It also uses FP8 quantization to reduce the model size and improve inference efficiency through quantization technology. (Jinse)
Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.
You may also like
Dogecoin Faces Potential Further Decline Amid Musk-Trump Feud and Bearish Market Signals
Circle hits $75 per share in first-day pop on NYSE
After upping its offer multiple times, Circle is finally trading on the NYSE
Circle ends NYSE debut up 167% from IPO price
The stablecoin issuer’s successful first day of trading is likely to spur more crypto IPOs, industry watchers say
Solana stablecoin supply dip led by $1.8B USDC outflow
Solana’s USDC caught a boost after being paired with the TRUMP memecoin

Trending news
MoreCrypto prices
More








