overview for allende2001

from HB news mega (post body, "Mounting Protests Against Paz in Bolivia / Iran Deal Remains Out of Reach / Sahel Holds Firm Against Foreign Aggression")

spoiler

Image depicts Bolivian trade unionists on strike in La Paz, Bolivia.

Long preamble/summary below of recent news events.

spoiler summary The Iran ceasefire is grinding on. After a brief period over the weekend of heightened activity where it seemed that US strikes might be resuming, Trump announced a "Memorandum of Understanding" with Iran, which initially appeared to be an agreement along Iran's demands.

For those not following along with the diplomatic minutia, Iran's position for several weeks has been that the nuclear issue must be discussed separately - because, well, last time they started discussing the nuclear issue with the US, they got fucking bombed - and so have proposed a two-stage negotiation where the war is first officially ended with certain preconditions (e.g. the US has to end sanctions and unfreeze assets and presumably withdraw at least some military assets), and then the second stage will begin in which the nuclear issue is handled.

The reason why a deal has still not been signed after all this time is because the US disagrees with doing it this way, and wants the nuclear issue to be handled right away (and obviously also objects with things like Iran retaining control of the Strait). Therefore, Trump's announcement appeared to be him finally accepting reality, but it quickly became apparent that this was just another market manipulation. I'm definitely in the camp among several other analysts that believes another round of war is going to happen barring some very sudden circumstances (e.g. Trump being forced out of power one way or another, or Iran obtaining a nuke) because the US still seems agreement-incapable. And in Lebanon, consternation for the Zionists against Hezbollah's attacks continues as the FPV drone threat only continues to increase despite them desperately seeking countermeasures.

As I've been perhaps too focussed on Iran lately, here's a brief roundup of big news events from the last month or so.

Orban losing power: Pretty cool, though his replacement being Neoliberal #2980329891 means that big changes seem unlikely.
Strikes in Bolivia against that dipshit Paz: Very nice to see, as it appears that Bolivia has among the best widespread on-the-ground popular support for worker-centric policies and politicians in Latin America that makes it so they can genuinely pressure power (already, the Labor Minister has resigned).
Situation in the Sahel: "Mysterious" third parties sponsored a big offensive against the AES which they largely repelled with help from Russia. The situation there is still a little tenuous as I understand it with a greater focus by anti-government forces on blockades of cities to cause internal revolts. This tactic is currently broadly failing as armed convoys are getting fuel and food into the cities, but figures like Traore are aware that more needs to be done.
Ukraine War: Aside from the usual grinding advance by Russia on the front, there have been back-and-forth missile and drone strikes as Ukraine hit some targets in the outskirts of Moscow with drones and then Russia fired a shitload of missiles, including the iconic Oreshnik, directly at Kiev, as Simplicius and others have covered in greater detail.

I could go on and on with the recent aggressions against Cuba, Modi's recent victories in India and the AI/chip tech war between China and the US but this preamble has to end at some point due to the character limit. ::: :::

Source: https://lemmygrad.ml/post/11718194 (post body)

No Donations for Days 😭 Please Don’t Leave Gaza Alone 💔 by allende2001 in c/mutual_aid@hexbear.net

[-] allende2001@lemmygrad.ml 2 points 6 days ago

bumping

43

Scientists sound alarm as dangerous amoebas spread globally (www.sciencedaily.com)

submitted 4 weeks ago* (last edited 4 weeks ago) by allende2001@lemmygrad.ml to c/science@hexbear.net

5 comments fedilink

cross-posted from: https://lemmygrad.ml/post/11499483

Archive link: https://archive.ph/ZQGwr

Scientists are raising concerns about an under-the-radar threat hiding in everyday environments: free-living amoebae. Credit: Shutterstock

A team of environmental and public health scientists is raising concerns about a largely overlooked group of microscopic organisms that may pose a growing danger worldwide: free living amoebae. In a recent perspective article published in Biocontaminant, researchers explain that these tiny life forms are becoming an emerging global health risk. Their spread is being driven by rising temperatures, aging water infrastructure, and limited systems for detecting and tracking them. Although most people have never heard of free living amoebae, scientists say they deserve far more attention.

What Are Free Living Amoebae

Amoebae are single celled organisms that live naturally in soil, freshwater, and even some man made water systems. They move and feed by extending parts of their cell body, a process that gives them their distinctive shape. Most amoebae are harmless and play a role in natural ecosystems. However, a small number of species can infect humans and cause severe illness. These infections are rare, but when they do occur, they can be extremely serious. One of the most well known examples is Naegleria fowleri (often called the brain eating amoeba). This organism can enter the body when contaminated water goes up the nose, such as during swimming in warm lakes or poorly treated water. Once inside, it can travel to the brain and cause a fast moving infection that is almost always fatal.

Why These Microbes Are So Hard to Eliminate

Scientists say one of the most concerning features of these amoebae is their ability to survive harsh conditions that would normally kill other microorganisms. "What makes these organisms particularly dangerous is their ability to survive conditions that kill many other microbes," said corresponding author Longfei Shu of Sun Yat sen University. "They can tolerate high temperatures, strong disinfectants like chlorine, and even live inside water distribution systems that people assume are safe." This resilience means that standard water treatment methods may not always be enough to eliminate them, especially in older or poorly maintained systems.

The Hidden Role of Amoebae in Spreading Other Pathogens

The risks go beyond the amoebae themselves. Researchers highlight that these organisms can act as protective hosts for other harmful microbes, including bacteria and viruses. Inside the amoeba, these pathogens can survive in a kind of safe shelter, shielded from disinfectants that would normally destroy them. This process is often described as a so called Trojan horse effect. It allows dangerous microbes to persist in drinking water systems and potentially spread more easily. Scientists are also concerned that this protective environment could help promote antibiotic resistance, making infections harder to treat over time.

Climate Change Is Expanding Their Reach

Rising global temperatures are expected to make the problem worse. Many of these amoebae thrive in warm conditions, so as water temperatures increase, they are likely to expand into new regions where they were once uncommon. In recent years, several outbreaks linked to recreational water use have already heightened public concern in different parts of the world. These incidents suggest that the risk is no longer limited to a few isolated areas.

Calls for Better Monitoring and Safer Water Systems

To address the growing threat, researchers are calling for a broader, more coordinated response. They recommend a One Health approach, which brings together experts in human health, environmental science, and water management to tackle the issue from multiple angles. Improving surveillance systems is a key priority, along with developing faster and more accurate diagnostic tools. The team also emphasizes the need for advanced water treatment technologies that can better target these resilient organisms before they pose a risk to the public.

A Problem That Crosses Boundaries

"Amoebae are not just a medical issue or an environmental issue," Shu said. "They sit at the intersection of both, and addressing them requires integrated solutions that protect public health at its source." As scientists continue to learn more about these microscopic organisms, one message is becoming clear: something largely invisible to the naked eye could have a much bigger impact on global health than previously thought.

0

Scientists sound alarm as dangerous amoebas spread globally (www.sciencedaily.com)

submitted 4 weeks ago* (last edited 4 weeks ago) by allende2001@lemmygrad.ml to c/science@lemmygrad.ml

0 comments fedilink

Archive link: https://archive.ph/ZQGwr

Scientists are raising concerns about an under-the-radar threat hiding in everyday environments: free-living amoebae. Credit: Shutterstock

A team of environmental and public health scientists is raising concerns about a largely overlooked group of microscopic organisms that may pose a growing danger worldwide: free living amoebae. In a recent perspective article published in Biocontaminant, researchers explain that these tiny life forms are becoming an emerging global health risk. Their spread is being driven by rising temperatures, aging water infrastructure, and limited systems for detecting and tracking them. Although most people have never heard of free living amoebae, scientists say they deserve far more attention.

What Are Free Living Amoebae

Amoebae are single celled organisms that live naturally in soil, freshwater, and even some man made water systems. They move and feed by extending parts of their cell body, a process that gives them their distinctive shape. Most amoebae are harmless and play a role in natural ecosystems. However, a small number of species can infect humans and cause severe illness. These infections are rare, but when they do occur, they can be extremely serious. One of the most well known examples is Naegleria fowleri (often called the brain eating amoeba). This organism can enter the body when contaminated water goes up the nose, such as during swimming in warm lakes or poorly treated water. Once inside, it can travel to the brain and cause a fast moving infection that is almost always fatal.

Why These Microbes Are So Hard to Eliminate

Scientists say one of the most concerning features of these amoebae is their ability to survive harsh conditions that would normally kill other microorganisms. "What makes these organisms particularly dangerous is their ability to survive conditions that kill many other microbes," said corresponding author Longfei Shu of Sun Yat sen University. "They can tolerate high temperatures, strong disinfectants like chlorine, and even live inside water distribution systems that people assume are safe." This resilience means that standard water treatment methods may not always be enough to eliminate them, especially in older or poorly maintained systems.

The Hidden Role of Amoebae in Spreading Other Pathogens

The risks go beyond the amoebae themselves. Researchers highlight that these organisms can act as protective hosts for other harmful microbes, including bacteria and viruses. Inside the amoeba, these pathogens can survive in a kind of safe shelter, shielded from disinfectants that would normally destroy them. This process is often described as a so called Trojan horse effect. It allows dangerous microbes to persist in drinking water systems and potentially spread more easily. Scientists are also concerned that this protective environment could help promote antibiotic resistance, making infections harder to treat over time.

Climate Change Is Expanding Their Reach

Rising global temperatures are expected to make the problem worse. Many of these amoebae thrive in warm conditions, so as water temperatures increase, they are likely to expand into new regions where they were once uncommon. In recent years, several outbreaks linked to recreational water use have already heightened public concern in different parts of the world. These incidents suggest that the risk is no longer limited to a few isolated areas.

Calls for Better Monitoring and Safer Water Systems

To address the growing threat, researchers are calling for a broader, more coordinated response. They recommend a One Health approach, which brings together experts in human health, environmental science, and water management to tackle the issue from multiple angles. Improving surveillance systems is a key priority, along with developing faster and more accurate diagnostic tools. The team also emphasizes the need for advanced water treatment technologies that can better target these resilient organisms before they pose a risk to the public.

A Problem That Crosses Boundaries

"Amoebae are not just a medical issue or an environmental issue," Shu said. "They sit at the intersection of both, and addressing them requires integrated solutions that protect public health at its source." As scientists continue to learn more about these microscopic organisms, one message is becoming clear: something largely invisible to the naked eye could have a much bigger impact on global health than previously thought.

33

China builds world’s first ‘coal battery’ with zero emission (www.scmp.com)

submitted 1 month ago* (last edited 1 month ago) by allende2001@lemmygrad.ml to c/technology@hexbear.net

0 comments fedilink

cross-posted from: https://lemmygrad.ml/post/11456035

New technology developed by Chinese scientists achieves higher energy efficiency than burning while eliminating carbon dioxide emissions

Chinese scientists have developed a way to generate electricity and achieve higher energy efficiency than conventional burning methods, while producing zero carbon dioxide emissions, by placing coal inside a “battery”.

“Coal-fired power” conjures images of heavy pollution, steep carbon footprints and modest efficiency. But a novel, direct coal power technology challenges that stereotype by eliminating combustion entirely and sidestepping the carbon dioxide emissions that have long defined coal use.

A team led by Xie Heping, a member of the Chinese Academy of Sciences with Shenzhen University, has for the first time built what they call a zero-carbon-emission direct coal fuel cell, or ZC-DCFC.

Full article

In this system, coal is pulverised, dried, purified and subjected to surface pre-treatment before being fed into the anode chamber of the cell.

Oxygen is supplied to the cathode, and within the cell, the fine coal powder undergoes electrochemical oxidation across an oxide membrane, yielding electricity directly – without any intermediate steam cycle or mechanical turbine.

At the anode outlet, the high-purity carbon dioxide generated by the reaction is captured in situ and catalytically converted into valuable chemical feedstocks such as synthesis gas or mineralised into compounds like sodium bicarbonate. The entire process is silent and clean.

Conventional coal power relies on burning coal to produce heat, which then boils water into steam to spin a turbine generator – a chain of conversions that remains hostage to the Carnot efficiency limit of internal combustion engines.

“This process is bound by the Carnot cycle, capping energy efficiency at around 40 per cent. In the ZC-DCFC, by avoiding the efficiency losses associated with combustion and thermal engines, it enables substantially higher theoretical efficiency,” Xie noted in his paper, which appeared in the peer-reviewed journal Energy Reviews.

Since 2018, Xie’s team has pushed the technology forward step by step, solving problems in materials, cell durability, fuel treatment and continuous coal feeding along the way.

Earlier generations of direct carbon fuel cells were plagued by low power density and short operational lifetimes. The newly developed cell, however, incorporates improvements in stack scalability, long-term stability, carbon conversion efficiency and overall system integration – areas the team has targeted in their paper.

“This concept can also be extended to deep coal seams located 2km (1.2 miles) underground,” Xie said.

Traditional mining of coal from such depths is prohibitively expensive. This technology could convert the coal to electricity on site, with only the power needing transmission to the surface. This approach could help ease pressure as shallow coal reserves gradually dwindle.

Xie’s group is also spearheading a landmark project under the National Science and Technology Major Project for Deep Earth Probe and Mineral Resources Exploration, launched in 2025.

Adapting the ZC-DCFC to withstand high temperatures, pressures and corrosive environments would enable the fuel cell to serve the deep-earth exploration initiative directly.

The research aligns squarely with China’s goal of achieving carbon neutrality by 2060. Yet expecting this laboratory-scale innovation to displace the nation’s existing coal-fired power fleet any time soon would be unrealistic.

Wei Zhijiang, a senior engineer at HBIS Group Xuansteel, said that by the end of 2025, coal power made up about 45 per cent of China’s total installed capacity but still supplied nearly 60 per cent of the nation’s electricity.

Meanwhile, half of those coal plants had been running for just 15 years – still young in industrial terms.

Pointing out the practical hurdles, Wei said moving the direct coal fuel cell from the lab to wide commercial use would take time and careful cost planning. Therefore, he believed the technology would not be cost-competitive until after 2045.

If you're interested to learn more about how the technology works in detail, here's the paper referenced in the article from the peer-reviewed journal Energy Reviews:

Towards zero-carbon-emission direct coal fuel cells for power generation

Abstract
Carbon neutrality has become an international consensus under the requirements established by the Paris Agreement. Accordingly, countries worldwide, especially developing nations, have formulated their own carbon neutrality policies. Owing to differences in regional development histories and resource endowments, as well as the intermittency of new energy, developing countries will continue to rely on coal to meet their energy demands for sustainable economic and social development in the near future. However, conventional coal-fired power generation technologies can hardly achieve low-carbon or even negative-carbon emissions. It is therefore urgent to develop novel carbon-free coal power technologies. This perspective proposes the concept of Zero-carbon-emission direct coal fuel cells (ZC-DCFC) for power generation as a disruptive technological paradigm for efficient coal utilization. The technological architecture of ZC-DCFC is discussed, including fuel supply, key materials, and in-situ CO2 conversion. The technical challenges and future development directions are also identified. ZC-DCFC is expected to open up a new pathway for near-zero-emission coal utilization, transforming coal from a traditional fossil fuel into a feasible clean energy source in the global low-carbon transition.

13

What is the Atlantic Meridional Overturning Circulation, and why are scientists worried about it slowing down? (www.cbc.ca)

submitted 1 month ago by allende2001@lemmygrad.ml to c/science@hexbear.net

0 comments fedilink

cross-posted from: https://lemmygrad.ml/post/11435420

Archive link: https://archive.ph/bYxBK

34

Palantir Employees Are Starting to Wonder if They're the Bad Guys (www.wired.com)

submitted 1 month ago by allende2001@lemmygrad.ml to c/worldnews@lemmygrad.ml

1 comments fedilink

cross-posted from: https://hexbear.net/post/8323358

Archive link: https://archive.is/veTal

https://ghostarchive.org/archive/XnQdX

Have you looked at our hats recently?

They've got skulls on them.

11

DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence (huggingface.co)

submitted 1 month ago by allende2001@lemmygrad.ml to c/technology@hexbear.net

0 comments fedilink

cross-posted from: https://lemmygrad.ml/post/11418648

DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence

Technical Report 👁️

Introduction

We present a preview version of DeepSeek-V4 series, including two strong Mixture-of-Experts (MoE) language models — DeepSeek-V4-Pro with 1.6T parameters (49B activated) and DeepSeek-V4-Flash with 284B parameters (13B activated) — both supporting a context length of one million tokens.

DeepSeek-V4 series incorporate several key upgrades in architecture and optimization:

Hybrid Attention Architecture: We design a hybrid attention mechanism combining Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA) to dramatically improve long-context efficiency. In the 1M-token context setting, DeepSeek-V4-Pro requires only 27% of single-token inference FLOPs and 10% of KV cache compared with DeepSeek-V3.2.

Manifold-Constrained Hyper-Connections (mHC): We incorporate mHC to strengthen conventional residual connections, enhancing stability of signal propagation across layers while preserving model expressivity.

Muon Optimizer: We employ the Muon optimizer for faster convergence and greater training stability.

We pre-train both models on more than 32T diverse and high-quality tokens, followed by a comprehensive post-training pipeline. The post-training features a two-stage paradigm: independent cultivation of domain-specific experts (through SFT and RL with GRPO), followed by unified model consolidation via on-policy distillation, integrating distinct proficiencies across diverse domains into a single model.

DeepSeek-V4-Pro-Max, the maximum reasoning effort mode of DeepSeek-V4-Pro, significantly advances the knowledge capabilities of open-source models, firmly establishing itself as the best open-source model available today. It achieves top-tier performance in coding benchmarks and significantly bridges the gap with leading closed-source models on reasoning and agentic tasks. Meanwhile, DeepSeek-V4-Flash-Max achieves comparable reasoning performance to the Pro version when given a larger thinking budget, though its smaller parameter scale naturally places it slightly behind on pure knowledge tasks and the most complex agentic workflows.

Model Downloads

Model #Total Params #Activated Params Context Length Precision Download

DeepSeek-V4-Flash-Base 284B 13B 1M FP8 Mixed HuggingFace | ModelScope

DeepSeek-V4-Flash 284B 13B 1M FP4 + FP8 Mixed* HuggingFace | ModelScope

DeepSeek-V4-Pro-Base 1.6T 49B 1M FP8 Mixed HuggingFace | ModelScope

DeepSeek-V4-Pro 1.6T 49B 1M FP4 + FP8 Mixed* HuggingFace | ModelScope

*FP4 + FP8 Mixed: MoE expert parameters use FP4 precision; most other parameters use FP8.

Evaluation Results

Base Model

Benchmark (Metric) # Shots DeepSeek-V3.2-Base DeepSeek-V4-Flash-Base DeepSeek-V4-Pro-Base

Architecture - MoE MoE MoE

# Activated Params - 37B 13B 49B

# Total Params - 671B 284B 1.6T

World Knowledge

AGIEval (EM) 0-shot 80.1 82.6 83.1

MMLU (EM) 5-shot 87.8 88.7 90.1

MMLU-Redux (EM) 5-shot 87.5 89.4 90.8

MMLU-Pro (EM) 5-shot 65.5 68.3 73.5

MMMLU (EM) 5-shot 87.9 88.8 90.3

C-Eval (EM) 5-shot 90.4 92.1 93.1

CMMLU (EM) 5-shot 88.9 90.4 90.8

MultiLoKo (EM) 5-shot 38.7 42.2 51.1

Simple-QA verified (EM) 25-shot 28.3 30.1 55.2

SuperGPQA (EM) 5-shot 45.0 46.5 53.9

FACTS Parametric (EM) 25-shot 27.1 33.9 62.6

TriviaQA (EM) 5-shot 83.3 82.8 85.6

Language & Reasoning

BBH (EM) 3-shot 87.6 86.9 87.5

DROP (F1) 1-shot 88.2 88.6 88.7

HellaSwag (EM) 0-shot 86.4 85.7 88.0

WinoGrande (EM) 0-shot 78.9 79.5 81.5

CLUEWSC (EM) 5-shot 83.5 82.2 85.2

Code & Math

BigCodeBench (Pass@1) 3-shot 63.9 56.8 59.2

HumanEval (Pass@1) 0-shot 62.8 69.5 76.8

GSM8K (EM) 8-shot 91.1 90.8 92.6

MATH (EM) 4-shot 60.5 57.4 64.5

MGSM (EM) 8-shot 81.3 85.7 84.4

CMath (EM) 3-shot 92.6 93.6 90.9

Long Context

LongBench-V2 (EM) 1-shot 40.2 44.7 51.5

Instruct Model

DeepSeek-V4-Pro and DeepSeek-V4-Flash both support three reasoning effort modes:

Reasoning Mode Characteristics Typical Use Cases Response Format

Non-think Fast, intuitive responses Routine daily tasks, low-risk decisions </think> summary

Think High Conscious logical analysis, slower but more accurate Complex problem-solving, planning <think> thinking </think> summary

Think Max Push reasoning to its fullest extent Exploring the boundary of model reasoning capability Special system prompt + <think> thinking </think> summary

DeepSeek-V4-Pro-Max vs Frontier Models

Benchmark (Metric) Opus-4.6 Max GPT-5.4 xHigh Gemini-3.1-Pro High K2.6 Thinking GLM-5.1 Thinking DS-V4-Pro Max

Knowledge & Reasoning

MMLU-Pro (EM) 89.1 87.5 91.0 87.1 86.0 87.5

SimpleQA-Verified (Pass@1) 46.2 45.3 75.6 36.9 38.1 57.9

Chinese-SimpleQA (Pass@1) 76.4 76.8 85.9 75.9 75.0 84.4

GPQA Diamond (Pass@1) 91.3 93.0 94.3 90.5 86.2 90.1

HLE (Pass@1) 40.0 39.8 44.4 36.4 34.7 37.7

LiveCodeBench (Pass@1) 88.8 - 91.7 89.6 - 93.5

Codeforces (Rating) - 3168 3052 - - 3206

HMMT 2026 Feb (Pass@1) 96.2 97.7 94.7 92.7 89.4 95.2

IMOAnswerBench (Pass@1) 75.3 91.4 81.0 86.0 83.8 89.8

Apex (Pass@1) 34.5 54.1 60.9 24.0 11.5 38.3

Apex Shortlist (Pass@1) 85.9 78.1 89.1 75.5 72.4 90.2

Long Context

MRCR 1M (MMR) 92.9 - 76.3 - - 83.5

CorpusQA 1M (ACC) 71.7 - 53.8 - - 62.0

Agentic

Terminal Bench 2.0 (Acc) 65.4 75.1 68.5 66.7 63.5 67.9

SWE Verified (Resolved) 80.8 - 80.6 80.2 - 80.6

SWE Pro (Resolved) 57.3 57.7 54.2 58.6 58.4 55.4

SWE Multilingual (Resolved) 77.5 - - 76.7 73.3 76.2

BrowseComp (Pass@1) 83.7 82.7 85.9 83.2 79.3 83.4

HLE w/ tools (Pass@1) 53.1 52.0 51.6 54.0 50.4 48.2

GDPval-AA (Elo) 1619 1674 1314 1482 1535 1554

MCPAtlas Public (Pass@1) 73.8 67.2 69.2 66.6 71.8 73.6

Toolathlon (Pass@1) 47.2 54.6 48.8 50.0 40.7 51.8

Comparison across Modes

Benchmark (Metric) V4-Flash Non-Think V4-Flash High V4-Flash Max V4-Pro Non-Think V4-Pro High V4-Pro Max

Knowledge & Reasoning

MMLU-Pro (EM) 83.0 86.4 86.2 82.9 87.1 87.5

SimpleQA-Verified (Pass@1) 23.1 28.9 34.1 45.0 46.2 57.9

Chinese-SimpleQA (Pass@1) 71.5 73.2 78.9 75.8 77.7 84.4

GPQA Diamond (Pass@1) 71.2 87.4 88.1 72.9 89.1 90.1

HLE (Pass@1) 8.1 29.4 34.8 7.7 34.5 37.7

LiveCodeBench (Pass@1) 55.2 88.4 91.6 56.8 89.8 93.5

Codeforces (Rating) - 2816 3052 - 2919 3206

HMMT 2026 Feb (Pass@1) 40.8 91.9 94.8 31.7 94.0 95.2

IMOAnswerBench (Pass@1) 41.9 85.1 88.4 35.3 88.0 89.8

Apex (Pass@1) 1.0 19.1 33.0 0.4 27.4 38.3

Apex Shortlist (Pass@1) 9.3 72.1 85.7 9.2 85.5 90.2

Long Context

MRCR 1M (MMR) 37.5 76.9 78.7 44.7 83.3 83.5

CorpusQA 1M (ACC) 15.5 59.3 60.5 35.6 56.5 62.0

Agentic

Terminal Bench 2.0 (Acc) 49.1 56.6 56.9 59.1 63.3 67.9

SWE Verified (Resolved) 73.7 78.6 79.0 73.6 79.4 80.6

SWE Pro (Resolved) 49.1 52.3 52.6 52.1 54.4 55.4

SWE Multilingual (Resolved) 69.7 70.2 73.3 69.8 74.1 76.2

BrowseComp (Pass@1) - 53.5 73.2 - 80.4 83.4

HLE w/ tools (Pass@1) - 40.3 45.1 - 44.7 48.2

MCPAtlas (Pass@1) 64.0 67.4 69.0 69.4 74.2 73.6

GDPval-AA (Elo) - - 1395 - - 1554

Toolathlon (Pass@1) 40.7 43.5 47.8 46.3 49.0 51.8

Chat Template

This release does not include a Jinja-format chat template. Instead, we provide a dedicated encoding folder with Python scripts and test cases demonstrating how to encode messages in OpenAI-compatible format into input strings for the model, and how to parse the model's text output. Please refer to the encoding folder for full documentation.

A brief example:
from encoding_dsv4 import encode_messages, parse_message_from_completion_text

messages = [
    {"role": "user", "content": "hello"},
    {"role": "assistant", "content": "Hello! I am DeepSeek.", "reasoning_content": "thinking..."},
    {"role": "user", "content": "1+1=?"}
]

# messages -> string
prompt = encode_messages(messages, thinking_mode="thinking")

# string -> tokens
import transformers
tokenizer = transformers.AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-V4-Pro")
tokens = tokenizer.encode(prompt)
How to Run Locally

Please refer to the inference folder for detailed instructions on running DeepSeek-V4 locally, including model weight conversion and interactive chat demos.

For local deployment, we recommend setting the sampling parameters to temperature = 1.0, top_p = 1.0. For the Think Max reasoning mode, we recommend setting the context window to at least 384K tokens.

License

This repository and the model weights are licensed under the MIT License.

Citation
@misc{deepseekai2026deepseekv4,
      title={DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence},
      author={DeepSeek-AI},
      year={2026},
}
Contact

If you have any questions, please raise an issue or contact us at service@deepseek.com.

Model	#Total Params	#Activated Params	Context Length	Precision	Download
DeepSeek-V4-Flash-Base	284B	13B	1M	FP8 Mixed	HuggingFace \| ModelScope
DeepSeek-V4-Flash	284B	13B	1M	FP4 + FP8 Mixed*	HuggingFace \| ModelScope
DeepSeek-V4-Pro-Base	1.6T	49B	1M	FP8 Mixed	HuggingFace \| ModelScope
DeepSeek-V4-Pro	1.6T	49B	1M	FP4 + FP8 Mixed*	HuggingFace \| ModelScope

Benchmark (Metric)	# Shots	DeepSeek-V3.2-Base	DeepSeek-V4-Flash-Base	DeepSeek-V4-Pro-Base
Architecture	-	MoE	MoE	MoE
# Activated Params	-	37B	13B	49B
# Total Params	-	671B	284B	1.6T
World Knowledge
AGIEval (EM)	0-shot	80.1	82.6	83.1
MMLU (EM)	5-shot	87.8	88.7	90.1
MMLU-Redux (EM)	5-shot	87.5	89.4	90.8
MMLU-Pro (EM)	5-shot	65.5	68.3	73.5
MMMLU (EM)	5-shot	87.9	88.8	90.3
C-Eval (EM)	5-shot	90.4	92.1	93.1
CMMLU (EM)	5-shot	88.9	90.4	90.8
MultiLoKo (EM)	5-shot	38.7	42.2	51.1
Simple-QA verified (EM)	25-shot	28.3	30.1	55.2
SuperGPQA (EM)	5-shot	45.0	46.5	53.9
FACTS Parametric (EM)	25-shot	27.1	33.9	62.6
TriviaQA (EM)	5-shot	83.3	82.8	85.6
Language & Reasoning
BBH (EM)	3-shot	87.6	86.9	87.5
DROP (F1)	1-shot	88.2	88.6	88.7
HellaSwag (EM)	0-shot	86.4	85.7	88.0
WinoGrande (EM)	0-shot	78.9	79.5	81.5
CLUEWSC (EM)	5-shot	83.5	82.2	85.2
Code & Math
BigCodeBench (Pass@1)	3-shot	63.9	56.8	59.2
HumanEval (Pass@1)	0-shot	62.8	69.5	76.8
GSM8K (EM)	8-shot	91.1	90.8	92.6
MATH (EM)	4-shot	60.5	57.4	64.5
MGSM (EM)	8-shot	81.3	85.7	84.4
CMath (EM)	3-shot	92.6	93.6	90.9
Long Context
LongBench-V2 (EM)	1-shot	40.2	44.7	51.5

Reasoning Mode	Characteristics	Typical Use Cases	Response Format
Non-think	Fast, intuitive responses	Routine daily tasks, low-risk decisions	`</think>` summary
Think High	Conscious logical analysis, slower but more accurate	Complex problem-solving, planning	`<think>` thinking `</think>` summary
Think Max	Push reasoning to its fullest extent	Exploring the boundary of model reasoning capability	Special system prompt + `<think>` thinking `</think>` summary

Benchmark (Metric)	Opus-4.6 Max	GPT-5.4 xHigh	Gemini-3.1-Pro High	K2.6 Thinking	GLM-5.1 Thinking	DS-V4-Pro Max
Knowledge & Reasoning
MMLU-Pro (EM)	89.1	87.5	91.0	87.1	86.0	87.5
SimpleQA-Verified (Pass@1)	46.2	45.3	75.6	36.9	38.1	57.9
Chinese-SimpleQA (Pass@1)	76.4	76.8	85.9	75.9	75.0	84.4
GPQA Diamond (Pass@1)	91.3	93.0	94.3	90.5	86.2	90.1
HLE (Pass@1)	40.0	39.8	44.4	36.4	34.7	37.7
LiveCodeBench (Pass@1)	88.8	-	91.7	89.6	-	93.5
Codeforces (Rating)	-	3168	3052	-	-	3206
HMMT 2026 Feb (Pass@1)	96.2	97.7	94.7	92.7	89.4	95.2
IMOAnswerBench (Pass@1)	75.3	91.4	81.0	86.0	83.8	89.8
Apex (Pass@1)	34.5	54.1	60.9	24.0	11.5	38.3
Apex Shortlist (Pass@1)	85.9	78.1	89.1	75.5	72.4	90.2
Long Context
MRCR 1M (MMR)	92.9	-	76.3	-	-	83.5
CorpusQA 1M (ACC)	71.7	-	53.8	-	-	62.0
Agentic
Terminal Bench 2.0 (Acc)	65.4	75.1	68.5	66.7	63.5	67.9
SWE Verified (Resolved)	80.8	-	80.6	80.2	-	80.6
SWE Pro (Resolved)	57.3	57.7	54.2	58.6	58.4	55.4
SWE Multilingual (Resolved)	77.5	-	-	76.7	73.3	76.2
BrowseComp (Pass@1)	83.7	82.7	85.9	83.2	79.3	83.4
HLE w/ tools (Pass@1)	53.1	52.0	51.6	54.0	50.4	48.2
GDPval-AA (Elo)	1619	1674	1314	1482	1535	1554
MCPAtlas Public (Pass@1)	73.8	67.2	69.2	66.6	71.8	73.6
Toolathlon (Pass@1)	47.2	54.6	48.8	50.0	40.7	51.8

Benchmark (Metric)	V4-Flash Non-Think	V4-Flash High	V4-Flash Max	V4-Pro Non-Think	V4-Pro High	V4-Pro Max
Knowledge & Reasoning
MMLU-Pro (EM)	83.0	86.4	86.2	82.9	87.1	87.5
SimpleQA-Verified (Pass@1)	23.1	28.9	34.1	45.0	46.2	57.9
Chinese-SimpleQA (Pass@1)	71.5	73.2	78.9	75.8	77.7	84.4
GPQA Diamond (Pass@1)	71.2	87.4	88.1	72.9	89.1	90.1
HLE (Pass@1)	8.1	29.4	34.8	7.7	34.5	37.7
LiveCodeBench (Pass@1)	55.2	88.4	91.6	56.8	89.8	93.5
Codeforces (Rating)	-	2816	3052	-	2919	3206
HMMT 2026 Feb (Pass@1)	40.8	91.9	94.8	31.7	94.0	95.2
IMOAnswerBench (Pass@1)	41.9	85.1	88.4	35.3	88.0	89.8
Apex (Pass@1)	1.0	19.1	33.0	0.4	27.4	38.3
Apex Shortlist (Pass@1)	9.3	72.1	85.7	9.2	85.5	90.2
Long Context
MRCR 1M (MMR)	37.5	76.9	78.7	44.7	83.3	83.5
CorpusQA 1M (ACC)	15.5	59.3	60.5	35.6	56.5	62.0
Agentic
Terminal Bench 2.0 (Acc)	49.1	56.6	56.9	59.1	63.3	67.9
SWE Verified (Resolved)	73.7	78.6	79.0	73.6	79.4	80.6
SWE Pro (Resolved)	49.1	52.3	52.6	52.1	54.4	55.4
SWE Multilingual (Resolved)	69.7	70.2	73.3	69.8	74.1	76.2
BrowseComp (Pass@1)	-	53.5	73.2	-	80.4	83.4
HLE w/ tools (Pass@1)	-	40.3	45.1	-	44.7	48.2
MCPAtlas (Pass@1)	64.0	67.4	69.0	69.4	74.2	73.6
GDPval-AA (Elo)	-	-	1395	-	-	1554
Toolathlon (Pass@1)	40.7	43.5	47.8	46.3	49.0	51.8

17

DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence (huggingface.co)

submitted 1 month ago* (last edited 1 month ago) by allende2001@lemmygrad.ml to c/technology@lemmygrad.ml

1 comments fedilink

DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence

Technical Report 👁️

Introduction

We present a preview version of DeepSeek-V4 series, including two strong Mixture-of-Experts (MoE) language models — DeepSeek-V4-Pro with 1.6T parameters (49B activated) and DeepSeek-V4-Flash with 284B parameters (13B activated) — both supporting a context length of one million tokens.

DeepSeek-V4 series incorporate several key upgrades in architecture and optimization:

Hybrid Attention Architecture: We design a hybrid attention mechanism combining Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA) to dramatically improve long-context efficiency. In the 1M-token context setting, DeepSeek-V4-Pro requires only 27% of single-token inference FLOPs and 10% of KV cache compared with DeepSeek-V3.2.

Manifold-Constrained Hyper-Connections (mHC): We incorporate mHC to strengthen conventional residual connections, enhancing stability of signal propagation across layers while preserving model expressivity.

Muon Optimizer: We employ the Muon optimizer for faster convergence and greater training stability.

We pre-train both models on more than 32T diverse and high-quality tokens, followed by a comprehensive post-training pipeline. The post-training features a two-stage paradigm: independent cultivation of domain-specific experts (through SFT and RL with GRPO), followed by unified model consolidation via on-policy distillation, integrating distinct proficiencies across diverse domains into a single model.

DeepSeek-V4-Pro-Max, the maximum reasoning effort mode of DeepSeek-V4-Pro, significantly advances the knowledge capabilities of open-source models, firmly establishing itself as the best open-source model available today. It achieves top-tier performance in coding benchmarks and significantly bridges the gap with leading closed-source models on reasoning and agentic tasks. Meanwhile, DeepSeek-V4-Flash-Max achieves comparable reasoning performance to the Pro version when given a larger thinking budget, though its smaller parameter scale naturally places it slightly behind on pure knowledge tasks and the most complex agentic workflows.

Model Downloads

Model #Total Params #Activated Params Context Length Precision Download

DeepSeek-V4-Flash-Base 284B 13B 1M FP8 Mixed HuggingFace | ModelScope

DeepSeek-V4-Flash 284B 13B 1M FP4 + FP8 Mixed* HuggingFace | ModelScope

DeepSeek-V4-Pro-Base 1.6T 49B 1M FP8 Mixed HuggingFace | ModelScope

DeepSeek-V4-Pro 1.6T 49B 1M FP4 + FP8 Mixed* HuggingFace | ModelScope

*FP4 + FP8 Mixed: MoE expert parameters use FP4 precision; most other parameters use FP8.

Evaluation Results

Base Model

Benchmark (Metric) # Shots DeepSeek-V3.2-Base DeepSeek-V4-Flash-Base DeepSeek-V4-Pro-Base

Architecture - MoE MoE MoE

# Activated Params - 37B 13B 49B

# Total Params - 671B 284B 1.6T

World Knowledge

AGIEval (EM) 0-shot 80.1 82.6 83.1

MMLU (EM) 5-shot 87.8 88.7 90.1

MMLU-Redux (EM) 5-shot 87.5 89.4 90.8

MMLU-Pro (EM) 5-shot 65.5 68.3 73.5

MMMLU (EM) 5-shot 87.9 88.8 90.3

C-Eval (EM) 5-shot 90.4 92.1 93.1

CMMLU (EM) 5-shot 88.9 90.4 90.8

MultiLoKo (EM) 5-shot 38.7 42.2 51.1

Simple-QA verified (EM) 25-shot 28.3 30.1 55.2

SuperGPQA (EM) 5-shot 45.0 46.5 53.9

FACTS Parametric (EM) 25-shot 27.1 33.9 62.6

TriviaQA (EM) 5-shot 83.3 82.8 85.6

Language & Reasoning

BBH (EM) 3-shot 87.6 86.9 87.5

DROP (F1) 1-shot 88.2 88.6 88.7

HellaSwag (EM) 0-shot 86.4 85.7 88.0

WinoGrande (EM) 0-shot 78.9 79.5 81.5

CLUEWSC (EM) 5-shot 83.5 82.2 85.2

Code & Math

BigCodeBench (Pass@1) 3-shot 63.9 56.8 59.2

HumanEval (Pass@1) 0-shot 62.8 69.5 76.8

GSM8K (EM) 8-shot 91.1 90.8 92.6

MATH (EM) 4-shot 60.5 57.4 64.5

MGSM (EM) 8-shot 81.3 85.7 84.4

CMath (EM) 3-shot 92.6 93.6 90.9

Long Context

LongBench-V2 (EM) 1-shot 40.2 44.7 51.5

Instruct Model

DeepSeek-V4-Pro and DeepSeek-V4-Flash both support three reasoning effort modes:

Reasoning Mode Characteristics Typical Use Cases Response Format

Non-think Fast, intuitive responses Routine daily tasks, low-risk decisions </think> summary

Think High Conscious logical analysis, slower but more accurate Complex problem-solving, planning <think> thinking </think> summary

Think Max Push reasoning to its fullest extent Exploring the boundary of model reasoning capability Special system prompt + <think> thinking </think> summary

DeepSeek-V4-Pro-Max vs Frontier Models

Benchmark (Metric) Opus-4.6 Max GPT-5.4 xHigh Gemini-3.1-Pro High K2.6 Thinking GLM-5.1 Thinking DS-V4-Pro Max

Knowledge & Reasoning

MMLU-Pro (EM) 89.1 87.5 91.0 87.1 86.0 87.5

SimpleQA-Verified (Pass@1) 46.2 45.3 75.6 36.9 38.1 57.9

Chinese-SimpleQA (Pass@1) 76.4 76.8 85.9 75.9 75.0 84.4

GPQA Diamond (Pass@1) 91.3 93.0 94.3 90.5 86.2 90.1

HLE (Pass@1) 40.0 39.8 44.4 36.4 34.7 37.7

LiveCodeBench (Pass@1) 88.8 - 91.7 89.6 - 93.5

Codeforces (Rating) - 3168 3052 - - 3206

HMMT 2026 Feb (Pass@1) 96.2 97.7 94.7 92.7 89.4 95.2

IMOAnswerBench (Pass@1) 75.3 91.4 81.0 86.0 83.8 89.8

Apex (Pass@1) 34.5 54.1 60.9 24.0 11.5 38.3

Apex Shortlist (Pass@1) 85.9 78.1 89.1 75.5 72.4 90.2

Long Context

MRCR 1M (MMR) 92.9 - 76.3 - - 83.5

CorpusQA 1M (ACC) 71.7 - 53.8 - - 62.0

Agentic

Terminal Bench 2.0 (Acc) 65.4 75.1 68.5 66.7 63.5 67.9

SWE Verified (Resolved) 80.8 - 80.6 80.2 - 80.6

SWE Pro (Resolved) 57.3 57.7 54.2 58.6 58.4 55.4

SWE Multilingual (Resolved) 77.5 - - 76.7 73.3 76.2

BrowseComp (Pass@1) 83.7 82.7 85.9 83.2 79.3 83.4

HLE w/ tools (Pass@1) 53.1 52.0 51.6 54.0 50.4 48.2

GDPval-AA (Elo) 1619 1674 1314 1482 1535 1554

MCPAtlas Public (Pass@1) 73.8 67.2 69.2 66.6 71.8 73.6

Toolathlon (Pass@1) 47.2 54.6 48.8 50.0 40.7 51.8

Comparison across Modes

Benchmark (Metric) V4-Flash Non-Think V4-Flash High V4-Flash Max V4-Pro Non-Think V4-Pro High V4-Pro Max

Knowledge & Reasoning

MMLU-Pro (EM) 83.0 86.4 86.2 82.9 87.1 87.5

SimpleQA-Verified (Pass@1) 23.1 28.9 34.1 45.0 46.2 57.9

Chinese-SimpleQA (Pass@1) 71.5 73.2 78.9 75.8 77.7 84.4

GPQA Diamond (Pass@1) 71.2 87.4 88.1 72.9 89.1 90.1

HLE (Pass@1) 8.1 29.4 34.8 7.7 34.5 37.7

LiveCodeBench (Pass@1) 55.2 88.4 91.6 56.8 89.8 93.5

Codeforces (Rating) - 2816 3052 - 2919 3206

HMMT 2026 Feb (Pass@1) 40.8 91.9 94.8 31.7 94.0 95.2

IMOAnswerBench (Pass@1) 41.9 85.1 88.4 35.3 88.0 89.8

Apex (Pass@1) 1.0 19.1 33.0 0.4 27.4 38.3

Apex Shortlist (Pass@1) 9.3 72.1 85.7 9.2 85.5 90.2

Long Context

MRCR 1M (MMR) 37.5 76.9 78.7 44.7 83.3 83.5

CorpusQA 1M (ACC) 15.5 59.3 60.5 35.6 56.5 62.0

Agentic

Terminal Bench 2.0 (Acc) 49.1 56.6 56.9 59.1 63.3 67.9

SWE Verified (Resolved) 73.7 78.6 79.0 73.6 79.4 80.6

SWE Pro (Resolved) 49.1 52.3 52.6 52.1 54.4 55.4

SWE Multilingual (Resolved) 69.7 70.2 73.3 69.8 74.1 76.2

BrowseComp (Pass@1) - 53.5 73.2 - 80.4 83.4

HLE w/ tools (Pass@1) - 40.3 45.1 - 44.7 48.2

MCPAtlas (Pass@1) 64.0 67.4 69.0 69.4 74.2 73.6

GDPval-AA (Elo) - - 1395 - - 1554

Toolathlon (Pass@1) 40.7 43.5 47.8 46.3 49.0 51.8

Chat Template

This release does not include a Jinja-format chat template. Instead, we provide a dedicated encoding folder with Python scripts and test cases demonstrating how to encode messages in OpenAI-compatible format into input strings for the model, and how to parse the model's text output. Please refer to the encoding folder for full documentation.

A brief example:
from encoding_dsv4 import encode_messages, parse_message_from_completion_text

messages = [
    {"role": "user", "content": "hello"},
    {"role": "assistant", "content": "Hello! I am DeepSeek.", "reasoning_content": "thinking..."},
    {"role": "user", "content": "1+1=?"}
]

# messages -> string
prompt = encode_messages(messages, thinking_mode="thinking")

# string -> tokens
import transformers
tokenizer = transformers.AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-V4-Pro")
tokens = tokenizer.encode(prompt)
How to Run Locally

Please refer to the inference folder for detailed instructions on running DeepSeek-V4 locally, including model weight conversion and interactive chat demos.

For local deployment, we recommend setting the sampling parameters to temperature = 1.0, top_p = 1.0. For the Think Max reasoning mode, we recommend setting the context window to at least 384K tokens.

License

This repository and the model weights are licensed under the MIT License.

Citation
@misc{deepseekai2026deepseekv4,
      title={DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence},
      author={DeepSeek-AI},
      year={2026},
}
Contact

If you have any questions, please raise an issue or contact us at service@deepseek.com.

Model	#Total Params	#Activated Params	Context Length	Precision	Download
DeepSeek-V4-Flash-Base	284B	13B	1M	FP8 Mixed	HuggingFace \| ModelScope
DeepSeek-V4-Flash	284B	13B	1M	FP4 + FP8 Mixed*	HuggingFace \| ModelScope
DeepSeek-V4-Pro-Base	1.6T	49B	1M	FP8 Mixed	HuggingFace \| ModelScope
DeepSeek-V4-Pro	1.6T	49B	1M	FP4 + FP8 Mixed*	HuggingFace \| ModelScope

Benchmark (Metric)	# Shots	DeepSeek-V3.2-Base	DeepSeek-V4-Flash-Base	DeepSeek-V4-Pro-Base
Architecture	-	MoE	MoE	MoE
# Activated Params	-	37B	13B	49B
# Total Params	-	671B	284B	1.6T
World Knowledge
AGIEval (EM)	0-shot	80.1	82.6	83.1
MMLU (EM)	5-shot	87.8	88.7	90.1
MMLU-Redux (EM)	5-shot	87.5	89.4	90.8
MMLU-Pro (EM)	5-shot	65.5	68.3	73.5
MMMLU (EM)	5-shot	87.9	88.8	90.3
C-Eval (EM)	5-shot	90.4	92.1	93.1
CMMLU (EM)	5-shot	88.9	90.4	90.8
MultiLoKo (EM)	5-shot	38.7	42.2	51.1
Simple-QA verified (EM)	25-shot	28.3	30.1	55.2
SuperGPQA (EM)	5-shot	45.0	46.5	53.9
FACTS Parametric (EM)	25-shot	27.1	33.9	62.6
TriviaQA (EM)	5-shot	83.3	82.8	85.6
Language & Reasoning
BBH (EM)	3-shot	87.6	86.9	87.5
DROP (F1)	1-shot	88.2	88.6	88.7
HellaSwag (EM)	0-shot	86.4	85.7	88.0
WinoGrande (EM)	0-shot	78.9	79.5	81.5
CLUEWSC (EM)	5-shot	83.5	82.2	85.2
Code & Math
BigCodeBench (Pass@1)	3-shot	63.9	56.8	59.2
HumanEval (Pass@1)	0-shot	62.8	69.5	76.8
GSM8K (EM)	8-shot	91.1	90.8	92.6
MATH (EM)	4-shot	60.5	57.4	64.5
MGSM (EM)	8-shot	81.3	85.7	84.4
CMath (EM)	3-shot	92.6	93.6	90.9
Long Context
LongBench-V2 (EM)	1-shot	40.2	44.7	51.5

Reasoning Mode	Characteristics	Typical Use Cases	Response Format
Non-think	Fast, intuitive responses	Routine daily tasks, low-risk decisions	`</think>` summary
Think High	Conscious logical analysis, slower but more accurate	Complex problem-solving, planning	`<think>` thinking `</think>` summary
Think Max	Push reasoning to its fullest extent	Exploring the boundary of model reasoning capability	Special system prompt + `<think>` thinking `</think>` summary

Benchmark (Metric)	Opus-4.6 Max	GPT-5.4 xHigh	Gemini-3.1-Pro High	K2.6 Thinking	GLM-5.1 Thinking	DS-V4-Pro Max
Knowledge & Reasoning
MMLU-Pro (EM)	89.1	87.5	91.0	87.1	86.0	87.5
SimpleQA-Verified (Pass@1)	46.2	45.3	75.6	36.9	38.1	57.9
Chinese-SimpleQA (Pass@1)	76.4	76.8	85.9	75.9	75.0	84.4
GPQA Diamond (Pass@1)	91.3	93.0	94.3	90.5	86.2	90.1
HLE (Pass@1)	40.0	39.8	44.4	36.4	34.7	37.7
LiveCodeBench (Pass@1)	88.8	-	91.7	89.6	-	93.5
Codeforces (Rating)	-	3168	3052	-	-	3206
HMMT 2026 Feb (Pass@1)	96.2	97.7	94.7	92.7	89.4	95.2
IMOAnswerBench (Pass@1)	75.3	91.4	81.0	86.0	83.8	89.8
Apex (Pass@1)	34.5	54.1	60.9	24.0	11.5	38.3
Apex Shortlist (Pass@1)	85.9	78.1	89.1	75.5	72.4	90.2
Long Context
MRCR 1M (MMR)	92.9	-	76.3	-	-	83.5
CorpusQA 1M (ACC)	71.7	-	53.8	-	-	62.0
Agentic
Terminal Bench 2.0 (Acc)	65.4	75.1	68.5	66.7	63.5	67.9
SWE Verified (Resolved)	80.8	-	80.6	80.2	-	80.6
SWE Pro (Resolved)	57.3	57.7	54.2	58.6	58.4	55.4
SWE Multilingual (Resolved)	77.5	-	-	76.7	73.3	76.2
BrowseComp (Pass@1)	83.7	82.7	85.9	83.2	79.3	83.4
HLE w/ tools (Pass@1)	53.1	52.0	51.6	54.0	50.4	48.2
GDPval-AA (Elo)	1619	1674	1314	1482	1535	1554
MCPAtlas Public (Pass@1)	73.8	67.2	69.2	66.6	71.8	73.6
Toolathlon (Pass@1)	47.2	54.6	48.8	50.0	40.7	51.8

Benchmark (Metric)	V4-Flash Non-Think	V4-Flash High	V4-Flash Max	V4-Pro Non-Think	V4-Pro High	V4-Pro Max
Knowledge & Reasoning
MMLU-Pro (EM)	83.0	86.4	86.2	82.9	87.1	87.5
SimpleQA-Verified (Pass@1)	23.1	28.9	34.1	45.0	46.2	57.9
Chinese-SimpleQA (Pass@1)	71.5	73.2	78.9	75.8	77.7	84.4
GPQA Diamond (Pass@1)	71.2	87.4	88.1	72.9	89.1	90.1
HLE (Pass@1)	8.1	29.4	34.8	7.7	34.5	37.7
LiveCodeBench (Pass@1)	55.2	88.4	91.6	56.8	89.8	93.5
Codeforces (Rating)	-	2816	3052	-	2919	3206
HMMT 2026 Feb (Pass@1)	40.8	91.9	94.8	31.7	94.0	95.2
IMOAnswerBench (Pass@1)	41.9	85.1	88.4	35.3	88.0	89.8
Apex (Pass@1)	1.0	19.1	33.0	0.4	27.4	38.3
Apex Shortlist (Pass@1)	9.3	72.1	85.7	9.2	85.5	90.2
Long Context
MRCR 1M (MMR)	37.5	76.9	78.7	44.7	83.3	83.5
CorpusQA 1M (ACC)	15.5	59.3	60.5	35.6	56.5	62.0
Agentic
Terminal Bench 2.0 (Acc)	49.1	56.6	56.9	59.1	63.3	67.9
SWE Verified (Resolved)	73.7	78.6	79.0	73.6	79.4	80.6
SWE Pro (Resolved)	49.1	52.3	52.6	52.1	54.4	55.4
SWE Multilingual (Resolved)	69.7	70.2	73.3	69.8	74.1	76.2
BrowseComp (Pass@1)	-	53.5	73.2	-	80.4	83.4
HLE w/ tools (Pass@1)	-	40.3	45.1	-	44.7	48.2
MCPAtlas (Pass@1)	64.0	67.4	69.0	69.4	74.2	73.6
GDPval-AA (Elo)	-	-	1395	-	-	1554
Toolathlon (Pass@1)	40.7	43.5	47.8	46.3	49.0	51.8

12

International team led by Chinese scientists unveils largest-ever cosmological simulation to date, serving as a powerful digital tool to explore cosmic evolution (www.globaltimes.cn)

submitted 1 month ago by allende2001@lemmygrad.ml to c/sino@hexbear.net

0 comments fedilink

cross-posted from: https://lemmygrad.ml/post/11417457

Photo: Deng Xiaoci/ GT

An international research team led by Chinese scientists unveiled the first batch of findings of the largest-ever cosmological simulation to date, codenamed "HyperMillenium," on Thursday. Such development has been applauded by global heavy-weight scientists as breakthrough ushering in a new era for human study of the universe, Global Times reporters learned from a press conference held by the National Astronomical Observatories of the Chinese Academy of Sciences (NAOC) on Thursday.

Wang Qiao, a fellow researcher with the NAOC, presented the simulation at the press conference, explaining that after the Big Bang, the universe evolved from an extremely homogeneous state and gradually developed into a web-like structure. In the "HyperMillenium" simulation, the research team utilized 4.2 trillion virtual particles to simulate the formation and evolution of the entire cosmic structure across the 13.8-billion-year timescale of the universe.

Photo: courtesy of the National Astronomical Observatories of the Chinese Academy of Sciences (NAOC)

According to a press release provided by the NAOC, the simulation covers a cube with a side length of 12 billion light-years and uses 4.2 trillion virtual dark matter particles. By applying a technique called N-body numerical simulation, the team accurately recreated how large-scale structures in the universe evolved over 10 billion years. In simple terms, they built a virtual universe inside a supercomputer, starting from just after the Big Bang and following the force of gravity step by step, read the release.

This provides theoretical support for research into dark matter and dark energy, and also offers strong support for new-generation galaxy survey programs, such as China Space Station Telescope (CSST) and the European Space Agency's Euclid mission, according to the NAOC.

Chinese domestically developed super computers and self-developed software, called photoNs, played an important role in running this large-scale simulation. After more than 10 years of work on algorithms and optimization, the team achieved efficient calculations using over 10 thousand accelerator cards. The project consumed more than 100 million CPU core-hours and 10 million accelerator-card hours, and produced approximately 13 petabytes of raw and processed data, per the NAOC.

We are entering an era where surveys of enormous cosmological volumes have the potential to revolutionize our understanding of dark energy, cosmological inflation, and the properties of neutrinos, said Mike Boylan-Kolchin of the University of Texas at Austin, the US. The professor hailed the simulation a "computational marvel."

"For this to happen, we need advanced theoretical tools, and the HyperMillennium Simulation is a computational marvel that will help unlock fundamental physics from observations of the cosmos. It has an unprecedented range of volume and mass resolution, enabling detailed predictions about how huge numbers of relatively common galaxies are distributed across the cosmic web and the properties of inherently rare and interesting objects that are inaccessible with smaller volumes. The HyperMillennium Simulation will be a touchstone for the galaxy formation and cosmology communities for years to come," the professor said.

"The HyperMillennium simulation redefines what is nowadays possible in numerical cosmology. I am extremely impressed that the team could realize this incredibly large and highly accurate simulation. Its enormous statistical power allows us to carry out new precision test of the LambdaCDM cosmological model, something that is very important for the field," said Volker Springel, the director of the Max Planck Institute for Astrophysics in Germany.

The first research paper from the project has recently been published in the journal Monthly Notices of the Royal Astronomical Society. And according to the NAOC, the first batch of simulation data has already been released to the global scientific community through the National Astronomical Data Center, a platform for astronomy research, education and data-driven applications.

25

International team led by Chinese scientists unveils largest-ever cosmological simulation to date, serving as a powerful digital tool to explore cosmic evolution (www.globaltimes.cn)

submitted 1 month ago by allende2001@lemmygrad.ml to c/china@lemmygrad.ml

1 comments fedilink

Photo: Deng Xiaoci/ GT

An international research team led by Chinese scientists unveiled the first batch of findings of the largest-ever cosmological simulation to date, codenamed "HyperMillenium," on Thursday. Such development has been applauded by global heavy-weight scientists as breakthrough ushering in a new era for human study of the universe, Global Times reporters learned from a press conference held by the National Astronomical Observatories of the Chinese Academy of Sciences (NAOC) on Thursday.

Wang Qiao, a fellow researcher with the NAOC, presented the simulation at the press conference, explaining that after the Big Bang, the universe evolved from an extremely homogeneous state and gradually developed into a web-like structure. In the "HyperMillenium" simulation, the research team utilized 4.2 trillion virtual particles to simulate the formation and evolution of the entire cosmic structure across the 13.8-billion-year timescale of the universe.

Photo: courtesy of the National Astronomical Observatories of the Chinese Academy of Sciences (NAOC)

According to a press release provided by the NAOC, the simulation covers a cube with a side length of 12 billion light-years and uses 4.2 trillion virtual dark matter particles. By applying a technique called N-body numerical simulation, the team accurately recreated how large-scale structures in the universe evolved over 10 billion years. In simple terms, they built a virtual universe inside a supercomputer, starting from just after the Big Bang and following the force of gravity step by step, read the release.

This provides theoretical support for research into dark matter and dark energy, and also offers strong support for new-generation galaxy survey programs, such as China Space Station Telescope (CSST) and the European Space Agency's Euclid mission, according to the NAOC.

Chinese domestically developed super computers and self-developed software, called photoNs, played an important role in running this large-scale simulation. After more than 10 years of work on algorithms and optimization, the team achieved efficient calculations using over 10 thousand accelerator cards. The project consumed more than 100 million CPU core-hours and 10 million accelerator-card hours, and produced approximately 13 petabytes of raw and processed data, per the NAOC.

We are entering an era where surveys of enormous cosmological volumes have the potential to revolutionize our understanding of dark energy, cosmological inflation, and the properties of neutrinos, said Mike Boylan-Kolchin of the University of Texas at Austin, the US. The professor hailed the simulation a "computational marvel."

"For this to happen, we need advanced theoretical tools, and the HyperMillennium Simulation is a computational marvel that will help unlock fundamental physics from observations of the cosmos. It has an unprecedented range of volume and mass resolution, enabling detailed predictions about how huge numbers of relatively common galaxies are distributed across the cosmic web and the properties of inherently rare and interesting objects that are inaccessible with smaller volumes. The HyperMillennium Simulation will be a touchstone for the galaxy formation and cosmology communities for years to come," the professor said.

"The HyperMillennium simulation redefines what is nowadays possible in numerical cosmology. I am extremely impressed that the team could realize this incredibly large and highly accurate simulation. Its enormous statistical power allows us to carry out new precision test of the LambdaCDM cosmological model, something that is very important for the field," said Volker Springel, the director of the Max Planck Institute for Astrophysics in Germany.

The first research paper from the project has recently been published in the journal Monthly Notices of the Royal Astronomical Society. And according to the NAOC, the first batch of simulation data has already been released to the global scientific community through the National Astronomical Data Center, a platform for astronomy research, education and data-driven applications.

14

Chinese scientists unveil green extraction technology for recovering critical metals in new energy industries – Global Times [2026-04-20] (www.globaltimes.cn)

submitted 1 month ago by allende2001@lemmygrad.ml to c/sino@hexbear.net

0 comments fedilink

cross-posted from: https://lemmygrad.ml/post/11376760

Chinese scientists unveil green extraction technology for recovering critical metals in new energy industries

Gold is recovered from the ore after a process of crushing and screening. Photo: IC

Chinese scientists have developed a universal green, high-efficiency membrane separation method to selectively extract a range of heavy metal resources critical to new energy technologies. This offers a potential solution to long-standing challenges in traditional extraction techniques, such as high pollution, low efficiency, and high energy consumption, while also supporting critical metal recovery and recycling, Chinese Academy of Sciences (CAS) announced on Monday.

The accelerated advancement of China’s dual carbon goals has fueled rapid growth in clean energy technologies such as wind power, photovoltaics, electric vehicles, and nuclear energy. This growth has driven up demand for specific heavy metal elements, some of which face heavy import dependence and potential supply shortages, according to an article released by the CAS on its official WeChat account.

A joint research team made up of scientists from the State Key Laboratory of Photoelectric Conversion and Utilization of Solar Energy at Qingdao Institute of Bioenergy and Bioprocess Technology (QIBEBT), CAS, and the Technical Institute of Physics and Chemistry, CAS, has developed a method for heavy metal extraction inspired by biological calcium ion channels. They published the research results in the international academic journal Nature Nanotechnology, according to a statement released by the QIBEBT.

According to reports, solvent extraction and adsorption methods are predominantly used to extract heavy metal ions by binding them selectively. However, these methods require excessive chemical use and cause environmental problems.

While membranes offer a cleaner, chemical-free alternative, they have historically failed at this specific task because heavy metal ions are often so similar in size and charge that standard filtration cannot tell them apart. To solve this, the scientists looked to a master of microscopic sorting: the biological cell.

The scientists found that in nature, biological voltage-gated calcium channels act like an exclusive VIP club. They feature a narrow, one-dimensional hallway lined with highly specific binding sites. When the “VIP” calcium ions enter in a single file, they effectively block the door for all other competing ions – a phenomenon scientists call an “anomalous mole fraction effect,” Gao Jun, corresponding author of the study and researcher at QIBEBT, told the Global Times on Monday.

Inspired by this natural design, the research team engineered a new separation mechanism. They created microscopic channels just wide enough, at about 1.4 nanometers, to force target heavy metal ions to line up in a single file, Gao said.

When the researchers coated the insides of these artificial channels with specific chemical designed to attract uranium, the system successfully mimicked the biological anomalous mole fraction effect. Once uranium entered the channel, it blocked out competing elements like vanadium. And because the trapped uranium ions repelled one another, they shuttled through the barrier smoothly and rapidly, bypassing the usual gridlock.

In a continuous test using natural seawater over 22 days, the process efficiently pulled out uranium while rejecting a sea of other background metals.

This technology can be extended to the extraction of various metals such as copper and gold by modifying functional groups, the small surface chemical units that modify material properties without changing the underlying structure. It is expected to offer a greener, more efficient approach to critical metal recovery and strengthen domestic mineral supply chain resilience, China Central Television reported.

Ultimately, this microscopic sorting mechanism could one day lead to a much greener, more sustainable global mining and recycling industry, Gao said.