заголовки анонсы

22.08.2025

16:42 What came before the Big Bang? Supercomputers may hold the answer

Scientists are rethinking the universe’s deepest mysteries using numerical relativity, complex computer simulations of Einstein’s equations in extreme conditions. This method could help explore what happened before the Big Bang, test theories of cosmic inflation, investigate multiverse collisions, and even model cyclic universes that endlessly bounce through creation and destruction.

20.08.2025

08:45 LAMMPS-KOKKOS: Performance Portable Molecular Dynamics Across Exascale Architectures

arXiv:2508.13523v1 Announce Type: cross Abstract: Since its inception in 1995, LAMMPS has grown to be a world-class molecular dynamics code, with thousands of users, over one million lines of code, and multi-scale simulation capabilities. We discuss how LAMMPS has adapted to the modern heterogeneous computing landscape by integrating the Kokkos performance portability library into the existing C++ code. We investigate performance portability of simple pairwise, many-body reactive, and machine-learned force-field interatomic potentials. We present results on GPUs across different vendors and generations, and analyze performance trends, probing FLOPS throughput, memory bandwidths, cache capabilities, and thread-atomic operation performance. Finally, we demonstrate strong scaling on all current US exascale machines -- OLCF Frontier, and ALCF Aurora, and NNSA El Capitan -- for the three potentials.

08:45 LAMMPS-KOKKOS: Performance Portable Molecular Dynamics Across Exascale Architectures

arXiv:2508.13523v1 Announce Type: new Abstract: Since its inception in 1995, LAMMPS has grown to be a world-class molecular dynamics code, with thousands of users, over one million lines of code, and multi-scale simulation capabilities. We discuss how LAMMPS has adapted to the modern heterogeneous computing landscape by integrating the Kokkos performance portability library into the existing C++ code. We investigate performance portability of simple pairwise, many-body reactive, and machine-learned force-field interatomic potentials. We present results on GPUs across different vendors and generations, and analyze performance trends, probing FLOPS throughput, memory bandwidths, cache capabilities, and thread-atomic operation performance. Finally, we demonstrate strong scaling on all current US exascale machines -- OLCF Frontier, and ALCF Aurora, and NNSA El Capitan -- for the three potentials.

19.08.2025

08:59 Exascale simulations underpin quake-resistant infrastructure designs

Simulations still can't predict exactly when an earthquake will happen, but with the incredible processing power of modern exascale supercomputers, they can now predict how they will happen and how much damage they will likely cause.

12.08.2025

20:32 Scientists explore real-time tsunami warning system on world's fastest supercomputer

Scientists at Lawrence Livermore National Laboratory (LLNL) have helped develop an advanced, real-time tsunami forecasting system—powered by El Capitan, the world's fastest supercomputer—that could dramatically improve early warning capabilities for coastal communities near earthquake zones.

08:59 Exploiting repeated matrix block structures for more efficient CFD on modern supercomputers

arXiv:2508.06710v1 Announce Type: new Abstract: Computational Fluid Dynamics (CFD) simulations are often constrained by the memory-bound nature of sparse matrix-vector operations, which eventually limits performance on modern high-performance computing (HPC) systems. This work introduces a novel approach to increase arithmetic intensity in CFD by leveraging repeated matrix block structures. The method transforms the conventional sparse matrix-vector product (SpMV) into a sparse matrix-matrix product (SpMM), enabling simultaneous processing of multiple right-hand sides. This shifts the computation towards a more compute-bound regime by reusing matrix coefficients. Additionally, an inline mesh-refinement strategy is proposed: simulations initially run on a coarse mesh to establish a statistically steady flow, then refine to the target mesh. This reduces the wall-clock time to reach transition, leading to faster convergence with equivalent computational cost. The methodology is evaluated

11.08.2025

12:46 Tesla Kills Dojo AI Supercomputer as Musk Shifts Strategy

Tesla has pulled the plug on Dojo, scrapping the AI supercomputer project that Elon Musk once called essential

08.08.2025

02:52 Tesla exec leading development of chip tech and Dojo supercomputer is leaving company

Pete Bannon, who joined Tesla from Apple in 2016, is leaving the electric vehicle maker.

02:52 Tesla exec leading development of chip tech and Dojo supercomputer is leaving company

Pete Bannon, who joined Tesla from Apple in 2016, is leaving the electric vehicle maker.

06.08.2025

23:57 NASA supercomputers take on life near Greenland's most active glacier

As Greenland's ice retreats, it's fueling tiny ocean organisms. To test why, scientists turned to a computer model from JPL and MIT that's been called a laboratory in itself.

29.07.2025

11:50 Exascale Implicit Kinetic Plasma Simulations on El~Capitan for Solving the Micro-Macro Coupling in Magnetospheric Physics

arXiv:2507.20719v1 Announce Type: new Abstract: Our fully kinetic, implicit Particle-in-Cell (PIC) simulations of global magnetospheres on up to 32,768 of El Capitan's AMD Instinct MI300A Accelerated Processing Units (APUs) represent an unprecedented computational capability that addresses a fundamental challenge in space physics: resolving the multi-scale coupling between microscopic (electron-scale) and macroscopic (global-scale) dynamics in planetary magnetospheres. The implicit scheme of iPIC3D supports time steps and grid spacing that are up to 10 times larger than those of explicit methods, without sacrificing physical accuracy. This enables the simulation of magnetospheres while preserving fine-scale electron physics, which is critical for key processes such as magnetic reconnection and plasma turbulence. Our algorithmic and technological innovations include GPU-optimized kernels, particle control, and physics-aware data compression using Gaussian Mixture Models. With

24.07.2025

20:10 Supercomputer simulation clarifies how turbulent boundary layers evolve at moderate Reynolds numbers

Scientists at the University of Stuttgart's Institute of Aerodynamics and Gas Dynamics (IAG) have produced a novel dataset that will improve the development of turbulence models. With the help of the Hawk supercomputer at the High-Performance Computing Center Stuttgart (HLRS), investigators in the laboratory of Dr. Christoph Wenzel conducted a large-scale direct numerical simulation of a spatially evolving turbulent boundary layer.

23.07.2025

14:01 Pixel-Resolved Long-Context Learning for Turbulence at Exascale: Resolving Small-scale Eddies Toward the Viscous Limit

arXiv:2507.16697v1 Announce Type: new Abstract: Turbulence plays a crucial role in multiphysics applications, including aerodynamics, fusion, and combustion. Accurately capturing turbulence's multiscale characteristics is essential for reliable predictions of multiphysics interactions, but remains a grand challenge even for exascale supercomputers and advanced deep learning models. The extreme-resolution data required to represent turbulence, ranging from billions to trillions of grid points, pose prohibitive computational costs for models based on architectures like vision transformers. To address this challenge, we introduce a multiscale hierarchical Turbulence Transformer that reduces sequence length from billions to a few millions and a novel RingX sequence parallelism approach that enables scalable long-context learning. We perform scaling and science runs on the Frontier supercomputer. Our approach demonstrates excellent performance up to 1.1 EFLOPS on 32,768 AMD GPUs, with a

14:01 Pixel-Resolved Long-Context Learning for Turbulence at Exascale: Resolving Small-scale Eddies Toward the Viscous Limit

arXiv:2507.16697v1 Announce Type: cross Abstract: Turbulence plays a crucial role in multiphysics applications, including aerodynamics, fusion, and combustion. Accurately capturing turbulence's multiscale characteristics is essential for reliable predictions of multiphysics interactions, but remains a grand challenge even for exascale supercomputers and advanced deep learning models. The extreme-resolution data required to represent turbulence, ranging from billions to trillions of grid points, pose prohibitive computational costs for models based on architectures like vision transformers. To address this challenge, we introduce a multiscale hierarchical Turbulence Transformer that reduces sequence length from billions to a few millions and a novel RingX sequence parallelism approach that enables scalable long-context learning. We perform scaling and science runs on the Frontier supercomputer. Our approach demonstrates excellent performance up to 1.1 EFLOPS on 32,768 AMD GPUs, with a

17.07.2025

20:38 UK's most powerful supercomputer comes online

The Isambard-AI supercomputer is made fully operational as the government unveils fresh AI plans.

16.07.2025

12:12 Scaling the memory wall using mixed-precision -- HPG-MxP on an exascale machine

arXiv:2507.11512v1 Announce Type: cross Abstract: Mixed-precision algorithms have been proposed as a way for scientific computing to benefit from some of the gains seen for artificial intelligence (AI) on recent high performance computing (HPC) platforms. A few applications dominated by dense matrix operations have seen substantial speedups by utilizing low precision formats such as FP16. However, a majority of scientific simulation applications are memory bandwidth limited. Beyond preliminary studies, the practical gain from using mixed-precision algorithms on a given HPC system is largely unclear. The High Performance GMRES Mixed Precision (HPG-MxP) benchmark has been proposed to measure the useful performance of a HPC system on sparse matrix-based mixed-precision applications. In this work, we present a highly optimized implementation of the HPG-MxP benchmark for an exascale system and describe our algorithm enhancements. We show for the first time a speedup of 1.6x using a

12:12 Scaling the memory wall using mixed-precision -- HPG-MxP on an exascale machine

arXiv:2507.11512v1 Announce Type: new Abstract: Mixed-precision algorithms have been proposed as a way for scientific computing to benefit from some of the gains seen for artificial intelligence (AI) on recent high performance computing (HPC) platforms. A few applications dominated by dense matrix operations have seen substantial speedups by utilizing low precision formats such as FP16. However, a majority of scientific simulation applications are memory bandwidth limited. Beyond preliminary studies, the practical gain from using mixed-precision algorithms on a given HPC system is largely unclear. The High Performance GMRES Mixed Precision (HPG-MxP) benchmark has been proposed to measure the useful performance of a HPC system on sparse matrix-based mixed-precision applications. In this work, we present a highly optimized implementation of the HPG-MxP benchmark for an exascale system and describe our algorithm enhancements. We show for the first time a speedup of 1.6x using a

15.07.2025

07:55 ORCHA -- A Performance Portability System for Post-Exascale Systems

arXiv:2507.09337v1 Announce Type: new Abstract: Heterogeneity is the prevalent trend in the rapidly evolving high-performance computing (HPC) landscape in both hardware and application software. The diversity in hardware platforms, currently comprising various accelerators and a future possibility of specializable chiplets, poses a significant challenge for scientific software developers aiming to harness optimal performance across different computing platforms while maintaining the quality of solutions when their applications are simultaneously growing more complex. Code synthesis and code generation can provide mechanisms to mitigate this challenge. We have developed a toolchain, ORCHA, which arises from the needs of a large multiphysics simulation software, Flash-X, which were not met by any of the existing solutions. ORCHA is composed of three stand-alone tools -- one to express high-level control flow and a map of what to execute where on the platform, a second one to express

07:55 ORCHA -- A Performance Portability System for Post-Exascale Systems

arXiv:2507.09337v1 Announce Type: new Abstract: Heterogeneity is the prevalent trend in the rapidly evolving high-performance computing (HPC) landscape in both hardware and application software. The diversity in hardware platforms, currently comprising various accelerators and a future possibility of specializable chiplets, poses a significant challenge for scientific software developers aiming to harness optimal performance across different computing platforms while maintaining the quality of solutions when their applications are simultaneously growing more complex. Code synthesis and code generation can provide mechanisms to mitigate this challenge. We have developed a toolchain, ORCHA, which arises from the needs of a large multiphysics simulation software, Flash-X, which were not met by any of the existing solutions. ORCHA is composed of three stand-alone tools -- one to express high-level control flow and a map of what to execute where on the platform, a second one to express

14.07.2025

21:59 100 undiscovered galaxies may be orbiting the Milky Way, supercomputer simulations hint

Our Milky Way could have many more satellite galaxies than we've detected so far. They're just too faint to be seen.

19:54 New computing approach combines quantum and supercomputers to predict molecule stability

Kenneth Merz, Ph.D., of Cleveland Clinic's Center for Computational Life Sciences and a team are exploring how quantum computers can work with supercomputers to better simulate molecule behavior.

04.07.2025

16:54 Quantum-enhanced supercomputers are starting to do chemistry

Working in tandem, a quantum computer and a supercomputer modelled the behaviour of several molecules, paving the way for useful applications in chemistry and pharmaceutical research

00:26 Musk's xAI scores permit for gas-burning turbines to power Grok supercomputer in Memphis

Elon Musk's artificial intelligence startup attained an official permit to power its Memphis supercomputer facility using natural gas-burning turbines.

03.07.2025

18:57 Machine learning outpaces supercomputers for simulating galaxy evolution coupled with supernova explosion

Researchers have used machine learning to dramatically speed up the processing time when simulating galaxy evolution coupled with supernova explosion. This approach could help us understand the origins of our own galaxy, particularly the elements essential for life in the Milky Way.

27.06.2025

14:12 'Quantum AI' algorithms already outpace the fastest supercomputers, study says

Researchers have successfully demonstrated quantum speedup in kernel-based machine learning.

25.06.2025

14:10 'A first in applied physics': Breakthrough quantum computer could consume 2,000 times less power than a supercomputer and solve problems 200 times faster

Scientists have built a compact physical qubit with built-in error correction, and now say it could be scaled into a 1,000-qubit machine that is small enough to fit inside a data center. They plan to release this machine in 2031.

18.06.2025

19:02 Supercomputer simulations show how to speed up chemical reaction rates at air-water interface

Using the now-decommissioned Summit supercomputer, researchers at the Department of Energy's Oak Ridge National Laboratory ran the largest and most accurate molecular dynamics simulations yet of the interface between water and air during a chemical reaction. The simulations have uncovered how water controls such chemical reactions by dynamically coupling with the molecules involved in the process.

10.06.2025

18:54 IBM announces new quantum processor, plan for Starling supercomputer by 2029

Part of the company's plan involves the new IBM Quantum Nighthawk processor, which is set to release later this year.

18:54 IBM announces new quantum processor, plan for Starling supercomputer by 2029

Part of the company's plan involves the new IBM Quantum Nighthawk processor, which is set to release later this year.

15:34 IBM says it will build a practical quantum supercomputer by 2029

The company has unveiled new innovations in quantum hardware and software that researchers hope will make quantum computing both error-proof and useful before the end of the decade

05.06.2025

21:49 Sandia Fires Up a Brain-Like Supercomputer That Can Simulate 180 Million Neurons

Brain-inspired computers could boost AI efficiency—a tantalizing prospect as the industry's energy bills mount. The post Sandia Fires Up a Brain-Like Supercomputer That Can Simulate 180 Million Neurons appeared first on SingularityHub.

04.06.2025

19:31 How physicists used antimatter, supercomputers and giant magnets to solve a 20-year-old mystery

Physicists are always searching for new theories to improve our understanding of the universe and resolve big unanswered questions.

02.06.2025

22:07 Supercomputer simulation reveals how merging neutron stars form black holes and powerful jets

Merging neutron stars are excellent targets for multi-messenger astronomy. This modern and still very young method of astrophysics coordinates observations of the various signals from one and the same astrophysical source. When two neutron stars collide, they emit gravitational waves, neutrinos and radiation across the entire electromagnetic spectrum. To detect them, researchers need to add gravitational wave detectors and neutrino telescopes to ordinary telescopes that capture light.

29.05.2025

23:08 Energy Department Unveils New Supercomputer That Merges With A.I.

The new supercomputer shows the increasing desire of government labs to adopt more technologies from commercial artificial intelligence systems.

22:57 Energy Department Unveils New Supercomputer That Merges With A.I.

The new supercomputer shows the increasing desire of government labs to adopt more technologies from commercial artificial intelligence systems.

27.05.2025

09:48 Scaling Large-scale GNN Training to Thousands of Processors on CPU-based Supercomputers

arXiv:2411.16025v2 Announce Type: replace Abstract: Graph Convolutional Networks (GCNs), particularly for large-scale graphs, are crucial across numerous domains. However, training distributed full-batch GCNs on large-scale graphs suffers from inefficient memory access patterns and high communication overhead. To address these challenges, we introduce \method{}, an efficient and scalable distributed GCN training framework tailored for CPU-powered supercomputers. Our contributions are threefold: (1) we develop general and efficient aggregation operators designed for irregular memory access, (2) we propose a hierarchical aggregation scheme that reduces communication costs without altering the graph structure, and (3) we present a communication-aware quantization scheme to enhance performance. Experimental results demonstrate that \method{} achieves a speedup of up to 6$\times$ compared with the SoTA implementations, and scales to 1000s of HPC-grade CPUs on the largest publicly available

23.05.2025

17:34 China is building a constellation of AI supercomputers in space — and just launched the first pieces

China has launched the first cluster of satellites for a planned AI supercomputer array. The first-of-its-kind array will enable scientists to perform in-orbit data processing.

00:37 Trippy supercomputer simulation offers unprecedented view of the space between stars

A groundbreaking new supercomputer model shows how magnetic fields shape the turbulent flow of charged particles in space.

22.05.2025

09:16 Extracting Practical, Actionable Energy Insights from Supercomputer Telemetry and Logs

arXiv:2505.14796v1 Announce Type: new Abstract: As supercomputers grow in size and complexity, power efficiency has become a critical challenge, particularly in understanding GPU power consumption within modern HPC workloads. This work addresses this challenge by presenting a data co-analysis approach using system data collected from the Polaris supercomputer at Argonne National Laboratory. We focus on GPU utilization and power demands, navigating the complexities of large-scale, heterogeneous datasets. Our approach, which incorporates data preprocessing, post-processing, and statistical methods, condenses the data volume by 94% while preserving essential insights. Through this analysis, we uncover key opportunities for power optimization, such as reducing high idle power costs, applying power strategies at the job-level, and aligning GPU power allocation with workload demands. Our findings provide actionable insights for energy-efficient computing and offer a practical, reproducible

21.05.2025

12:02 Building quantum supercomputers: Scientists connect two quantum processors using existing fiber optic cables for the first time

Scientists have connected two quantum computers, paving the way for distributed quantum computing, quantum supercomputers and a quantum internet.

20.05.2025

14:11 Exascale computing is here — what does this new era of computing mean and what are exascale supercomputers capable of?

Exascale computing can process over a quintillion operations every second — enabling supercomputers to perform complex simulations that were previously impossible. But how does it work?

19.05.2025

23:27 UK weather forecast more accurate with Met Office supercomputer

Detailed weather forecasts and better predictions about the rain will soon be enjoyed in the UK.

21:59 Novel data streaming software chases light speed from accelerator to supercomputer

Analyzing massive datasets from nuclear physics experiments can take hours or days to process, but researchers are working to radically reduce that time to mere seconds using special software being developed at the Department of Energy's Lawrence Berkeley and Oak Ridge national laboratories.

12.05.2025

08:00 Characterizing GPU Energy Usage in Exascale-Ready Portable Science Applications

arXiv:2505.05623v1 Announce Type: new Abstract: We characterize the GPU energy usage of two widely adopted exascale-ready applications representing two classes of particle and mesh solvers: (i) QMCPACK, a quantum Monte Carlo package, and (ii) AMReX-Castro, an adaptive mesh astrophysical code. We analyze power, temperature, utilization, and energy traces from double-/single (mixed)-precision benchmarks on NVIDIA's A100 and H100 and AMD's MI250X GPUs using queries in NVML and rocm smi lib, respectively. We explore application-specific metrics to provide insights on energy vs. performance trade-offs. Our results suggest that mixed-precision energy savings range between 6-25% on QMCPACK and 45% on AMReX-Castro. Also there are still gaps in the AMD tooling on Frontier GPUs that need to be understood, while query resolutions on NVML have little variability between 1 ms and 1 s. Overall, application level knowledge is crucial to define energy-cost/science-benefit opportunities for the

09.05.2025

10:15 ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling

arXiv:2505.04802v1 Announce Type: cross Abstract: Sparse observations and coarse-resolution climate models limit effective regional decision-making, underscoring the need for robust downscaling. However, existing AI methods struggle with generalization across variables and geographies and are constrained by the quadratic complexity of Vision Transformer (ViT) self-attention. We introduce ORBIT-2, a scalable foundation model for global, hyper-resolution climate downscaling. ORBIT-2 incorporates two key innovations: (1) Residual Slim ViT (Reslim), a lightweight architecture with residual learning and Bayesian regularization for efficient, robust prediction; and (2) TILES, a tile-wise sequence scaling algorithm that reduces self-attention complexity from quadratic to linear, enabling long-sequence processing and massive parallelism. ORBIT-2 scales to 10 billion parameters across 32,768 GPUs, achieving up to 1.8 ExaFLOPS sustained throughput and 92-98% strong scaling efficiency. It

09:05 ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling

arXiv:2505.04802v1 Announce Type: new Abstract: Sparse observations and coarse-resolution climate models limit effective regional decision-making, underscoring the need for robust downscaling. However, existing AI methods struggle with generalization across variables and geographies and are constrained by the quadratic complexity of Vision Transformer (ViT) self-attention. We introduce ORBIT-2, a scalable foundation model for global, hyper-resolution climate downscaling. ORBIT-2 incorporates two key innovations: (1) Residual Slim ViT (Reslim), a lightweight architecture with residual learning and Bayesian regularization for efficient, robust prediction; and (2) TILES, a tile-wise sequence scaling algorithm that reduces self-attention complexity from quadratic to linear, enabling long-sequence processing and massive parallelism. ORBIT-2 scales to 10 billion parameters across 32,768 GPUs, achieving up to 1.8 ExaFLOPS sustained throughput and 92-98% strong scaling efficiency. It supports

30.04.2025

22:48 Quantum computer outperforms supercomputers in approximate optimization tasks

A quantum computer can solve optimization problems faster than classical supercomputers, a process known as "quantum advantage" and demonstrated by a USC researcher in a paper recently published in Physical Review Letters.

29.04.2025

08:01 The Big Send-off: High Performance Collectives on GPU-based Supercomputers

arXiv:2504.18658v1 Announce Type: new Abstract: We evaluate the current state of collective communication on GPU-based supercomputers for large language model (LLM) training at scale. Existing libraries such as RCCL and Cray-MPICH exhibit critical limitations on systems such as Frontier -- Cray-MPICH underutilizes network and compute resources, while RCCL suffers from severe scalability issues. To address these challenges, we introduce PCCL, a communication library with highly optimized implementations of all-gather and reduce-scatter operations tailored for distributed deep learning workloads. PCCL is designed to maximally utilize all available network and compute resources and to scale efficiently to thousands of GPUs. It achieves substantial performance improvements, delivering 6-33x speedups over RCCL and 28-70x over Cray-MPICH for all-gather on 2048 GCDs of Frontier. These gains translate directly to end-to-end performance: in large-scale GPT-3-style training, PCCL provides up to

23.04.2025

10:01 Trends in AI Supercomputers

arXiv:2504.16026v1 Announce Type: new Abstract: Frontier AI development relies on powerful AI supercomputers, yet analysis of these systems is limited. We create a dataset of 500 AI supercomputers from 2019 to 2025 and analyze key trends in performance, power needs, hardware cost, ownership, and global distribution. We find that the computational performance of AI supercomputers has doubled every nine months, while hardware acquisition cost and power needs both doubled every year. The leading system in March 2025, xAI's Colossus, used 200,000 AI chips, had a hardware cost of \$7B, and required 300 MW of power, as much as 250,000 households. As AI supercomputers evolved from tools for science to industrial machines, companies rapidly expanded their share of total AI supercomputer performance, while the share of governments and academia diminished. Globally, the United States accounts for about 75% of total performance in our dataset, with China in second place at 15%. If the observed

14.04.2025

20:43 Trump throws wrench into NSF’s support for new Texas supercomputer

Spending fight with Congress could force NSF to pull plug on fastest university-based machine

17:51 Nvidia to mass produce AI supercomputers in Texas as part of $500 billion U.S. push

Its Blackwell AI chips have started production in Phoenix, Arizona, at Taiwan Semiconductor plants.

07.04.2025

10:13 Performance Analysis of HPC applications on the Aurora Supercomputer: Exploring the Impact of HBM-Enabled Intel Xeon Max CPUs

arXiv:2504.03632v1 Announce Type: new Abstract: The Aurora supercomputer is an exascale-class system designed to tackle some of the most demanding computational workloads. Equipped with both High Bandwidth Memory (HBM) and DDR memory, it provides unique trade-offs in performance, latency, and capacity. This paper presents a comprehensive analysis of the memory systems on the Aurora supercomputer, with a focus on evaluating the trade-offs between HBM and DDR memory systems. We explore how different memory configurations, including memory modes (Flat and Cache) and clustering modes (Quad and SNC4), influence key system performance metrics such as memory bandwidth, latency, CPU-GPU PCIe bandwidth, and MPI communication bandwidth. Additionally, we examine the performance of three representative HPC applications -- HACC, QMCPACK, and BFS -- each illustrating the impact of memory configurations on performance. By using microbenchmarks and application-level analysis, we provide insights into

06.04.2025

15:10 MIT invents new way for QPUs to communicate — paving the way for a scalable 'quantum supercomputer'

A new device enables remote entanglement, allowing distant quantum processors to communicate with one another with reduced error rates.

05.04.2025

16:06 Mini desktop supercomputer coming this year — powerful enough to run advanced AI models and small enough to fit in your bag

The new DGX machines are portable but powerful enough to drive complex AI modules and research, with processing capabilities previously only available in data centers.

03.04.2025

19:19 Supercomputer models microtubule dynamics, offering new insights into neurodegenerative diseases

Each day, a human adult loses on average 50 to 70 billion cells, which die from natural causes alone. New cells replace lost ones by the complex process of cell division, which relies on what scientists call molecular machines to transport chemical cargo to where it is needed for reactions that keep us alive.

01.04.2025

12:26 Globus Service Enhancements for Exascale Applications and Facilities

arXiv:2503.22981v1 Announce Type: new Abstract: Many extreme-scale applications require the movement of large quantities of data to, from, and among leadership computing facilities, as well as other scientific facilities and the home institutions of facility users. These applications, particularly when leadership computing facilities are involved, can touch upon edge cases (e.g., terabyte files) that had not been a focus of previous Globus optimization work, which had emphasized rather the movement of many smaller (megabyte to gigabyte) files. We report here on how automated client-driven chunking can be used to accelerate both the movement of large files and the integrity checking operations that have proven to be essential for large data transfers. We present detailed performance studies that provide insights into the benefits of these modifications in a range of file transfer scenarios.

28.03.2025

13:06 This UK-Built AI Supercomputer Could Change How We Fight Disease, Climate Change, and Energy Crises

Dawn, the AI supercomputer from Cambridge, accelerates breakthroughs in climate science, medical diagnostics, and clean energy with advanced computing power.

08:52 Workshop Scientific HPC in the pre-Exascale era (part of ITADATA 2024) Proceedings

arXiv:2503.21415v1 Announce Type: new Abstract: The proceedings of Workshop Scientific HPC in the pre-Exascale era (SHPC), held in Pisa, Italy, September 18, 2024, are part of 3rd Italian Conference on Big Data and Data Science (ITADATA2024) proceedings (arXiv: 2503.14937). The main objective of SHPC workshop was to discuss how the current most critical questions in HPC emerge in astrophysics, cosmology, and other scientific contexts and experiments. In particular, SHPC workshop focused on: $\bullet$ Scientific (mainly in astrophysical and medical fields) applications toward (pre-)Exascale computing $\bullet$ Performance portability $\bullet$ Green computing $\bullet$ Machine learning $\bullet$ Big Data management $\bullet$ Programming on heterogeneous architectures $\bullet$ Programming on accelerators $\bullet$ I/O techniques

21.03.2025

20:04 New AI is better at weather prediction than supercomputers — and it consumes 1000s of times less energy

The Aardvark Weather machine learning algorithm is much faster than traditional systems and can work on a desktop computer.

20.03.2025

21:40 AI can forecast the weather in seconds without needing supercomputers

While earlier weather-forecasting AIs have replaced some tasks done by traditional models, new research uses machine learning to replace the entire process, making it much faster

14.03.2025

10:15 Introducing MareNostrum5: A European pre-exascale energy-efficient system designed to serve a broad spectrum of scientific workloads

arXiv:2503.09917v1 Announce Type: new Abstract: MareNostrum5 is a pre-exascale supercomputer at the Barcelona Supercomputing Center (BSC), part of the EuroHPC Joint Undertaking. With a peak performance of 314 petaflops, MareNostrum5 features a hybrid architecture comprising Intel Sapphire Rapids CPUs, NVIDIA Hopper GPUs, and DDR5 and high-bandwidth memory (HBM), organized into four partitions optimized for diverse workloads. This document evaluates MareNostrum5 through micro-benchmarks (floating-point performance, memory bandwidth, interconnect throughput), HPC benchmarks (HPL and HPCG), and application studies using Alya, OpenFOAM, and IFS. It highlights MareNostrum5's scalability, efficiency, and energy performance, utilizing the EAR (Energy Aware Runtime) framework to assess power consumption and the effects of direct liquid cooling. Additionally, HBM and DDR5 configurations are compared to examine memory performance trade-offs. Designed to complement standard technical

13.03.2025

16:01 China achieves quantum supremacy claim with new chip 1 quadrillion times faster than the most powerful supercomputers

This new superconducting prototype quantum processor achieved benchmarking results to rival Google's new Willow QPU.

00:39 Supercomputer draws molecular blueprint for repairing damaged DNA

Sunburns and aging skin are obvious effects of exposure to harmful UV rays, tobacco smoke and other carcinogens. But the effects aren't just skin deep. Inside the body, DNA is literally being torn apart.

12.03.2025

08:04 MFC 5.0: An exascale many-physics flow solver

arXiv:2503.07953v1 Announce Type: new Abstract: Engineering, medicine, and the fundamental sciences broadly rely on flow simulations, making performant computational fluid dynamics solvers an open source software mainstay. A previous work made MFC 3.0 a published open source source solver with many features. MFC 5.0 is a marked update to MFC 3.0, including a broad set of well-established and novel physical models and numerical methods and the introduction of GPU and APU (or superchip) acceleration. We exhibit state-of-the-art performance and ideal scaling on the first two exascale supercomputers, OLCF Frontier and LLNL El Capitan. Combined with MFC's single-GPU/APU performance, MFC achieves exascale computation in practice. With these capabilities, MFC has evolved into a tool for conducting simulations that many engineering challenge problems hinge upon. New physical features include the immersed boundary method, $N$-fluid phase change, Euler--Euler and Euler--Lagrange sub-grid bubble

07:42 MFC 5.0: An exascale many-physics flow solver

arXiv:2503.07953v1 Announce Type: cross Abstract: Engineering, medicine, and the fundamental sciences broadly rely on flow simulations, making performant computational fluid dynamics solvers an open source software mainstay. A previous work made MFC 3.0 a published open source source solver with many features. MFC 5.0 is a marked update to MFC 3.0, including a broad set of well-established and novel physical models and numerical methods and the introduction of GPU and APU (or superchip) acceleration. We exhibit state-of-the-art performance and ideal scaling on the first two exascale supercomputers, OLCF Frontier and LLNL El Capitan. Combined with MFC's single-GPU/APU performance, MFC achieves exascale computation in practice. With these capabilities, MFC has evolved into a tool for conducting simulations that many engineering challenge problems hinge upon. New physical features include the immersed boundary method, $N$-fluid phase change, Euler--Euler and Euler--Lagrange sub-grid

07.03.2025

14:10 HERACLES++: a multi-dimensional Eulerian code for exascale computing

arXiv:2503.04428v1 Announce Type: cross Abstract: Numerical simulations of multidimensional astrophysical fluids present considerable challenges. However, the development of exascale computing has significantly enhanced computational capabilities, motivating the development of new codes that can take full advantage of these resources. In this article, we introduce HERACLES++, a new hydrodynamics code with high portability, optimized for exascale machines with different architectures and running efficiently both on CPUs and GPUs. The code is Eulerian and employs a Godunov finite-volume method to solve the hydrodynamics equations, which ensures accuracy in capturing shocks and discontinuities. It includes different Riemann solvers, equations of state, and gravity solvers. It works in Cartesian and spherical coordinates, either in 1-D, 2-D, or 3-D, and uses passive scalars to handle gases with several species. The code accepts a user-supplied heating or cooling term to treat a variety of

05.03.2025

17:08 Supercomputers reveal how small ocean processes influence storms

For decades, scientists assumed that only large ocean temperature patterns covering 200 kilometers (124 miles) or more could strongly influence storms. Now, by leveraging advances in computing power, a team of scientists from UC San Diego's Scripps Institution of Oceanography, NASA Jet Propulsion Laboratory and NASA Goddard Space Flight Center have discovered that small-scale ocean processes can have a much larger influence on storm development than previously thought.

03.03.2025

21:16 Superconducting quantum processor prototype operates 10¹⁵ times faster than fastest supercomputer

Zuchongzhi-3, a superconducting quantum computing prototype with 105 qubits and 182 couplers, has made significant advancements in random quantum circuit sampling. This prototype was successfully developed by a research team from the University of Science and Technology of China (USTC).

23.02.2025

19:09 NASA supercomputer reveals strange spiral structure at the edge of our solar system

The mysterious Oort cloud is the source of many of our solar system's comets, but astronomers still have no idea what it looks like. Now, new simulations may have given them a first glimpse.

18.02.2025

15:37 Quantum simulation breakthrough will lead to 'discoveries impossible in today's fastest supercomputers,' Google scientists claim

By combining digital and analog quantum simulation into a new hybrid approach, scientists have already started to make fresh scientific discoveries using quantum computers.

13.02.2025

15:00 Supercomputer runs largest and most complicated simulation of the universe ever

Frontier, the second fastest supercomputer in the world, used dark matter and the movement of gas and plasma rather than just gravity to model the observable universe.

10:28 Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers

arXiv:2502.08145v1 Announce Type: new Abstract: Training and fine-tuning large language models (LLMs) with hundreds of billions to trillions of parameters requires tens of thousands of GPUs, and a highly scalable software stack. In this work, we present a novel four-dimensional hybrid parallel algorithm implemented in a highly scalable, portable, open-source framework called AxoNN. We describe several performance optimizations in AxoNN to improve matrix multiply kernel performance, overlap non-blocking collectives with computation, and performance modeling to choose performance optimal configurations. These have resulted in unprecedented scaling and peak flop/s (bf16) for training of GPT-style transformer models on Perlmutter (620.1 Petaflop/s), Frontier (1.381 Exaflop/s) and Alps (1.423 Exaflop/s). While the abilities of LLMs improve with the number of trainable parameters, so do privacy and copyright risks caused by memorization of training data, which can cause disclosure of

12.02.2025

23:02 Wildfires intensifying more due to changes in vegetation and humidity than to lightning, supercomputer simulation finds

Extreme fire seasons in recent years highlight the urgent need to better understand wildfires within the broader context of climate change. Under climate change, many drivers of wildfires are expected to change, such as the amount of carbon stored in vegetation, rainfall, and lightning strikes.

11.02.2025

21:13 Supercomputer simulations of giant radio galaxy formation challenge current theoretical models

Enabled by supercomputing, University of Pretoria (UP) researchers have led an international team of astronomers that has provided deeper insight into the entire life cycle (birth, growth and death) of giant radio galaxies, which resemble "cosmic fountains"—jets of superheated gas that are ejected into near-empty space from their spinning supermassive black holes.

20:03 World's 1st hybrid quantum supercomputer goes online in Japan

Japan's Fugaku supercomputer has gained an edge following the installation of the Reimei quantum computer.

06.02.2025

18:22 Supercomputer simulation shows why beneficial mutations rarely lead to hypermutators in real organisms

In real life, mutants can arise when their DNA changes to give them an advantage over the rest of the population. A team from the University of Michigan has used simulations on the Pittsburgh Supercomputing Center's Neocortex system to find out why beneficial mutants rarely come to dominate real organisms.

02:02 https://www.physics.ox.ac.uk/news/paving-way-quantum-supercomputers

In a milestone that brings quantum computing tangibly closer to large-scale practical use, scientists have demonstrated the first instance of distributed quantum computing. Using a photonic network interface, they successfully linked two separate quantum processors to form a single, fully connected quantum computer, paving the way to tackling computational challenges previously out of reach.

05.02.2025

19:05 Quantum algorithm distributed across multiple processors for the first time—paving the way to quantum supercomputers

In a milestone that brings quantum computing tangibly closer to large-scale practical use, scientists at Oxford University Physics have demonstrated the first instance of distributed quantum computing.

10:35 How to Build a Quantum Supercomputer: Scaling from Hundreds to Millions of Qubits

arXiv:2411.10406v2 Announce Type: cross Abstract: In the span of four decades, quantum computation has evolved from an intellectual curiosity to a potentially realizable technology. Today, small-scale demonstrations have become possible for quantum algorithmic primitives on hundreds of physical qubits and proof-of-principle error-correction on a single logical qubit. Nevertheless, despite significant progress and excitement, the path toward a full-stack scalable technology is largely unknown. There are significant outstanding quantum hardware, fabrication, software architecture, and algorithmic challenges that are either unresolved or overlooked. These issues could seriously undermine the arrival of utility-scale quantum computers for the foreseeable future. Here, we provide a comprehensive review of these scaling challenges. We show how the road to scaling could be paved by adopting existing semiconductor technology to build much higher-quality qubits, employing system engineering

22.01.2025

13:09 Extreme scaling of the metadynamics of paths algorithm on the pre-exascale JUWELS Booster supercomputer

arXiv:2501.11962v1 Announce Type: new Abstract: Molecular dynamics (MD)-based path sampling algorithms are a very important class of methods used to study the energetics and kinetics of rare (bio)molecular events. They sample the highly informative but highly unlikely reactive trajectories connecting different metastable states of complex (bio)molecular systems. The metadynamics of paths (MoP) method proposed by Mandelli, Hirshberg, and Parrinello [Pys. Rev. Lett. 125 2, 026001 (2020)] is based on the Onsager-Machlup path integral formalism. This provides an analytical expression for the probability of sampling stochastic trajectories of given duration. In practice, the method samples reactive paths via metadynamics simulations performed directly in the phase space of all possible trajectories. Its parallel implementation is in principle infinitely scalable, allowing arbitrarily long trajectories to be simulated. Paving the way for future applications to study the thermodynamics and

21.01.2025

16:09 World's fastest supercomputer 'El Capitan' goes online — it will be used to secure the U.S. nuclear stockpile and in other classified research

The world's fastest supercomputer 'El Capitan' can reach a peak performance of 2.746 exaFLOPS, making it the planet's third exascale computer.

07.01.2025

07:20 Nvidia's mini 'desktop supercomputer' is 1,000 times more powerful than a laptop — and it can fit in your bag

New Project Digits mini PC offers a petaFLOP of power for local AI processing and data science.

06.01.2025

13:33 Data Parallel Visualization and Rendering on the RAMSES Supercomputer with ANARI

arXiv:2501.01628v1 Announce Type: new Abstract: 3D visualization and rendering in HPC are very heterogenous applications, though fundamentally the tasks involved are well-defined and do not differ much from application to application. The Khronos Group's ANARI standard seeks to consolidate 3D rendering across sci-vis applications. This paper makes an effort to convey challenges of 3D rendering and visualization with ANARI in the context of HPC, where the data does not fit within a single node or GPU but must be distributed. It also provides a gentle introduction to parallel rendering concepts and challenges to practitioners from the field of HPC in general. Finally, we present a case study showcasing data parallel rendering on the new supercomputer RAMSES at the University of Cologne.

23.12.2024

08:12 Asynchronous-Many-Task Systems: Challenges and Opportunities -- Scaling an AMR Astrophysics Code on Exascale machines using Kokkos and HPX

arXiv:2412.15518v1 Announce Type: new Abstract: Dynamic and adaptive mesh refinement is pivotal in high-resolution, multi-physics, multi-model simulations, necessitating precise physics resolution in localized areas across expansive domains. Today's supercomputers' extreme heterogeneity presents a significant challenge for dynamically adaptive codes, highlighting the importance of achieving performance portability at scale. Our research focuses on astrophysical simulations, particularly stellar mergers, to elucidate early universe dynamics. We present Octo-Tiger, leveraging Kokkos, HPX, and SIMD for portable performance at scale in complex, massively parallel adaptive multi-physics simulations. Octo-Tiger supports diverse processors, accelerators, and network backends. Experiments demonstrate exceptional scalability across several heterogeneous supercomputers including Perlmutter, Frontier, and Fugaku, encompassing major GPU architectures and x86, ARM, and RISC-V CPUs. Parallel

10.12.2024

16:01 New AI cracks complex engineering problems faster than supercomputers

Modeling how cars deform in a crash, how spacecraft responds to extreme environments, or how bridges resist stress could be made thousands of times faster thanks to new artificial intelligence that enables personal computers to solve massive math problems that generally require supercomputers.

09.12.2024

19:03 Google's new quantum chip has solved a problem the best supercomputer would have taken a quadrillion times the age of the universe to crack

Google's new 105-qubit 'Willow' quantum processor has surpassed a key error-correction threshold first proposed in 1995 — with errors now reducing exponentially as you scale up quantum machines.

06.12.2024

15:07 World's 2nd fastest supercomputer runs largest-ever simulation of the universe

The simulations will be used by astronomers to test the standard model of cosmology.

29.11.2024

21:42 A Superfast Supercomputer Creates the Biggest Simulation of the Universe Yet

Scientists at the Department of Energy’s Argonne National Laboratory have created the largest astrophysical simulation of the Universe ever. They used what was until recently the world’s most powerful supercomputer to simulate the Universe at an unprecedented scale. The simulation’s size corresponds to the largest surveys conducted by powerful telescopes and observatories. The Frontier Supercomputer … Continue reading "A Superfast Supercomputer Creates the Biggest Simulation of the Universe Yet" The post A Superfast Supercomputer Creates the Biggest Simulation of the Universe Yet appeared first on Universe Today.

25.11.2024

21:57 Record-breaking run on Frontier sets new bar for simulating the universe in exascale era

The universe just got a whole lot bigger—or at least in the world of computer simulations, that is. In early November, researchers at the Department of Energy's Argonne National Laboratory used the fastest supercomputer on the planet to run the largest astrophysical simulation of the universe ever conducted.

22.11.2024

20:44 New Supercomputer Simulation Explains How Mars Got Its Moons

One mystery in planetary science is a satisfying origin story for Mars's moons, Phobos and Deimos. Were they chunks of Mars blasted into space by a meteor impact? Were they captured asteroids from the belt? A new supercomputer simulation found that a reasonable explanation could come from a massive asteroid passing just close enough to Mars that it was torn into pieces. Over time, chunks and debris would have settled into a disk around Mars and clumped into moons. The post New Supercomputer Simulation Explains How Mars Got Its Moons appeared first on Universe Today.

20.11.2024

23:00 Making Mars's moons: Supercomputers offer 'disruptive' new explanation

A NASA study using a series of supercomputer simulations reveals a potential new solution to a longstanding Martian mystery: How did Mars get its moons? The first step, the findings say, may have involved the destruction of an asteroid.

19.11.2024

21:39 Machine learning and supercomputer simulations help researchers to predict interactions between gold nanoparticles and blood proteins

Researchers have used machine learning and supercomputer simulations to investigate how tiny gold nanoparticles bind to blood proteins. The studies discovered that favorable nanoparticle-protein interactions can be predicted from machine learning models that are trained from atom-scale molecular dynamics simulations. The new methodology opens ways to simulate efficacy of gold nanoparticles as targeted drug delivery systems in precision nanomedicine.

20:39 World's new fastest supercomputer is built to simulate nuclear bombs

The vast computational power of the El Capitan supercomputer at Lawrence Livermore National Laboratory in California will be used to support the US nuclear deterrent

08:47 Exascale Workflow Applications and Middleware: An ExaWorks Retrospective

arXiv:2411.10637v1 Announce Type: new Abstract: Exascale computers offer transformative capabilities to combine data-driven and learning-based approaches with traditional simulation applications to accelerate scientific discovery and insight. However, these software combinations and integrations are difficult to achieve due to the challenges of coordinating and deploying heterogeneous software components on diverse and massive platforms. We present the ExaWorks project, which addresses many of these challenges. We developed a workflow Software Development Toolkit (SDK), a curated collection of workflow technologies that can be composed and interoperated through a common interface, engineered following current best practices, and specifically designed to work on HPC platforms. ExaWorks also developed PSI/J, a job management abstraction API, to simplify the construction of portable software components and applications that can be used over various HPC schedulers. The PSI/J API is a

18.11.2024

23:39 Machine learning and supercomputer simulations predict interactions between gold nanoparticles and blood proteins

Researchers in the Nanoscience Center at the University of Jyväskylä, Finland, have used machine learning and supercomputer simulations to investigate how tiny gold nanoparticles bind to blood proteins. The studies discovered that favorable nanoparticle-protein interactions can be predicted from machine learning models that are trained from atom-scale molecular dynamics simulations. The new methodology opens ways to simulate the efficacy of gold nanoparticles as targeted drug delivery systems in precision nanomedicine.

13.11.2024

05:05 Bulges calculated in the supercomputer: How cells digest their internal canal system

Inside cells, there exists an extensive system of canals known as the endoplasmic reticulum (ER), which consists of membrane-encased tubes that are partially broken down as needed -- for instance in case of a nutrient deficiency. As part of this process, bulges or protrusions form in the membrane, which then pinch off and are recycled by the cell. A study has examined this protrusion process using computer simulations. Its finding: certain structural motifs of proteins in the ER membrane play a central role in this process.

01.11.2024

19:47 Why a Memphis Community Is Fighting Elon Musk’s Supercomputer

Residents say Mr. Musk’s data center for artificial intelligence is compounding their pollution burden and adding stress on the local electrical grid.

31.10.2024

18:16 Why a Memphis Community Is Fighting Elon Musk’s Supercomputer

Residents say Mr. Musk’s data center for artificial intelligence is compounding their pollution burden and adding stress on the local electrical grid.

25.10.2024

12:50 Leveraging Hardware Performance Counters for Predicting Workload Interference in Vector Supercomputers

arXiv:2410.18126v1 Announce Type: new Abstract: In the rapidly evolving domain of high-performance computing (HPC), heterogeneous architectures such as the SX-Aurora TSUBASA (SX-AT) system architecture, which integrate diverse processor types, present both opportunities and challenges for optimizing resource utilization. This paper investigates workload interference within an SX-AT system, with a specific focus on resource contention between Vector Hosts (VHs) and Vector Engines (VEs). Through comprehensive empirical analysis, the study identifies key factors contributing to performance degradation, such as cache and memory bandwidth contention, when jobs with varying computational demands share resources. To address these issues, we develop a predictive model that leverages hardware performance counters (HCs) and machine learning (ML) algorithms to classify and predict workload interference. Our results demonstrate that the model accurately forecasts performance degradation, offering

23.10.2024

23:34 Tracking down nuclear fission's elusive scission neutron with a supercomputer

Nuclear fission—when the nucleus of an atom splits in two, releasing energy—may seem like a process that is fully understood. First discovered in 1939 and thoroughly studied ever since, fission is a constant factor in modern life, used in everything from nuclear medicine to power-generating nuclear reactors. However, it is a force of nature that still contains mysteries yet to be solved.