Designing & Building Data Center AI Infrastructure at Scale (original) (raw)
89,000+
GPUs Deployed & Managed
3.3+ Billion
Hours of GPU Runtime
Penguin Solutions designed, built, deployed, and now manages one of Korea’s largest GPU clusters, consisting of over 1,000 NVIDIA Blackwell GPUs integrated into a single cluster.
Shell powers its sustainable high-performance data centers with Penguin’s high-performance computing (HPC) solutions, including immersion cooling.
Penguin Solutions designed, built, and deployed the infrastructure to support the Georgia Tech AI Makerspace.
Penguin Solutions deploys NextSilicon accelerator technology as part of the Vanguard program at Sandia National Labs.
Accelerate time to value by basing system architectures on a proven set of designs that have been validated at scale in numerous production deployments.
Achieve high rates of system stability with our in-factory experts who integrate and validate all components of the compute cluster including rack integration, network configuration, and burn-in testing.
Drive on-site installations with coordination of data center staff, data storage partners, and infrastructure cooling providers—and utilize ICE ClusterWare software to validate production readiness.
Assure production readiness and change management by working with a certified NVIDIA DGX Managed Services provider, the offers a full set of end-to-end services.
“Penguin Solutions demonstrated a deep understanding of our technical requirements, translating them into a sophisticated infrastructure environment that meets and exceeds expectations.”
“It takes a village to do AI well, it takes an infrastructure, it takes a data center, and it takes experts. And, I think in that regard, having Georgia Tech, NVIDIA, and Penguin—that’s what it takes.”
“After a thorough RFP process, it was clear early on that Penguin was the right partner for us. Not only do they have the technical expertise and decades of experience, but they’re able to move very fast.”

OriginAI®
OriginAI® is an AI factory infrastructure solution built on proven, pre-defined AI architectures that can scale from hundreds to over 16,000 GPU clusters.
OriginAI integrates these validated technologies with Penguin’s intelligent, intuitive cluster management software and expert services for designing, building, deploying, and managing AI infrastructure at scale.

ICE ClusterWare™
Simplify the deployment and management of AI clusters to realize greater productivity at speed.
With ICE ClusterWare™, bare-metal hardware, network, and software resources are transformed into high-performance cluster environments, reducing administration complexity and optimizing resource availability.

Delivering NVIDIA DGX-Ready Managed Services
Penguin Solutions has designed and deployed large NVIDIA DGX clusters with high-speed NVIDIA InfiniBand networking and optimized storage.
We have deep expertise and relationships with most storage vendors which allows us to provide bespoke solutions for every customer.

Stratus ztC Endurance®
Stratus ztC Endurance® is an innovative family of computing platforms that enables intelligent, predictive fault tolerance and 99.99999% compute platform availability.
The platform combines built-in fault tolerance, proactive health monitoring, and serviceability by OT or IT, all while meeting your cybersecurity requirements.

Stratus ztC Edge®
Stratus ztC Edge® is a secure, rugged, highly automated computing platform that improves productivity, increases operational efficiency, and reduces downtime risk at the edge of corporate networks.
Its self-protecting and self-monitoring features drastically reduce unplanned downtime and ensure continuous availability of business-critical applications.

Stratus everRun®
Stratus everRun® is a software solution that pairs two servers via virtualization to create protected and replicated virtual machines (VMs) within a single operating environment, ensuring your applications run without interruption or data loss.
Stratus everRun accelerates time to revenue by transforming your applications into continuously available solutions with customized availability.

Introducing the New Family of CXL® Add-in-Cards (AICs)
Compute Express Link (CXL) enables data centers, cloud services, and HPC providers to expand memory for intensive computing easily and cost-effectively.

Ultra-High Reliability Zefr ZDIMM Memory Modules
Ideal for data centers, hyperscalers, and HPC platforms running large memory applications that require maximum compute availability.

Next-Generation Data Center SSDs
Designed to meet the stringent demands placed on storage systems in hyperscaler, hyper-converged, enterprise, and edge data centers.

Penguin Solutions Featured on USA Today’s Climate Leaders 2026

Penguin Solutions Powers Deepgram's Enterprise Voice AI Infrastructure

Penguin Solutions Introduces First Production-Ready CXL-Based KV Cache Server

Penguin Solutions Announces CEO Transition

Sandia Announces Spectra Supercomputer Installed by Penguin Solutions

Penguin Solutions Releases ICE ClusterWare Management Software 13.0

Our CEO, Mark Adams, Recently Spoke with Scott McGrew on NBC News

CEO & President Mark Adams Joins the Micro Journeys Podcast

SK Telecom Launches Sovereign AI Infrastructure, Powered by NVIDIA

Five Critical Design Considerations for AI Infrastructure

Penguin Solutions Signs Agreement with CDW Expanding Customer Reach

Stratus ztC Endurance Named “HPC Solution of the Year”

Penguin Solutions' OriginAI Honored as a Winner in the 2025 AI Excellence Awards

Penguin Solutions Supports Pure Storage Introduction of FlashBlade//EXA™

Rebellions Partners on Strategic Collaboration Initiative

Penguin Solutions Expands Its AI Infrastructure Management Software

Mark Seamans Discusses Simplifying AI Complexity with Data Management

Penguin Solutions Signs AI Data Center Collaboration Agreement with SK Telecom and SK hynix
![]()
Penguin Solutions Named in Top Five Vendors to Watch in 2024 HPCwire Readers’ and Editors’ Choice Awards

OriginAI Infrastructure Now Available with Additional GPUs and Enhanced Cluster Management Capabilities

Penguin Solutions Accelerates Time to Value for AI Factories

Penguin Solutions Selected as the Managed Services Partner for Voltage Park’s NVIDIA Clusters

Sandia Partners With NextSilicon and Penguin Solutions to Deliver ‘First of its Kind’ Runtime Reconfigurable Accelerator Technology

AI Makes Mark on Engineering Education

Georgia Tech Unveils New AI Makerspace in Collaboration with NVIDIA
![]()
The Infrastructure Behind the Outputs: Cloud and HPC Unlock the Power of AI

Shell Deploys Cooling Immersion Pods in Texas Data Center

Supercomputing Platform From Penguin Solutions Installed at DoD Site

Meta Is Building the World’s Fastest AI Supercomputer
Talk to Our Experts
Whether you’re struggling with AI solution design, build, deployment, or management—in your data center or in the cloud—Penguin Solutions can help.
Partner with Penguin Solutions and get on track to your improve AI advantage.