About NewMachine
NewMachine was founded in 2018 by Viktor Harlan, a data-center engineer who spent fifteen years designing HPC infrastructure for the world's most demanding scientific computing and AI research organizations. Viktor noticed a pattern: every ML team was solving the same low-level problems — GPU interconnect topology, NCCL optimization, distributed checkpoint storage, thermal management — yet each was doing it in isolation, often badly. He believed there was room for a dedicated infrastructure company that could do this work once, do it right, and offer it as a service.
The first NewMachine facility opened in a converted industrial warehouse in Secaucus, New Jersey, purpose-built for GPU-dense workloads. It featured liquid cooling for 8-GPU nodes, a full-bisection InfiniBand fabric, and a distributed storage layer optimized for checkpoint I/O. The first tenants — two AI research labs — signed before construction was finished.
Seven years later, NewMachine operates six facilities across New Jersey, Chicago, London, Frankfurt, Tokyo, and Singapore. We serve over 90 organizations ranging from two-person ML startups to global technology companies training foundation models. Our managed GPU cloud lets clients provision bare-metal H100 and B200 clusters, high-speed storage, and private network circuits through an API, with the same reliability guarantees they expect from their own hardware — and none of the operational burden.
Our Mission
To provide the strongest possible foundation for AI and ML workloads by building infrastructure that is fast, resilient, and invisible.
Our Values
Durability
We design every system for a ten-year lifespan and test it against failure scenarios most teams never consider. Redundant power, redundant cooling, redundant network paths — because in GPU infrastructure, "good enough" is a liability.
Precision
FLOPS matter and bandwidth adds up. We measure everything — interconnect latency, GPU memory bandwidth, storage throughput, cooling efficiency — because at the infrastructure layer, small imprecisions cascade into training time and cost overruns.
Operational Excellence
Infrastructure is only as good as the team that operates it. Our NOC runs around the clock with engineers who understand ML training workflows, not just server hardware. GPU failures are detected in seconds, communicated in minutes, and resolved with the urgency that multi-day training runs demand.