Now Reading: How neoclouds meet the demands of AI workloads

Loading
svg

How neoclouds meet the demands of AI workloads

NewsFebruary 12, 2026Artifice Prime
svg16

Neoclouds are specialized clouds devoted to the wildly dynamic world of artificial intelligence, currently experiencing explosive 35.9% annual growth. Built from the ground up to meet AI’s significant computational demands, neoclouds first emerged several years ago. Dozens of providers have arrived since then, with CoreWeave, Crusoe, Llambda, Nebius, and Vultr among the neocloud leaders.

The ”neo” in neoclouds serves to distinguish them from the more established cloud providers such as AWS, Google Cloud, and Microsoft Azure, whose multitude of options for infrastructure, managed services, and applications imply that cloud providers must offer an endless aisle of choices. The hyperscalers were first to support AI workloads, too, but it was a retrofitted option on an existing platform rather than a clean slate implementation built for purpose.

Neoclouds have one job: provide an optimal home for AI. Most obviously, that means neoclouds feature GPU-first computing, typically available at a price-per-hour less than half that of the hyperscalers. Neoclouds also offer high-bandwidth networking, low-latency storage, advanced power management, and managed services for deploying, monitoring, maintaining, and securing AI workloads. These capabilities are offered through a more streamlined and easy to use surface, unencumbered by traditional non-AI features.

In contrast to the cookie-cutter options offered by the hyperscalers, neoclouds take a boutique approach, responding to the special requirements and evolving needs of customers—including customers that push the envelope of AI development. That flexibility is a key reason why an increasing number of AI startups, enterprises, researchers, and independent developers are choosing neoclouds as their AI platform of choice.

Choosing the best configuration

The best neoclouds offer a wide range of hardware choices plus skilled guidance for customers about which GPU, memory, networking and storage options best suit which AI tasks. That advice is based on deep AI engineering experience, but a few general principles apply. If you were planning on training your own large language model (LLM), for example, you’d need the highest-end configuration available—at this writing, probably NVIDIA GB200 Grace Blackwell GPUs with 186GBs VRAM each.

But today, vanishingly few players beyond such monster AI providers as Anthropic, OpenAI, Google, or Meta train their own LLMs. Fine-tuning LLMs that have already been trained, which typically includes augmenting them with additional data, is far more prevalent and requires far less horsepower. The same goes for LLM post-training and reinforcement learning. And the processing required for inference alone—that is, running LLMs that have already been trained and tuned—is again far less demanding.

It’s worth noting that massive consumer adoption of LLM chatbots has obscured the fact that AI covers a very wide range—including video generation, computer vision, image classification, speech recognition, and much more. Plus, small language models for such applications as code completion, customer service automation, and financial document analysis are becoming increasingly popular. To choose configurations that match AI tasks, neocloud customers must either come in the door with bona fide AI engineering skills or rely on the options and guidance offered by neocloud  providers.

Managed AI services

Most added-value neocloud services center on maximizing inference performance, with ultra-low latency and seamless scaling. A key performance metric is TTFT (time to first token), which measures how long it takes for an LLM to generate and return the first word of its response after receiving a prompt.

No surprise, then, that one of the most competitive areas is optimization of a neocloud’s inference engine to reduce TTFT times, as well as to sustain overall throughput. AI agents cannot afford to return 429 errors, rate-limiting responses that frustrate users by indicating the maximum number of server requests has been exceeded.

A number of infrastructure-level techniques can keep AI results flowing. Sophisticated caching schemes can queue up local and remote nodes to provide nearly instantaneous results. Continuous batching reduces request wait times and maximizes CPU utilization. And a technique known as quantization deliberately reduces the precision of model weights post-training to increase memory utilization with no discernible effect on the accuracy of results. As workload sizes increase, the best neoclouds scale up to meet demand automatically, offering flexible token-based pricing to keep costs manageable.

Although still less expensive than the AI infrastructure offerings of the hyperscalers, the high end for neoclouds tends to be on-demand pricing per hour of GPU time. But some neoclouds now also offer so-called serverless pricing, where customers pay per token generated. The latter can decrease costs dramatically, as can spot pricing offered by neoclouds that temporarily have unused GPU capacity (ideal for fault-tolerant workloads that may experience fluctuating performance).

Increasingly, neocloud providers also offer predeployed open source LLMs such as Kimi-K2, Llama, Gemma, GPT-OSS, Qwen, and DeepSeek. This accelerates model discovery and experimentation, allowing users to generate API keys in minutes. More advanced neocloud providers tune their inference engines to each model for maximum optimization. A single pane of glass for inference performance metrics as well as model provisioning and management is highly desirable.

Ultimately, the idea is to provide infrastructure as a service specifically for AI, without all the application-layer stuff the hyperscalers have larded onto their platforms. The extensive automation, self-service configuration, and array of options are all tailor made for AI.

Solving the cost equation

Today, enterprises still tend to be in the experimental phase when it comes to running their own AI models. That’s why the majority of neocloud customers are AI natives—a mix of specialized AI providers offering everything from code generation tools to video generation to vertical solutions for health care, legal research, finance, and marketing.

Cost is critical for such providers, which is why neoclouds’ ability to offer AI infrastructure for far less than the hyperscalers is so attractive. Pricing models tailored to customer needs provide additional advantages.

But AI natives that need consistent performance at very low cost typically negotiate long-term contracts with neoclouds stretching months or years. These providers’ entire businesses are dependent on AI and rely on having high-quality inference without interruption. Agreements often include managed inference services as well as reliable, low-latency storage for massive data sets and high-throughput model training.

Reliability and security

As with any cloud, neoclouds must offer enterprise-grade reliability and security. One reason to opt for one of the neocloud leaders is that they’re more likely to have geographically distributed data centers that can provide redundancy when one location goes offline. Power redundancy is also critical, including uninterruptible power supplies and backup generators.

Neocloud security models are less complex than those of the hyperscalers. Because the bulk of what neoclouds offer is AI-specific infrastructure, business customers may be better served to think in terms of bringing neocloud deployments into their own security models. That being said, neoclouds must offer data encryption at rest as well as in transit, with the former supporting ephemeral elliptic curve Diffie-Hellman cryptographic key exchange signed with RSA and ECDSA. Also look for the usual certifications: SOC 2 Type I, SOC 2 Type II, and ISO 27001.

Neoclouds provide an environment that enables you to deploy distributed workloads, monitor them, and benefit from a highly reliable infrastructure where hardware failures are remediated transparently, without affecting performance. The result is better reliability, better observability, and better error recovery, all of which are essential to delivering a consistent AI customer experience.

The neocloud choice

We live in a multicloud world. When customers choose a hyperscale cloud, they’re typically attracted by certain features or implementations unavailable in the same form elsewhere. The same logic applies to opting for a neocloud: It’s a decision to use the highest-performing, most flexible, most cost-effective platform for running AI workloads.

The neocloud buildout can barely keep up with today’s AI boom, with new agentic workflows revolutionizing business processes and so called “AI employees” on the verge of coming online. The potential of AI is immense, offering unprecedented opportunities for innovation.

A recent McKinsey report estimated that by 2030, roughly 70% of data center demand will be for data centers equipped to host advanced AI workloads. No doubt a big chunk of that business will continue to go to the hyperscalers. But for customers who need to run high-performance AI workloads cost-effectively at scale, or who have needs that can’t be met by the hyperscalers’ prefab options, neoclouds provide a truly purpose-built solution.

New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.

Original Link:https://www.infoworld.com/article/4123878/how-neoclouds-meet-the-demands-of-ai-workloads.html
Originally Posted: Thu, 12 Feb 2026 09:00:00 +0000

0 People voted this article. 0 Upvotes - 0 Downvotes.

Artifice Prime

Atifice Prime is an AI enthusiast with over 25 years of experience as a Linux Sys Admin. They have an interest in Artificial Intelligence, its use as a tool to further humankind, as well as its impact on society.

svg
svg

What do you think?

It is nice to know your opinion. Leave a comment.

Leave a reply

Loading
svg To Top
  • 1

    How neoclouds meet the demands of AI workloads

Quick Navigation