Cloud computing

AI Hardware AI Infrastructure

23 min read

Updated Jun 22, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 22, 2026

Fact-checked

In review queue

Sources

26 citations

Revision

v2 · 4,626 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Cloud computing is the on-demand availability of computer system resources, especially data storage and computing power, delivered over the internet without active management by the end user. Instead of owning and operating physical servers, customers rent capacity from a service provider and pay only for what they consume. The market reached 419 billion US dollars in full-year 2025, growing about 30 percent year over year, and generative AI has become the single largest driver of that growth.^[1]^[21]

The canonical technical definition comes from the United States National Institute of Standards and Technology, which in Special Publication 800-145 (Mell and Grance, 2011) describes cloud computing as a model that exhibits five essential characteristics, three service models, and four deployment models.^[2] The five characteristics are on-demand self-service, broad network access, resource pooling, rapid elasticity, and measured service. The three service models are Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). The four deployment models are private, public, hybrid, and community cloud.

Cloud computing underpins most modern AI infrastructure. Large language models such as Anthropic's Claude and OpenAI's GPT family are trained and served on hyperscale cloud GPU clusters; managed services such as Amazon SageMaker, AWS Bedrock, Azure OpenAI, and Vertex AI abstract away the complexity of provisioning GPUs and TPUs; and GPU-as-a-service "neoclouds" such as CoreWeave, Lambda Labs, and Crusoe have emerged specifically to host AI training and inference. John Dinsdale, chief analyst at Synergy Research Group, summarized the shift in early 2026: "GenAI has simply put the cloud market into overdrive."^[21]

What is cloud computing? (the NIST definition)

The NIST definition is the most widely cited reference text for cloud computing in industry, government, and academia.^[2] It is short enough to quote in full in procurement documents yet broad enough to cover the range of public, private, and hybrid deployments seen today.

Five essential characteristics

On-demand self-service. A consumer can unilaterally provision computing capabilities as needed, automatically and without human interaction with the service provider.
Broad network access. Capabilities are available over the network through standard mechanisms that promote use by heterogeneous thin or thick clients.
Resource pooling. Provider resources are pooled to serve multiple consumers using a multi-tenant model, with physical and virtual resources dynamically assigned according to demand.
Rapid elasticity. Capabilities can be elastically provisioned and released, in some cases automatically, to scale rapidly outward and inward.
Measured service. Cloud systems automatically control and optimize resource use through metering appropriate to the service (storage, processing, bandwidth, active user accounts), providing transparency for both provider and consumer.^[2]

If any one of these properties is missing, the system is usually classified as a hosted or managed service rather than true cloud computing.

Three service models

IaaS delivers virtualized compute, storage, and networking primitives. The customer controls operating systems, storage, and deployed applications; the provider manages the hardware and hypervisor. Examples: Amazon EC2, Google Compute Engine, Azure Virtual Machines.
PaaS delivers a managed runtime in which the customer deploys application code without managing servers, operating systems, or middleware. Examples: Heroku, Google App Engine, AWS Elastic Beanstalk, Azure App Service.
SaaS delivers complete applications consumed through a browser or thin client; the customer manages only data and configuration. Examples: Salesforce, Microsoft 365, Google Workspace, Slack, Zoom.

Industry practice extends these categories with Function as a Service (FaaS), Container as a Service (CaaS), Database as a Service (DBaaS), and Backend as a Service (BaaS), each of which fits within or between the three NIST categories.

Four deployment models

Public cloud: open use by the general public (AWS, Microsoft Azure, Google Cloud).
Private cloud: exclusive use by a single organization, on-premises or third-party hosted.
Hybrid cloud: two or more distinct cloud infrastructures (private, community, public) bound together by standardized or proprietary technology.
Community cloud: exclusive use by a specific community of consumers from organizations that share concerns (mission, security requirements, policy, compliance).

When did cloud computing start?

The conceptual roots of cloud computing reach back to the time-sharing era of the 1960s, when systems such as MIT's Compatible Time-Sharing System (CTSS, 1961) and the multi-organization MULTICS project demonstrated that a single mainframe could serve many remote users at once.^[3] In a 1961 talk at MIT's centennial, John McCarthy anticipated the commercial form of this idea, predicting that "computation may someday be organized as a public utility," sold by the cycle the way electricity was sold by the kilowatt-hour.^[4] In the following decades, IBM and DEC commercialized time-sharing, and remote application hosting matured into the application service provider model of the late 1990s.

The phrase "cloud computing" appears in internal Compaq business plans dated 1996 and was used publicly in a 1997 INFORMS keynote by University of Texas professor Ramnath Chellappa, who described it as "a new computing paradigm where the boundaries of computing will be determined by economic rationale rather than technical limits alone."^[5] The term entered mainstream business vocabulary in August 2006 when Google chief executive Eric Schmidt used it at a Search Engine Strategies conference to describe the data-services-and-architecture model that Google operated.^[6]

The modern public cloud era is generally dated to 2006. Salesforce.com, founded in 1999 by Marc Benioff and Parker Harris, had already popularized the multi-tenant SaaS delivery model with its slogan "No software," but Salesforce was a single application. The launch of Amazon Web Services reframed the cloud as a set of general-purpose primitives. Amazon S3 launched on March 14, 2006 as the first publicly available AWS service, followed by EC2 in beta on August 25, 2006.^[7]^[8] These two services, object storage and rentable virtual machines billed by the hour, became the template for almost every IaaS provider that followed. Google App Engine launched in preview on April 7, 2008 as a PaaS for Python web applications. Microsoft announced Windows Azure (later renamed Microsoft Azure) at its Professional Developers Conference on October 27, 2008, and the service became commercially available on February 1, 2010.^[9] Google's broader IaaS and developer offerings were unified under the Google Cloud Platform brand from 2011 to 2012.

A second wave of cloud platforms grew out of open-source infrastructure software. Docker was first released in March 2013, popularizing application containers. Kubernetes, originally developed at Google as Borg's successor, was open-sourced in June 2014 and donated to the newly created Cloud Native Computing Foundation. AWS Lambda was announced on November 13, 2014 at AWS re:Invent, ushering in the serverless era.^[10] Demand accelerated again from 2022 onward as generative AI workloads consumed enormous amounts of GPU capacity. Synergy Research Group reported that the global cloud infrastructure market reached 330 billion US dollars in 2024 and grew further to 419 billion US dollars in 2025, with quarterly revenues hitting 119.1 billion US dollars in Q4 2025, a 30 percent year-over-year increase and the ninth consecutive quarter of accelerating growth.^[1]^[21]

Service models

The boundary between IaaS, PaaS, and SaaS is often blurry, since most providers sell products from all three categories.

Layer	Customer manages	Provider manages	AWS	Azure	Google Cloud
SaaS	Data, configuration	Application, runtime, OS, virtualization, hardware	WorkMail	Microsoft 365	Google Workspace
FaaS / Serverless	Function code, triggers	Runtime, OS, scaling, hardware	AWS Lambda	Azure Functions	Cloud Functions
PaaS	Application code, data	Runtime, OS, scaling, hardware	Elastic Beanstalk	App Service	App Engine
CaaS	Container images, manifests	Orchestrator, OS, hardware	ECS, EKS, Fargate	AKS	GKE
IaaS	OS, runtime, code, data	Virtualization, hardware, network	EC2, S3, VPC	VMs, Blob Storage	Compute Engine, Cloud Storage

A typical AI startup in 2025 might run inference on a managed model API (SaaS), package agent logic on serverless functions (FaaS), keep training data in object storage (IaaS), and use a managed vector database (DBaaS).

Deployment models

Most public-facing internet services run on public cloud. Banks, defense agencies, and large healthcare providers often operate private cloud environments, sometimes on hardware in their own data centers and sometimes on dedicated hardware inside a public cloud region. Hybrid cloud is common for enterprises with regulated data: workloads with low latency or strict residency requirements stay on premises while elastic workloads burst to the public cloud. Community cloud examples include AWS GovCloud (US) for federal, state, and local government agencies, and consortium clouds operated by European banks. Multi-cloud, where an organization deliberately uses two or more public clouds, is now a common procurement strategy used to avoid vendor lock-in and to access provider-specific services such as Google's TPUs or Azure's OpenAI integration.

Who are the major cloud providers?

Three hyperscalers dominate the public IaaS and PaaS market. According to Synergy Research Group, in Q4 2025 AWS held roughly 28 percent of global market share, Microsoft Azure 21 percent, and Google Cloud 14 percent, with Google and Microsoft posting faster growth than the leader.^[21] Alibaba Cloud leads China and ranks fourth globally; Oracle Cloud and IBM Cloud round out the top six.

Provider	Founded / launched	HQ	Primary services
Amazon Web Services (AWS)	2006 (S3, EC2)	Seattle, US	EC2, S3, Lambda, SageMaker, Bedrock
Microsoft Azure	2010	Redmond, US	VMs, Blob Storage, Azure ML, Azure OpenAI
Google Cloud Platform	2008 / 2011 rebrand	Mountain View, US	Compute Engine, BigQuery, Vertex AI, TPUs
Alibaba Cloud	2009	Hangzhou, China	ECS, OSS, MaxCompute, PAI
Oracle Cloud Infrastructure	2016 (Gen 2)	Austin, US	OCI Compute, Autonomous Database, GPU clusters
IBM Cloud	2011 (SmartCloud)	Armonk, US	Cloud VPC, watsonx, Power Systems
Tencent Cloud	2010	Shenzhen, China	CVM, COS, TI Platform
Salesforce	1999	San Francisco, US	Sales Cloud, Service Cloud, Einstein
CoreWeave	2017	Roseland, US	GPU IaaS for AI training and inference
Lambda Labs	2012	San Francisco, US	GPU cloud, H100 / H200 instances
Crusoe Energy	2018	Denver, US	Stranded-gas-powered AI data centers and GPU cloud

A newer category of providers, sometimes called "neoclouds," specializes in renting GPU capacity for AI workloads. CoreWeave, Lambda Labs, Crusoe, Together AI, and Nebius all operate clusters of NVIDIA accelerators sold by the hour or as reserved capacity. CoreWeave, the largest of these, went public in March 2025 at 40 US dollars per share in the biggest US tech IPO since 2021, reported 5.13 billion US dollars of revenue for full-year 2025, and ended 2025 with a revenue backlog of 66.8 billion US dollars in remaining performance obligations.^[22] By Q4 2025 Synergy Research Group reported that Oracle and these specialized AI clouds had continued to inch market share away from the big three, though their individual shares remained in low single digits.^[21]

AI and ML cloud services

Cloud platforms now ship comprehensive stacks for training and serving AI models, spanning four layers: raw GPU and accelerator infrastructure, managed training and tuning, model hosting and inference APIs, and higher-level agents and applications.

Amazon Web Services

AWS launched Amazon SageMaker at re:Invent 2017 as a unified platform for building, training, and deploying machine learning models, with managed Jupyter notebooks, distributed training, hyperparameter tuning, model registry, and autoscaling inference endpoints. For foundation-model workloads, AWS introduced Amazon Bedrock, announced April 13, 2023 and generally available on September 28, 2023.^[11] Bedrock provides a single API for invoking models from Anthropic (Claude), AI21 Labs, Cohere, Meta (Llama), Stability AI, Mistral, and Amazon's own Titan and Nova families, plus fine-tuning, knowledge bases for retrieval-augmented generation, agents, and guardrails. At the hardware tier, AWS offers EC2 P5 instances with eight NVIDIA H100 GPUs and newer P5e instances with NVIDIA H200 GPUs, alongside custom Trainium training chips and Inferentia inference chips made by Amazon's Annapurna Labs subsidiary. Anthropic is a major Bedrock partner: Amazon invested up to 4 billion US dollars in Anthropic in September 2023 and an additional 4 billion in November 2024, with AWS named as Anthropic's primary cloud and training partner. In November 2025 OpenAI also signed a 38 billion US dollar, seven-year agreement to run workloads on AWS, giving it access to hundreds of thousands of NVIDIA GB200 and GB300 GPUs on EC2.^[23]

Microsoft Azure

Microsoft's AI portfolio centers on the Azure OpenAI Service, which became generally available on January 17, 2023, giving enterprise customers access to OpenAI's GPT, DALL-E, and embedding models with Azure compliance and virtual-network isolation.^[12] Microsoft has invested an estimated 13 billion US dollars in OpenAI since 2019 and runs much of OpenAI's training and inference on dedicated Azure capacity. Azure Machine Learning offers a similar workflow to SageMaker, and Azure ND-series VMs host NVIDIA H100 and H200 GPUs in 8-GPU configurations connected by NVLink and InfiniBand. In late 2024 Microsoft consolidated its agent and model-customization tools under the Azure AI Foundry brand.

Google Cloud

Google's flagship managed AI service is Vertex AI, launched at Google I/O in May 2021. Vertex AI integrates AutoML, custom training, model registry, and a Model Garden with first-party models such as Gemini and Imagen alongside open models such as Llama, Mistral, and Anthropic's Claude. Google Cloud is the only hyperscaler that offers Tensor Processing Units, Google's custom AI accelerators. Public TPU generations include TPU v5e (efficiency-optimized), TPU v5p (announced December 2023), and TPU v6 "Trillium" (announced May 2024). Google bundles TPUs, GPUs, fast networking, storage, and a unified scheduler under the AI Hypercomputer brand. Anthropic also runs Claude on Vertex AI, and Google made multiple multibillion-dollar investments in Anthropic between 2023 and 2024.

Specialized AI clouds

NVIDIA's DGX Cloud, announced in March 2023, is a multi-cloud AI training service that offers reserved DGX nodes hosted inside Oracle, Microsoft, Google, and AWS data centers under a single NVIDIA-managed software stack. Neoclouds occupy the middle ground between hyperscalers and on-premises hardware: CoreWeave, founded in 2017 as a cryptocurrency mining business, pivoted to GPU rental and by 2024 operated more than 250,000 NVIDIA GPUs across about 30 data centers, becoming a key training partner for Microsoft and OpenAI workloads. Lambda Labs sells on-demand and reserved H100 / H200 capacity to AI startups and researchers; Crusoe Energy operates AI-optimized data centers powered partly by stranded natural gas; Together AI and Nebius mix model APIs with bare-GPU rental. OpenAI's compute strategy expanded far beyond Azure during 2025: the company disclosed roughly 1.15 trillion US dollars of multi-year compute and infrastructure commitments, including an estimated 300 billion US dollar deal with Oracle Cloud as part of the Stargate data-center program, a CoreWeave commitment that grew to about 22 billion US dollars, and an arrangement under which NVIDIA agreed to invest up to 100 billion US dollars tied to OpenAI building at least 10 gigawatts of new data-center capacity.^[24]

Architecture

Virtualization and multi-tenancy

Virtualization lets multiple virtual machines share a single physical server. Public clouds use a mix of open-source and proprietary hypervisors: Xen powered the original EC2 fleet from 2006, and AWS later migrated most instance families to its custom Nitro hypervisor, which offloads networking, storage, and security to dedicated hardware cards. Azure runs primarily on Hyper-V; Google Cloud uses a modified KVM. Container runtimes such as containerd and gVisor add lighter-weight isolation above VMs. Multi-tenancy means a single instance of infrastructure or software serves many customers with logical isolation: SaaS applications usually share a single database with tenant identifiers on each row, while IaaS providers separate tenants at the hypervisor or dedicated-host level.

Regions, storage, and networking

Most public clouds organize physical capacity into regions, each composed of multiple availability zones. AWS popularized this terminology: a region is a geographic area such as US East (N. Virginia) or eu-west-1 (Ireland), and an availability zone is one or more discrete data centers with independent power, cooling, and networking. As of 2025, AWS operates more than 33 regions and over 100 availability zones; Azure and Google Cloud each operate more than 30 regions. Cloud storage splits into three tiers: object storage (S3, Azure Blob, Google Cloud Storage) for opaque binary objects accessed via HTTP, block storage (EBS, Azure Disks, Persistent Disk) for raw volumes that look like local disks, and file storage (EFS, Azure Files, Filestore) for shared file systems via NFS or SMB. Customer networks are carved out as Virtual Private Clouds (VPCs) with customer-controlled IP ranges, route tables, and firewall rules, connected to corporate networks via AWS Direct Connect or Azure ExpressRoute. CDNs such as AWS CloudFront, Google Cloud CDN, Azure Front Door, Cloudflare, and Fastly cache content at hundreds of edge locations.

Identity and access management

Identity, authorization, and audit are handled by AWS IAM, Azure Active Directory (renamed Microsoft Entra ID in 2023), and Google Cloud IAM. These services support fine-grained policy languages, role assumption, federation via SAML or OIDC, and short-lived credentials. Misconfigured IAM policies remain among the most common causes of cloud data leaks.

How much does cloud computing cost?

Cloud pricing is one of the most-debated topics in enterprise IT. Most providers offer a menu of pricing modes for the same underlying resources:

On-demand: pay by the hour or second with no long-term commitment. Most expensive per unit, most flexible.
Reserved or committed-use: discounts of roughly 30 to 70 percent in exchange for one- or three-year commitments (AWS Reserved Instances and Savings Plans, Google Committed Use Discounts, Azure Reserved VM Instances).
Spot or preemptible: deeply discounted access (often 60 to 90 percent off) to spare capacity the provider can reclaim with short notice; used heavily for batch processing, rendering, and tolerant ML training.
Free tier: modest monthly allotments designed to attract new customers.

Cloud bills also include less-visible cost categories. Egress fees, charged for data leaving the provider's network, have been criticized as a hidden lock-in mechanism: pulling a multi-petabyte data set out of one cloud can cost tens of thousands of dollars. The European Union's Data Act, which entered into force on January 11, 2024, requires cloud providers to make customer switching easier and to phase out egress fees; Google Cloud, AWS, and Azure each announced free egress for customers leaving the platform during 2024 in response.^[13] The FinOps movement, formalized by the founding of the FinOps Foundation in 2019 and its merger into the Linux Foundation in 2020, defines a discipline for managing cloud cost and value across engineering, finance, and product teams. By 2024 over 90 percent of large enterprises with significant cloud spend reported having a dedicated FinOps function, according to the FinOps Foundation's annual State of FinOps survey.^[14]

The scale of AI-driven cloud investment has become macroeconomically significant. Microsoft, Alphabet, Amazon, and Meta together announced more than 200 billion US dollars of planned 2024 capital spending on data centers and AI infrastructure, and their combined 2026 capital-expenditure guidance rose to roughly 725 billion US dollars, up about 77 percent from 2025: Amazon near 200 billion, Microsoft near 190 billion, Alphabet 175 to 185 billion, and Meta 115 to 135 billion.^[15]^[25] Industry analysts increasingly note that electrical power, not chips, is now the binding constraint. As Microsoft chief executive Satya Nadella put it on the BG2 podcast in November 2025, "the biggest issue we are now having is not a compute glut, but it's power," adding that he had GPUs he could not deploy because "I don't have warm shells to plug into."^[26]

Security and compliance

Cloud security is governed by the shared responsibility model, popularized by AWS but adopted in some form by every major provider. The provider is responsible for the security "of" the cloud (physical data centers, hardware, hypervisor, network fabric, managed-service control planes), while the customer is responsible for security "in" the cloud (operating systems, application code, data classification, IAM and network configuration). Most major cloud breaches have stemmed from customer-side misconfiguration rather than provider compromise.

Providers pursue extensive compliance attestations: common frameworks include SOC 2 Type II, ISO/IEC 27001, ISO/IEC 27017, ISO/IEC 27018, PCI DSS, HIPAA, FedRAMP (Moderate and High), the UK's Cyber Essentials Plus, and Australia's IRAP. The European Union's GDPR imposes additional data-residency and data-subject-rights obligations. Encryption at rest is on by default for most cloud storage; encryption in transit is generally TLS 1.2 or 1.3. Customers often layer customer-managed keys on top using AWS KMS, Azure Key Vault, or Google Cloud KMS, sometimes with hardware security module backing.

A newer category of "sovereign cloud" offerings addresses concerns by European, Middle Eastern, and Asian governments that data stored in US-owned clouds might fall under US law, in particular the CLOUD Act of 2018, which authorizes US authorities to compel disclosure of data held by US companies even when stored abroad. Microsoft Cloud for Sovereignty, Google Sovereign Cloud, AWS European Sovereign Cloud (announced October 2023, scheduled to launch in 2025), and partnerships such as the Bleu joint venture between Capgemini, Orange, and Microsoft for the French market are direct responses. Major cloud-related security events include the May 2019 Capital One breach (an SSRF on AWS, exploited by a former AWS engineer), the December 2020 SolarWinds compromise affecting Microsoft cloud customers via the Orion software update, and the July 19, 2024 CrowdStrike Falcon sensor incident, in which a faulty content update sent millions of Windows hosts into boot loops, disrupting airlines, banks, and hospitals worldwide.

Trends

Hybrid and multi-cloud. Tools such as AWS Outposts, Azure Stack, Azure Arc, and Google Anthos bring cloud control planes onto customer premises. Service meshes (Istio, Linkerd) and platform-engineering tools (Backstage, Crossplane) help organizations operate workloads across multiple clouds and on-premises environments without rewriting them for each platform.

Edge computing. Edge computing extends cloud capabilities closer to data sources and users to reduce latency. Cloudflare Workers (2017), AWS Wavelength (2020), AWS Local Zones, Azure Edge Zones, and Google Distributed Cloud Edge target real-time gaming, vehicle workloads, low-latency 5G applications, and on-premises industrial AI inference.

Sustainability. Hyperscale data centers consume large amounts of electricity, and providers have responded with climate commitments. Microsoft pledged in January 2020 to be carbon-negative by 2030 and remove its historical emissions by 2050. Google announced in 2020 a goal of running on 24/7 carbon-free energy in all data centers by 2030. Amazon's Climate Pledge, signed in 2019, aims for net-zero by 2040, supported by power purchase agreements that made Amazon the world's largest corporate buyer of renewable electricity from 2020 through 2024 according to BloombergNEF. Critics note that GenAI workloads are pushing electricity demand and water usage upward at rates that strain these commitments.

AI-driven demand surge. Synergy Research Group estimated that GenAI was the primary driver of cloud market growth through 2025, helping push quarterly revenues to 119.1 billion US dollars in Q4 2025.^[21] This has driven hyperscaler capex to record highs, contributed to long lead times for NVIDIA H100 and H200 GPUs, and made power and grid interconnection capacity the primary bottleneck to data-center expansion in many regions. Infrastructure-as-code tools such as HashiCorp Terraform (2014), Pulumi (2018), and AWS CloudFormation are standard practice, and cloud configurations are increasingly generated and reviewed by AI assistants.

Criticisms

Vendor lock-in and egress fees. Migrating from one cloud to another involves rewriting infrastructure-as-code templates, IAM policies, and managed-service integrations; the deeper a customer adopts proprietary services such as DynamoDB, BigQuery, or Cosmos DB, the higher the switching cost. Charging customers to extract their own data has long been criticized as anti-competitive, though the 2024 EU Data Act has begun to constrain this practice.

Outages and concentration risk. Because a few hyperscalers host a large fraction of internet services, single-region failures cause cascading effects. Notable examples include the AWS US-East-1 outages of February 28, 2017 (an S3 typo took down portions of the public web) and December 7, 2021, Azure outages in September 2018 and July 2024, and a June 2019 Google Cloud networking outage. UK and EU regulators opened formal market studies of hyperscaler concentration in 2023 and 2024.

Privacy and government access. The CLOUD Act and analogous laws have created concerns about cross-border data access. The Court of Justice of the European Union's 2020 Schrems II ruling invalidated the EU-US Privacy Shield, leading to years of legal uncertainty for transatlantic data transfers and contributing to the rise of sovereign cloud offerings.

Over-provisioning, waste, and workforce shifts. Industry surveys including Flexera's annual State of the Cloud have estimated that 25 to 35 percent of cloud spend is wasted on idle or oversized resources, one motivator for the FinOps movement. Critics also argue that the AI capex surge undermines hyperscalers' climate commitments, that water consumption for evaporative cooling at large data centers can be locally significant in arid regions, and that traditional system-administration roles have shrunk in favor of platform-engineering, site-reliability-engineering, and FinOps roles. Some enterprises characterize "lift and shift" cloud migrations as exercises that move costs from capital to operating expense without delivering proportionate productivity gains.

References

Synergy Research Group, "Cloud Market Jumped to $330 billion in 2024 - GenAI is Now Driving Half of the Growth," February 2025. https://www.srgresearch.com/articles/cloud-market-jumped-to-330-billion-in-2024-genai-is-now-driving-half-of-the-growth ↩
Peter Mell and Timothy Grance, "The NIST Definition of Cloud Computing," NIST Special Publication 800-145, September 2011. https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-145.pdf ↩
Fernando J. Corbato et al., "An Experimental Time-Sharing System," Proceedings of the 1962 Spring Joint Computer Conference, MIT, 1962. ↩
Simson L. Garfinkel, "Architects of the Information Society," MIT Press, 1999. (Records McCarthy's 1961 utility-computing prediction.) ↩
Antonio Regalado, "Who Coined 'Cloud Computing'?," MIT Technology Review, October 31, 2011. https://www.technologyreview.com/2011/10/31/257406/who-coined-cloud-computing/ ↩
Eric Schmidt, conversation with Danny Sullivan, Search Engine Strategies Conference, San Jose, August 9, 2006. https://www.google.com/press/podium/ses2006.html ↩
Amazon Web Services, "Amazon S3 Launches," press release, March 14, 2006. https://press.aboutamazon.com/2006/3/amazon-web-services-launches ↩
Amazon Web Services, "Announcing Amazon EC2 beta," August 25, 2006. https://aws.amazon.com/about-aws/whats-new/2006/08/24/announcing-amazon-elastic-compute-cloud-amazon-ec2---beta/ ↩
Microsoft, "Windows Azure General Availability," press release, February 1, 2010. https://news.microsoft.com/2010/02/01/windows-azure-general-availability/ ↩
Amazon Web Services, "Amazon Web Services Announces AWS Lambda," re:Invent press release, November 13, 2014. https://press.aboutamazon.com/2014/11/amazon-web-services-announces-aws-lambda ↩
Swami Sivasubramanian, "Amazon Bedrock Is Now Generally Available," AWS News Blog, September 28, 2023. https://aws.amazon.com/blogs/aws/amazon-bedrock-is-now-generally-available-build-and-scale-generative-ai-applications-with-foundation-models/ ↩
Eric Boyd, "General availability of Azure OpenAI Service," Microsoft Azure Blog, January 16, 2023. https://azure.microsoft.com/en-us/blog/general-availability-of-azure-openai-service-expands-access-to-large-advanced-ai-models-with-added-enterprise-benefits/ ↩
European Commission, "Data Act enters into force," press release, January 11, 2024. https://digital-strategy.ec.europa.eu/en/policies/data-act ↩
FinOps Foundation, "State of FinOps 2024." https://data.finops.org/ ↩
Bloomberg, "Big Tech's $200 Billion AI Capex Bet Is Reshaping the Economy," 2024. https://www.bloomberg.com/news/articles/2024-08-05 ↩
Nicholas Carr, "The Big Switch: Rewiring the World, from Edison to Google," W. W. Norton, 2008.
Michael Armbrust et al., "A View of Cloud Computing," Communications of the ACM, vol. 53, no. 4, April 2010. https://dl.acm.org/doi/10.1145/1721654.1721672
Kubernetes project, "Kubernetes 1.0 release announcement," CNCF, July 21, 2015. https://kubernetes.io/blog/2015/07/kubernetes-10-launch-party-at-oscon/
Synergy Research Group, "Cloud Market Share Trends - Big Three Together Hold 63%," 2024. https://www.srgresearch.com/articles/cloud-market-share-trends-big-three-together-hold-63-while-oracle-and-the-neoclouds-inch-higher
NVIDIA, "NVIDIA DGX Cloud Launches with Microsoft Azure, Oracle Cloud and Google Cloud," GTC press release, March 21, 2023. https://nvidianews.nvidia.com/news/nvidia-dgx-cloud
Synergy Research Group, "GenAI Helps Drive Quarterly Cloud Revenues to $119 Billion as Growth Rate Jumped Yet Again in Q4," February 2026. https://www.srgresearch.com/articles/genai-helps-drive-quarterly-cloud-revenues-to-119-billion-as-growth-rate-jumped-yet-again-in-q4 ↩
CoreWeave, Inc., "Fourth Quarter and Full Year 2025 Results," Form 8-K, SEC, 2026. https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0001769628&type=8-K ↩
Amazon Web Services, "AWS and OpenAI announce multi-year strategic partnership," press release, November 3, 2025. https://press.aboutamazon.com/2025/11/aws-and-openai-announce-multi-year-strategic-partnership ↩
CNBC, "A guide to the $1 trillion-worth of AI deals between OpenAI, Nvidia and others," October 15, 2025. https://www.cnbc.com/2025/10/15/a-guide-to-1-trillion-worth-of-ai-deals-between-openai-nvidia.html ↩
CNBC, "Tech AI spending approaches $700 billion in 2026, cash taking big hit," February 6, 2026. https://www.cnbc.com/2026/02/06/google-microsoft-meta-amazon-ai-cash.html ↩
TechCrunch, "Altman and Nadella need more power for AI, but they're not sure how much," November 3, 2025. https://techcrunch.com/2025/11/03/altman-and-nadella-need-more-power-for-ai-but-theyre-not-sure-how-much/ ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

AI in manufacturing Amazon Ampere Computing AstroD AD-01 CoreWeave Crusoe Energy Data Center Edge computing Internet of Things Lepton AI Microsoft Azure Modal (platform)Pinecone RunPod Technology Tencent AI Vast.ai Wang Jian

What is cloud computing? (the NIST definition)

Five essential characteristics

Three service models

Four deployment models

When did cloud computing start?

Service models

Deployment models

Who are the major cloud providers?

AI and ML cloud services

Amazon Web Services

Microsoft Azure

Google Cloud

Specialized AI clouds

Architecture

Virtualization and multi-tenancy

Regions, storage, and networking

Identity and access management

How much does cloud computing cost?

Security and compliance

Trends

Criticisms

See also

References

Improve this article

Related Articles

Cloud TPU

NVIDIA Picasso

Tensor Processing Unit (TPU)

TPU Pod

TPU Node

TPU Worker

What links here

Related Articles

Cloud TPU

NVIDIA Picasso

Tensor Processing Unit (TPU)

TPU Pod

TPU Node

TPU Worker

What links here