| Virtual Reality | |
|---|---|
| Type | Computer-generated immersive environment |
| Common acronym | VR |
| Related fields | Augmented reality, Mixed reality, Extended reality, Computer vision |
| Display medium | Head-mounted display (HMD), CAVE projection |
| First HMD | The Sword of Damocles, Ivan Sutherland, 1968 |
| Term coined | Jaron Lanier, mid-1980s, VPL Research |
| Modern consumer launch | Oculus Rift Kickstarter, August 2012 |
| Major vendors (2026) | Meta, Apple, Sony, Valve, ByteDance/Pico, HTC Vive, Bigscreen |
| Market leader | Meta Reality Labs, ~75% global headset shipment share in 2025 |
Virtual reality (VR) is a computer-generated three-dimensional environment that a user can perceive, navigate and interact with in a way that simulates physical presence. In its modern consumer form, VR is delivered through a stereoscopic head-mounted display (HMD) with positional tracking, spatial audio, and hand controllers or hand tracking, producing the subjective experience of being inside a digital scene rather than viewing one through a screen. VR is related to, but distinct from, augmented reality (AR), mixed reality (MR), and the umbrella term extended reality (XR).[1]
VR has become one of the most active application surfaces for artificial intelligence. AI now drives the rendering pipeline (through neural upscaling, foveated variable rate shading, and Gaussian splatting for novel-view synthesis), the avatar layer (Meta's Codec Avatars and Apple's Personas use generative models to reconstruct faces in real time), and the content layer (large language models such as Meta's Llama drive NPCs and auto-generated worlds in Horizon and VRChat). VR is also a primary training environment for embodied AI agents in robotics, autonomous driving, and drones, where simulated worlds enable safe, massively parallel reinforcement learning before physical deployment.[2][3]
This article focuses on the intersection of VR and AI. Topics include modern hardware (Meta Quest 3 and 3S, Apple Vision Pro with M5, Sony PSVR2, Valve Steam Frame, ByteDance Pico, Bigscreen Beyond 2), AI techniques in VR rendering, the use of VR as a sim-to-real training pipeline, social VR with generative NPCs, and generative VR worlds produced by foundation models such as Google DeepMind's Genie 3 and Meta's WorldGen.
VR, AR, MR, and XR sit along the reality-virtuality continuum first formalised by Paul Milgram and Fumio Kishino in 1994.
| Term | Acronym | What the user sees | Example device |
|---|---|---|---|
| Virtual reality | VR | Fully synthetic environment, real world hidden | Meta Quest 3, Sony PSVR2, Valve Steam Frame |
| Augmented reality | AR | Real world with simple digital overlays | Smartphone AR filters, Meta Ray-Ban smart glasses |
| Mixed reality | MR | Real world with digital objects that interact with surfaces and lighting | HoloLens 2, Quest 3 passthrough, Magic Leap 2 |
| Extended reality | XR | Umbrella category covering all of the above | Industry term used by Qualcomm, IEEE VR, IDC reports |
The line between VR and MR has blurred since 2023. The Meta Quest 3 and Apple Vision Pro are immersive HMDs (so they are VR devices) but include colour passthrough cameras that show the surrounding room with virtual objects composited in. Apple prefers "spatial computing" for this style of experience. The remainder of this article uses VR in the broader, modern sense that includes passthrough mixed reality.[4][5]
The Sword of Damocles, built in 1968 by Ivan Sutherland and Bob Sproull at Harvard, was the first true head-mounted display. It used two small cathode-ray tubes inside a metal frame so heavy it had to be suspended from the ceiling on a mechanical arm (the source of its nickname). Its graphics were simple line drawings, but it implemented the core idea that still defines VR: head tracking that updates the view in real time.[6]
In 1985 Jaron Lanier founded VPL Research, the first company to commercialise VR hardware. VPL produced the DataGlove, the EyePhone HMD, and a body suit, and Lanier popularised the phrase "virtual reality" itself.[7]
The modern consumer era began on August 1, 2012, when Palmer Luckey launched a Kickstarter for the Oculus Rift development kit. The campaign ended September 1, 2012 with $2.4 million from 9,522 backers. On March 25, 2014, Facebook (now Meta) acquired Oculus VR for approximately $2 billion. Meta subsequently shipped the consumer Rift in 2016, the standalone Quest in 2019, the Quest 2 in 2020 (the best-selling VR headset of all time, with more than 20 million units sold), the Quest Pro in 2022, and the Quest 3 and 3S in 2023 and 2024.[8][9]
| Headset | Vendor | Launched | Display per eye | SoC / chip | Tracking | Approx. price (USD) |
|---|---|---|---|---|---|---|
| Meta Quest 2 | Meta | October 2020 | 1832x1920 LCD | Snapdragon XR2 Gen 1 | Inside-out, hand tracking | $199-$299 |
| Meta Quest Pro | Meta | October 2022 | 1800x1920 mini-LED | Snapdragon XR2+ Gen 1 | Eye and face tracking | $999 |
| Meta Quest 3 | Meta | October 2023 | 2064x2208 LCD | Snapdragon XR2 Gen 2 | Color passthrough, hand tracking | $499 |
| Meta Quest 3S | Meta | October 2024 | 1832x1920 LCD | Snapdragon XR2 Gen 2 | Color passthrough, hand tracking | $299 |
| Sony PlayStation VR2 | Sony | February 2023 | 2000x2040 OLED | PS5 host | Eye tracking, foveated rendering | $349-$549 |
| Apple Vision Pro | Apple | February 2024 | 3660x3200 micro-OLED | Apple M2 + R1 | Eye and hand tracking | $3,499 |
| Apple Vision Pro (M5) | Apple | October 2025 | 3660x3200 micro-OLED | Apple M5 + R1 | Eye and hand tracking | $3,499 |
| Pico 4 Ultra | ByteDance Pico | September 2024 | 2160x2160 LCD | Snapdragon XR2 Gen 2 | Inside-out, hand tracking | ~$549 |
| Bigscreen Beyond 2 | Bigscreen | April 2025 | 2560x2560 OLED | PC tethered | Lighthouse, optional eye tracking | $1,019-$1,219 |
| HTC Vive XR Elite | HTC | February 2023 | 1920x1920 LCD | Snapdragon XR2 Gen 1 | Color passthrough | $1,099 |
| Valve Steam Frame | Valve | Announced Nov 2025, ships 2026 | 2160x2160 LCD | Snapdragon 8 Gen 3 | Eye tracking, foveated streaming | TBA |
The Meta Quest 3, released October 10, 2023, was the first mass-market headset with high-quality colour passthrough mixed reality and the Snapdragon XR2 Gen 2, using 2064x2208 LCD displays behind pancake optics and a 110-degree horizontal field of view. The Meta Quest 3S, launched October 15, 2024 at $299, brought the same chip to a lower price by reusing Quest 2 displays and Fresnel lenses, replacing the Quest 2 as Meta's entry point.[10][11]
Meta Horizon OS has shifted aggressively toward AI. The v71 update (December 2024) added Audio to Expression, an on-device model that converts microphone audio into avatar facial expressions. The v81 release (October 2025) added Meta AI logic improvements and "window anchoring." Meta AI on Quest, rolled out in summer 2024, is multimodal: it answers questions about what the user sees through passthrough cameras, identifies objects, translates signs, and runs conversational chat in-headset.[12][13]
Apple Vision Pro, released February 2, 2024 at $3,499, is the most technically ambitious consumer HMD ever shipped. It uses two micro-OLED panels at 3660x3200 each (combined 23 megapixels), an Apple M2 processor and a custom R1 sensor-fusion chip, twelve cameras, five sensors and six microphones. Eye tracking is the primary input. Apple avoided the term VR entirely, calling the category "spatial computing."[14]
In October 2025, Apple released a refreshed Vision Pro built around the M5 chip. The M5 model keeps the $3,499 price but delivers up to 50 percent faster on-device AI for system features such as Persona generation and spatial photo upscaling, up to 2x faster third-party AI workloads, and a refresh rate that scales up to 120 Hz. Mark Gurman has reported that a redesigned Apple Vision Pro 2 is targeted for late 2026 with a new lens and cooling design.[15][16]
visionOS 26, released in 2025, reworked Apple's Personas using a generative AI algorithm built on Gaussian splatting to produce more natural facial motion. It also added support for 180- and 360-degree content from Insta360, GoPro and Canon, support for the PSVR2 Sense controller, and an enterprise API surface. The Apple Intelligence stack is integrated throughout.[17]
Sony's PlayStation VR2 launched February 22, 2023 at $549 and is the only major recent consumer headset to use OLED displays (2000x2040 per eye). It includes built-in eye tracking and supports foveated rendering through the PS5 GPU. Sony released a PC adapter for $59.99 on August 7, 2024, but PC mode loses HDR, eye tracking, adaptive triggers and most haptics. Sony has discounted the headset to $349 multiple times.[18][19]
Valve announced the Steam Frame on November 12, 2025, with launch slated for the first half of 2026. It uses a Qualcomm Snapdragon 8 Gen 3 with 16 GB LPDDR5X, LCD displays at 2160x2160 per eye, refresh rates from 72 to 144 Hz, and weights 185 grams (base unit) or 440 grams with strap and battery. It runs an Arch Linux variant of SteamOS using Proton for Windows games and FEX-Emu for x86 emulation. The headset includes eye tracking for foveated rendering and a foveated streaming mode that varies wireless bitrate at the encoder level so the centre of gaze gets more bandwidth. Wi-Fi 7 with multiple radios separates internet traffic from VR streaming.[20]
Pico, a subsidiary of ByteDance since 2021, is the second-largest VR vendor by global shipments. The Pico 4 Ultra, released September 2024, uses the Snapdragon XR2 Gen 2 with 12 GB of RAM. Pico held about 4.1 percent of global shipments in 2024 (around 680,000 units against Meta's 5.3 million) but dominated China with 46 percent of consumer VR in H1 2025. ByteDance plans a new Pico generation built on its own custom XR chip in 2026.[21][22]
The Bigscreen Beyond 2, released April 2025 at $1,019, is the smallest tethered PC VR headset at 107 grams, with 2560x2560 OLED displays, pancake lenses, a 116-degree diagonal field of view, and Lighthouse tracking. The Beyond 2e variant adds AI-powered on-device eye tracking and dynamic foveated rendering. HTC's main offering is the Vive XR Elite, launched February 2023.[23][24]
Global VR/MR shipments in 2024 were approximately Meta 74.6 percent, Apple 5.2 percent, Sony 4.3 percent, ByteDance/Pico 4.1 percent and XREAL 3.3 percent (Counterpoint Research and IDC). The market shrank around 12 percent in 2024 and continued to decline in 2025 as legacy categories were displaced by AI smart glasses such as Meta's Ray-Ban line. IDC reported in late 2025 that the broader XR category grew 41.6 percent year over year to 14.5 million units, with Meta capturing 75.7 percent share in Q3 2025 across Quest and Ray-Ban combined.[25][26]
A modern VR headset must produce two high-resolution stereoscopic images at 90 to 120 frames per second with motion-to-photon latency below about 20 milliseconds, or the user feels nausea. On battery-powered standalone headsets, brute-force rendering is impossible, so modern VR relies heavily on AI techniques.
The human eye sees high detail only in the central few degrees around the fovea. Foveated rendering exploits this by drawing the foveal area at full resolution while the periphery is rendered at reduced quality. NVIDIA's Variable Rate Supersampling (VRSS) family pioneered driver-level foveated rendering on PC. VRSS 1 (2019) used a fixed central region; VRSS 2 (2020), built with Tobii, added eye-tracking-driven dynamic foveation following the user's gaze. NVIDIA's VRWorks Graphics SDK exposes VRS Wrapper, VRS Helper and Gaze Handler APIs. Foveated rendering reached the mainstream with Sony's PlayStation VR2 in 2023; the Bigscreen Beyond 2e added it for PC VR in 2025, and the upcoming Valve Steam Frame uses eye tracking for both render-time foveation and "foveated streaming."[20][27][28][29]
NVIDIA's Deep Learning Super Sampling (DLSS), originally developed for flat-screen gaming on RTX GPUs, has become a key VR tool. DLSS uses a neural network trained on high-resolution reference images to upscale lower-resolution frames in real time. DLSS 4 (2025) added multi-frame generation using a transformer-based model, essential when a VR application must hit 90 Hz. Similar techniques include AMD FidelityFX Super Resolution, Intel XeSS, and Sony's PSSR on the PS5 Pro for tethered PSVR2.
Gaussian splatting (3DGS), introduced at SIGGRAPH 2023, represents 3D scenes as millions of anisotropic Gaussian primitives that can be rasterised in real time. Together with Neural Radiance Fields (NeRF), it has become the dominant technique for capturing real environments and replaying them as photorealistic free-viewpoint VR. Apple uses Gaussian splatting for the new Personas in visionOS 26, and Meta's Codec Avatars research has shifted toward 3DGS-based representations.[17][30]
VR-Splatting, published in May 2025 in the Proceedings of the ACM on Computer Graphics and Interactive Techniques, combined neural point rendering for the foveal region with smooth Gaussian splatting in the periphery to deliver foveated radiance-field rendering at full VR resolution (2016x2240 per eye) at 90 Hz. The A3FR system (2025) demonstrated incremental gaze-tracked foveated rendering of Gaussian-splatted scenes with sub-millisecond latency.[30][31]
| Technique | Originator | Function | Used in |
|---|---|---|---|
| Variable Rate Supersampling (VRSS / VRSS 2) | NVIDIA | Driver-level foveated rendering, dynamic with eye tracking | NVIDIA RTX GPUs in PC VR |
| Deep Learning Super Sampling (DLSS) | NVIDIA | Neural upscaling and frame generation | RTX-based PC VR titles |
| FidelityFX Super Resolution (FSR) | AMD | Upscaling, optional ML mode | AMD Radeon, Steam Deck |
| Xe Super Sampling (XeSS) | Intel | Neural upscaling | Intel Arc GPUs |
| PSSR (PlayStation Spectral Super Resolution) | Sony | Neural upscaling on PS5 Pro | PSVR2 via PS5 Pro |
| Gaussian splatting (3DGS) | Inria et al. | Real-time radiance-field rendering | Apple Personas, Meta Codec Avatars, VR-Splatting |
| Neural Radiance Fields (NeRF) | UC Berkeley et al. | Implicit volumetric scene representation | Capture pipelines, novel-view synthesis |
| Audio to Expression | Meta Reality Labs | Speech-driven facial animation | Meta Horizon avatars |
| Foveated streaming | Valve | Gaze-driven encoder bitrate allocation | Valve Steam Frame |
Meta Reality Labs has worked on Codec Avatars for nearly a decade. The system uses generative neural networks to reconstruct a person's face and body from input data (originally a multi-camera Mugsy capture rig with more than 100 cameras, more recently a smartphone scan) and drives the avatar in real time using face and eye tracking. Demonstrations between Mark Zuckerberg and Lex Fridman in 2023 produced what observers called the most photorealistic remote presence ever shown publicly, although rendering required a high-end workstation.[3][32]
SqueezeMe, presented at SIGGRAPH 2025, distilled a Gaussian-splatting-based full-body Codec Avatar to run three avatars at 72 frames per second on a Quest 3. Other research lines include Avat3r, Vid2Avatar-Pro and URAvatar. The current limitation is that the Quest 3 and 3S lack face and eye tracking, so the highest-quality Codec Avatars cannot yet be driven from those devices.[3][33]
Apple's Personas, introduced with the original Vision Pro in February 2024, were the first shipping consumer feature of this kind. The original implementation used a depth scan and produced a slightly uncanny half-body avatar. The visionOS 26 update reworked Personas around Gaussian splatting and a generative AI algorithm, producing more natural facial motion. Persona generation runs up to 50 percent faster on the M5 Vision Pro than on the M2 model.[15][17]
NVIDIA Avatar Cloud Engine (ACE) is a suite of microservices for AI-driven digital humans. ACE went generally available in June 2024 and combines NVIDIA Riva (speech recognition, TTS, translation), Nemotron LLMs, and Audio2Face (real-time facial animation from audio). It integrates with third-party LLMs from OpenAI, Anthropic, and others through NIM microservices. ACE powers Inworld NPCs, customer-service avatars, and digital health assistants. The on-device Nemotron-3 4.5B can drive an avatar entirely on a local RTX GPU.[34][35]
VR is also one of the most important pipelines for training embodied AI agents. The approach, sim-to-real transfer, trains a policy in a simulated environment (often a physics-accurate 3D world delivered through the same engines that power VR rendering) and then deploys it on physical hardware. Simulation enables thousands of parallel training environments, randomised lighting, friction and sensor noise, and avoids the safety hazards of real-world training.
NVIDIA Isaac Sim, built on NVIDIA Omniverse and OpenUSD, is the most widely used commercial robotics simulator. It supports GPU-accelerated PhysX physics, RTX ray-traced sensor simulation, and ROS integration. Isaac Lab unifies reinforcement learning, imitation learning and motion planning. Synthetic data is post-processed with NVIDIA Cosmos, a family of world foundation models pretrained on more than 20 million hours of video. Humanoid robot vendors including 1X, Figure, Apptronik and Boston Dynamics use combinations of Isaac Sim, Cosmos and proprietary simulators.[36][37]
Autonomous driving is one of the largest VR-style simulation use cases. CARLA, an open-source simulator, hosts datasets such as a 110-scene benchmark with 24,000 frames and 890,000 annotations across six weather conditions. Waymo and Tesla operate proprietary simulation platforms that re-run real driving logs and synthesise variations. NVIDIA's Drive Hyperion combines on-vehicle compute with Omniverse simulations and Cosmos world models. Novel-view synthesis using Gaussian splatting and NeRF is increasingly used to construct closed-loop simulators that re-render real drives from new viewpoints.[38][39]
Drones are well suited to sim-to-real reinforcement learning because the cost of a real crash is much higher than a simulated one. Microsoft AirSim and the open-source OmniDrones platform built on NVIDIA Isaac train policies for navigation and racing. A landmark 2023 Nature paper from the University of Zurich described Swift, a deep reinforcement learning system trained in simulation that defeated three world-champion human drone racers.[40][41]
Meta Horizon Worlds, available on Quest headsets and the web, announced AI-powered NPCs at Meta Connect 2024 and 2025 powered by Llama 4. A Character Builder tool lets creators configure name, backstory, instructions, voice and guardrails. The Horizon Studio editor adds GenAI tools to generate 3D meshes, textures, skyboxes, sound effects, ambient audio, TypeScript scripts and full islands from text prompts; a project-level AI assistant is rolling out in 2026.[42][43]
VRChat, the largest user-generated social VR platform, has seen rapid adoption of LLM-driven NPCs. Celeste AI, an LLM-based virtual companion, was one of the first widely used examples. Academic work from CHI 2024 and 2025 has shown that VRChat NPCs with retrieval-augmented memory (storing prior conversations and mood, and retrieving the top-N relevant observations) can hold long-running, contextually consistent dialogues. The standard stack combines Whisper-style speech-to-text, an LLM such as GPT-4 or Claude for response generation, a small open-source emotion classifier, and the avatar's animation system.[44][45]
The most futuristic application of AI in VR is the on-the-fly generation of entire interactive worlds from a text prompt, sometimes called text-to-VR or text-to-world. Google DeepMind's Genie 3, announced August 2025, generates dynamic 3D worlds at 720p and 24 frames per second, navigable in real time and consistent for a few minutes. Genie 3 supports object permanence and "promptable world events" that let users alter weather, characters or objects mid-experience. Observers describe it as the "GPT-3 moment" for interactive 3D environments.[46]
Meta's WorldGen, announced as a Reality Labs research project in 2025, generates fully textured navigable 3D scenes spanning 50 by 50 metres from a single text prompt. Smaller GenAI components inside Horizon Studio already produce textures, audio and 3D meshes on demand. Video generation models such as OpenAI's Sora, Runway Gen-3, Kling, and Veo are not strictly VR systems, but they generate stereoscopic-capable video that can be displayed in 360- or 180-degree formats on Vision Pro and Quest.[42][47]
| Integration | Vendor | Year | Function |
|---|---|---|---|
| Meta AI on Quest | Meta | 2024 | Multimodal LLM assistant with passthrough vision |
| Audio to Expression (Horizon OS v71) | Meta | December 2024 | Speech-driven avatar facial animation |
| Codec Avatars / SqueezeMe | Meta Reality Labs | 2014-present | Photorealistic generative avatars |
| Apple Personas (Gaussian splatting rebuild) | Apple | visionOS 26, 2025 | Generative full-face avatars for FaceTime |
| Apple Intelligence on Vision Pro | Apple | 2024-2025 | On-device LLM, Image Playground, photo search |
| NVIDIA ACE microservices | NVIDIA | June 2024 | Riva ASR/TTS, Audio2Face, Nemotron LLM for avatars |
| Horizon Worlds AI NPCs (Llama 4) | Meta | 2025 | Generative NPCs with Character Builder |
| Horizon Studio GenAI tools | Meta | 2025 | Text-to-3D mesh, texture, audio, code |
| WorldGen | Meta Reality Labs | 2025 | Text-to-world for navigable 3D scenes |
| Genie 3 | Google DeepMind | August 2025 | Text-to-interactive-world model |
| VR-Splatting | IEEE / ACM TOG | 2025 | Foveated radiance-field rendering at 90 Hz |
| Foveated streaming | Valve (Steam Frame) | 2026 | Gaze-driven wireless bitrate allocation |
| Isaac Sim + Cosmos | NVIDIA | 2024-2025 | Sim-to-real training for robots and AVs |
VR remains constrained by several factors. Form factor and weight are barriers (Vision Pro weighs over 600 grams without battery). Battery life on standalone headsets is typically two to three hours. Vergence-accommodation conflict causes eye strain; varifocal displays are an active research direction. Motion sickness affects a minority of users in artificial-locomotion games. Social adoption has been slowed by the isolation of wearing a headset and unresolved questions about content moderation. AI-generated NPCs add new safety concerns including grooming, harassment, and emotional manipulation by autonomous agents.