Spatial computing meets intelligence

VR / XR × AI

Exploring AI-native interfaces for immersive environments. Generative 3D, spatial understanding, voice-first interaction, and intelligent virtual worlds.

We're building AI that works in 3D space — generating environments, understanding physical context, and responding to gestures and voice. Our primary test platform is Meta Quest, with experiments spanning standalone VR, passthrough AR, and tethered displays.

The interesting problems: How do you ground an LLM in 3D coordinates? Can you generate usable 3D assets fast enough to feel interactive? What happens when you combine Quest's hand tracking and passthrough with on-device inference?

Applications we're exploring: generative design (describe a room, see it built around you), adaptive training simulations, spatial data visualization, and multi-user spaces where AI agents have presence alongside humans.

What we're exploring

01Generative 3D from text and images

02Spatial reasoning and scene understanding

03Voice and gesture-first interfaces

04AI-driven procedural environments

05Real-time 3D asset generation

06Immersive data visualization

Experiments

What we're building, testing, and learning.

Text-to-3D scene generation

Natural language descriptions → complete 3D scenes. Testing various approaches: diffusion-based, retrieval-augmented, and hybrid methods.

Insight:Quality vs. speed tradeoff is brutal. Fast methods produce generic results; quality methods take minutes per object. Hybrid approach showing promise.

Spatial voice commands on Quest

'Move that over there' — voice commands with spatial deixis on Meta Quest 3. Combining speech recognition with Quest's hand tracking and eye gaze to understand references.

Insight:Multimodal fusion is essential. Voice alone is ambiguous; combined with Quest's gaze direction, accuracy jumps significantly.

Passthrough AR scene understanding

Using Quest's passthrough cameras to let AI understand your real environment — identify objects, suggest placements, answer questions about the space.

Adaptive training on standalone XR

Training environments on Quest that adjust difficulty and scenarios based on user performance. Testing industrial safety scenarios that run entirely on-device.

Tech we're using

UnityMeta Quest 3Quest SDKUnreal EngineGaussian SplattingNeRFWhisperClaude OpusBlenderWebXR

Open questions

Things we're still figuring out.

Can generative 3D ever be fast enough for real-time creation on standalone headsets?

How do you ground AI understanding in 3D space — not just 2D projections?

What interaction patterns work best when you have Quest's hand tracking, voice, and eye gaze?

Is immersive AI interaction genuinely better, or just more engaging?

Other research

Physical intelligence

Robotics × AI

Teaching machines to understand and act in physical environments. Edge inference, real-time adaptation, safety-critical systems.

→

Infrastructure for autonomous agents

Agent OS

Building the orchestration layer for AI agents — browser control, desktop automation, inter-agent communication, and reliable tool use.

→

Interested in this research? Have a related problem?

Let's talk →Reach out to us at info@deepklarity.com