-- fps
ARKit 52-dim
1 Choose Engine
V1 Recommended
111-dim output
Body motion
Idle blink
Phoneme-based engine with full expression control, VRMA body animation, idle eye blink + micro-expressions. Best quality.
V2 Lightweight
52-dim output
Fast
Student distillation model with direct prediction. Lighter, faster, crisp mouth animation. No body motion or idle expressions.
2 Load Avatar
or upload your own
Choose a pre-built avatar above or upload your own .vrm file.
3 Voice Input
Real-time microphone streaming or upload .mp3/.wav/.ogg audio files.
How it works:
1. Choose V1 or V2 engine and initialize
2. Pick a pre-built avatar or upload your own VRM
3. Speak via mic or upload audio to animate
All processing runs locally in your browser.
No data is sent to any server.
1. Choose V1 or V2 engine and initialize
2. Pick a pre-built avatar or upload your own VRM
3. Speak via mic or upload audio to animate
All processing runs locally in your browser.
No data is sent to any server.