Building multilingual XR training with JSON content packs
A simple content architecture for VR and AR training apps that need multiple languages, regional terminology, and client-specific lesson text.

Localization becomes painful when text is treated like decoration.
In XR training, text is not decoration. It is instruction, assessment, safety context, narration, labels, tool names, error recovery, and instructor guidance. If that text is hardcoded into Unity scenes, every new language becomes a rebuild project.
For enterprise training platforms, I prefer JSON content packs.
Why XR localization is different
Localizing a website is mostly a layout and copy problem. Localizing XR training has more moving parts:
- Floating labels must fit inside 3D space
- Voice prompts need timing
- Step names must match instructor language
- Safety terms may be region-specific
- Medical or industrial terminology must be exact
- UI text may be visible inside headset and on instructor screens
- The same training flow may need client-specific wording
That is why I try to separate content from scene logic.
What goes into a content pack
A content pack is a structured set of text and references that the app loads at runtime. It can be bundled inside the APK for offline deployment.
Typical fields include:
- Language code and display name
- Module title and description
- Step labels
- Instruction text
- Hint text
- Error messages
- Completion messages
- Voiceover file references
- Assessment labels
- Instructor notes
- Glossary terms
The Unity scene still controls behavior. The JSON pack controls the words and content references attached to that behavior.
A small example
The structure can stay simple:
{
"language": "en",
"moduleTitle": "Lockout Tagout Procedure",
"steps": [
{
"id": "inspect-panel",
"title": "Inspect the control panel",
"instruction": "Confirm the machine is idle before applying lockout.",
"hint": "Look for the main disconnect switch.",
"success": "Panel inspection complete."
}
]
}
The important part is the stable id. Scene logic can ask for content by id, so translators can change language without breaking interaction code.
Build for missing content
Every localization system needs graceful failure.
In production, I expect:
- Fallback to default language if a key is missing
- Editor validation for missing ids
- Warnings for long labels
- A simple preview mode for translators or reviewers
- Version numbers for content packs
- A changelog for client-specific edits
This prevents small content mistakes from becoming broken builds.
Offline-first makes language packs easier
If the app already runs offline, language packs can ship with the build. That is useful for hospitals, factories, classrooms, and field sites where WiFi is restricted or unavailable.
It also simplifies review. A client can approve a specific build knowing exactly which language content is included. There is no hidden server dependency changing copy after deployment.
For Anatomy XR, this kind of structure makes multilingual anatomy labels practical. The 3D experience stays the same, while the language layer can adapt for different classrooms or institutions.
Keep translators out of Unity
The best localization workflow does not require translators to open Unity.
A clean content pack lets a translator work in a familiar format, then lets the development team validate and package the result. For larger projects, the JSON can also be generated from a spreadsheet or translation management system.
The important rule is that language review should not require scene editing.
Text length is a design constraint
Some languages need more space than English. Some labels become unreadable when translated directly. In VR, that matters because text often exists in constrained panels or anchored world-space callouts.
Good systems plan for:
- Wrapping
- Font scaling limits
- Short and long label variants
- Icon support
- Voiceover when text would be too dense
- Instructor notes for extra explanation outside headset
Localization is not only translation. It is UX.
The payoff
JSON content packs are not glamorous, but they make XR training easier to maintain.
They help teams:
- Add languages without rebuilding the interaction system
- Support regional terminology
- Review safety-critical text clearly
- Keep offline deployments self-contained
- Let instructors request wording changes without touching code
For training products, that flexibility matters. The first version teaches the process. The second, third, and fourth versions usually teach it in more places, to more people, with more specific language.
That is exactly where a content layer earns its keep.
Tags