Tracking Overview¶

XRTracker uses model-based tracking — it knows the 3D shape of the object and matches it against the camera image in real-time. This page explains the core concepts.

How It Works¶

Every frame, the tracker:

Receives a camera image from the active camera source
Compares the 3D model against what it sees in the image — matching edges, boundaries, or depth surfaces depending on the active modality
Refines the pose to best align the model with the real object
Updates the GameObject's transform with the new position and rotation

This runs every frame at camera framerate (typically 30-60 fps).

Tracking Status¶

Each TrackedBody reports a TrackingStatus:

Status	Description
NotTracking	Not tracking — waiting for detection or tracking was lost
Tracking	Actively tracking with good quality
Poor	Tracking but quality is below the "nice" threshold

Tracking starts when quality exceeds QualityToStart for several consecutive frames. Tracking stops when quality drops below QualityToStop.

Tracking Modalities¶

XRTracker uses a primary + complementary modality architecture. You always pick one primary contour modality (Silhouette or Edge), and optionally add complementary modalities (Depth, Texture) to improve robustness.

Primary Modalities (pick one)¶

Silhouette Tracking¶

Uses the object's foreground boundary — the contour where the object separates from the background. Silhouette tracking is the most general-purpose modality and the most resilient to fast camera or object movement, since the foreground/background separation remains visible even during motion blur.

Works best with objects that have a distinct outline against their surroundings. Requires some visual contrast between the object and background.

Learn more about Silhouette Tracking

Edge Tracking¶

Uses geometric edges on the object's surface — creases, silhouette contours, and depth discontinuities. Edge tracking works in lower contrast conditions than silhouette because it relies on local gradients rather than global foreground/background separation. Well suited for large machinery, engine blocks, industrial equipment — anything with lots of internal geometric detail visible from the camera.

Edge tracking also excels when you can only see a partial view of a large object, since internal edges remain visible even when the outline is not fully in frame. However, it is more sensitive to fast motion — edge correspondences are easier to lose during rapid movement.

Learn more about Edge Tracking

Complementary Modalities (optional)¶

These modalities enhance the primary contour modality but cannot be used alone.

Depth Tracking¶

Uses depth data from LiDAR sensors (iPhone Pro, iPad Pro) or depth cameras (Intel RealSense). Matches the 3D point cloud against the model surface. Provides strong translational constraints that complement the rotational precision of contour modalities.

Depth tracking is always used in addition to silhouette or edge — it cannot be used standalone.

Learn more about Depth Tracking

Texture Tracking¶

Uses surface appearance — matching the object's texture or visual features against a reference. Provides additional constraints when the object has distinctive surface patterns. Complementary to contour tracking.

To enable, check Texture Tracking on the TrackedBody component. Texture tracking works best on objects with rich, non-repetitive surface patterns (labels, printed graphics, wood grain). It adds little value on uniform or metallic surfaces.

Silhouette vs. Edge¶

Aspect	Silhouette	Edge
Setup	Requires pre-generated tracking model	Works directly from mesh — no model generation
Contrast requirement	Needs foreground/background separation	Works with lower contrast (local gradients)
Fast motion	Very resilient	More sensitive to rapid movement
Background clutter	Sensitive — similar backgrounds confuse it	More discriminative — uses local edge features
Object type	General-purpose, curved shapes	Large machinery, engines, objects with internal detail
Computational cost	Lower	Higher

Combining Modalities¶

Silhouette and Edge are mutually exclusive — you pick one contour method. Depth is a complementary modality that enhances either method but cannot be used alone.

Combination	Use Case
Silhouette only	General-purpose, no depth sensor
Edge only	Industrial parts, cluttered backgrounds, no depth sensor
Silhouette + Depth	Best robustness for AR with depth sensor (iPhone Pro, RealSense)
Edge + Depth	Industrial parts with depth sensor

Choosing the Right Combination¶

Scenario	Recommended
Object with distinct outline, varied background	Silhouette
Large machinery, engine, equipment with geometric detail	Edge
General-purpose with depth sensor	Silhouette + Depth
Low contrast scene with depth sensor	Edge + Depth
Object against cluttered background	Edge + Depth
Fast-moving objects or handheld camera	Silhouette (+ Depth if available)

How Modalities Interact¶

When multiple modalities are active, they work together to find the best pose. Each modality contributes its own measurements, and the tracker combines all information to produce the most accurate result. Quality is reported as the best across active modalities. Adding depth doesn't slow down contour tracking — all modalities are processed together efficiently.

Enabling Combinations¶

On the TrackedBody component:

Set Tracking Method to Silhouette or Edge
Check Depth Tracking to add depth

Detection¶

Detection uses the initial pose — the position and rotation of the TrackedBody GameObject in the scene relative to the camera. When you click "Track" or call ForceStartTracking(), the tracker renders the model at this pose and searches for a match.

Detection tips

Place the 3D model at the approximate real-world position relative to the camera
The closer the initial pose, the faster and more reliable detection is
For AR Foundation, detection uses the phone's current camera view