In the competitive landscape of fashion e-commerce, augmented reality (AR) has emerged as a game-changing technology that bridges the gap between online browsing and in-store try-ons. Virtual try-on features empower customers to see how garments, accessories, or makeup look on their own bodies in real time, using nothing more than their smartphone camera. For iOS developers, Apple’s ARKit framework provides a robust and well-documented set of tools to build these immersive experiences. This article explores the architecture, implementation steps, and best practices for creating a production-ready virtual try-on feature in a fashion app using ARKit.

Understanding ARKit and Its Role in Fashion Try-Ons

ARKit, introduced in iOS 11, has evolved significantly with each major release. It leverages the device’s camera, motion sensors, and advanced computer vision algorithms to understand the environment and track user movement. For virtual try-on, the most relevant capabilities are:

  • Face tracking – Using the TrueDepth camera (on iPhone X and later) to map 3D facial geometry and expressions. Ideal for eyewear, earrings, hats, or makeup.
  • Body tracking – Introduced in ARKit 3, this feature estimates a human body’s pose in 3D without needing a depth sensor. It enables full-body try-on of clothing items.
  • Scene understanding – ARKit’s ability to detect horizontal and vertical planes, estimate lighting, and perform real-time environment mapping helps virtual garments blend naturally into the user’s surroundings.
  • People occlusion – Available on devices with the Neural Engine, this allows virtual objects to appear behind or in front of real people, creating realistic layering effects.

These features make ARKit an ideal foundation for fashion apps that aim to reduce return rates, increase conversion, and offer a novel shopping experience.

Core Requirements for a Virtual Try-On Feature

Before diving into the implementation, it’s important to understand the technical and design prerequisites. A successful try-on experience demands:

  • Hardware compatibility – Face tracking requires a device with TrueDepth (iPhone X/XR/XS and later). Body tracking works on any A12 Bionic or newer device, but People occlusion needs at least A13.
  • 3D assets – Realistic, lightweight models of clothing, accessories, or makeup. Each model must be rigged to follow the tracked skeleton or facial blend shapes.
  • Alignment and calibration – Virtual items must accurately align to the user’s body dimensions. This often requires scaling based on detected facial width or shoulder distance.
  • Real-time rendering performance – The AR session must maintain a consistent 60 FPS to avoid motion sickness. This means optimizing polygon counts, textures, and shaders.
  • User interface – Intuitive controls for selecting products, adjusting sizes, taking snapshots, or adding items to a cart.

Step-by-Step Implementation Guide

Setting Up the AR Session

Every ARKit project starts with an ARSession. For virtual try-on, you typically choose a configuration that matches the use case. For face-based try-ons (glasses, makeup), use ARFaceTrackingConfiguration. For full-body clothing, use ARBodyTrackingConfiguration. Provide user-facing privacy descriptions for camera and motion data access. Kick off the session in a SCNView or ARSKView (SceneKit or SpriteKit) and implement ARSessionDelegate to receive frame updates and tracking data.

Important: Always check for AR support at runtime. Not all iOS devices support the required configuration. Fall back gracefully by offering a non-AR experience or prompting the user to upgrade.

Face Tracking for Accessories and Makeup

ARFaceTrackingConfiguration provides a mesh of 1,220 vertices representing the detected face. It also exposes blend shapes (e.g., jawOpen, browDownLeft) for animating dynamic content. To overlay glasses, create a 3D model of the frame and parent it to the face anchor’s node. Use the vertex positions to adjust the bridge and temple lengths automatically. For lipstick or eyeshadow, project a screen-space quad that follows the mask’s UV coordinates. Pro tip: Use ARSCNFaceGeometry to visually map textures onto the face mesh, enabling realistic makeup try-ons.

External resource: Apple’s ARFaceAnchor documentation provides detailed blend shape reference.

Body Tracking for Clothes

ARBodyTrackingConfiguration estimates the user’s body pose using 19 joints (head, neck, shoulders, elbows, wrists, hips, knees, ankles, etc.). Each joint returns a position and orientation in 3D space. To dress a virtual avatar, you can either:

  • Render a 3D character – Skinned to the tracked skeleton, then overlay the character with garment meshes. This technique works well for shirts, pants, and dresses.
  • Attach clothing to joints directly – For simpler items like wristbands or hats, parent the model to the relevant joint node.

Because ARKit’s body tracking is based on a uniform skeleton, you must scale the character to match the user’s proportions. Measure the distance between shoulder joints and compare to the average, then apply a uniform scale factor. Adjust the mesh’s position so it stays flush with the user’s skin (use the torso center for shirts, hip center for pants).

Importing and Optimizing 3D Models

3D assets for AR must strike a balance between visual fidelity and performance. Use authoring tools like Blender or Autodesk Maya to create models with fewer than 20,000 triangles per garment. Export in USDZ format, Apple’s recommended format for AR Quick Look and SceneKit scenes. Textures should be at most 1024×1024 (or 2048×2048 for hero items) and compressed to reduce memory. Use physically based rendering (PBR) materials to ensure the cloth reacts naturally to light. For cloth simulation, bake vertex animations offline into morph targets (blend shapes) instead of real-time physics on mobile.

Aligning Virtual Items with User Anatomy

Accurate alignment is the hardest part of virtual try-on. For face tracking, use the face anchor’s transform to place the item at the correct offset. For example, glasses should sit roughly 20 mm in front of the eyes along the local z-axis. Adjust based on the bridge vertex positions. For body tracking, the hips joint is the root for pants; the neck or chest center for tops. Because ARKit does not provide body dimensions (like waist circumference), you’ll need to approximate:

  • Use the detected face width (from left cheek to right cheek) as a reference for upper body scale.
  • Alternatively, ask the user to input their height and weight during onboarding, then map to a standard mannequin scale.

Dynamic resizing – Allow users to pinch-to-zoom or drag sliders to fine‑tune the fit. Persist these adjustments per product.

Handling Lighting and Shadows

ARKit automatically estimates environmental lighting intensity and color temperature via ARLightEstimate. Use this data to set the SCNLight in your scene to directional or ambient, matching the real world. To cast realistic shadows, enable shadow maps on the light and configure categoryBitMask so only virtual objects cast shadows onto themselves (not onto the real user, which would appear unnatural). For occlusions (making the real user appear in front of the virtual garment), enable frameSemantics.personSegmentationWithDepth on the configuration. This generates a depth‑based segmentation mask that isolates the user and allows correct z‑ordering.

External resource: Apple’s sample project “Creating a Fashion App That Uses People Occlusion” is an excellent reference.

User Interaction and UI

The try‑on interface should be minimal but responsive. Common UI elements include:

  • Product carousel – Swipeable thumbnails of available items.
  • Color and size pickers – Allow users to change the texture or adjust fit.
  • Snapshot button – Capture the current AR view and save to the photo library or share via social media.
  • Reset / recalibrate – If tracking drifts, provide a button to restart the session or recenter the avatar.

Keep touch interactions intuitive. A single tap on a product selects it; a long press displays details. Use ARHitTest or raycastQuery to detect taps on virtual items and respond accordingly.

Best Practices for Performance and Realism

To maintain a smooth, lifelike experience:

  • Use Level of Detail (LOD) – Provide lower‑poly versions of models for when the user is far from the camera.
  • Batch render calls – Merge static meshes and use texture atlases.
  • Optimize animations – Instead of per‑frame joint updates, consider using CAKeyframeAnimation on the skeleton.
  • Test on a range of devices – Performance can vary dramatically between an iPhone SE and an iPhone 15 Pro Max. Set different quality budgets accordingly.
  • Monitor frame drops – Implement ARSession.setWorldOrigin(relativeTransform:) sparingly; recalculating the world map is expensive.

Overcoming Common Challenges

ChallengeSolution
Clothing clipping through the user’s bodyUse a depth mask to hide geometry behind the user. For face items, push the model slightly forward or use SCNDistanceConstraint.
Tracking loss when user moves quicklyReset the session if tracking state falls to .limited. Display a warning to move slowly.
Lighting mismatch between real and virtualEnable probeEnvironment to sample HDR reflections from the real environment. Fall back to a generic environment map if none.
High memory usage from multiple garment modelsLoad only the selected product’s model. Purge unused assets using SCNSceneSource with delayed loading.
User privacy concernsAlways obtain explicit consent for camera use. Delete face mesh data after the session ends. Do not upload raw depth maps to servers.

Conclusion and Future Directions

Building a virtual try‑on feature with ARKit empowers fashion apps to deliver an immersive, personalized shopping journey that drives engagement and reduces returns. By understanding the framework’s tracking capabilities, investing in high‑quality 3D assets, and implementing robust alignment and occlusion logic, developers can create experiences that feel both magical and practical. As Apple continues to advance AR capabilities — with lidar scanners on newer devices and improved scene understanding — the possibilities for realistic try‑ons will only grow. For brands looking to differentiate in a crowded market, now is the time to invest in AR‑driven fashion experiences.

Further reading: Apple ARKit overview and the official ARKit documentation.