VR teleop trajectory logs replay on hardware vision-conditioned

VR Teleoperation for Robot Data Collection & Demonstrations

We collect real-world robot demonstrations/trajectories on a UR7e using Meta Quest 3 controllers, store demonstrations in readable logs, and replay motion with conditional execution based on sensory input.

Platform: UR7e
Interface: Meta Quest 3 (SteamVR/ALVR)
Stack: ROS 2 + MoveIt
Outputs: .txt logs (+ dataset placeholders)

Introduction

Goal, Motivation & Applications

Project Goal

Modern robot learning policies (including imitation learning and diffusion models) are constrained by a scarcity of diverse, high-quality training data, particularly for 7-DoF manipulation tasks. To address this, we developed a novel, low-cost teleoperation interface that combined Meta Quest 3 Virtual Reality headset with the UR7e robotic arm allowing a human operator to perform manipulation tasks naturally in 3D space through a headset while recording synchronized joint and gripper data to train modern Imitation Learning policies.

Motivation & Technical Challenges

Modern robot learning (e.g., Diffusion Policies, Open X-Embodiment) is bottlenecked by the lack of high-quality demonstration data. Traditional collection methods such as kinesthetic teaching or "puppet" arms are either dangerous, unintuitive, expensive, or produce noisy and inconsistent results for complex tasks.

The Core Problem: This project bridges two distinct coordinate systems: the unconstrained Euclidean space of a VR controller and the kinematic constraints of an industrial robot. The key technical challenge was developing a "Logical Home Pairing" algorithm (Absolute Pose Mapping) to translate user hand poses into safe and accurate robot trajectories in real-time without risking singularities or sudden jumps.

Real-World Applications

Beyond collecting training data for AI, this teleoperation architecture has immediate utility in several high-impact domains:

  • Data Collection: Enabling scalable data generation for Open X-Embodiment initiatives.
  • Hazardous Environments: Allowing humans to handle toxic materials or nuclear waste from a safe distance using natural hand motions.
  • Remote Maintenance: Enabling technicians to repair machinery in cleanrooms or offshore rigs via VR digital twins.
  • Space Robotics: Facilitating control of orbital manipulators where traditional joystick interfaces are too abstract for complex dexterity tasks.

Design Strategy

Criteria, Choices & Trade-offs

Design Criteria

To act as a viable data collection platform, the system had to meet three critical performance benchmarks:

  • Latency: End-to-end delay must be <100ms to prevent user motion sickness and over-correction.
  • Safety: The system must prevent "jumps" when the VR controller is first activated.
  • Fidelity: Recorded trajectories must preserve high-frequency movements and gripper actions for successful replay.

Design Choices & Trade-offs

We prioritized spatial intuition and operator ergonomics to ensure high-quality data collection. With this in mind we implemented an Absolute Pose Mapping strategy rather than Relative Velocity Control.

Why we chose it:

Absolute mapping "locks" the robot's end-effector to the user's hand, enabling intuitive spatial understanding. Velocity control often suffers from drift which often makes precise stacking tasks frustrating for the user.

The Trade-off:

Workspace vs. Precision: By mapping 1:1, the user is physically limited by their own arm reach. We sacrificed the ability to move infinitely by resetting the controller and its pose in exchange for higher precision and safety in a fixed workspace.

Engineering Impact

Our architecture moves beyond academic theory to solve the "Data Bottleneck" in industrial robotics. We prioritized portability and fault-tolerance to create a deployable solution.

Scalable Data Pipeline (Quest 3):

We replaced capital-intensive motion capture labs (e.g., OptiTrack) with consumer inside-out tracking. This drastically reduces the "cost per datum" and allows for decentralized data collection without specialized facility infrastructure.

Solving the "Long Tail" (Human-in-the-Loop):

Hard-coded automation fails on edge cases. Our teleoperation stack ensures reliability by providing a remote human fallback interface, allowing operators to resolve failures in hazardous or unstructured environments without stopping the line.

Implementation

Full Pipeline

Full Pipeline
The complete pipeline showing real-time teleoperation (top) and the vision-based trajectory replay loop using Color Detection (bottom).
Meta Quest 3 ALVR (Wireless) SteamVR / OpenVR Linux Workstation UR7e Manipulator

Software Architecture

Meta Quest 3 TeleOp

The system implements a 72Hz synchronous control loop. To ensure intuitive handling, we decouple the absolute coordinate systems using a "Logical Home" calibration. This allows the operator to map a comfortable hand position (Pvr) to the robot's safety home (Probot) instantly.

Equation showing T_target calculation based on calibration offset and live VR pose

Record Trajectory

We provide users with the ability to record arbitrary teleoperation trajectories through a custom ROS 2 recording pipeline. Trajectory recording is initiated by launching the teleo_recorder ROS 2 node that we implemented, followed by invoking the service call ros2 service call /start_recording std_srvs/srv/Trigger "{}". Once triggered, the recording service begins capturing robot command messages published by the teleoperation stack (i.e., the command topic used to drive the robot during live control). These messages reflect the real-time motion generated during teleoperation and are logged continuously for the duration of the recording session. The resulting trajectories can then be replayed or processed downstream, enabling repeatable execution and offline analysis of teleoperated demonstrations.

Equation showing T_target calculation based on calibration offset and live VR pose

Replay Trajectory

We provide the option for the user to replay any trajectory they recorded by passing in the path to the .txt file to a ROS2 node that populates a job queue with the joint positions recorded at each 0.5 second sample and executes trajectories to reach each of those joint positions one by one in a smooth fashion to replay the entire trajectory.

Continuous Trajectory Replay With Color Sensing

Another option is for the user to run the trajectory replay continuously with color sensing. First, a ROS node must be run that loads the Intel Realsense D435i camera and publishes the detected color of objects to a ROS2 topic. For color detection, we filter colors using the HSV color spectrum, an adjustable distance (in our demo we used 0.75 meters to 1.1 meters), and adjustable area parameters to control the range of the size of objects that can be detected. Then a second ROS2 node must be run that pulls the information published by the camera, and once a color is detected, runs a full exection involving loading a trajectory .txt file that corresponds with the color of the object detected, populating a job queue with the joint positions recorded at each 0.5 second sample, and executing trajectories to reach each of those joint positions one by one in a smooth fashion to replay the entire trajectory. Once the trajectory has been replayed, the node returns to detecting colors and can replay another trajectory if another color is found.

Vision Pipeline
The complete vision pipeline involves running a camera_detection ROS2 node that detects the color of a target object and publishes the color to the detected_color topic. A second ROS2 node full_execute must also be run that subscribes to the detected_color topic and based on the color detected populates a job queue to replay a recorded trajectory before listening back to the topic for the next detected color.

Results & Demos

VR Teleoperation
Autonomous Replay

Conclusion

Results

Overall, our system successfully met our core design criteria. We demonstrated a complete end-to-end VR-based teleoperation pipeline from the Meta Quest 3 to the UR7e robot arm, including real-time control, trajectory recording, and reliable trajectory replay. This outcome closely aligns with our initial project goals and the scope refined after meeting with our TA.

In particular, the system achieved stable and intuitive teleoperation, allowing a human operator wearing the Quest 3 headset to control the UR7e’s end-effector motion in real time with low perceptual latency. The mapping between VR controller motion and robot motion was sufficiently smooth to enable precise manipulation behaviors, validating our design choice of using VR as a natural and expressive human–robot interface. Users were able to guide the robot through meaningful manipulation trajectories with minimal training.

Trajectory recording and replay functioned as intended and served as a strong indicator of system robustness. Recorded demonstrations could be replayed consistently, with the robot closely following the original motion paths. This confirms that our trajectory encoding, time synchronization, and playback solutions preserved the essential structure of the human demonstrations. These results make possible our broader motivation of using teleoperation as a high-quality data collection method.

Challenges/Difficulties

  • Latency: Hardware: Getting the Quest headset + controllers reliably connected on the same local network as the lab computer.
  • Safety: Software: Ensuring a smooth trajectory replay by through consistent timing between recorded samples and playback speed.
  • Fidelity: Sensing: Ensuring trajectories can run continously based on sensory input in different environments (lighting, distance, shadows, etc.).

Flaws/Future Considerations

One flaw in our system is that it doesn't have a safety net to prevent collisions if the human operator makes a mistake while controlling it. For example, if the human operator drops a controller the arm will go down and may crash into the table. To prevent this issue from hapening, we aim to add robot workspace bounding and proximity checks to prevent unsafe motions. Furthermore, instead of a basic color sensing pipeline, we aim to incorporate reinforcement learning to build upon the teleoperated demonstrations collected in this project. Rather than relying on task-specific object placement, learned policies will adapt the demonstrated trajectories to new object poses and environmental variations, improving robustness and generalization.

Project Team

Roles & Contributions

Samuel Mankoff

Master’s Student
Mechanical Engineering

Interested in robotic control, perception, and learning. Focused on designing robots that can replicate human manipulation skills.

Key Contribution: Led development of the VR-based teleoperation pipeline (Quest 3). Assisted with robot control logic and trajectory mapping.

Akshaj Gupta

Sophomore
EECS

Focused on control and learning algorithms for robotic arms and humanoids to perform difficult manipulation tasks.

Key Contribution: Developed the Computer Vision Pipeline (HSV/Depth) and implemented the trajectory saving/replay logic.

Kourosh Salahi

Junior
EECS & Business

Focused on machine learning applications for robotics, specifically control strategies for humanoid platforms.

Key Contribution: Contributed to the CV sensing pipeline for object detection and performed data collection trials via teleoperation.

Ziteng (Ender) Ji

Senior
Computer Science

Research interests lie in robot learning, specifically enabling robots to acquire sophisticated skills through data-driven approaches.

Key Contribution: Architected the Meta Quest connection with UR7e and assisted in the physical setup of the teleoperation workspace.

Aaron Zheng

Senior
EECS

Interested in humanoid robots and LLMs. Focusing on learning-based policies for motion execution in simulation.

Key Contribution: Integrated Meta Quest 3 (ROS2 -> SteamVR → ALVR) for UR7e teleoperation. Made color detection robust for obj detection, contributed to teleo / record script.

Additional Materials

Documentation & Resources

Project Presentation

Our comprehensive final presentation covering the complete development process, technical challenges, experimental results, and future work:

ME 206A Final Presentation - VR Teleoperation for Robot Learning

Complete technical overview including motivation, design decisions, implementation details, experimental results, and team contributions.

📄 Download PDF Complete slide deck with technical details and experimental results

Links

Team

B.S.: Aaron Zheng, Akshaj Gupta, Kourosh Salahi, Ziteng (Ender) Ji
M.S.: Samuel Mankoff

Acknowledgements

We would like to thank EECS 106A Course Staff for helping us throughout this project