Robot Arm Pick and Place

Advanced testing platform for evaluating foundation models and camera configurations on robotic pick and place tasks. Supports multiple state-of-the-art models and flexible camera setups for optimal training performance.
Project Overview
This project serves as a comprehensive testing platform for evaluating different foundation models and robotic learning approaches. The system is designed to test various state-of-the-art models including Pi Zero, Octo, GROOT N1.5, ACT, Diffusion, and SmolVLA on real-world pick and place tasks. Additionally, it provides a flexible platform for testing optimal camera configurations, allowing reconfiguration between 2-camera and 5-camera setups to determine the best angles for training. The platform also supports experimentation with different telepresence approaches using MoveIt, CuRobo, and VR technologies including Apple Vision Pro, along with augmented data testing for enhanced model training.
Key Features:
- Foundation model testing platform (Pi Zero, Octo, GROOT N1.5, ACT, Diffusion, SmolVLA, etc)
- Flexible camera configuration testing (2-camera and 5-camera setups)
- Gripper control for secure handling
- Telepresence experimentation (MoveIt, CuRobo, VR with Vision Pro)
- Augmented data testing and generation
- Dataset collection and model evaluation framework
- Comparative analysis of different model architectures
Dataset: RealSense Black-Green Background LeRobot
This project uses a custom dataset hosted on Hugging Face:
https://huggingface.co/datasets/tshiamor/realsense-black-green-background-lerobot
Description:
The realsense-black-green-background-lerobot dataset contains a series of episodes recorded with a robot and multiple cameras, specifically designed for robotics and imitation learning tasks. The dataset features video data of a robot arm (phosphobot, so100, phospho-dk) performing pick and place operations against a black and green background. It is compatible with LeRobot and RLDS, and can be used to train policies using imitation learning techniques.
Dataset Specifications:
- Number of episodes: 300
- Modalities: Video
- Tags: phosphobot, so100, phospho-dk
- Size: < 1K
- Direct dataset page: View on Hugging Face
- Browse files: Dataset Files & Tree
This dataset is ideal for robotics research, especially for training and evaluating pick and place tasks using vision-based policies. For more details, visit the dataset page.