VLA Manipulation Tools
RoboCrew allows your robot to perform complex physical tasks—like grabbing objects—by utilizing Vision-Language-Action (VLA) policies as tools. These tools bridge the gap between high-level LLM reasoning and low-level motor control.
1. Prerequisites
Section titled “1. Prerequisites”Before the agent can use an arm, you must have a VLA server running in a separate terminal. RoboCrew uses the LeRobot framework for this:
Three recomended ways to setup server
Section titled “Three recomended ways to setup server”A. Driectly on robot
Section titled “A. Driectly on robot”This method hosts the VLA policy on the robot’s onboard computer, providing the lowest possible latency for real-time motor control. It is the most robust option but realisticly Raspberry Pi 5 is too weak for it.
While settng up VLA tool set server_address="0.0.0.0:8080"
B. Local server
Section titled “B. Local server”By running the policy on a powerful workstation within the same local network, you can leverage high-end GPUs while keeping the robot lightweight.
While settng up VLA tool set server_address="your-server-ip"
C. External server (Tailscale recomendation)
Section titled “C. External server (Tailscale recomendation)”Using a VPN like Tailscale allows you to connect to a remote cloud server or a distant workstation securely over the internet.
While settng up VLA tool set server_address="your-tailscale-server-ip"
2. Creating a VLA Tool
Section titled “2. Creating a VLA Tool”You define a manipulation tool using the create_vla_single_arm_manipulation factory function. This binds a specific pretrained policy to a tool the AI agent can call.