Frequently asked questions
What Reflex is, how fast it runs, what it supports, and how to ship your first model.
What is Reflex?
Reflex is a low-latency GPU cloud built for robotics. Train, fine-tune, and serve vision-language-action (VLA) models, then stream inference to your robots in real time, with no MLOps stack to run.
How fast is Reflex inference?
State-of-the-art for cloud robot inference. Reflex's fused kernels beat torch.compile on H100s and round-trip a robot observation in under 30 ms, faster than running the model on the robot itself.
Which robot models does Reflex support?
pi0.5, pi0.7-flash, ACT, and your own custom VLA. Fine-tune any of them, or bring weights you already trained.
Do I need a GPU on the robot?
No. Your robot streams camera frames and state to a colocated Reflex GPU pool over a single WebSocket, and actions stream back inside the control loop's latency budget. You can drop the onboard GPU entirely.
How does training and billing work?
Fine-tune on managed NVIDIA GPUs (GB200, B300, B200, H200, H100) in minutes. You pay per second of compute, not per node. Training is in beta.
How do I deploy a model to a whole fleet?
Push once. The model rolls out to every robot in your fleet over a single WebSocket: no SSH, no flashing, no version drift between robots.
How do I get started?
Install the SDK with pip install reflex-sdk and connect your robot, or reach out at team@tryreflex.ai for access.
Still stuck? Email team@tryreflex.ai or join the Discord.