FAQ

Frequently asked questions

What Reflex is, how fast it runs, what it supports, and how to ship your first model.

What is Reflex?

Reflex is a low-latency GPU cloud built for robotics. Train, fine-tune, and serve vision-language-action (VLA) models, then stream inference to your robots in real time, with no MLOps stack to run.

How fast is Reflex inference?

State-of-the-art for cloud robot inference. Reflex's fused kernels beat torch.compile on H100s and round-trip a robot observation in under 30 ms, faster than running the model on the robot itself.

Which robot models does Reflex support?

pi0.5, pi0.7-flash, ACT, and your own custom VLA. Fine-tune any of them, or bring weights you already trained.

Do I need a GPU on the robot?

No. Your robot streams camera frames and state to a colocated Reflex GPU pool over a single WebSocket, and actions stream back inside the control loop's latency budget. You can drop the onboard GPU entirely.

How does training and billing work?

Fine-tune on managed NVIDIA GPUs (GB200, B300, B200, H200, H100) in minutes. You pay per second of compute, not per node. Training is in beta.

How do I deploy a model to a whole fleet?

Push once. The model rolls out to every robot in your fleet over a single WebSocket: no SSH, no flashing, no version drift between robots.

How do I get started?

Install the SDK with pip install reflex-sdk and connect your robot, or reach out at team@tryreflex.ai for access.

Still stuck? Email team@tryreflex.ai or join the Discord.