Question 1

What is Reflex?

Accepted Answer

Reflex is a low-latency GPU cloud built for robotics. Train, fine-tune, and serve vision-language-action (VLA) models, then stream inference to your robots in real time, with no MLOps stack to run.

Question 2

How fast is Reflex inference?

Accepted Answer

State-of-the-art for cloud robot inference. Reflex's fused kernels beat torch.compile on H100s and round-trip a robot observation in under 30 ms, faster than running the model on the robot itself.

Question 3

Which robot models does Reflex support?

Accepted Answer

pi0.5, pi0.7-flash, ACT, and your own custom VLA. Fine-tune any of them, or bring weights you already trained.

Question 4

Do I need a GPU on the robot?

Accepted Answer

No. Your robot streams camera frames and state to a colocated Reflex GPU pool over a single WebSocket, and actions stream back inside the control loop's latency budget. You can drop the onboard GPU entirely.

Question 5

How does training and billing work?

Accepted Answer

Fine-tune on managed NVIDIA GPUs (GB200, B300, B200, H200, H100) in minutes. You pay per second of compute, not per node. Training is in beta.

Question 6

How do I deploy a model to a whole fleet?

Accepted Answer

Push once. The model rolls out to every robot in your fleet over a single WebSocket: no SSH, no flashing, no version drift between robots.

Question 7

How do I get started?

Accepted Answer

Install the SDK with pip install reflex-sdk and connect your robot, or reach out at team@tryreflex.ai for access.