The 180

The 180

vLLM on Google Cloud TPU: A Model Size vs Chip Cheat Sheet (With Interactive Tool)

Grace Gong's avatar
Grace Gong
May 26, 2026
∙ Paid
Image source: Google Cloud

Which TPU configuration fits your model, what tensor_parallel_size to set, and what it costs per hour

Keep reading with a 7-day free trial

Subscribe to The 180 to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2026 Grace Gong · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture