Use vLLM on GKE to run inference with DeepSeek-V3.2-Speciale