Use vLLM on GKE to run inference with DeepSeek-V3.2-Speciale

Use vLLM on GKE to run inference with DeepSeek-V3.2-Speciale