vLLM and High-Performance Inference: Memory Optimization, Parallel Execution, Token Streaming, Scalable Model Serving (Large Language Refinement Inference Series)
Compare webshops (1)
Shop
Price
Pages: 183, Paperback, Independently published