LLM Inference Engineering: Quantization, KV-Cache Optimization, and High-Throughput Serving: A Production Engineer's Guide to INT4/INT8 ... (Production AI Engineering Series)
Pages: 82, Paperback, Independently published
Pages: 82, Paperback, Independently published