I have a 5090 Laptop from work, 24GB VRAM.
I have been testing every model that comes out, and I can confidently say I’ll be cancelling my cloud subscriptions.
All my tool call and data science benchmarks that prove a model is reliably good for my use case, passed.
It might not be the case for other professions, but for pyspark/python and data transformation debugging it’s basically perfect.
Using llama.cpp, q4_k_m at q4_0, still looking at options for optimising.
[link] [comments]




