Hey guys, is anyone here using a Tesla P40 with newer models like Qwen / Mixtral / Llama?
RTX 3090 prices are still very high, while P40 is around $250, so I’m considering it as a budget option.
Trying to understand real-world usability:
- how many tokens/sec are you getting on 30B models?
- is it usable for chat + light coding?
- how bad does it get with longer context?
Thank you!
[link] [comments]



