I like the idea of the 395+ with 128 gb vram, but the speed on inference with bigger models just makes it seem like its not worth it. I feel like if you ever need the capabilities of a bigger model, you can just use a cloud lm to do so.
Whereas with dual 3090s , you get a decent size model with lots of speed, which is far better for use cases such as agentic workflows.
What do you guys think?
[link] [comments]




