Just been playing around with PrismML's 1-bit 8B LLM and its legit. Now the question is can turboquant be used with it? seemingly yes?
(If so, then I'm really not seeing any real hurdles to agentic tasks done on device on today's smartphones..)
[link] [comments]