Hosting Assistant_Pepe_70B on Horde!

Reddit r/LocalLLaMA / 3/28/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

Key Points

  • The post announces that “Assistant_Pepe_70B” is being hosted on Hugging Face and made available through Horde with very high availability across 2xA6000 GPUs.
  • It specifies FP8 precision running at 16k context length, claiming roughly 99.99% accuracy for the FP8 mode.
  • The author points readers to Lite KoboldAI (FREE, no login required) as the interface to try the model.
  • Feedback from users is explicitly encouraged to iterate on hosting quality and/or usage.
  • The update highlights practical deployment options for large local/hosted LLMs via distributed inference infrastructure like Horde.

Hi all,

Hosting https://huggingface.co/SicariusSicariiStuff/Assistant_Pepe_70B on Horde at very high availability on 2xA6000.

FP8 precision at 16k context (FP8 is about 99.99% accuracy).

( https://lite.koboldai.net/ FREE, no login required)

So give it a try!
(Feedback always welcomed)

submitted by /u/Sicarius_The_First
[link] [comments]
広告