Hey everyone!
I wanted to share my latest project: Apex-1, a lightweight 350M parameter model designed for speed and efficiency on edge devices.
The Goal: I wanted to see how much "world knowledge" and instruction-following I could cram into a tiny model using consumer hardware and high-quality data.
Key Info:
- Architecture: Based on nanoGPT / Transformer.
- Dataset: Pre-trained on a subset of FineWeb-Edu (10BT) for reasoning and knowledge.
- Finetuning: Alpaca-Cleaned for better instruction following.
- Format: Weights available as ONNX (perfect for mobile/web) and standard PyTorch.
It’s great for basic summarization, simple Q&A, and running on hardware that usually can't handle LLMs.
Check it out here:https://huggingface.co/LH-Tech-AI/Apex-1-Instruct-350M
This is just the beginning – Apex 1.5 and a dedicated Code version are already in the pipeline. I'd love to get some feedback or see your benchmarks!
[link] [comments]