I spent a lot of time building an inference engine like ollama, pure vibe coding in go. I kept trying to push it to optimize it and it was fun but after sometime I really wanted to know what was going on to be able to really know what those optimizations were about and why some were'nt working as I expected. This is a part 1 of those articles that go deep and is beginner friendly to get up to speed with inference.
[link] [comments]




![[P] I trained an AI to play Resident Evil 4 Remake using Behavioral Cloning + LSTM](/_next/image?url=https%3A%2F%2Fexternal-preview.redd.it%2FzgmJOxETuqgqlsgMxeBl7S4gZNDHf_K3U9w883ioT4M.jpeg%3Fwidth%3D320%26crop%3Dsmart%26auto%3Dwebp%26s%3Da63f97b9d03c40b846cd3eaac472e78050020a43&w=3840&q=75)