LocalFTW
Why Local
All Posts
Guides
Contribute
Clinic
Topic Graph
Bookmarks
Tagged "llm-architecture"
Kimi Introduces Attention Residuals: 1.25x Compute Performance at <2% Overhead
17 March 2026
Student Releases Dhi-5B: Multimodal Model Trained for Just $1,200
13 February 2026