A
argbe.tech - news
1min read

Local LLM agents take a crack at faster matrix multiplication

Published today, a Towards Data Science write-up details a local, MacBook-based agent loop that generates and benchmarks Rust matrix-multiplication variants using open-source models.

A MacBook-based local agent loop is generating and benchmarking Rust matrix-multiplication variants in search of faster matmul.

  • Author / source: Stefano Bosisio (Towards Data Science)
  • Goal: speed up matmul for GPT fine-tuning workloads; reduce reliance on BLAS and Rust unsafe
  • Hardware: MacBook Pro (M3, 36GB RAM)
  • Local model: Mixtral 8x7B GGUF (Q4_K_M) by MrAderMacher
  • Orchestration: Microsoft AutoGen with roles Proposer, Coder, Tester, Manager (the Verifier role is present but currently disabled)
  • Retrieval: Chroma vector DB built from 50 matmul-optimization papers (2020–2025)
  • Embeddings / chunking: semantic chunking with BAAI/bge-base-en-v1.5
  • Code: public repo agents_matmul
  • Positioning: a laptop-scale way to explore Strassen-like variants—not a direct path to BLAS-level performance