[D] How does distributed proof of work computing handle the coordination needs of neural network training?

Reddit r/MachineLearning / 3/31/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The post argues that typical distributed neural-network training suffers from coordination overhead because frequent gradient synchronization is delay-sensitive, especially across machines/over the internet.
  • It questions whether Qubic’s claimed approach (evolutionary selection rather than backpropagation) could eliminate gradient-sharing needs by evolving independent models and applying selection pressure over time.
  • The author asks whether evolutionary model search is a validated research direction or generally underperforms gradient descent, and whether it better matches a “distributed proof of work” style compute model.
  • The post further requests evidence from published research comparing evolutionary methods to standard gradient-based training at large scale.
  • The stated goal is technical assessment of whether the architecture is coherent, not an endorsement or critique of the project’s intent.

[D] Ive been trying to understand the technical setup of a project called Qubic. It claims to use distributed proof of work computing for neural network training. I want to know if the idea holds together technically.The main issue with distributed training is coordination. Training large neural networks needs frequent sharing of gradient updates across nodes. This process is sensitive to delays and works far better with fast connections inside a data center than over the internet on separate machines. My question for people who actually do distributed machine learning work is this. Is there a training method that avoids the need for gradient synchronization altogether?Qubic describes its Aigarth AI system as using evolutionary selection instead of backpropagation. This means there are no gradients to share. Each node evolves its own model on its own. Selection pressure then acts across the full set of models over time rather than through matched weight updates.If that account is correct it removes the usual coordination problem. The process would work more like a genetic algorithm search than standard deep learning training. My questions are these:

  1. Is evolutionary model search a real direction in machine learning research or has it been shown to perform worse than gradient descent?
  2. If it is a real direction does the distributed proof of work model fit this approach better than it fits standard backpropagation training?
  3. Is there published research that compares evolutionary methods to standard training at large scale?

I am only trying to understand whether the architecture makes sense technically. Not here to judge the project.

submitted by /u/srodland01
[link] [comments]