← Categories

Models & Research

Composer 2: What is new and Compares with Claude Opus 4.6 & GPT-5.4

Composer 2: What is new and Compares with Claude Opus 4.6 & GPT-5.4

Dev.to / 3/24/2026

[D] Cathie wood claims ai productivity wave is starting, data shows 43% of ceos save 8+ hours weekly

Reddit r/MachineLearning / 3/24/2026

Microsoft hires top AI researchers from Allen Institute for AI for Suleyman's Superintelligence team

Microsoft hires top AI researchers from Allen Institute for AI for Suleyman's Superintelligence team

THE DECODER / 3/24/2026

MolmoWeb 4B/8B

Reddit r/LocalLLaMA / 3/24/2026

Ai2 releases MolmoWeb, an open-weight visual web agent with 30K human task trajectories and a full training stack

Ai2 releases MolmoWeb, an open-weight visual web agent with 30K human task trajectories and a full training stack

VentureBeat / 3/24/2026

I mapped how Reddit actually talks about AI safety: 6,374 posts, 23 clusters, some surprising patterns

Reddit r/artificial / 3/24/2026

Sarvam 105B Uncensored via Abliteration

Reddit r/artificial / 3/24/2026

Anthropic’s Claude Code and Cowork can control your computer

Anthropic’s Claude Code and Cowork can control your computer

The Verge / 3/24/2026

Production-Ready LLM Agents: A Comprehensive Framework for Offline Evaluation

Towards Data Science / 3/24/2026

Mirage raises $75M to continue building models for its AI video editing app Captions

TechCrunch / 3/24/2026

[R] Evaluating MLLMs with Child-Inspired Cognitive Tasks

[R] Evaluating MLLMs with Child-Inspired Cognitive Tasks

Reddit r/MachineLearning / 3/24/2026

Rethinking positional encoding as a geometric constraint rather than a signal injection

Reddit r/LocalLLaMA / 3/24/2026

Devstral-Small-2-24B fine-tuned on Claude 4.6 Opus reasoning traces [GGUF Q4+Q5]

Reddit r/LocalLLaMA / 3/24/2026

Agile Robots becomes the latest robotics company to partner with Google DeepMind

TechCrunch / 3/24/2026

Mistral-Small-4-119B-2603-heretic

Reddit r/LocalLLaMA / 3/24/2026

Request: Training a pretrained, MoE version of Mistral Nemo

Reddit r/LocalLLaMA / 3/24/2026

SWE-bench results for different KV cache quantization levels

Reddit r/LocalLLaMA / 3/24/2026

[D] Matryoshka Representation Learning

Reddit r/MachineLearning / 3/24/2026

Two new Qwen3.5 “Neo” fine‑tunes focused on fast, efficient reasoning

Reddit r/LocalLLaMA / 3/24/2026

HKIC, Gobi Partners and HKU team up for fund backing university research start-ups

HKIC, Gobi Partners and HKU team up for fund backing university research start-ups

SCMP Tech / 3/24/2026

Yann LeCun’s New LeWorldModel (LeWM) Research Targets JEPA Collapse in Pixel-Based Predictive World Modeling

MarkTechPost / 3/24/2026

Streaming experts

Simon Willison's Blog / 3/24/2026

Interactive Web Visualization of GPT-2

Interactive Web Visualization of GPT-2

Reddit r/artificial / 3/24/2026

[R] Causal self-attention as a probabilistic model over embeddings

Reddit r/MachineLearning / 3/24/2026

The 5 software development trends that actually matter in 2026 (and what they mean for your startup)

Dev.to / 3/24/2026

iPhone 17 Pro Running a 400B LLM: What It Really Means

Dev.to / 3/24/2026

[R] V-JEPA 2 has no pixel decoder, so how do you inspect what it learned? We attached a VQ probe to the frozen encoder and found statistically significant physical structure

Reddit r/artificial / 3/24/2026

Programming Manufacturing Robots with Imperfect AI: LLMs as Tuning Experts for FDM Print Configuration Selection

arXiv cs.RO / 3/24/2026

Enhancing Safety of Large Language Models via Embedding Space Separation

arXiv cs.AI / 3/24/2026

Reasoning or Rhetoric? An Empirical Analysis of Moral Reasoning Explanations in Large Language Models

arXiv cs.AI / 3/24/2026

The Reasoning Error About Reasoning: Why Different Types of Reasoning Require Different Representational Structures

arXiv cs.AI / 3/24/2026

Deterministic Hallucination Detection in Medical VQA via Confidence-Evidence Bayesian Gain

arXiv cs.AI / 3/24/2026

Putnam 2025 Problems in Rocq using Opus 4.6 and Rocq-MCP

arXiv cs.LG / 3/24/2026

MKA: Memory-Keyed Attention for Efficient Long-Context Reasoning

arXiv cs.LG / 3/24/2026

Future-Interactions-Aware Trajectory Prediction via Braid Theory

arXiv cs.AI / 3/24/2026

The Multiverse of Time Series Machine Learning: an Archive for Multivariate Time Series Classification

arXiv cs.LG / 3/24/2026

Interpretable Multiple Myeloma Prognosis with Observational Medical Outcomes Partnership Data

arXiv cs.LG / 3/24/2026

Rolling-Origin Validation Reverses Model Rankings in Multi-Step PM10 Forecasting: XGBoost, SARIMA, and Persistence

arXiv cs.LG / 3/24/2026

Transformer-Based Predictive Maintenance for Risk-Aware Instrument Calibration

arXiv cs.LG / 3/24/2026

REMI: Reconstructing Episodic Memory During Internally Driven Path Planning

arXiv cs.AI / 3/24/2026

Unified-MAS: Universally Generating Domain-Specific Nodes for Empowering Automatic Multi-Agent Systems

arXiv cs.AI / 3/24/2026

Understanding Behavior Cloning with Action Quantization

arXiv cs.LG / 3/24/2026

GSEM: Graph-based Self-Evolving Memory for Experience Augmented Clinical Reasoning

arXiv cs.AI / 3/24/2026

Beyond Detection: Governing GenAI in Academic Peer Review as a Sociotechnical Challenge

arXiv cs.AI / 3/24/2026

MARCUS: An agentic, multimodal vision-language model for cardiac diagnosis and management

arXiv cs.AI / 3/24/2026

A Context Engineering Framework for Improving Enterprise AI Agents based on Digital-Twin MDP

arXiv cs.AI / 3/24/2026

SDE-Driven Spatio-Temporal Hypergraph Neural Networks for Irregular Longitudinal fMRI Connectome Modeling in Alzheimer's Disease

arXiv cs.LG / 3/24/2026

INTRYGUE: Induction-Aware Entropy Gating for Reliable RAG Uncertainty Estimation

arXiv cs.AI / 3/24/2026

CAMA: Exploring Collusive Adversarial Attacks in c-MARL

arXiv cs.LG / 3/24/2026

MARLIN: Multi-Agent Reinforcement Learning for Incremental DAG Discovery

arXiv cs.LG / 3/24/2026