リトリーバル温暖化エネルギーベースの推論:構造化推論タスクにおける拡散を推論するための五つのアブレーション手法
原題: Retrieval-Warmed Energy-Based Reasoning: A Five-Arm Ablation Methodology for Diffusion-as-Inference on Structured Reasoning Tasks
分析結果
- カテゴリ
- エネルギー
- 重要度
- 68
- トレンドスコア
- 27
- 要約
- 温暖化された拡散サンプラーは反復的な推論を加速しますが、どの部分がその利点をもたらすのかは明確ではありません。本研究では、リトリーバル温暖化エネルギーベースの推論手法を用いて、構造化推論タスクにおける拡散プロセスの効果を分析します。特に、五つのアブレーション手法を通じて、各要素の寄与を評価し、推論の効率を向上させるための洞察を提供します。
- キーワード
arXiv:2606.26476v1 Announce Type: cross Abstract: Warm-started diffusion samplers accelerate iterative inference, but it is rarely clear which part of the pipeline carries the gain. We study \textbf{retrieval-warmed energy-based reasoning (RW-EBR)} -- an IRED energy-based diffusion model \cite{du2024ired} augmented with a Modern Hopfield trajectory memory -- and contribute a \textbf{five-arm ablation methodology} (oracle, best-constant, per-query-random, shuffled, aligned) that separates three confounded effects: class-prior bias shift, stochastic warm-starting, and graph-aligned value reuse. The diagnostic decomposition is adapted from LLM-RAG evaluation \cite{ru2024ragchecker}. On \textbf{connectivity-2} (Erd\H{o}s--R\'enyi all-pairs reachability), the aligned-vs-shuffled-oracle swing reaches \textbf{$+35$\,pp} balanced accuracy on a fixed 1{,}000-graph validation-set diagnostic, with value distribution and retrieval mechanics fixed, only per-graph alignment destroyed, while per-query random initialisation falls below cold -- per-graph alignment, not bias shift or stochasticity, dominates. Yet the \emph{deployable} cold-prediction pipeline misses the acceptance gate at stored-value quality. The same diagnostic logic, stopped at the key-quality screen, applied to \textbf{Sudoku} with a task-specific key encoder produces a clean negative at a \emph{different} component -- key quality, under the current setup. The decomposition names the first blocking component on each task. The setting -- graph reachability refined by an iterative diffusion sampler, with explainability of failure modes as the lens -- places the work within structured and spatio-temporal reasoning. arXiv:2606.26476v1 Announce Type: cross Abstract: Warm-started diffusion samplers accelerate iterative inference, but it is rarely clear which part of the pipeline carries the gain. We study \textbf{retrieval-warmed energy-based reasoning (RW-EBR)} -- an IRED energy-based diffusion model \cite{du2024ired} augmented with a Modern Hopfield trajectory memory -- and contribute a \textbf{five-arm ablation methodology} (oracle, best-constant, per-query-random, shuffled, aligned) that separates three confounded effects: class-prior bias shift, stochastic warm-starting, and graph-aligned value reuse. The diagnostic decomposition is adapted from LLM-RAG evaluation \cite{ru2024ragchecker}. On \textbf{connectivity-2} (Erd\H{o}s--R\'enyi all-pairs reachability), the aligned-vs-shuffled-oracle swing reaches \textbf{$+35$\,pp} balanced accuracy on a fixed 1{,}000-graph validation-set diagnostic, with value distribution and retrieval mechanics fixed, only per-graph alignment destroyed, while per-query random initialisation falls below cold -- per-graph alignment, not bias shift or stochasticity, dominates. Yet the \emph{deployable} cold-prediction pipeline misses the acceptance gate at stored-value quality. The same diagnostic logic, stopped at the key-quality screen, applied to \textbf{Sudoku} with a task-specific key encoder produces a clean negative at a \emph{different} component -- key quality, under the current setup. The decomposition names the first blocking component on each task. The setting -- graph reachability refined by an iterative diffusion sampler, with explainability of failure modes as the lens -- places the work within structured and spatio-temporal reasoning.