Home

KAIST-Mila AI

Research

AGI Seminar Series Scientist AI

Publications

People

Galleries Applying

💛

Awesome X (Prof. Ahn’s Reading List)

AGI

General

•

2408 - A Theory of Understanding for Artificial Intelligence: Composability, Catalysts, and Learning

•

2406 - Open-Endedness is Essential for Artificial Superhuman Intelligence

•

2311 - Levels of AGI: Operationalizing Progress on the Path to AGI

•

2208 - The Alberta Plan for AI Research

AI Mathematician / AI Scientist / Autoformalization

General

•

2506 - The Ideation-Execution Gap: Execution Outcomes of LLM-Generated versus Human Research Ideas

•

2504 - Accurate and Diverse LLM Mathematical Reasoning via Automated PRM-Guided GFlowNets

•

2504 - Brains vs. Bytes - Evaluating LLM Proficiency in Olympiad Mathematics 

•

2502 - Auto-Bench - An Automated Benchmark for Scientific Discovery in LLMs

•

2411 - Large language models surpass human experts in predicting neuroscience results

•

2411 - Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics

•

2410 - Herald - A Natural Language Annotated Lean 4 Dataset

•

2407 - LEAN-GitHub: Compiling GitHub LEAN repositories for a versatile LEAN prover

•

2306 - From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought

•

2406 - AI-Assisted Generation of Difficult Math Questions

•

2405 - Metacognitive Capabilities of LLMs- An Exploration in Mathematical Problem Solving

•

2405 - DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

•

2404 - A Survey on Deep Learning for Theorem Proving

•

2403 - Don’t Trust: Verify – Grounding LLM Quantitative Reasoning with Autoformalization

•

2403 - Machine Learning and Information Theory Concepts Towards an AI Mathematician [Y. Bengio]

•

2402 - REFACTOR: Learning to Extract Theorems from Proofs

•

2312 - NeurIPS Tutorial on Machine Learning for Theorem Proving [Video]

•

2312 - Speculative Exploration on the Concept of Artificial Agents Conducting Autonomous Research 

•

2310 - A New Approach Towards Autoformalization

•

2212 - Solving Quantitative Reasoning Problems with Language Models

•

2302 - Peano - Learning Formal Mathematical Reasoning

•

2301 - Towards Autoformalization of Mathematics and Code Correctness: Experiments with Elementary Proofs

•

2210 - Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs

•

2205 - Autoformalization with Large Language Models

•

20xx - A Promising Path Towards Autoformalization and General Artificial Intelligence

•

2009 - Generative Language Modeling for Automated Theorem 

•

2006 - Mathematical Reasoning via Self-supervised Skip-tree Training

Hypothesis Generation

•

2404 - Hypothesis Generation with Large Language Models

AI4Science

Protein Design

•

2402 - Generative AI for Controllable Protein Sequence Design: A Survey

ARC & VAR

General

•

2411 - LogiCity- Advancing Neuro-Symbolic AI with Abstract Urban Simulation 

•

2405 - DAT - Disentangling and Integrating Relational and Sensory Information in Transformer Architectures

•

2403 - Human-Level Few-Shot Concept Induction Through Minimax Entropy Learning

•

2304 - Abstractors and Relational Cross-Attention: An Inductive Bias for Explicit Relational Reasoning in Transformers

ARC

•

 2412 - ARC PrizeOpenAI o3 Breakthrough High Score on ARC-AGI-Pub

•

2412 - ConceptSearch: Towards Efficient Program Search Using LLMs for Abstraction and Reasoning Corpus (ARC)

•

2412 - ARC Prize 2024: Technical Report

•

2411 - Mini-ARC: Solving Abstraction and Reasoning Puzzles with Small Transformer Models

•

2411 - Searching Latent Program Spaces 

•

2411 - The Surprising Effectiveness of Test-Time Training for Abstract Reasoning

•

2411 - Combining Induction and Transduction for Abstract Reasoning

•

 2410 - ARC PrizeHow to Beat ARC-AGI by Combining Deep Learning and Program Synthesis

•

2410 - Tackling the Abstraction and Reasoning Corpus with Vision Transformers: the Importance of 2D Representation, Positions, and Objects

•

2409 - H-ARC: A Robust Estimate of Human Performance on the Abstraction and Reasoning Corpus Benchmark

•

2404 - Re-ARC - Addressing the Abstraction and Reasoning Corpus via Procedural Example Generation

•

2403 - Reasoning Abilities of Large Language Models: In-Depth Analysis on the Abstraction and Reasoning Corpus

•

2402 - Neural Networks for Abstraction and Reasoning: Towards Broad Generalization in Machines

•

22xx - A Program-Synthesis Challenge for ARC-Like Tasks

Architectures (Backbone)

General

•

2501 - What’s Next for Mamba? Towards More Expressive Recurrent Update Rules

•

2410 - FACTS- A Factored State-Space Framework For World Modelling 

•

2406 - Vision-LSTM: xLSTM as Generic Vision Backbone

•

2405 - Aaren - Attention as an RNN

•

2310 - Sparse Universal Transformer

•

2212 - DiT - Scalable Diffusion Models with Transformers (Backbone Architecture of Sora)

•

2207 - Neural Networks And The Chomsky Hierarchy

•

2202 - GroupViT: Semantic Segmentation Emerges from Text Supervision

•

2202 - MaskGIT: Masked Generative Image Transformer

•

20xx - Theoretical Limitations of Self-Attention in Neural Sequence Models

Artificial Hippocampus & Spatial Intelligence

General

•

2503 - A hippocampal population code for rapid generalization

•

2501 - Computational Models of Hippocampal Cognitive Function

•

2501 - Key-Value Memory in the Brain

•

2412 - Learning Hierarchical Abstractions of Complex Dynamics using Active Predictive Coding Spatial

•

2412 - Models of Human Hippocampal Specialization- a Look at the Electrophysiological Evidence

•

2411 - A Tale of Two Algorithms - Structured Slots Explain Prefrontal Sequence Memory and Are Uniﬁed with Hippocampal Cognitive Maps

•

2408 - Space as A Scaffold for Rotational Generalisation of Abstract Concepts

•

2408 - Human hippocampal and entorhinal neurons encode the temporal structure of experience @11/30/2024 

•

2408 - How the Human Brain Creates Cognitive Maps of Related Concepts

•

2408 - Why Concepts Are (probably) Vectors

•

2408 - Cognitive maps from predictive vision

•

2408 - Abstract representations emerge in human hippocampal neurons during inference

•

2407 - Space is a latent sequence: A theory of the hippocampus

•

2407 - Automated construction of cognitive maps with visual predictive coding

•

2407 - The Computational Foundations of Dynamic Coding in Working Memory

•

2406 - A recurrent network model of planning explains hippocampal replay and human behavior

•

2405 - Remapping revisited: how the hippocampus represents different spaces

•

2401 - Learning Cognitive Maps from Transformer Representations for Efficient Planning in Partially Observed Environments

•

2311 - The generative grammar of the brain: a critique of internally generated representations

•

2308 - Cognitive graphs: Representational substrates for planning

•

2302 - Replay and Compositional Computation

•

2209 - How to build a cognitive map

2112 - Relating Transformers to Models and Neural Representations of The Hippocampal Formation

•

2106 - Geometry of abstract learned knowledge in the hippocampus

•

2009 - Emergence of abstract rules in the primate brain

•

1805 - Vector-Based Navigation Using Grid-Like Representations in Artificial Agents

Spatial AI

•

2412 - Thinking in Space: How Multimodal Large Language Models See, Remember and Recall Spaces

Causality

Position & Survey Papers

•

2409 - Theory Is All You Need: AI, Human Cognition, and Causal Reasoning

•

2405 - From Identifiable Causal Representations to Controllable Counterfactual Generation: A Survey on Causal Generative Modeling

•

2403 - Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey

•

2402 - Essential Role of Causality in Foundation World Models for Embodied AI

•

2307 - Causal Reinforcement Learning: A Survey

•

2302 - A Survey on Causal Reinforcement Learning 

•

2206 - Causal Machine Learning: A Survey

•

2105 - Toward Causal Representation Learning

RL & World Model

•

2501 - Language Agents Meet Causality -- Bridging LLMs and Causal World Models

•

2408 - Rethinking State Disentanglement in Causal Reinforcement Learning

•

2406 - Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning

•

2404 - Robust Agents Learn Causal World Models

•

2404 - Compete and Compose: Learning Independent Mechanisms for Modular World Models

•

2404 - Empowerment as Causal Learning, Causal Learning as Empowerment A bridge between Bayesian causal hypothesis testing and reinforcement learning

•

2403 - Dreaming of Many Worlds: Learning Contextual World Models Aids Zero-Shot Generalization

•

2306 - Granger Causal Interaction Skill Chains

•

2206 - Causal Dynamics Learning for Task-Independent State Abstraction

•

2200 - CausalDyna: Improving Generalization of Dyna-Style Reinforcement Learning via Counterfactual-Based Data Augmentation

•

2102 - Agent Incentives - A Causal Perspective

•

2100 - Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning

Causal-LLM

•

2502 - Auto-Bench- An Automated Benchmark for Scientific Discovery in LLMs

•

2412 - Prompting Strategies for Enabling Large Language Models to Infer Causation from Correlation

•

2412 - Do LLMs Act as Repositories of Causal Knowledge?

•

2402 - Efficient Causal Graph Discovery Using Large Language Models

Representation

•

2409 - Unifying Causal Representation Learning with the Invariance Principle

•

2407 - Disentangled Representations for Causal Cognition

•

2406 - The Odyssey of Commonsense Causality: From Foundational Benchmarks to Cutting-Edge Reasoning

•

2405 - From Identifiable Causal Representations to Controllable Counterfactual Generation: A Survey on Causal Generative Modeling

•

24xx - Multi-View Causal Representation Learning with Partial Observability

•

2403 - Towards the Reusability and Compositionality of Causal Representations

•

2305 - Interventional Causal Representation Learning

•

2310 - Towards Causal Foundation Model: on Duality between Causal Inference and Attention

•

23xx - Nonparametric Identifiability of Causal Representations from Unknown Interventions

•

2209 - Interventional Causal Representation Learning

•

2203 - Weakly Supervised Causal Representation Learning

•

2202 - CITRIS: Causal Identifiability from Temporal Intervened Sequences

•

2100 - CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models

•

2000 - iVAE: Variational Autoencoders and Nonlinear ICA: A Unifying Framework

•

13xx - Programs as Causal Models - Speculations on MentalPrograms and Mental Representation

Discovery

•

2405 - Demystifying Amortized Causal Discovery with Transformers

•

2402 - Efficient Causal Graph Discovery Using Large Language Models

•

2206 - Causal Dynamics Learning for Task-Independent State Abstraction

•

2204 - Learning to Induce Causal Structure

•

2202 - DECI - Deep End-to-end Causal Inference

•

2011 - Causal Autoregressive Flows

•

20xx - Causal Discovery with Reinforcement Learning

•

20xx - Differentiable Causal Discovery from Interventional Data

Benchmarks

•

2405 - Smoke and Mirrors in Causal Downstream Tasks

Causality in NeuroCog

•

2409 - Theory Is All You Need: AI, Human Cognition, and Causal Reasoning

•

2406 - Counterfactual Simulation in Causal Cognition (by Tobias Gerstenberg)

•

2405 - The Development of Human Causal Learning and Reasoning

•

2001 - What is Causal Cognition

•

1707 - Changes in cognitive flexibility and hypothesis searchacross human life history from childhood toadolescence to adulthood

Compositional Generalization

Position Papers

•

2402 - Compositional Generative Modeling - A Single Model is Not All You Need

•

2302 - Modular Deep Learning

General Representation

•

2501 - CELEBI - A Compressive-Expressive Communication Framework for Compositional Representations

•

2501 - Compositional Generalization Requires More Than Disentangled Representations

•

2411 - Interaction Asymmetry - A General Principle For Learning Composable Abstractions

•

2407 - Deciphering the Role of Representation Disentanglement - Investigating Compositional Generalization in CLIP Models

•

2406 - Discrete Dictionary-based Decomposition Layer for Structured Representation Learning TPR Vision

•

2310 - Compositional Abilities Emerge Multiplicatively: Exploring Diffusion Models on a Synthetic Task

•

1908 - The Compositionality of Neural Networks: Integrating Symbolism and Connectionism [Hupkes]

Algorithmic

•

2303 - The New XOR Problem

Composable World Models

•

2404 - Compete and Compose: Learning Independent Mechanisms for Modular World Models

•

20xx - Compositional Visual Generation with Energy Based Models [Igor Mordarch]

Neural Programmer

•

2306 - Learning Transformer Programs

Backbone Architectures (Compositional Transformer)

•

2405 - Transformers Can Do Arithmetic with the Right Embeddings

•

2405 - Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization

•

2405 - Improving Transformers with Dynamically Composable Multi-Head Attention

•

2405 - Sparo: Selective Attention for Robust and Compositional Transformer Encodings for Vision

•

2404 - When Can Transformers Reason with Abstract Symbols?

•

2311 - Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks

•

2311 - What Algorithms can Transformers Learn? A Study in Length Generalization

•

22xx - Making Transformers Solve Compositional Tasks

•

2110 - Compositional Attention: Disentangling Search and Retrieval

Emergent Communication

•

2408 - Emergent Language in Open-Ended Environments

•

2406 - Speaking Your Language: Spatial Relationships in Interpretable Emergent Communication

NeuroCog

•

2407 - Modularity in Biologically Inspired Representations Depends on Task Variable Range Independence

•

2406 - Binding in Hippocampal-Entorhinal Circuits Enables Compositionality in Cognitive Maps

•

2405 - The relational bottleneck as an inductive bias for efficient abstraction

•

2304 - Constructing Future Behaviour in The Hippocampal Formation Through Composition and Replay

•

2305 - Neural Knowledge Assembly in Humans and Neural Networks [Christopher Summerfield]

•

2304 - Hippocampal Spatio-Predictive Cognitive Maps Adaptively Guide Reward Generalization

•

 2209 - Symbols and Mental Programs: A Hypothesis About Human Singularity [Stanislas Dehaene]

•

2211 - Fast rule switching and slow rule updating in a perceptual categorization task [N. Daw]

•

2010 - Meta-Learning of Compositional Task Distributions in Humans and Machines [Thomas L. Grifﬁths]

•

20xx - Concepts and Compositionality - In Search of the Brain’s Language of Thought

Consciousness

NeuroCog

•

2407 - Conscious Artificial Intelligence and Biological Naturalism

•

2407 - What Does Decoding from The PFC Reveal About Consciousness? [Ned Block]

Diffusion Models

Tutorial

•

2406 - Step-by-Step Diffusion - An Elementary Tutorial

•

2405 - Building Diffusion Model's theory from ground up

•

2403 - Tutorial on Diffusion Models for Imaging and Vision

•

2401 - Demystifying Variational Diffusion Models

•

2208 - Understanding Diffusion Models - A Unified Perspective

Foundation

•

2411 - Towards a Mechanistic Explanation of Diffusion Model Generalization

•

2410 - One Step Diffusion via Shortcut Models Efficiency

•

2406 - Variational Flow Matching for Graph Generation Flow Matching

•

2312 - Boosting Latent Diffusion with Flow Matching Flow Matching

•

2307 - Flow Matching in Latent Space Flow Matching

•

2211 - Efficient Video Prediction via Sparsely Conditioned Flow Matching Flow Matching

•

2210 - Flow Matching for Generative Modeling Flow Matching

•

2209 - Diffusion Posterior Sampling for General Noisy Inverse Problems Inverse Problem

•

2207 - Classifier-Free Diffusion Guidance Guidance

•

2202 - Progressive Distillation for Fast Sampling of Diffusion Models Efficiency

•

2105 - Diffusion Models Beat GANs on Image Synthesis [→ Classifier-Guidance] Guidance

•

2010 - DDIM - Denoising Diffusion Implicit Models

•

2006 - DDPM - Denoising Diffusion Probabilistic Models

Discrete Diffusion

•

2503 - Block Diffusion - Interpolating Between Autoregressive and Diffusion Language Models

•

2503 - Generalized Interpolating Discrete Diffusion 

•

2502 - Spatial Reasoning with Denoising Models

•

2412 - Simple Guidance Mechanisms For Discrete Diffusion Models Guidance Discrete  

•

2410 - G2D2: Gradient-guided Discrete Diffusion for image inverse problem solving Discrete

•

2408 - Discrete Flow Matching Discrete

•

2406 - Simple and Effective Masked Diffusion Language Models Discrete  

•

2406 - Simplified and Generalized Masked Diffusion for Discrete Data Discrete

•

2310 - SEDD - Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution Discrete

•

2107 - D3PM - Structured Denoising Diffusion Models in Discrete State-Spaces Discrete  

•

Christopher Beckham, PhDMy notes on discrete denoising diffusion models (D3PMs) Discrete

•

2102 - Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions Discrete

Diffusion Posterior Sampling (Diffusion + SMC)

•

2412 - On Diffusion Posterior Sampling via Sequential Monte Carlo for Zero-Shot Scaffolding of Protein Motifs

•

2405 - Diffusion Posterior Sampling for Linear Inverse Problem Solving — a Filtering Perspective

•

2312 - Practical and Asymptotically Exact Conditional Sampling in Diffusion Models

Discrete Representation (VQ-VAE)

General

•

2408 - Discrete Flow Matching

•

2407 - Balance of Number of Embedding and their Dimensions in Vector Quantization

•

23xx - Resizing Codebook of Vector Quantization without Retraining

•

23xx - Straightening out The Straight through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks

•

2310 - EdVAE: Mitigating Codebook Collapse with Evidential Discrete Variational Autoencoders

•

2309 - Finite Scalar Quantization: VQ-VAE Made Simple

•

2303 - Regularized Vector Quantization for Tokenized Image Synthesis

•

22xx - SQ-VQE: Variational Bayes on Discrete Representation with Self-Annealed Stochastic Quantization

•

22xx - Discrete Representations Strengthen Vision Transformer Robustness

•

2102 - dVAE - Zero-Shot Text-to-Image Generation 

•

20xx - Hierarchical Quantized Autoencoders

•

1711 - VQ-VAE - Neural Discrete Representation Learning

Exploration

General

•

2206 - BYOL-Explore: Exploration by Bootstrapped Prediction

GFlowNets

General

•

2412 - Amortizing Intractable Inference in Diffusion Models for Bayesian Inverse Problems

•

2410 - Adaptive Teachers For Amortized Samplers

•

2405 - Amortizing Intractable Inference in Diffusion Models for Vision, Language, and Control

•

24XX - Maximum Entropy GFlowNets with Soft Q-Learning

•

24XX - Generative Flow Networks as Entropy-Regularized RL

•

2310 - Local Search GFlowNets

•

2302 - DynGFN - Bayesian Dynamic Causal Discovery using Generative Flow Networks

•

2209 - SubTB - Learning GFlowNets from Partial Episodes for Improved Convergence and Stability

Interactive Embodied Agents

General

•

2310 - SmartPlay - A Benchmark for LLMs as Intelligent Agents

Robotics

•

2406 - DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning 

•

2406 - Language Guided Skill Discovery

•

2405 - Vision-based Manipulation from Single Human Video with Open-World Object Graphs

•

2405 - From LLMs to Actions: Latent Codes as Bridges in Hierarchical Robot Control

•

2312 - SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution

•

2307 - Breadcrumbs to the Goal: Goal-Conditioned Exploration from Human-in-the-Loop Feedback

•

2210 - Language-Table: Interactive Language - Talking to Robots in Real Time

Minecraft

•

2407 - Odyssey: Empowering Agents with Open-World Skills

•

2403 - MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control

•

2312 - MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception

Crafter

•

2404 - AgentKit: Structured LLM Reasoning with Dynamic Graphs

LLMs

Position Papers & Survey

•

2406 - From Decoding to Meta-Generation- Inference-time Algorithms for Large Language Models

•

2402 - Position Paper: Bayesian Deep Learning in the Age of Large-Scale AI

LLM Agent

•

DiscoveryWorld: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents

LLM Reasoning

•

2504 - Brains vs. Bytes- Evaluating LLM Proficiency in Olympiad Mathematics 

•

2501 - DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

•

2501 - rStar-Math- Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Math

•

2412 - Scaling of Search and Learning - A Roadmap to Reproduce o1 from Reinforcement Learning Perspective o1 Reasoning

•

2412 - Mastering Board Games by External and Internal Planning with Language Models

•

2412 - Training Large Language Models to Reason in a Continuous Latent Space

•

24xx - SELF-EXPLORE - Enhancing Mathematical Reasoning in Language Models with Fine-grained Rewards

•

2411 - Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics

•

2409 - RethinkMCTS: Refining Erroneous Thoughts in Monte Carlo Tree Search for Code Generation

•

2409 - arXiv.orgRule Extrapolation in Language Models: A Study of Compositional...

•

2409 - Looped Transformers for Length Generalization

•

2408 - Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

•

2407 - NuminaMath - The largest public dataset in AI4Maths with 860k pairs of competition math problems and solutions Benchmark

•

2407 - Recursive Introspection - Teaching Language Model Agents How to Self-Improve

•

2407 - Deciphering the Factors Influencing the Efficacy of Chain-of-Thought- Probability, Memorization, and Noisy Reasoning

•

2406 - Assessing the Emergent Symbolic Reasoning Abilities of Llama Large Language Models

•

2405 - Learning to Reason via Program Generation, Emulation, and Search

•

2405 - From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step

•

2405 - AlphaMath Almost Zero - Process Supervision Without Process

•

2405 - Understanding Transformer Reasoning Capabilities via Graph Algorithms

•

2405 - From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step

•

2404 - Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo

•

2403 - Machine Learning and Information Theory Concepts Towards an AI Mathematician

•

2402 - Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models

•

2312 - Math-Shepherd - Verify and Reinforce LLMs Step-by-step without Human Annotations 

•

23xx - Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design

•

23xx - Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models

•

2309 - AlphaZero-Like Tree-Search can Guide Large Language Model Decoding and Training

•

2308 - Reinforced Self-Training (ReST) for Language Modeling

•

2308 - Graph of Thoughts - Solving Elaborate Problems with Large Language Models

•

2305 - Let’s Verify Step by Step

•

2305 - Reasoning with Language Model is Planning with World Model

•

2302 - On the Planning Abilities of Large Language Models (A Critical Investigation with a Proposed Benchmark) 

•

2212 - Prompting Is Programming- A Query Language for Large Language Models

•

2210 - Language Models Are Greedy Reasoners- A Systematic Formal Analysis Of Chain-Of-Thought

•

2110 - Training Verifiers to Solve Math Word Problems

GPT-o1 Analysis

•

2411 - o1-Coder: an o1 Replication for Coding

•

2411 - Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

•

LLaVA-o1: Let Vision Language Models Reason Step-by-Step

•

TwitterBeyondBacktesting on Twitter / X

•

TwitterBeyondBacktesting on Twitter / X

•

TwitterRafael Rafailov on Twitter / X

•

TwitterSubbarao Kambhampati (కంభంపాటి సుబ్బారావు) on Twitter / X

•

2410 - When a language model is optimized for reasoning, does it still show embers of autoregression? An analysis of OpenAI o1

•

Speculations on Test-Time Scaling (o1)

◦

•

O1 Replication Journey – Part 2: Surpassing O1-preview through Simple Distillation Big Progress or Bitter Lesson?

Causal

•

2402 - Efficient Causal Graph Discovery Using Large Language Models

•

Awesome-LLM-causal-reasoning

Understanding

•

2406 - Transformers need glasses! Information over-squashing in language tasks

•

2406 - The Geometry of Categorical and Hierarchical Concepts in Large Language Models

In Brain

•

1709 - Language, Mind and Brain

Memory

General

•

2405 - The Memory Systems of The Human Brain and Generative Artificial Intelligence

•

2405 - In Search of Dispersed Memories: Generative Diffusion Models Are Associative Memory Networks

•

2212 - Retrieval-Augmented Diffusion Models

Associative Memory

•

2501 - Titans - Learning to Memorize at Test Time

•

2501 - Episodic and Associative Memory From Spatial Scaffolds In The Hippocampus neurocog

•

2501 - Test-time regression - a unifying framework for designing sequence models with associative memory

•

2412 - Memorization to Generalization: The Emergence of Diffusion Models from Associative Memory

•

2412 - Memory Layers at Scale

•

2405 - Memory Mosaics

2406 - Parallelizing Linear Transformers with the Delta Rule over Sequence Length

NeuroCog

•

2306 - Organizing memories for generalization in complementary learning systems

NeuroAI & NeuroCog General

General

•

2412 - Models of human hippocampal specialization- a look at the electrophysiological evidence

•

2411 - A Tale of Two Algorithms - Structured Slots Explain Prefrontal Sequence Memory and Are Uniﬁed with Hippocampal Cognitive Maps 

•

2409 - A Neural Mechanism for Compositional Generalization of Structure in Humans 

•

2405 - Curiosity and The Dynamics of Optimal Exploration

•

2308 - Cognitive graphs: Representational substrates for planning

•

23xx - Neural Wave Machines: Learning Spatiotemporally Structured Representations with Locally Coupled Oscillatory Recurrent Neural Networks

•

2209 - Symbols and Mental Programs: A Hypothesis About Human Singularity

Traveling Waves

•

2507 - State Space Models Naturally ProduceTraveling Waves, Time Cells, and Scale to Abstract Cognitive Functions

•

2502 - Traveling Waves Integrate Spatial Information Through Time

•

2309 - Wave-RNN - Traveling Waves Encode the Recent Past and Enhance Sequence Learning

•

2304 - Neural Wave Machines: Learning Spatiotemporally Structured Representations with Locally Coupled Oscillatory Recurrent Neural Networks

•

1912 - Wave physics as an analog recurrent neural network

Neurosymbolic

General

•

2501 - Developing a Foundation of Vector Symbolic Architectures Using Category Theory

•

2411 - Towards Learning to Reason: Comparing LLMs with Neurosymbolic on Arithmetic Relations in Abstract Reasoning

•

2411 - LogiCity- Advancing Neuro-Symbolic AI with Abstract Urban Simulation 

•

2409 - arXiv.orgRule Extrapolation in Language Models: A Study of Compositional...

•

2407 - Symbolic metaprogram search improves learning efﬁciency and explains rule learning in humans

•

2405 - Investigating Symbolic Capabilities of Large Language Models

•

2404 - AgentKit - Structured LLM Reasoning with Dynamic Graphs

•

2401 - A model of conceptual bootstrapping in human cognition

•

23xx - Neurosymbolic AI: the 3rd wave

•

2306 - Bayesian Program Learning by Decompiling Amortized Knowledge

•

2306 - From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought

•

2402 - WorldCoder, a Model-Based LLM Agent: Building World Models by Writing Code and Interacting with the Environment

•

2107 - Human-Level Reinforcement Learning through Theory-Based Modeling, Exploration, and Planning

•

2106 - DreamCoder: Bootstrapping Inductive Program Synthesis with Wake-Sleep Library Learning

Neural Program Synthesis / Induction

•

2405 - Learning to Reason via Program Generation, Emulation, and Search

•

2405 - Diffusion On Syntax Trees For Program Synthesis 

•

2310 - Local Search and the Evolution of World Models

•

18xx - Neural Program Synthesis from Diverse Demonstration Videos

•

17xx - Program Synthesis

•

1512 - Human-Level Concept Learning Through Probabilistic Program Induction

•

13xx - Programs as Causal Models - Speculations on MentalPrograms and Mental Representation

Object-Centric

General

•

2503 - Unifying Causal and Object-Centric Representation Learning

•

2411 - Interaction Asymmetry - A General Principle For Learning Composable Abstractions

•

2408 - Zero-Shot Object-Centric Representation Learning

•

2406 - Identifiable Object-Centric Representation Learning via Probabilistic Slot Attention

•

2405 - Neural Language of Thought Models

•

2403 - Slot Abstractors: Toward Scalable Abstract Visual Reasoning

•

2312 - Reusable Slotwise Mechanisms

•

2310 - COSMOS - Neurosymbolic Grounding for Compositional World Models

•

2310 - Object-Centric Architectures Enable Efficient Causal Representation Learning

•

2307 - Compositional Generalization from First Principles

•

2205 - HOWM - Toward Compositional Generalization in Object-Oriented World Modeling (ICML22)

•

21xx - The Role of Disentanglement in Generalisation

NeuroCog

•

2304 - Solving the Binding Problem - Assemblies Form when Neurons Enhance Their Firing Rate—they Don’t Need to Oscillate or Synchronize 

•

2109 - Capturing the Objects of Vision with Neural Networks

•

1805 - A New Approach to Solving the Feature Binding Problem in Primate Vision

Rotating Features & Complex-Value Synchrony

•

2410 - Artificial Kuramoto Oscillatory Neurons

•

2410 - Tracking Objects that Change in Appearance with Phase Synchrony

•

2405 - Recurrent Complex-Weighted Autoencoders for Unsupervised Object Discovery

•

23xx - Representing Part-Whole Hierarchy with Coordinated Synchrony in Neural Networks

•

2306 - Rotating Features for Object Discovery

•

2305 - Contrastive Training of Complex-Valued Autoencoders for Object Discovery

•

2204 - Complex-Valued Autoencoders for Object Discovery

•

1312 - Neuronal Synchrony in Complex-Valued Deep Networks

Planning & World Models & RL

Inbox

•

2310 - Skipper - Combining Spatial and Temporal Abstraction in Planning for Better Generalization [YB]

Position Papers

•

2405 - What is Lacking in Sora and V-JEPA's World Models? -A Philosophical Analysis of Video AIs Through the Theory of Productive Imagination

•

2403 - Are Video Generation Models World Simulators?

RL General

•

2401 - Closing the Gap between TD Learning and Supervised Learning--A Generalisation Point of View

Offline RL

•

2110 - Offline Reinforcement Learning with Implicit Q-Learning

World Representations

•

2404 - V-JEPA - Revisiting Feature Prediction for Learning Visual Representations from Video

•

2307 - IVCL - Does Visual Pretraining Help End-to-End Reasoning?

•

2206 - BYOL-Explore: Exploration by Bootstrapped Prediction

Diffuser Planning

•

2408 - Diffusion Model for Planning - A Systematic Literature Review

•

2408 - Diffusion Models Are Real-Time Game Engines

•

2407 - Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion

•

2405 - Diffusion for World Modeling: Visual Details Matter in Atari

•

2405 - Hierarchical World Models as Visual Whole-Body Humanoid Controllers

•

2405 - PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer

•

2404 - [Tutorial Blog] Diffusion Models for Video Generation (Lil’Log)

•

2403 - Subgoal Diffuser: Coarse-to-fine Subgoal Generation to Guide Model Predictive Control for Robot Manipulation

•

2402 - Stitching Sub-Trajectories with Conditional Diffusion Model for Goal-Conditioned Offline RL

•

2402 - DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching

•

2401 - DiffuserLite - Towards Real-time Diffusion Planning

•

2401 - Simple Hierarchical Planning with Diffusion

•

2401 - Closing the Gap between TD Learning and Supervised Learning -- A Generalisation Point of View

•

2312 - SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution

•

2309 - Compositional Foundation Models for Hierarchical Planning

•

2309 - Reasoning with Latent Diffusion in Offline Reinforcement Learning

•

2306 - Decision Stacks: Flexible Reinforcement Learning via Modular Generative Models

•

2303 - Diffusion Policy: Visuomotor Policy Learning via Action Diffusion

•

2302 - Learning Universal Policies via Text-Guided Video Generation

•

2209 - QDT - Q-learning Decision Transformer- Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL

Dreamers & Dyna

•

2502 - M3\text{M}^3M3: A Modular World Model over Streams of Tokens

•

2502 - Improving Transformer World Models for Data-Efficient RL

•

2407 - Δ\DeltaΔ-IRIS: Efficient World Models with Context-Aware Tokenization

•

2406 - DART - Learning to Play Atari in a World of Tokens

•

2406 - A New View on Planning in Online Reinforcement Learning

•

2405 - CarDreamer: Open-source Learning Platform for World Model Based Autonomous Driving

•

2403 - Dreaming of Many Worlds: Learning Contextual World Models Aids Zero-Shot Generalization

MCTS

•

2406 - UniZero: Generalized and Efficient Planning with Scalable Latent World Models

Memory

•

2309 - Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents

Applications

•

2405 - CarDreamer: Open-source Learning Platform for World Model Based Autonomous Driving

Benchmarks

•

2405 - AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents

•

2402 - Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning

Misc

•

PML Foundation

General

•

2402 - iDEM - Iterated Denoising Energy Matching for Sampling from Boltzmann Densities

•

0312 - The IM Algorithm : A Variational Approach to Information Maximization

RL General

Unsupervised RL

•

2408 - Unsupervised-to-Online Reinforcement Learning

Safety & Alignment

Position

•

2502 - Superintelligent Agents Pose Catastrophic Risks- Can Scientist AI Offer a Safer Path?

•

2404 - Regulating Advanced Artificial Agents [Y. Bengio, Science]

•

24xx - Safeguarded AI: Constructing Guaranteed Safety v1.1

•

2310 - AI Alignment: A Comprehensive Survey [Peking University]

•

2309 - Anthropic's Responsible Scaling Policy 

•

2309 - Provably Safe Systems: The Only Path to Controllable AGI

•

2306 - An Overview of Catastrophic AI Risks

•

2207 - Toward Verified Artificial Intelligence

•

2209 - The Alignment Problem from a Deep Learning Perspective

•

2112 - Eliciting latent knowledge: How to tell if your eyes deceive you

•

Alignment Forum https://www.alignmentforum.org/

LLM

•

2404 - BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models

•

2404 - Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo

•

2404 - Your Language Model is Secretly a Q-Function

•

2312 - Training Chain-Of-Thought via Latent-Variable Inference

•

2310 - Amortizing Intractable Inference in Large Language Models [github]

•

2309 - Making Large Language Models Better Reasoners with Alignment

•

2304 - Bayesian Low-Rank Adaptation for Large Language Models [ICLR24]

•

2304 - Fundamental Limitations of Alignment in Large Language Models

•

2309 - Don’t Throw Away Your Value Model! Generating More Preferable Text with Value-Guided Monte-Carlo Tree Search Decoding

•

2205 - RL with KL Penalties is Better Viewed as Bayesian Inference

•

2202 - Red Teaming Language Models with Language Models

•

2111 - An Explanation of In-Context Learning as Implicit Bayesian Inference

Embodied Agent

•

Diffuser

•

2312 - Conformal Prediction for Uncertainty-Aware Planning with Diffusion Dynamics Model [NeurIPS’23]

•

2306 - Safe Planning with Diffusion Probabilistic Models

Safe Reinforcement Learning & Planning

•

2405 - Dynamic Model Predictive Shielding for Provably Safe Reinforcement Learning

•

2402 - Leveraging Approximate Model-based Shielding for Probabilistic Safety Guarantees in Continuous Environments

•

2310 - Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

•

2307 - SafeDreamer: Safe Reinforcement Learning with World Models

•

2306 - Safe Planning with Diffusion Probabilistic Models

•

2306 - Trajectory Generation, Control, and Safety with Denoising Diffusion Probabilistic Models

•

2304 - Approximate Shielding of Atari Agents for Safe Exploration

•

2101 - Shielding Atari Games with Bounded Prescience

Scientist AI

Truthness

•

2503 - Reasoning to Learn from Latent Thoughts 

•

2503 - From Models to Microtheories- Distilling a Model's Topical Knowledge for Grounded Question-Answering

•

2402 - Enhancing Systematic Decompositional Natural Language Inference Using Informal Logic 

•

2307 - Measuring Faithfulness in Chain-of-Thought Reasoning

•

2305 - Language Models Don’t Always Say What They Think- Unfaithful Explanations in Chain-of-Thought Prompting

System 2 AI

ARC

LLM Reasoning

Causality

Compositional Generalization

Test-Time Compute & Training (TTC & TTT)

•

2501 - Machine Learning Blog | ML@CMU | Carnegie Mellon UniversityOptimizing LLM Test-Time Compute Involves Solving a Meta-RL Problem

•

2501 - Test-time Computing: from System-1 Thinking to System-2 Thinking

•

2412 - Neurips 2024 Tutorial: Meta-decoding algorithms

◦

2406 - From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models

•

2411 - The Surprising Effectiveness of Test-Time Training for Abstract Reasoning

•

2408 - Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

•

2408 - Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models

•

2404 - Stream of Search (SoS): Learning to Search in Language

•

2403 - Common 7B Language Models Already Possess Strong Math Capabilities

•

2312 - Training Chain-of-Thought via Latent-Variable Inference

Awesome X (Prof. Ahn’s Reading List)

AGI

AI Mathematician / AI Scientist / Autoformalization

Hypothesis Generation

AI4Science

ARC & VAR

Architectures (Backbone)

Artificial Hippocampus & Spatial Intelligence

Causality

Compositional Generalization

Consciousness

Diffusion Models

Discrete Representation (VQ-VAE)

Exploration

GFlowNets

Interactive Embodied Agents

LLMs

Memory

NeuroAI & NeuroCog General

Traveling Waves

Neurosymbolic

Object-Centric

Planning & World Models & RL

PML Foundation

RL General

Safety & Alignment

Scientist AI

System 2 AI

Test-Time Compute & Training (TTC & TTT)

zz_Archived