- llm
- inference
- on-device-ai
- reinforcement-learning
- robotics
- ml
- thinking
•
•
•
•
•
•
-
Gated Delta Networks: Improving Mamba2 with the Delta Rule
A reading note on Liu et al., 2024 — how combining Mamba2's decay gate with DeltaNet's selective memory update yields a flexible recurrent model, and why its parallel training algorithm requires a matrix inverse.
-
Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots
A reading note on Chi et al., 2024 — how a handheld gripper with a fisheye camera and mirrors enables hardware-agnostic demonstration collection anywhere, and what retargeting steps bridge the gap to a real robot.
-
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
A reading note on Chi et al., 2023 — how DDPM-style denoising is applied to robot action generation, why it handles multimodal demonstrations better than explicit policies, and what the inference cost looks like in practice.
-
Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware
A reading note on ALOHA (Zhao et al., 2023) — how ACT uses action chunking and a Conditional VAE to solve fine manipulation tasks that cause standard imitation learning to fail completely.
-
Reinforcement Learning for Large Language Models
From the fundamentals of RL — value functions, policy gradients, Q-learning, PPO — to how these ideas translate directly into RLHF and DPO for LLM alignment.