
dtthinky!
Hello, I usually go by the "dt.thinky!" alias, though my real name is Doğukan . I'm inspired by automated AI research, multi-agent systems, AI infrastructure for accelerating scientific discovery, reinforcement-learning research built on experience replay, and I'm doing research across these areas. But who can wear just one hat? I enjoy exploring how AI can develop new superlearning capabilities through experience, and I'm incredibly optimistic about where this new kind of knowledge generation can lead. Right now, I'm focused on scaling AI compute and on superlearning architectures for AI. You may enjoy reading some of the articles I've written.
My notes, experiments and things I find worth sharing:
On this page:
The work that matters most to me lives where automated AI research, neuroscience, and RL research overlap. I like learning in public, writing down experiments, and keeping enough structure around the process when the problem calls for it.
Neural Networks & New Kinds!
I like compression as a lens on learning systems, representation learning, and generalization.
Kolmogorov complexity gives a clean way to think about how much structure a model can capture and how we might reason about concise descriptions of data.
Kolmogorov complexity as the ultimate compressor:
K(X) = len of shortest program that outputs X
If C is a computable compressor, then;
for all X
K(X) ≤ l(C(X)) + K(C) + O(1)
proof: the simulation argument
Computing K(X)
* Undecidable / not computable
* A deep NN/transformer is a parallel computer with finite resources
MAGICAL!
* NNs can simulate and program!
↓
They are little computers
↓
They're circuits, circuits are computers, computing machines
↓
SGD searching over program space!
micro micro Kolmogorov complexity
fitting a NN
with SGD we can compute our miniature Kolmogorov compressor
micro-K(f) ≈ bit-len of weights within a fixed architecture
minf ∈ F [ loss(f) + λ · micro-K(f) ]
lower desc len => better generalization
I'm extremely bullish on methods that make learning systems more efficient and more useful.
Research!
A running log of posts, experiments, and longer notes from the work.
Introducing Verified Replay Distillation (VRD) recipe for continual learning in verifiable domains
A short tour of VRD, an on-policy continual learning recipe that uses nothing more than a verifier, a replay buffer, and a failure-driven curriculum to teach a language model new task families without forgetting old ones.
autoresearch-mamba: Karpathy-Style Autoresearch for Mamba-2, Mamba-3, and Hybrid Mamba-Transformer MoE
Karpathy-style autoresearch for Mamba-2, Mamba-3, and Nemotron-H style hybrid Mamba-Transformer MoE language models on MLX and GPU.
A self-improving skill catalog for AI agents
An open-source skill catalog that agents use, extend, and improve themselves. 19 skills covering the full LLM lifecycle, autonomous research, GPU/TPU/QPU programming, and scientific computing — built by agents, for agents.
Mem-RLM — Memory-Augmented Inference for Recursive Language Models
An open-source memory layer for Recursive Language Models that records execution trajectories, extracts reusable strategies, and injects them into future runs. Models stop starting cold and actually learn which approaches work for which problem types — 26% accuracy improvement on weaker models, fully stateful inference.
Claude Code-Time Skill Acquisition with Agent Teams
A team of agents researched, synthesized, and integrated a production-grade React Native skill into a shared knowledge base in under 15 minutes — just through coordination at Claude Code-time.
On Compression, Computation and the Space Between
Kolmogorov complexity, neural networks as program search and Wolfram's ruliology seem to be looking at the same thing from different rooms.
Defeating Nondeterminism in LLM Inference: Reproducing Batch-Invariant Ops (RMSNorm & Tiled Matrix Multiplication) in JAX
A learning log reproducing the implementation of batch-invariant NN operations in JAX, drawing from Thinking Machines Lab's seminal collaborative work, \"Defeating Nondeterminism in LLM Inference.\"
Streaming deepagents and task delegation with real-time output
This post demonstrates how to implement streaming capabilities on top of DeepAgents' package with multi-agent setup, with practical code examples and architectural patterns you can apply to your own projects.
Energetics of Allosteric Communication in Ubiquitin Revealed by Hybrid MCTS-Langevin Simulations
Exploring protein conformational landscapes and identifying potential allosteric communication pathways remain significant challenges in computational biophysics. This study presents a hybrid computational approach combining Monte Carlo Tree Search (MCTS) with Langevin Dynamics (LD) simulations using the OpenMM toolkit to enhance conformational sampling.
Build!
experimental efforts aimed at systems capable of superlearning
I'm learning the field in the open, sharing study notes, experiments, and trials as I go.
You can start with the latest article, then branch out from there.