Doğukan Tuna profile photo

dtthinky!

Hello, I usually go by the "dt.thinky!" alias, though my name is . Currently, I'm working on high-compute RL (reinforcement learning) research focused on superlearners relying on experience streaming that grows at runtime. Focused on continual model-based RL, experience replay and streaming, general digital/physical environments. You may enjoy reading some of the articles I've written.

Check The Connectionism Codex Substack

Reach out:

My notes, experiments and things I find worth sharing:

On this website:

Build!

*ContinuaLM
*

Experiental AI system that grows a superlearner from experience at runtime

Research!

A running log of posts, experiments, and longer notes from the work.

View all

Neural Networks & New Kinds!

I like compression as a lens on learning systems, representation learning, and generalization. Kolmogorov complexity gives a clean way to think about how much structure a model can capture and how we might reason about concise descriptions of data.

Kolmogorov complexity as the ultimate compressor:

K(X) = len of shortest program that outputs X

If C is a computable compressor, then;

for all X

K(X) ≤ l(C(X)) + K(C) + O(1)

proof: the simulation argument

Computing K(X)

* Undecidable / not computable

* A deep NN/transformer is a parallel computer with finite resources

MAGICAL!

* NNs can simulate and program!

They are little computers

They're circuits, circuits are computers, computing machines

SGD searching over program space!

micro micro Kolmogorov complexity

fitting a NN

with SGD we can compute our miniature Kolmogorov compressor

micro-K(f) ≈ bit-len of weights within a fixed architecture

minf ∈ F [ loss(f) + λ · micro-K(f) ]

lower desc len => better generalization

I'm extremely bullish on methods that make learning systems more efficient and more useful.