I Built a Hebbian RNN next steps
#hebbian
2024-02-09 08:06
Four things to work on next:
- Try out some new rules found in the survey, listed here
- Use a more optimized dataset, perhaps simpler. Or make mine more efficient.
- Run it on university compute.
- Try some completely different rules from scratch, based on some other papers’ code.
# new rules
Does Ojas Rule work? I might even be able to generalize, probably not this next step, though. I may want to combine the general equation with a metalearning approach, like in meta-learning-through-hebbian-plasticity. List of Hebbian Weight Update Rules
That’s not a simple approach. It would be simpler for me to just swap out rules like:
- Ojas Rule
- competitive
- covariance rule
- maybe HPCA I can straight-shot just test them in the current build, maybe just excluding my existing normalization of the weights in hopes that these rules eliminate the need for that.
I also had an interesting idea where I replace my next-character prediction reward signal with one that was layer-wise. It’s based on HPCA, but time-shifted: HPCA
# better dataset
I’m currently using project gutenberg, with 100-character strings chosen randomly. It’s one-hot encoded, fed in one character at a time. I’m not sure whether it’s a bottleneck in the learning process, for a couple of reasons:
- it may be pulling from disk each time
- it could potentially fit on gpu, which would be faster
- it might be too complex for my current fast-iteration needs I should look into:
- optimized, common datasets/dataloaders that are probably already built into pytorch
- data that more explicitly suits the needs: it should incentivize taking long-term dependencies into account. I’m not sure what sort of dataset I’d be looking for in the latter case. I just want to see the thing train faster, especially on the backprop-rnn baseline. Perhaps a baby name dataset? But that doesn’t have long-term dependencies. Low vocab, like simple english wikipedia may be it. Maybe some common computer vision dataset would do the trick. I don’t really want to do reinforcement learning environments.
pytorch has built-in text datasets: torchtext.datasets — Torchtext 0.17.0 documentation Notably, the unsupervized ones look interesting:
- CC100 common crawl
- EnWik9 wikipedia The computer vision ones are interesting, too: Datasets — Torchvision 0.17 documentation Especially Omniglot 500 This would have long-range dependencies, right? Like if I made it just put out one pixel at a time, left to right and down?
# university compute
I have to find my BYU id, contact Dr. Fulda, contact the office of research computing, use a vpn, it’s a whole hassle. I’ll want my code to be pretty organized and portable and optimized, so I guess I’m biased toward doing all the other things first. However, it’s possible that my wishes for better understanding of the learning dynamics are just a 2-day supercomputer run away. I could literally be wasting time right now because my existing code just works.
# wipe and restart
Other papers already have hebbian learning running, I may just need to slap on my reward signal and rnn loop and get it going. It could be good to see how a new structure might prove superior.