🪴 jaden lorenc

Search

Search IconIcon to open search

Augment WLASL

Last updated Aug 22, 2023 Edit Source

#stablediffusion

# previous attempt

My little attempt at straight training an entire stable diffusion model on just the WLASL dataset was fairly ineffectual. Months of training on university compute resources culminated in a bit of a blurry mess:

This is supposed to be the sign for ‘book’.

I spent probably about 100 hours, total, trying to force this to work, fiddling with hyperparameters and diving into the code to learn what might be improved. I tried my own set of augmentations: cropping the windows randomly, skipping frames, cropping the video at different times. The odyssey, with loss graphs and sample outputs, is logged here].

Truly, I needed an existing video diffusion model, but there wasn’t one available at the time.

# current go at it