Alt Transformer Project Description
#project I’m a driven Data Scientist Intern, with a deep interest in machine learning and natural language processing. I recently finished a course in Deep Learning and have worked on impactful projects such as drug discovery research with a professor. My skills extend beyond Python and Git, including the rapid cross-platform development tool, Flutter. My passion aligns closely with the latest developments in machine learning, ensuring I’m always updated with current trends. Throughout my academic journey, I’ve worked on several projects like building parsers, investigating into RoseTTAFold algorithm, and leading a startup for aspiring authors. My most recent and personally exciting projects are:
- A chatbot for my personal site that allows interviewer to interview “me” on demand.
- A finetuned video diffusion model for word-level sign language generation.
- A small variation of a transformer model that attempts to abstract-away the attention operation down to a single matrix multiplication between the activations of two layers. It loses the ability to vary the context length, but I feel that the increased flexibility might actually create some advantages. It seems to train competitively to GPT-2, but it may lose competitive advantage due to its lack of implicit bias towards preserving long-term dependencies.