AI Engineer @pipikwan working on @wasikan-kisewatisiwin || Multimodal Researcher
[Long paper, SVU Workshop @ ICCV 2025] The code repository for "Watch, Listen, Understand, Mislead: Tri-modal Adversarial Attacks on Short Videos for Content Appropriateness Evaluation"
The code repository of "An audio video-based multi-modal fusion approach for speech emotion recognition"
The code repository of "Multimodal Islamophobic Meme Identification and Classification" - 3rd MusIML Workshop NeurIPS 2024
[Champion - Bhashamul National NLP Datathon] A seq2seq model for Bengali regional text to IPA transcriber model
A extracurricular activity management system that can be easily customizable and adaptable for any university.
[Winner - Harvard HSIL Hackathon 2025 Dhaka Hub] Haven: A BCI-powered mental health speech-to-speech conversational agent that understands and responds with emotion