Artificial Intelligence and Machine Learning PhD Student and Engineer specialised in state‑of‑the‑art Natural Language Processing techniques. I worked on several NLP projects, including AraBERT, AraELECTRA, and AraGPT2, and CamemBERTa
Mar 2022 - Present
Paris, France
Mar 2023 - Present
Mar 2022 - Feb 2023
Jan 2021 - Feb 2022
Beirut, Lebanon
Jan 2021 - Feb 2022
Feb 2018 - Jan 2021
Beirut, Lebanon
Sep 2018 - Jan 2021
Feb 2018 - Jan 2020
May 2020 - May 2020
Beirut, Lebanon
May 2020 - May 2020
Feb 2020 - Apr 2020
Beirut, Lebanon
Feb 2020 - Apr 2020
June 2016 - Aug 2016
Beirut, Lebanon
June 2016 - Aug 2016
Feb. 2018 ‑ Sep. 2020 Masters Of Engineering In Electrical And Computer EngineeringMajor Area:Artificial Intelligence and Machine Learning systems Minor Area:Software, Networking and Security Scholarships:Awarded Graduate Fellowship with full tuition coverage Thesis:Transformers for Arabic Natural Language Understanding and Generation Supervisor:Prof. Hazem Hajj Awards:Abdul Hadi Debs Endowment Award for Academic Excellence Nominee | ||
Sep. 2013 ‑ Jul. 2017 Bachelor Of Engineering In Computer And Communication EngineeringFocus:Communications and Networking, Antennas and Propagation, and Digital Signal Processing Awards:Dean’s List for 6 semesters |
Pre-trained Transformers for Arabic Language Understanding and Generation (Arabic BERT, Arabic GPT2, Arabic ELECTRA)
Code for training DeBERTa V3 from scratch. Used to train CamemBERTa
Self-Hosted Large Language Models for Overleaf
Terminal UI for monitoring SLURM jobs
TUI for browsing, canceling, and inspecting OAR jobs on a cluster using only the terminal.
The competition required building machine learning models that can determine the sentiment (positive, negative, neutral) behind Arabic text (tweets), with a prize of 10000 USD, 5000 USD, and 2000 USD to the first, second and third place winners, respectively. The competition ran for 3 months, with 74 teams participating and submitting their predictions. Wissam ranked 3rd on the public leaderboard when the submission window closed. Top-ranked participants were then invited to submit and share their codes to the organizers for the final evaluation on another private 20,000 tweets dataset. Wissam’s submission scored the highest on the private dataset.
The rise of offensive speech, including vulgar or targeted insults, reflects increasing polarization in society, amplified by social media. A shared task aims to detect such speech in Arabic social media using the SemEval 2020 dataset, which includes manually annotated tweets for offensiveness (OFF or NOT_OFF) and hate speech (HS or NOT_HS). The dataset is split into train, dev, and test sets, with subtasks for offensive language and hate speech detection. Subtask B, identifying hate speech, is more challenging due to its lower prevalence. Our team won second place in both tasks, advancing research on offensive content and hate speech identification in Arabic tweets.
The Abdul Hadi Debs Endowment Award for Academic Excellence is a $1,000 endowment to a student at the graduate level who has an outstanding academic record and has demonstrated research capabilities through a paper, project, or thesis deemed by the faculty to be worthy of publication. I was nominated for this award by the Department of Electrical and Computer Engineering.
In Track 1-A, we developed a machine learning model to detect fake news and identify the news domain, using an annotated training corpus provided via email. In Track 1-B, we built a model to distinguish between bot and human Twitter accounts, also using an annotated dataset provided by Marc Jones. I received recognition for creating the best domain detection system in Track 1-A.