My name is M(u|o)hammad Khalifa. I am a computer science master’s student at Cairo University. My master’s work is on Low-resource multi-dialectal Arabic NLU and abstractive summarization. My main research interests are Generation, Representation Learning, and Few-Shot Learning. I am currently an Applied Scientist Intern at Amazon Web Services, supervised by Miguel Ballesteros and Prof. Kathleen Mckeown, working on multiple projects including Dialogue Summarization and Document Image Understanding. Previously, I was an intern at Naver Labs Europe where I worked on Controlled Text Generation and Energy-based models with Hady Elsahar and Marc Dymetman. Before that, I worked as a Machine learning research engineer at Sypron Solutions under under the supervision of Dr. Alaa Khamis,where I investigated Big data time-series analysis, anomaly detection, and stream processing for preventive maintenance.

( Twitter / LinkedIn / Scholar / Github / CV )


March 1st, 2021: My internship at AWS was extended. Now I am working on Document Image Understanding!

Jan 12th, 2021: My paper on “A Distributional Approach To Controlled Text Generation” was accepted to ICLR 2021 (Top 2.2% of submissions and Oral Presentation). [Paper] [Code] [Blog]

Jan 10th, 2021: My paper on Zero-shot multi-dialectal Arabic sequence labeling was accepted to EACL 2021! [Paper] [Code] [Bibtex]

October 12th, 2020: Started an applied scientist internship at Amazon Web Services, working with Miguel Ballesteros and Kathleen Mckeown.

July 21th, 2020: My paper on book success prediction (with Professor Aminul Islam) via pre-trained embeddings and readability scores is now live on arxiv.

May 5th, 2020: Started an Internship in NAVER Labs Europe working on Controlled Text Generation with distributional constraints!


Muhammad Khalifa, Muhammad Abdul-Mageed, Khaled Shaalan. “Self-Training Pre-Trained Language Models for Zero-and Few-Shot Multi-Dialectal Arabic Sequence Labeling.” In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume 2021 (pp. 769–782). Association for Computational Linguistics. [Paper] [Code] [Bibtex]

Muhammad Khalifa, Hady Elsahar, Marc Dymetman. “A Distributional Approach to Controlled Text Generation”. In International Conference on Learning Representations 2021. [Paper] [Code] [Blog]

Mustafa Jarrar, Eman Karajah, Muhammad Khalifa, Khaled Shaalan. “Extracting Synonyms from Bilingual Dictionaries”. In Proceedings of the 11th International Global Wordnet Conference (GWC2021). Global Wordnet Association (2021). [Paper]

Muhammad Khalifa, Khaled Shaalan. “Character Convolutions for Arabic Named Entity Recognition with Long Short-Term Memory Networks”. In Speech & Language, Volume 58, 2019, Pages 335-346, ISSN 0885-2308. [Paper]

Muhammad Khalifa, Noura Hussein. “Ensemble Learning for Irony Detection in Arabic Tweets”. In FIRE (Working Notes), pp. 433-438. 2019. [Paper]