Kirolos Ataallah

Research Engineer & AI Specialist

AI and Computer Vision Engineer with 4+ years of experience specializing in Vision-Language Models and Video LLMs. First-author contributor to publications at top-tier venues (ECCV 2024, EMNLP 2025).

Kirolos Ataallah - Research Engineer & AI Specialist

About Me

Passionate about advancing AI through innovative research and practical applications

Professional Background

I am a Senior Computer Vision Engineer at CreativAI, specializing in large-scale video analytics and AI research. With a Master's degree in AI and Data Science from Ottawa University (GPA: A+), I combine deep research expertise with practical engineering skills.

Research Focus

My research centers on Vision-Language Models (VLMs) and Video Large Language Models, with published work at prestigious conferences including ECCV 2024 and EMNLP 2025. I've contributed to breakthrough projects in video understanding and multimodal AI.

Current Work

Currently working on large-scale video projects analyzing 1000+ hours of content using efficient algorithms, parallel computing, and video knowledge graphs. Previously collaborated with VISION-CAIR lab at KAUST University under Prof. Mohammed El Hossieny superviosn.

4+

Years Experience

4

Published Papers

15+

Major Projects

154

Total Citations

Professional Experience

Journey through innovative roles in AI and computer vision

Jun 2024 - Present

Senior Computer Vision Engineer

CreativAI - San Francisco, CA (Remote)

Working on large-scale video projects analyzing 1000+ hours of content with efficient algorithms, parallel computations, and video knowledge graphs. Developing cutting-edge solutions for video analytics and insights aggregation.

Computer Vision Video Analytics Video Understanding Large Scale Parallel Processing
Sep 2023 - Jun 2024

Computer Vision Researcher (Visiting Student)

VISION-CAIR Lab, KAUST University

Conducted research under Prof. Mohammed El Hossieny superviosn focusing on vision-language models, long video understanding, and video benchmarking. Published 3 papers at top-tier conferences.

MiniGPT4-Video (CVPR W 2024)
Goldfish (ECCV 2024)
InfiniBench (EMNLP 2025)
Research Vision-Language Models Video Understanding Long video understanding benchmark
Jan 2022 - May 2024

Computer Vision Engineer

Plaibook - AI Company

Analyzed football matches using advanced computer vision techniques including detection, bird's eye view generation, player tracking, jersey classification, and soccer event detection.

Sports Analytics Object Detection Player Tracking
July 2022 - Jan 2023

AWS Master's Project Collaboration

Ottawa University & Amazon Web Services

Developed a virtual fitting room mobile application using GANs. Deployed on AWS using EC2, S3, ECS, and ECR services. Published research paper in IJACSA journal.

GANs AWS Mobile Development

Education

Academic foundation in AI and computer science

2023
Master's

M.Sc. Artificial Intelligence & Data Science

University of Ottawa, Canada

GPA: A+ Thesis: Virtual Fitting Room using GANs
2020
Bachelor's

B.Sc. Computer Engineering

Benha University, Egypt

GPA: 88.3% (3rd Place) Project: Interactive VR with NLP

Research & Publications

Contributing to the advancement of AI and computer vision

CVPR W 2024

MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens

Developed a multimodal Large Language Model specifically designed for video understanding, capable of processing temporal visual and textual data with state-of-the-art results.

IJACSA Journal

A Cost-Efficient Approach for Creating Virtual Fitting Room using Generative Adversarial Networks (GANs)

Master's thesis work on developing a virtual fitting room application using GANs, deployed on AWS cloud infrastructure for scalable fashion technology solutions.

Featured Projects

Showcasing innovative solutions in AI and computer vision

Football Match Analysis System

Complete computer vision pipeline for football analysis including player detection, tracking, jersey classification, and event detection using YOLOv8 and custom algorithms.

YOLOv8 Object Tracking Sports Analytics

YOLO v8 Contributions

Contributed to the official YOLOv8 library with improvements in object tracking algorithms, enhancing performance for real-time applications.

PyTorch Object Detection Open Source

Arabic Sentiment Analysis

Advanced sentiment analysis for Arabic Twitter data using state-of-the-art models including AraBERTv2, MARBERT, and custom preprocessing techniques.

BERT NLP Arabic Text

Network Intrusion Detection

Deep learning-based network intrusion detection system using advanced neural network architectures for cybersecurity applications.

Deep Learning Cybersecurity TensorFlow

Humanly Interactive VR

Bachelor's graduation project: Virtual reality environment with NPCs that respond to user speech using NLP models, creating immersive interactive experiences.

VR NLP Unity

Stock Price Prediction

Netflix stock price prediction model incorporating Twitter sentiment analysis, combining financial data with social media sentiment for enhanced predictions.

Time Series Sentiment Analysis Finance

Skills & Expertise

Technical proficiencies and professional capabilities

AI & Machine Learning

Computer Vision
Deep Learning
Vision-Language Models
Video Understanding
Large Language Models

Programming & Frameworks

Python
PyTorch
TensorFlow
OpenCV

Cloud & DevOps

AWS
Google Cloud Platform
Docker
CI/CD
MLOps

Certifications

AWS Machine Learning Specialty

May 2022

AWS Cloud Practitioner

Mar 2022

Deep Learning Specialization

Stanford/Coursera

Data Analysis Professional

Udacity Nanodegree

Get In Touch

Let's collaborate on innovative AI and computer vision projects

Email

kirolosatef1997@gmail.com

Phone

(+20) 1280636202

Location

El Obour, Cairo, Egypt