CV
Education
Ph.D. in Computer Vision & Deep Learning - Taiwan Tech (NTUST), ECE Department Oct. 2020 - Feb. 2026
Reaseach Topic: Efficient Spatio-Temporal Deep Learning for Real-Time Video-Based Accident Anticipation and Understanding on Edge Devices
Advisors: prof. Yie-Tarng Chen and Wen-Hsien Fang
M.Sc. in Software Engineering - University of West Bohemia (UWB), CSE Department Sept. 2017 - Jun. 2020
Thesis: Information Extraction from Heterogeneous Image-based Documents using Templates
Master’s Exchange Programme - Taiwan Tech (NTUST), CSIE Department Sept. 2018 - Feb. 2019
Courses: Intelligent Video Surveillance Systems, Intelligent Control System, Basic Oral Chinese
B.Sc. in Software Engineering - University of West Bohemia (UWB), CSE Department Sept. 2014 - Jun. 2017
Work Experience
Research Assistant - Taiwan Tech (NTUST) Oct. 2020 - Jan. 2026
Reaseach Areas: Multi-modal (text-visual) models, spatio-temporal modelling, image-to-video adaptation, parameter-efficient training, domain adaptation, real-time deployment (NVIDIA Jetson Nano), and self-supervised learning.
Technologies: PyTorch, Python, C/C++, CUDA, TensorRT, VLM, ViTs, LLMs, GNNs.
Teaching Assistant - Taiwan Tech (NTUST) Feb. 2021 - Jan. 2026
Courses: Intelligent Video Surveillance Systems, Database, Large Language Models and Applications.
Mentoring: Master’s student at our lab (research and code review).
Junior Researcher - NTIS Research Centre Sept. 2019 - Sept. 2020
Patented Innovation: Co-invented a multi-modal biometric approach (US Patent 2025/0184148) for person identification based on short audio-visual recordings.
Full-Stack Architecture: Integrated a C++ backend (OpenCV/DLib with CNNs) and a Node.js server to bridge AI-generated hashes as a simple QR code to a web UI for real-time person authorization.
Research & Software Engineer -- Palaxo International Ltd. Sept. 2017 - Sept. 2020
Researched and developed an image analysis system for paperless documents to allow text recognition and automatically anonymize sensitive data (using Tesseract-OCR / OpenCV / DLib / C++).
Software Engineer - Kerio Technologies, Inc. Jun. 2016 - Jun. 2017
Implementing manageable router with security (firewall) and optimization using own kernel in C/C++.
Skills
- Technical skills: PyTorch, C/C++, CUDA, Python, TensorRT, ONNX, NVidia Jetson, Vision Transformers, VLM, Large Language Models, Graph Neural Networks, Linux, OpenCV, Git, LaTeX.
- Soft skills: Problem solving, time management, critical thinking, independence, communication, responsibility.
- Development skills: Software design and architectures, agile development, team working, leadership.
- Communication skills: Czech (native), English (fluent), Slovakia (fluent), Traditional Chinese (basic).
Publications
Lightweight Spatio-Temporal Modeling via Temporally Shifted Distillation for Real-Time Accident Anticipation
P. Patera, Y.-T. Chen, W.-H. Fang, "Lightweight Spatio-Temporal Modeling via Temporally Shifted Distillation for Real-Time Accident Anticipation,"The Fourteenth International Conference on Learning Representations (ICLR), Rio de Janeiro, Brazil, 2026, pp. 1-20, url: https://openreview.net/forum?id=8zzfTSVds2
System and method for identification, authentication, and verification of a person based upon a short audio-visual recording of the person
K. Ekštein, M. Konopík, F. Pártl, P. Patera, "System and method for identification, authentication, and verification of a person based upon a short audio-visual recording of the person", US patent 2025/0184148.
A Multi-modal Architecture with Spatio-Temporal-Text Adaptation for Video-based Traffic Accident Anticipation
P. Patera, Y. -T. Chen and W. -H. Fang, "A Multi-Modal Architecture With Spatio-Temporal-Text Adaptation for Video-Based Traffic Accident Anticipation," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 35, no. 9, pp. 8989-9002, Sept. 2025, doi: 10.1109/TCSVT.2025.3552895.
Spatio-Temporal Adaptation with Dilated Neighbourhood Attention for Accident Anticipation
P. Patera, Y. -T. Chen and W. -H. Fang, "Spatio-Temporal Adaptation With Dilated Neighbourhood Attention For Accident Anticipation," 2024 IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, 2024, pp. 2452-2458, doi: 10.1109/ICIP51287.2024.10647316.
Teaching
Honors & Awards
- IT SPY 2020 – Recognized as one of the best Master’s theses in Central Europe submitted in 2020.
- M.Sc. Diploma with Honors.
- Recipient of the Taiwan Ministry of Education Scholarship (2020–2024).
- EMI (English as a Medium of Instruction) Teaching Certificate.
