Avatar

Md Fokhrul Islam

Machine Learning Engineer
at ACI Limited

Research Assistant
at University of Dhaka

(Looking for a PhD position)

My research pursuits revolve around the fascinating intersection of multi-modal learning in scenarios with limited data and interpretable machine learning, with a particular focus on learning from natural geometries present in the problems. I am passionate about creating automated systems for applications in healthcare, robotics, and social good, with a strong emphasis on geometric (e.g. graph) representation learning, reinforcement learning, and machine perception.

I recently completed my studies in Robotics and Mechatronics Engineering at the University of Dhaka. I am currently working as a Research Assistant under the guidance of Dr. Sejuti Rahman. Recently, I joined ACI Limited as a Machine Learning Engineer where I work on several computer vision projects, namely PrescriptionOCR, Swapno POS/shoplifting theft detection, and BengaliOCR.

Outside of my research pursuits, I find great enjoyment in reading non-fiction books, particularly those related to moral philosophy and psychology. I am also an avid follower of soccer and chess, with Lionel Messi (G.O.A.T.!) and Magnus Carlsen being among my favorite players.

Interests

  • AI in Healthcare, Robotics, and Social Good
  • Machine vision
  • Geometric representation Learning
  • Multimodal Commonsense Reasoning

Education

  • Masters in Robotics and Mechatronics Engineering (2022 - 2023)

    University of Dhaka

  • Undergrad Student in Robotics and Mechatronics Engineering (2017 - 2021)

    University of Dhaka

Recent News


Publications

Graph Convolutional Networks for Assessment of Physical Rehabilitation Exercises. IEEE Transaction on Neural Systems and Rehabilitation Engineering (TNSRE), 2022. (  * Equal contribution)

Paper Code Demo

AI-Driven Stroke Rehabilitation Systems and Assessment: A Systematic Review. IEEE Transaction on Neural Systems and Rehabilitation Engineering (TNSRE), 2022.

Paper Code

Data-augmentation for Bangla-English Code-Mixed Sentiment Analysis: Enhancing Cross Linguistic Contextual Understanding. IEEE Access, 2023.

Paper Code

VAE-GAN3D: Leveraging image-based semantics for 3D zero-shot recognition. Image and Vision Computing, 2024.

Paper





Research

Relation and Knowledge Aware Zero Shot Learning in 3D Object Recognition(Master’s thesis)

Zero-shot learning (ZSL) has emerged as a promising approach to categorizing unseen objects without labeled instances by leveraging external knowledge sources. However, in the domain of 3D object recognition, these sources often contain irrelevant information that hinders accurate object representation. Moreover, generalized zero-shot learning (GZSL) faces persistent challenges with a bias towards seen classes. To address these issues, we propose two main contributions: First, we introduce a novel multimodal embedding framework that enhances the quality of semantic information used in 3D object recognition. This framework aligns visual data from 2D images of 3D objects with graph-based semantic information from knowledge graphs and jointly learns embeddings from different modalities via an attention-based fusion module. These improved alignments and refined knowledge enable us to generate more accurate and discriminative attribute descriptions for each 3D object class. Second, to overcome the bias towards seen classes in GZSL, we propose a semi-supervised contrastive module combined with a generative adversarial network (GAN). This module integrates instance-level contrastive embedding and supervision, enhancing feature-level representations of unseen 3D object classes. Our approach improves generalization in GZSL scenarios, addressing the long-standing challenge of bias towards seen classes. We conduct extensive experiments on four state-of-the-art 3D object recognition datasets and evaluate performance in both inductive ZSL and GZSL tasks. The experimental results demonstrate the effectiveness of our method, achieving a 3% increase in the harmonic mean metric on the ModelNet40 dataset compared to the strongest baseline, 3DGenZ, and significant improvements over other state-of-the-art models.

3D Zero-shot Learning, Graph Convolutional Networks, Contrastive Learning, Knowledge Graph
*

IHABOT: Intelligent Hospital Assistance Robot to Fight Contagion by Reducing DoctorPatient Interaction

During the COVID-19 pandemic, we realized the importance and necessity of automation in hospitals and healthcare facilities. Robotics has already established itself as anecessity in the medical field, ranging from automatic diagnostic systems to assisting nurses in health care facilities, thus reducing the tedious strain of physicians or nurses and increasing diagnostic accuracy. Our project utilizes the state-of-the-art advancements in vision-based action recognition, human robot interaction, artificial intelligence, and deep learning to build an autonomous, feature-rich hospital and clinic aid robot. The proposed Intelligent Hospital Assistance Robot (IHABOT) is equipped with a number of autonomous features. First of all, it can map its surroundings, determine the best route to take to its destination, and hence navigate by itself in a real-world environment. Second, it has a variety of sensors to collect physiological data from the patient, includ- ing temperature, systolic and diastolic blood pressure, oxygen saturation, and pulse rate. IHABOT uses artificial intelligence (AI) to automatically evaluate these physiological measures and detect patients who are deteriorating early, allowing for prompt treatment and a reduction in significant adverse events. Thirdly, in the absence of a doctor, this medical robot can keep track of and assess patients’ performance on exercises through- out post-stroke therapy. The robot delivers a performance score that aids in both patient self-evaluation of performance as well as medical professionals’ evaluation of patient development and prescription of required actions. Last but not the least, IHABOT diagnoses COVID-19 from radiography images and CT scans using a novel few-shot learning-based method. Often, the conventional diagnostic techniques with high accu- racy have the setback of being expensive and sophisticated, requiring skilled individuals for specimen collection and screening, resulting in lower outreach. Therefore, medical robots like IHABOT, which do not require direct human intervention, can be used in hospitals for automated diagnosis and to lessen the likelihood of infection spreading through reduced human-to-human contact.

Robotics, Healthcare, Deep Learning, Computer Vision
*

Artificial Intelligence in Business Decision Making: A Study on Code-Mixed and Transliterated Bangla Customer Reviews

In today’s digital world, automated sentiment analysis from online reviews can contribute to a wide variety of decision-making processes. One example is examining typical perceptions of a product based on customer feedbacks to have a better understanding of consumer expectations, which can help enhance everything from customer service to product offerings. Online review comments, on the other hand, frequently mix different languages, use non-native scripts and do not adhere to strict grammar norms. For a low-resource language like Bangla, the lack of annotated code-mixed data makes automated sentiment analysis more challenging. To address this, we collect online reviews of different products and construct an annotated Bangla-English code mix (BE-CM) dataset. On our sentiment corpus, we also compare several alternative models from the existing literature. We present a simple but effective data augmentation method that can be utilized with existing word embedding algorithms without the need for a parallel corpus to improve cross-lingual contextual understanding. Our experimental results suggest that training word embedding models (e.g., Word2vec, FastText) with our data augmentation strategy can help the model in capturing the cross-lingual relationship for code-mixed sentences, thereby improving the overall performance of existing classifiers in both supervised learning and zero-shot cross-lingual adaptability. With extensive experimentations, we found that XGBoost with Fasttext embedding trained on our proposed data augmentation method outperforms other alternative models in automated sentiment analysis on code-mixed Bangla-English dataset, with a weighted F1 score of 87%. This project is a collaboration with Centre for Advanced Research in Strategic Human Resource Management[CARSHRM], University of Dhaka.

Code-Mixed Sentiment Analysis, LLMs, Text Augmentation,

An Intelligent Agent for Evaluating and Guiding the Post-Stroke Rehabilitation Exercises(Undergraduate thesis)

Health professionals often prescribe patients to perform specific exercises for re- habilitation of several diseases (e.g., stroke, Parkinson, backpain). When patients perform those exercises in the absence of an expert (e.g., physicians/therapists), they cannot assess the correctness of the performance. Automatic assessment of physical rehabilitation exercises aims to assign a quality score given a RGBD or RGB video of the body movement as input. Recent deep learning approaches address this problem by extracting CNN features from co-ordinate grids of skele- ton data (body-joints) obtained from videos. However, they could not extract rich spatio-temporal features from variable-length inputs. To address this issue, we investigate Graph Convolutional Networks (GCNs) for this task. We adapt spatio-temporal GCN to predict continuous scores(assessment) instead of discrete class labels. Our model can process variable-length inputs so that users can per- form any number of repetitions of the prescribed exercise. Moreover, our novel design also provides self-attention of body-joints, indicating their role in predict- ing assessment scores. It guides the user to achieve a better score in future trials by matching the same attention weights of expert users. Our model successfully outperforms existing exercise assessment methods on KIMORE and UI-PRMD datasets.

Graph Neural Networks, Self-Attention, Video Analysis, Rehabilitation Engineering
*
*

Learning to Trade with Deep Q Learning

This project focuses on the development and evaluation of an Artificial Intelligence (AI) agent for optimized stock trading. Utilizing the Deep Deterministic Policy Gradient (DDPG) algorithm and other relevant techniques, the agent's primary objective is to formulate an optimal policy that maximizes profits from its actions and corresponding positions in the stock market. To validate the agent's effectiveness and versatility, comprehensive testing has been conducted using datasets from two distinct stock markets: the S&P 500 and the Dhaka Stock Exchange (DSE). The results of this research endeavor promise to offer valuable insights into AI-driven stock trading strategies applicable across diverse financial markets. This project is a collaboration with Centre for Advanced Research in Strategic Human Resource Management[CARSHRM], University of Dhaka.

Deep Reinforcement Learning, Stock Trading, Finance, DDPG, DQN




Projects

 
 
 
 
 

Swarm Robots Aggregation(Triangular Pattern Formation) using Particle Swarm Aggregation

Class Project

Jan. 2021

There are six swarm robots in two dimensional space. We model these swarms following some swarm aggregation equations to a specific triangular pattern.

 
 
 
 
 

Camera Calibration using AprilTag

Class Project

Jan. 2021

The work is based on the paper "A flexible new technique for camera calibration" by Zhengyou Zhang. We implemented this using Matlab 2021b. This work can estimates the parameters of a lens and image sensor of an image or video camera to determine a 3D point’s projection onto the image plane using AprilTag placed in real world.

 
 
 
 
 

Self Solving Eight Puzzle using A* Algorithm

Class Project

June 2020

The Self Solving Eight Puzzle has an artificial intelligence based eight puzzle solver program using A* search algorithm with Manhattan distance and mismatched tiles as heuristic. It takes the the current state of the puzzle and generates the solution showing intermediate steps. A model of the eight puzzle board was created using PVC and the tiles were moved according to the solution steps with the help of a CNC machine and electromagnet.

 
 
 
 
 

Classifying ASD/TD on Eye-Tracking Data Using Saliency Maps and Deep Learning

Software Development Project (Collaboration with IPNA)

Feb 2019 – Dec 2020

We design a visual saliency based deep learning model for automatic and quantitative ASD/TD class. Instead of directly extracting features from the fixation data, our method employs several well-known computational models of visual attention to predict eye fixation locations as saliency maps to get better result. We also incorporate few shot based method for this task because of low data availability.





Scholarships & Awards

  • Top 7th in Robi Datathon 3.0 (biggest AI/ML competition in Bangladesh), 2024
  • National Science & Technology (NST) Fellowship for Excellent Master's Thesis, 2022-2023
  • IFIC Bank Trust Fund Research Grants (Highest & Consecutive 3 times: 2021, 2022, 2023)
  • Winner in the Research Project Category, Seminar on "Robotics in Bangladesh: Academia and Industry Initiatives", 2022
  • 1st Runner-up Research Project Presentation in Dhaka University Research and Publication Fair, 2022




Contact