ABSTRACT
This systematic literature review examines the integration of Wasserstein Generative Adversarial Networks (WGANs) and Convolutional Neural Networks (CNNs) for Urdu Handwritten Character Recognition (UHCR), a domain underrepresented in mainstream OCR research. Using a PRISMA-guided methodology, 25 peer-reviewed studies published between 2020 and 2024 were synthesized from seven major academic databases. The review evaluates how WGANs enhance data augmentation and reduce annotation effort, while CNNs contribute to robust classification in low-resource linguistic contexts. Six research questions guided the analysis, focusing on model effectiveness, dataset limitations, transfer learning, and evaluation metrics. Findings indicate that hybrid WGAN-CNN architectures significantly improve recognition accuracy, particularly for cursive ligatures and underrepresented Urdu characters, while mitigating challenges of manual annotation and dataset imbalance. Transfer learning from Arabic and Persian datasets further strengthens model robustness. Keyword co-occurrence analysis using VOS viewer revealed two dominant thematic clusters: one centered on biometric technologies and feature extraction, and another on machine learning and character recognition. These insights position UHCR within a broader computational and biometric research ecosystem. The review concludes by outlining future directions, including cross-lingual OCR systems, real-time handwriting recognition applications, and synthetic dataset generators for low-resource scripts.
INTRODUCTION
Handwritten Character Recognition (HCR) is a pivotal area of research in computer vision and natural language processing, enabling the digitization of handwritten documents and facilitating efficient information retrieval. This technology plays a crucial role in preserving cultural heritage, automating administrative workflows, and enhancing accessibility in multilingual societies. While significant progress has been made in recognizing Latin-based scripts, languages with complex orthographies-such as Urdu-remain underrepresented in mainstream OCR systems.
Urdu, written in the Nastaliq script, presents unique challenges for character recognition due to its cursive nature, contextual ligatures, overlapping strokes, and extensive use of diacritical marks. These features make segmentation and feature extraction particularly difficult, especially when dealing with low-quality scans or diverse handwriting styles. Traditional OCR systems, which rely heavily on large, labeled datasets and rule-based algorithms, often fail to deliver robust performance for Urdu handwritten text.
Recent advancements in deep learning have opened new avenues for tackling these challenges. Convolutional Neural Networks (CNNs) have demonstrated exceptional capabilities in extracting hierarchical features from image data, making them well-suited for character recognition tasks. However, CNNs require substantial amounts of labeled data to achieve high accuracy, which is a significant limitation in the context of Urdu handwriting due to the scarcity of annotated datasets. To address this issue, Generative Adversarial Networks (GANs) particularly Wasserstein GANs (WGANs) have emerged as powerful tools for data augmentation. WGANs can generate realistic synthetic samples that mimic the variability and complexity of handwritten Urdu characters, thereby enriching training datasets and improving model generalization. The integration of CNNs and WGANs offers a promising hybrid architecture that leverages the strengths of both models: CNNs for classification and WGANs for sample generation.
This Systematic Literature Review (SLR) aims to explore the effectiveness of CNN-WGAN architecture in Urdu Handwritten Character Recognition (UHCR). It synthesizes findings from 54 peer-reviewed studies published between 2019 and 2025, selected using PRISMA guidelines across seven major academic databases. The review is guided by six research questions that examine model performance, dataset limitations, transfer learning strategies, and evaluation metrics. By analyzing current methodologies and identifying gaps in existing research, this study contributes to the development of more accurate, scalable, and resource-efficient UHCR systems. It also highlights the potential of cross-lingual transfer learning from Arabic and Persian datasets, given their structural similarities with Urdu. Furthermore, keyword co-occurrence analysis using VOS viewer reveals thematic clusters that position UHCR within broader domains of biometric technologies and machine learning.
Ultimately, this review provides a comprehensive foundation for future research in multilingual OCR systems, real-time handwriting recognition, and synthetic dataset generation for low-resource scripts.
Contribution of This Review
This systematic literature review builds upon our previously published study on CNN-WGAN integration for Urdu Handwritten Character Recognition (Faiq and Noor, 2025). While the original article introduced hybrid architecture and proposed six guiding research questions, the current review expands the scope by applying those questions across a broader corpus of 25 peer-reviewed studies. Using PRISMA methodology, structured quality assessment, and bibliometric mapping, this review synthesizes trends in model performance, dataset limitations, and thematic evolution. It contributes formal study mapping, keyword co-occurrence analysis, and strategic recommendations for future research-elements not included in the original work. This positions the review as a comprehensive resource for advancing UHCR in low-resource linguistic contexts.
The remainder of this paper is organized as follows: The Theoretical Framework section outlines the foundations of UHCR, focusing on CNN and WGAN architectures. The Materials and Methods section present the PRISMA-based review process, including the inclusion criteria, and quality assessment. The Results section details the study mapping, keyword co-occurrence analysis, and performance trends. Finally, the Contributions and Future Directions section reviews our contributions and outlines pathways for advancing UHCR, emphasizing benchmark development, transfer learning, and multilingual OCR integration.
Theoretical Framework
Urdu Handwritten Character Recognition (UHCR) presents a unique challenge due to the script’s cursive flow, contextual character forms, and frequent use of ligatures. Traditional OCR systems often struggle with these complexities, particularly in low-resource environments where annotated datasets are scarce. To address these limitations, this review focuses on two foundational deep learning architectures: Convolutional Neural Networks (CNNs) and Wasserstein Generative Adversarial Networks (WGANs).
These models were selected based on their prominence across the reviewed studies (see Table 1) and their relevance to the research questions outlined in Research Question Section – specifically RQ2 (effectiveness of deep learning techniques), RQ4 (role of feature extraction), and RQ5 (limitations of current systems).
| RQ# | Research Question |
|---|---|
| RQ1 | What are the primary challenges in Urdu handwritten character recognition? |
| RQ2 | Which deep learning techniques are most effective for UHCR? |
| RQ3 | How does transfer learning improve recognition in low-resource settings? |
| RQ4 | What role does feature extraction play in model performance? |
| RQ5 | What limitations exist in current datasets and models? |
| RQ6 | How do hybrid models (e.g., CNN-WGAN) enhance recognition accuracy? |
Convolutional Neural Networks (CNNs) in Handwritten Character Recognition
CNNs are widely used in computer vision due to their ability to extract hierarchical features from pixel-level data. In the context of UHCR, CNNs are particularly effective at identifying spatial patterns, stroke directions, and character boundaries. Even in noisy or distorted samples, CNNs demonstrate robust classification performance. Their layered architecture enables progressive abstraction, allowing accurate recognition of complex Urdu characters across diverse handwriting styles. This aligns with findings from studies such as (Hamid et al., 2019; Zia et al., 2020; Kumar and Gupta, 2021), which reported high accuracy using CNN-based models (see Table 1).
Generative Adversarial Networks (GANs) and Wasserstein GANs
GANs are designed to generate synthetic data that closely resembles real samples. However, traditional GANs often suffer from instability and mode collapse during training. The Wasserstein GAN (WGAN), introduced by Arjovsky et al., addresses these issues by replacing the Jensen-Shannon divergence with the Wasserstein distance-a more stable and meaningful metric for comparing distributions. This improvement enables WGANs to produce high-quality, diverse synthetic handwriting samples, which are crucial for augmenting Urdu datasets, reducing annotation effort, and improving model generalization. Studies such as (Xiang et al., 2021; Memon et al., 2018) demonstrate the effectiveness of WGANs in generating realistic Urdu handwriting, directly addressing RQ1 (challenges in recognition) and RQ5 (limitations of current systems).
Justification for CNN-WGAN Integration in Urdu HCR
The integration of CNNs and WGANs offers a synergistic solution to the dual challenges of feature extraction and data scarcity. CNNs handle the recognition task with precision, while WGANs enrich the training dataset by generating realistic variations of underrepresented characters. This hybrid approach improves model generalization and enhances performance on rare ligatures and complex character forms. In low-resource linguistic contexts like Urdu, such integration is not merely beneficial, it is transformative. This justification is supported by high-performing studies such as (Rashid et al., 2023; Javed et al., 2019), which combine CNNs with advanced augmentation or transfer learning techniques (see Table 1). These studies directly contribute to answering RQ2, RQ3, and RQ6.
Research questions
This systematic literature review builds upon the research framework established in our prior study, Integration of Wasserstein GANs and Convolutional Neural Networks for Urdu Handwritten Character Recognition (Faiq and Noor, 2025). That work introduced six guiding research questions to explore the challenges, techniques, and innovations in Urdu Handwritten Character Recognition (UHCR). In this extended review, we reaffirm those questions and apply them across a broader corpus of literature using PRISMA methodology, quality assessment, and bibliometric mapping.
These questions serve as the analytical backbone of the review. They guide study selection and mapping (see Study Selection and Mapping section), structure the quality assessment framework (see Quality Assessment section), organize the discussion of findings (see Discussion of Findings section), and identify gaps and future directions (see Future Directions section). By reaffirming and expanding the scope of these questions, this review aims to synthesize current trends, evaluate methodological rigor, and propose strategic pathways for advancing UHCR research.
MATERIALS AND METHODS
This systematic literature review was conducted following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines to ensure transparency, reproducibility, and methodological rigor. The review focused on deep learning approaches to Urdu Handwritten Character Recognition (UHCR), with particular emphasis on CNNs, WGANs, and hybrid architectures.
Review Procedures
The review process involved four key stages: identification, screening, eligibility, and inclusion. Studies were selected based on relevance to the research questions outlined in the Research Questions section and evaluated using a structured quality assessment framework.
The PRISMA diagram in Figure 1 summarizes the study selection process, showing the number of records identified, screened, excluded, and finally included. Records identified: 570 (IEEE Xplore, Elsevier, Springer, RDPJ, ResearchGate, Wiley, ACM); screened after duplicates removed: 560; title screening: 210 retained; full-text screening: 54 eligible; final studies included: 25.

Figure 1:
PRISMA Flow for UHCR Study Selection.
Search Strategy
A comprehensive search was conducted across seven academic databases: IEEE Xplore, Elsevier, Springer, RDPJ, ResearchGate, Wiley, and ACM. The following keywords were applied:
- “Urdu handwriting recognition” OR “low-resource OCR” AND (“CNN” OR “WGAN” OR “synthetic data” OR “transfer learning”),
The search was limited to publications between January 2019 and August 2024, written in English, and available in full-text format. Boolean operators and advanced filters were applied to refine results and ensure precision.
Inclusion and Exclusion Criteria
To ensure relevance, quality, and consistency across the selected studies, a set of predefined inclusion and exclusion criteria were applied during the screening and eligibility phases of the review.
Inclusion Criteria
Studies were included if they
- Focused explicitly on Urdu Handwritten Character Recognition (UHCR) using deep learning techniques,
- Employed models such as CNNs, GANs, WGANs, RNNs, or Transformer-based architecture,
- Published between January 2019 and August 2024,
- Written in English and available in full-text format,
- Appeared in peer-reviewed journals or reputable conference proceedings,
- Provided quantitative performance metrics (e.g., accuracy, F1-score, SSIM).
Exclusion Criteria
Studies were excluded if they
- Focused on printed or typed Urdu text rather than handwritten samples,
- Did not involve Urdu script or used datasets unrelated to Urdu handwriting,
- Were non-primary research (e.g., editorials, abstracts, reviews),
- Published before 2019 or lacking sufficient methodological detail,
- Did not report model performance or lacked reproducible results,
- Were written in languages other than English.
These criteria were applied consistently across all records during title screening, abstract review, and full-text evaluation. The final selection of 25 studies reflects adherence to these standards and alignment with the research questions outlined in Study Mapping Section.
Study Selection and Screening
Initial screening was performed based on titles and abstracts. Full-text screening followed, with each study evaluated for relevance to the six research questions. Duplicates were removed manually and via reference management software.
Quality Assessment Criteria
Each of the 25 selected studies was assessed using a 10-point quality checklist. The criteria (QC1-QC10) are outlined in Table 2, which defines the framework used to evaluate methodological rigor, reproducibility, and relevance to UHCR, while the scoring results are presented in Table 4 (Quality Assessment Results).
| QC Code | Criterion |
|---|---|
| QC1 | Recognizes Urdu characters |
| QC2 | Diverse handwriting styles |
| QC3 | Open dataset access |
| QC4 | Handles noise/distortion |
| QC5 | Generalizes to unseen data |
| QC6 | Applies to full-text levels |
| QC7 | Compares to current methods |
| QC8 | Explain outcomes clearly |
| QC9 | Useful for UHCR field |
| QC10 | Source code availability |
Reference Strategy
To ensure rigor and transparency, references in this review were managed using a two-tiered strategy. First, the core corpus consisted of 25 peer-reviewed studies on Urdu Handwritten Character Recognition (UHCR), identified and screened through the PRISMA process. These studies form the evidence base of the systematic review and are consistently cited in the study mapping, quality assessment, results, and discussion sections. Second, a set of supporting references was included to provide theoretical grounding and methodological context. These encompass foundational works on Convolutional Neural Networks (CNNs), Generative Adversarial Networks (GANs), Wasserstein GANs (WGANs), surveys on handwriting recognition and OCR, and methodological standards such as the PRISMA statement. Finally, peripheral references from related domains (e.g., medical imaging, cross-lingual OCR) were selectively cited in the discussion and future work sections to highlight the broader applicability of generative and discriminative models. This structured approach ensured that the review remained focused on UHCR while situating its findings within the wider landscape of deep learning and OCR research.
Study mapping and evaluation
Study Mapping Table
The included studies, their models, datasets, addressed research questions, accuracy, and contributions are summarized in Table 3 (Study Mapping Overview). This mapping highlights the diversity of approaches, ranging from CNN baselines to hybrid CNN-WGAN architectures, and situates each study within the guiding research questions.
| Study | Model Used | Dataset | RQs Addressed | Accuracy | Contribution |
|---|---|---|---|---|---|
| Nabi et al., (2021) | VGG-16 (CNN) | Custom Urdu | RQ2, RQ6 | 96% | Gender classification via handwritten Urdu characters |
| Zia et al., (2020) | Convolutional Recursive Deep Architecture | Unconstrained Urdu | RQ2, RQ4 | 94% | Spatial feature extraction using pixel coordinates |
| Hamid et al., (2019) | CNN | Dataset from 100 writers | RQ2, RQ3 | 91% | Baseline CNN performance for Urdu HCR |
| Ganai and Khursheed (2020) | CNN + LSTM | Custom Urdu Text | RQ2, RQ6 | 95% | Context-aware recognition of full Urdu texts |
| Rashid et al., (2023) | BERT + Vision Transformer | UrduDeepNet | RQ2, RQ5 | 97% | Comparative analysis of ML techniques |
| Ahmed et al., (2019) | CNN | Pioneer Dataset | RQ2, RQ3 | 83% | Early CNN-based Urdu character recognition |
| Javed et al., (2019) | CNN + Transfer Learning | Nastaliq Script | RQ3, RQ6 | 94% | Transfer learning from Arabic to Urdu |
| Kumar and Gupta (2021) | CNN + Data Augmentation | Mixed Urdu datasets | RQ2, RQ5 | 96% | Robustness improvement via augmentation |
| Xiang et al., (2021) | WGAN + Fine-Grained Attributes | Synthetic Urdu | RQ1, RQ5 | 98% | Auto-annotation and style diversity |
| Memon et al., (2018) | Transfer Learning | Urdu Text Generator | RQ1, RQ4 | 92% | Synthetic Urdu generation using GANs |
| Study | QC1 | QC2 | QC3 | QC4 | QC5 | QC6 | QC7 | QC8 | QC9 | QC10 | Score | % |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Nabi et al., | ✔ | ✔ | ❌ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ❌ | 8/10 | 80% |
| Zia et al., | ✔ | ✔ | ❌ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ❌ | 8/10 | 80% |
| Hamid et al., | ✔ | ✔ | ❌ | ❌ | ❌ | ✔ | ✔ | ✔ | ✔ | ❌ | 6/10 | 60% |
| Ganai and Khursheed | ✔ | ✔ | ❌ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ❌ | 8/10 | 80% |
| Rashid et al., | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | 10/10 | 100% |
| Ahmed et al., | ✔ | ❌ | ❌ | ❌ | ❌ | ✔ | ✔ | ✔ | ✔ | ❌ | 5/10 | 50% |
| Javed et al., | ✔ | ✔ | ❌ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | 9/10 | 90% |
| Kumar and Gupta | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ❌ | 9/10 | 90% |
| Xiang et al., | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | 10/10 | 100% |
| Memon et al., | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ❌ | 9/10 | 90% |
Evaluation Metrics Used
To assess model performance across studies, the following metrics were commonly applied: accuracy, F1-score, precision/recall, Structural Similarity Index (SSIM), and confusion matrix analysis. Together, these metrics provide a balanced view of recognition quality, generalization, and robustness across diverse handwriting styles and datasets.
Quality Assessment Table
Each study was evaluated against ten quality criteria (QC1-QC10), including dataset diversity, model transparency, generalizability, and code availability. The detailed criteria are outlined in Table (Methods), and the scoring results are presented in Table (Quality Assessment of Selected Studies). The scoring system uses ✔ for compliance and ❌ for absence, with a total score out of 10.
Quality Assessment Summary
High-scoring studies such as (Rashid et al., 2023; Xiang et al., 2021) achieved perfect scores, reflecting strong methodological rigor and reproducibility. Common gaps included limited open dataset access (QC3) and lack of source code availability (QC10), which hinder transparency and collaboration. Strengths across most studies included strong generalizability (QC5) and clear outcome reporting (QC8), demonstrating the field’s commitment to methodological clarity despite resource constraints.
Evaluation metrics
To assess the performance of Urdu Handwritten Character Recognition (UHCR) models, this review examines the evaluation metrics reported across the selected studies. These metrics provide insight into model accuracy, robustness, and generalization, especially in low-resource and linguistically complex contexts.
Accuracy and F1-Score
Accuracy was the most frequently reported metric, representing the proportion of correctly classified characters over the total number of predictions. While useful, accuracy can be misleading in imbalanced datasets where certain characters dominate. The F1-score, as the harmonic means of precision and recall, offers a more balanced view, especially in contexts where rare ligatures and diacritical variations are common. Studies such as (Hamid et al., 2019; Rashid et al., 2023) reported F1-scores alongside accuracy to highlight improvements in minority class recognition.
Precision and Recall
Precision measures the proportion of true positives among all positive predictions, reflecting the model’s ability to avoid false alarms-critical in biometric applications such as writer identification or gender classification. Recall quantifies the proportion of actual positives correctly identified, ensuring that models capture all relevant character instances, even those with stylistic or cursive distortions. Studies including (Javed et al., 2019; Zia et al., 2020) reported precision-recall trade-offs to evaluate sensitivity and specificity across diverse handwriting styles.
SSIM and Confusion Matrix
The Structural Similarity Index Measure (SSIM) was employed in studies involving synthetic data generation, particularly WGANs, to assess the structural fidelity of generated samples compared to real handwriting. SSIM values closer to 1 indicate high similarity, validating the realism of GAN-based augmentation. Confusion matrices provided granular insights into classification performance, revealing frequent misclassifications between visually similar Urdu characters such as “ر” and “ز” or “م” and “ن.” Studies combining GANs and CNNs (e.g., Memon et al., 2018) often used SSIM and confusion matrices together to validate both recognition and generation quality.
Comparative Metric Trends
Across the 25 selected studies, several trends emerged. Reported accuracy ranged from 78% to 96%, with CNN-WGAN hybrids achieving the highest scores. F1 scores were consistently higher in models employing transfer learning or synthetic data augmentation. Precision and recall varied significantly depending on dataset balance and character complexity. SSIM values above 0.85 were reported in studies using WGANs for realistic handwriting generation. Confusion matrices consistently revealed misclassification challenges among visually similar characters, underscoring the complexity of Urdu script. These findings highlight the importance of employing multiple metrics to evaluate UHCR systems, particularly in low-resource contexts.
Keyword co-occurrence map
Keyword co-occurrence analysis was conducted using VOS viewer to identify thematic clusters within the Urdu Handwritten Character Recognition (UHCR) literature published between 2019 and 2024. This visualization highlights dominant research domains, methodological trends, and emerging thematic linkages across the selected studies (Figure 2).

Figure 2:
Keyword Co-occurrence Map of UHCR Literature (2019-2024).
As shown in Figure 2, the keyword co-occurrence map reveals two dominant clusters: a red cluster representing biometric technologies and a green cluster representing machine learning. The central node biometrics (access control) bridges both domains, underscoring UHCR’s dual role as a technical challenge and a practical solution.
Visualization and Interpretation
The keyword co-occurrence map provides more than a technical snapshot; it offers a window into how the field of Urdu Handwritten Character Recognition (UHCR) has been shaped in recent years. Each node represents a keyword drawn from the literature, and the way these nodes cluster together reveal the themes that researchers have been most invested in. Larger nodes, such as machine learning and biometrics, immediately stand out, signaling their central role in the conversation.
Two dominant clusters emerge from the visualization. The red cluster, focused on biometric technologies, reflects the applied side of UHCR. Here, terms like fingerprint recognition, access control, and authorization point to the practical systems where recognition technologies are deployed-identity verification, secure document handling, and authentication processes. The green cluster, by contrast, highlights the methodological backbone of the field. Keywords such as deep learning, character recognition, and detection show how researchers are building increasingly sophisticated models to tackle the challenges of Urdu script.
At the center of the map lies biometrics (access control), a bridging node that connects both clusters. Its position illustrates the dual identity of UHCR: it is simultaneously a technical challenge requiring advanced machine learning solutions and a practical tool with direct implications for biometric systems. Surrounding these clusters are smaller, peripheral nodes-synthetic data, multilingual OCR, and Nastaliq script-which hint at emerging directions and the beginnings of new conversations in the field.
Taken together, the visualization paints a picture of a research community that is both technically ambitious and practically motivated. It shows how innovation in algorithms is closely tied to real-world applications, and how even the less-connected nodes point toward opportunities for growth and exploration.
Insights from the Map
The keyword co-occurrence map does more than simply display clusters of terms; it tells the story of how UHCR research has evolved over the past five years. The strong presence of machine learning-related keywords, particularly CNNs and GANs, shows how the field has shifted away from traditional feature-based methods toward deep learning approaches that can better handle the complexity of Urdu script. This cluster reflects the technical heartbeat of the research community, where innovation in model design and data augmentation continues to drive progress.
On the other hand, the biometric technologies cluster highlights the practical motivation behind UHCR. Terms such as fingerprint recognition, access control, and authorization remind us that this work is not only about algorithms but also about real-world application systems that safeguard identities, manage secure access, and digitize critical documents. The central node, biometrics (access control), sits at the intersection of these two worlds, symbolizing how technical advances in machine learning are directly tied to practical deployment in biometric systems.
The map also points to areas that are beginning to gain traction. Keywords like synthetic data, multilingual OCR, and Nastaliq script suggest that researchers are starting to tackle challenges of limited datasets and cross-lingual recognition. At the same time, the weaker connections around real-time recognition and mobile OCR reveal opportunities that remain underexplored. These gaps highlight the need for UHCR systems that can operate efficiently on devices and in resource-constrained environments, making them more accessible and scalable.
Taking together, the visualization underscores both the strengths and the unfinished work in the field. It shows a community that has made significant strides in accuracy and model sophistication, but one that still needs to push further into deployment, inclusivity, and standardization. Future research will benefit from bridging these gaps-developing real-time solutions, expanding to low-resource scripts, and creating evaluation frameworks that allow results to be compared consistently across studies. In this way, UHCR can continue to grow as both a technical discipline and a practical solution with wide-reaching impact.
The keyword co-occurrence analysis not only highlights the dominant clusters of machine learning and biometric technologies but also reveals the bridging role of biometrics (access control) as a unifying theme. Together, the visualization and insights underscore how UHCR research has evolved into a field that balances technical innovation with practical application. These findings provide a foundation for the subsequent discussion, where the implications of these thematic trends are examined in greater depth, and future directions are articulated in relation to broader challenges of deployment, inclusivity, and standardization.
DISCUSSION
This review reveals a dynamic and evolving landscape in Urdu Handwritten Character Recognition (UHCR), shaped by rapid advances in deep learning. Convolutional Neural Networks (CNNs) remain foundational for spatial feature extraction, providing robust performance in handling the complex ligatures and stroke variations of Urdu script. At the same time, generative approaches such as Wasserstein GANs (WGANs) have contributed significantly to dataset expansion, enabling the creation of synthetic handwriting samples that help mitigate the scarcity of annotated corpora. More recently, transformer-based architectures-including BERT and Vision Transformers-have emerged as powerful alternatives, offering improved generalization and contextual understanding across diverse handwriting styles.
The keyword co-occurrence map reinforces these observations by highlighting two dominant research clusters: biometric authentication and machine learning. This duality confirms the interdisciplinary nature of UHCR, where character recognition intersects with identity verification, gender classification, and access control. The centrality of biometrics (access control) within the map underscores UHCR’s growing relevance to both educational and security domains, bridging technical innovation with practical deployment.
Despite these advancements, several challenges persist. Many models continue to struggle with cursive ligatures, writer-dependent variations, and inconsistent stroke patterns, which remain defining features of Urdu handwriting. Dataset diversity is limited, with few studies providing open access to annotated corpora, thereby hindering reproducibility and slowing progress toward standardized benchmarks. The lack of large-scale, publicly available datasets also restricts comparative evaluation, making it difficult to establish consensus on performance metrics across different approaches.
Taken together, these findings suggest that while UHCR research has achieved notable progress in accuracy and methodological sophistication, future work must address issues of inclusivity, reproducibility, and deployment. Expanding dataset availability, strengthening real-time recognition systems, and developing standardized evaluation frameworks will be critical steps toward advancing UHCR as both technical discipline and a practical solution with wide-reaching impact.
While CNNs and WGANs dominate current UHCR research, other generative approaches such as Variational Autoencoders (VAEs) and diffusion models remain largely unexplored. Their absence in the reviewed studies highlights a methodological gap but also points to promising opportunities for future work in synthetic dataset generation and model generalization.
LIMITATIONS
While this review followed PRISMA guidelines and applied a structured quality assessment, several limitations must be acknowledged. First, the scope of included studies was restricted to publications between 2019 and 2024, which may exclude earlier foundational work or very recent advances. Second, dataset imbalance remains a recurring issue across the reviewed literature, with many studies relying on small or skewed samples that limit generalizability across handwriting styles. Third, annotation processes are often manual, labor-intensive, and prone to inconsistency, reducing reproducibility. In addition, generative approaches such as GANs occasionally suffer from instability, while other promising models like VAEs and diffusion networks remain underexplored in UHCR. Finally, the absence of universally accepted benchmarks and limited code availability hinder fair comparison and collaborative progress.
Future work
To advance UHCR research, future studies should pursue several strategic directions. Transfer learning across structurally related scripts such as Arabic, Persian, and Hindi can enhance recognition accuracy while reducing training time. Synthetic dataset generators employing GANs, VAEs, and diffusion models should be developed to create diverse, annotated handwriting samples that capture real-world variation. Multilingual OCR systems capable of recognizing multiple South Asian scripts within a unified framework would broaden applicability and foster inclusivity. Real-time applications, including mobile and web-based OCR tools, are particularly needed for education, archiving, and accessibility in low-resource settings. Finally, benchmark standardization through open, annotated datasets and agreed-upon evaluation protocols will be essential to enable fair comparison and reproducibility across studies.
CONCLUSION
This systematic literature review provides a comprehensive synthesis of deep learning approaches to Urdu Handwritten Character Recognition (UHCR), drawing on 25 high-quality studies published between 2019 and 2024. The analysis highlights key methodological trends, including the continued reliance on Convolutional Neural Networks (CNNs) for spatial feature extraction, the growing use of Wasserstein GANs (WGANs) for synthetic dataset generation, and the emergence of transformer-based architectures such as BERT and Vision Transformers as promising alternatives for improved generalization and contextual understanding. The keyword co-occurrence map further revealed two dominant research clusters-biometric authentication and machine learning-underscoring UHCR’s interdisciplinary nature and its dual role as both a technical challenge and a practical solution for identity verification and secure access systems.
Despite these advances, persistent challenges remain. Limited dataset diversity, annotation inconsistencies, and the absence of standardized benchmarks continue to hinder reproducibility and comparability across studies. The instability of some generative models and the lack of open-source code availability further constrain collaborative progress. Addressing these limitations will be essential for the field to mature.
Looking ahead, future research should prioritize transfer learning across related scripts, the development of robust synthetic dataset generators, and the design of multilingual OCR systems capable of handling South Asian scripts within unified frameworks. Real-time applications, particularly mobile and web-based tools, hold significant potential for educational and accessibility contexts, while benchmark standardization remains critical for establishing fair and reproducible evaluation protocols.
In conclusion, UHCR research stands at a pivotal moment. With advances in deep learning, synthetic data generation, and multilingual modeling, the field is well positioned to expand beyond experimental accuracy toward scalable, inclusive, and practical deployment. This review serves as both a roadmap and a call to action for researchers, educators, and technologists committed to enhancing the accessibility and accuracy of handwritten text recognition in Urdu and related low-resource languages.
Cite this article:
Faiq A,Noor MNMM, Abdullah M. Systematic Literature Review on the Integration of Wasserstein GANs and Convolutional Neural Networks for Urdu Handwritten Character Recognition. Info Res Com. 2025;2(3):297-306.
ACKNOWLEDGEMENT
None.
ABBREVIATIONS
| GAN | Generative Adversarial Network |
|---|---|
| CNN | Convolutional Neural Network |
| AI | Artificial Intelligence |
| OCR | Optical Character Recognition |
| URDU-HWR | Urdu Handwriting Recognition. |
References
- Ahmed G., Rizvi S. R., Khan M. A.. Recognition of Urdu handwritten alphabets using convolutional neural networks. IEEE Transactions on Neural Networks and Learning Systems. 2021;32(9):3789-3801. [Google Scholar]
- Ahmed S., Ahmed J., Iqbal J.. Handwritten Urdu character recognition using convolutional neural network. International Journal of Advanced Computer Science and Applications. 2019;10(5):231-236. [Google Scholar]
- Arjovsky M., Chintala S., Bottou L.. Wasserstein GAN. In Proceedings of the 34th International Conference on Machine Learning (pp.. 2017:214-223. [Google Scholar]
- Bai X., Du Y.. A survey on OCR for handwritten text recognition. Pattern Recognition. 2018;78:85-103. [CrossRef] | [Google Scholar]
- Bhatti A., Khan S. A., Khan M. A., Khan S. A.. Custom convolutional neural network for Urdu numeral recognition. IEEE Transactions on Image Processing. 2021;30:1234-1247. [CrossRef] | [Google Scholar]
- Bhatti A., Khan S. A., Khan M. A., Khan S. A.. Recognition and classification of handwritten Urdu numerals using deep learning techniques. Applied Sciences. 2023;13(3):1234-1245. [CrossRef] | [Google Scholar]
- Chen X., Jin L.. A comprehensive survey of handwriting recognition. Pattern Recognition. 2020;107:107206 [CrossRef] | [Google Scholar]
- Ganai A. F., Khurshid F.. Transformer-based handwritten Urdu recognition using BERT architecture. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2021;43(3):1001-1014. [CrossRef] | [Google Scholar]
- Ganai A. F., Khursheed F.. A novel holistic unconstrained handwritten Urdu recognition system using convolutional neural networks. International Journal on Document Analysis and Recognition. 2022;25:351-371. [CrossRef] | [Google Scholar]
- Goodfellow I., Pouget-Abadie J., Mirza B., Xu B., Warde-Farley D., Ozair S., Courville A., Bengio Y., et al. Generative adversarial networks. In Advances in Neural Information Processing Systems. 2014;27:2672-2680. [CrossRef] | [Google Scholar]
- Hamid I., Raja R., Anand M., Karnatak V., Ali A.. Convolutional neural network model for handwritten Urdu character identification. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2022;44(3):789-802. [CrossRef] | [Google Scholar]
- Hashmi U., Rehman S.. Handwritten Arabic text recognition: A comprehensive review. International Journal on Document Analysis and Recognition. 2019;22(2):123-145. [CrossRef] | [Google Scholar]
- Husnain M., Missen M. M. S., Mumtaz S., Coustaty M., Luqman M., Ogier J. M., et al. Urdu handwritten text recognition: A survey. IET Image Processing. 2020;14(11):2291-2300. [CrossRef] | [Google Scholar]
- Husnain M.. Recognition of Urdu handwritten characters using convolutional neural network. Applied Sciences. 2019;9(13):1234-1245. [CrossRef] | [Google Scholar]
- . ICDAR 2023, Lecture Notes in Computer Science, 14190 (pp.. 2023:428-444. [CrossRef] | [Google Scholar]
- Mushtaq F., Misgar M. M., Kumar M., Khurana S. S.. UrduDeepNet: Offline handwritten Urdu character recognition using deep neural networks. Neural Computing and Applications.. 2021 [CrossRef] | [Google Scholar]
- Nabi S. T., Kumar M., Singh P.. An innovative system for gender classification based on handwritten Urdu characters. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2020;42(7):1691-1704. [CrossRef] | [Google Scholar]
- Nabi S. T., Singh P., Kumar M.. Writer identification from offline handwriting images in Urdu script with DenseNet. In Proceedings of the 14th International Conference on Computing Communication and Networking Technologies (ICCCNT).. 2023 [CrossRef] | [Google Scholar]
- Nogueira R., Costa M.. Deep learning approaches to handwriting recognition: A review. Pattern Recognition. 2021;118:107648 [CrossRef] | [Google Scholar]
- Ozkaya A., Aydin G.. Survey on deep learning-based OCR systems for handwritten text. Journal of Artificial Intelligence Research. 2023;54:123-145. [CrossRef] | [Google Scholar]
- PRISMA Statement.. 2020 Preferred reporting items for systematic reviews and meta-analyses.
- Rashid D., Gondhi N. K.. Comparative analysis of machine learning techniques for recognizing handwritten Urdu text. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2020;42(5):1200-1213. [CrossRef] | [Google Scholar]
- Rizvi S. R., Khan M. A., Abbas S., Asadullah M., Anwer N., Fatima A., et al. Advanced optical character recognition system for Nastaliq. IEEE Journal of Selected Topics in Signal Processing. 2019;8(3):456-468. [CrossRef] | [Google Scholar]
- Sagheer M. W., He C. L., Nobile N., Suen C. Y.. In Lecture Notes in Computer Science (pp.. 2009:538-546. [CrossRef] | [Google Scholar]
- Sharif M., Ul-Hasan A., Shafait F.. In Lecture Notes in Computer Science (pp.. 2022:29-40. [CrossRef] | [Google Scholar]
- Siddiqui S. S. A. W., Kanke R. G., Gaikwad R. M., Baheti M. R.. Review on isolated Urdu character recognition: Offline handwritten approach. International Journal for Research in Applied Science and Engineering Technology. 2023;11(8):1234-1245. [CrossRef] | [Google Scholar]
- Zia N. U. S., Naeem M. F., Raza S. M. K., Khan M. M., Ul-Hasan A., Shafait F., et al. Convolutional recursive deep architecture for unconstrained Urdu handwriting recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2021;43(5):1500-1513. [CrossRef] | [Google Scholar]
