Skip to main content
Enterprise AI Analysis: Evaluation of Monolingual and Multilingual Transformer Models for Nepali Headline Generation

Enterprise AI Analysis

Evaluation of Monolingual and Multilingual Transformer Models for Nepali Headline Generation

This paper evaluates four Transformer-based models for Nepali headline generation, a challenging task due to the language's low-resource status and complex morphology. The models—mBART-large-cc25, mT5-small (multilingual encoder-decoder), NepBERTa (monolingual encoder adapted to seq2seq), and LLaMA-3.2-1B (multilingual decoder-only)—were fine-tuned on a custom dataset (NEPHEAD). Evaluation used both automatic (ROUGE, BLEU, METEOR, BERTScore, SBERT) and human metrics (relevance, fluency, conciseness, accuracy, engagement). Multilingual encoder-decoder models, especially mBART-large-cc25, showed the best overall performance. LLaMA-3.2-1B demonstrated strong adaptability despite no explicit Nepali pretraining, while NepBERTa's effectiveness was limited by architectural and tokenization constraints. The study emphasizes the critical role of pretraining objectives, model architecture, and tokenizer design for effective NLG in low-resource languages.

Key Performance Indicators

Highlighting the top-level performance metrics and key takeaways from the research that directly impact enterprise AI strategies.

0 Model Performance (mBART-large-cc25)
0 Languages Covered (mT5-small)
0 Human Evaluation (mBART-large-cc25)

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Developing Natural Language Generation (NLG) systems for low-resource languages like Nepali faces significant challenges due to scarce annotated data, lack of specialized NLP tools, and limited availability of pre-trained models. This study directly addresses these gaps by evaluating transformer models for Nepali headline generation.

The research compares four distinct Transformer-based architectures: multilingual encoder-decoder (mBART, mT5), a language-specific encoder adapted for seq2seq (NepBERTa), and a multilingual decoder-only model (LLaMA). This provides insights into their suitability for low-resource text generation.

A dual evaluation strategy combines automatic lexical and semantic metrics (ROUGE, BLEU, METEOR, BERTScore, SBERT) with human assessment (relevance, fluency, conciseness, accuracy, engagement). Inter-annotator agreement ensures the reliability of human judgments.

90.58 mBART-large-cc25 BERTScore-F1 (%)

Enterprise Process Flow

Custom Dataset Curation (NEPHEAD)
Fine-tuning Transformer Models
Lexical & Semantic Auto-Evaluation
Multi-Criteria Human Evaluation
Identify Optimal Architectures
Feature mBART-large-cc25 NepBERTa
Architecture Multilingual Encoder-Decoder Monolingual Encoder (adapted Seq2Seq)
Pretraining Cross-lingual, denoising autoencoder (includes Nepali) Monolingual Nepali corpus, masked language model
Tokenization SentencePiece-based (coherent subwords) WordPiece-based (fragmented outputs)
Performance Highest overall scores (automatic & human) Limited effectiveness, low fluency/coherence

Case Study: LLaMA-3.2-1B's Adaptability

Despite lacking explicit Nepali pretraining, LLaMA-3.2-1B, a multilingual decoder-only model, demonstrated competitive performance in Nepali headline generation after task-specific fine-tuning. This highlights the strong generalization capabilities of large-scale decoder-only architectures when guided by prompt-based learning.

Outcome: Achieved competitive BERTScore-F1 of 90.04% and SBERT score of 69.64%, indicating strong cross-lingual transfer capabilities through fine-tuning.

Calculate Your Potential ROI

Estimate the potential return on investment for implementing advanced NLG solutions tailored for low-resource languages in your enterprise.

Estimated Annual Savings $0
Annual Hours Reclaimed 0

Your AI Implementation Roadmap

A structured approach to integrating advanced LRL NLG into your enterprise operations for maximum impact.

Phase 1: Data Curation & Preprocessing
(4-6 Weeks)

Gathering, cleaning, and standardizing diverse text corpora for low-resource languages, ensuring quality and consistency for model training.

Phase 2: Model Adaptation & Fine-tuning
(8-12 Weeks)

Selecting and fine-tuning Transformer architectures (e.g., mBART, LLaMA) using task-specific datasets and optimizing hyperparameters.

Phase 3: Comprehensive Evaluation & Refinement
(6-8 Weeks)

Conducting both automatic and human evaluations, analyzing model outputs for quality, fluency, and accuracy, and iteratively refining models based on feedback.

Phase 4: Deployment & Integration
(4-8 Weeks)

Integrating the optimized NLG system into existing enterprise workflows and platforms, ensuring scalability and robust performance.

Ready to Transform Your Content Generation?

Leverage cutting-edge AI to overcome low-resource language challenges and unlock new possibilities for global content at scale.

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs


AI Consultation Booking