Afro-XLM-R Fine-Tuned for Setswana Offensive Language Detection
1. Model Summary
This repository contains a fine-tuned version of Afro-XLM-R, a multilingual transformer model optimised for African languages.
The model has been fine-tuned to classify Setswana text into:
- 0 โ Non-offensive
- 1 โ Offensive
Afro-XLM-R provides a multilingual baseline to benchmark performance against monolingual Setswana models such as PuoBERTa.
Its cross-lingual capabilities make it particularly useful when dealing with:
- Code-switching
- Multilingual social media content
- Borrowed words from English/Setswana
2. Intended Use
Primary Use Cases
- Detection of offensive, abusive, or harmful expressions in Setswana text.
- Digital forensic analysis of Facebook, WhatsApp, and other social media content.
- Research in low-resource NLP for African languages.
- Benchmarking multilingual vs monolingual transformer performance.
Not Intended For
- Fully automated decision systems without human oversight.
- Legal conclusions or disciplinary outcomes without expert forensic interpretation.
- Non-Setswana text unless validated.
3. Dataset Description
A curated dataset of 977 Setswana social media text samples was used.
Class Distribution
- Offensive: 477
- Non-offensive: 500
Annotation Notes
- Offensive content includes insults, cyberbullying, hate speech, threats, and abusive slang.
- Semantic triggers were used during training for improved sensitivity to Setswana insult constructions.
- The test split is tag-free to reflect real-world forensic environments.
Ethical Handling
- All posts were sourced from publicly available content.
- Identifiable information was removed.
- This dataset is not automatically redistributed as part of the model.
4. Training Procedure
Model Architecture
- Base model: Afro-XLM-R
- Backbone: XLM-RoBERTa
- Multilingual African-centric pretraining dataset
- ~270M parameters (depending on variant)
Training Hyperparameters
- Epochs: 10
- Batch size: 16 (training), 64 (evaluation)
- Optimizer: AdamW
- Learning rate: 1e-5
- Weight decay: 0.01
- Loss function: class-weighted cross entropy
- Weights =
[1.0, 2.0](non-offensive, offensive)
- Weights =
Hardware
- Trained using Google Colab GPU (T4/A100 depending on session).
5. Evaluation Methodology
The dataset split follows:
- 80% training
- 20% held-out test set
- 5-fold stratified cross-validation used during model selection.
- No semantic triggers or augmentations present in the test set.
Evaluation uses the following metrics:
- Accuracy
- Macro F1
- Recall for offensive class
- Matthews Correlation Coefficient (MCC)
- ROC-AUC
- Runtime speed
6. Test Set Results (Final Model)
| Metric | Value |
|---|---|
| Accuracy | 0.8622 |
| Macro F1-score | 0.8603 |
| Recall (Offensive = 1) | 0.8111 |
| MCC | 0.7229 |
| ROC-AUC | 0.9015 |
| Loss | 0.3895 |
| Runtime (seconds) | 1.1634 |
| Samples per second | 168.468 |
| Steps per second | 3.438 |
Interpretation
- The ROC-AUC of 0.90 demonstrates strong separation between offensive and non-offensive classes.
- MCC = 0.7229 indicates strong classification reliability in mildly imbalanced data.
- Recall(1) = 0.8111 means the model captures most harmful/offensive cases โ useful for forensic workflows where false negatives are costly.
- Slightly slower inference compared to PuoBERTa due to model size and multilingual embedding space.
Overall, Afro-XLM-R performs strongly as a multilingual baseline for Setswana offensive-language detection.
7. How to Use the Model
Python Inference Example
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_name = "mopatik/Afro-XLM-R-offensive-detection-v1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Ensure model is in evaluation mode
model.eval()
# Sample text (replace with your actual text)
#sample_text = "o seso tota" # (you are insanely stupid) Example Setswana text
sample_text = "modimo a le segofatse" # (God bless you all) Example Setswana text
# Tokenize and prepare input
inputs = tokenizer(
sample_text,
padding='max_length',
truncation=True,
max_length=128,
return_tensors="pt"
)
# Make prediction
with torch.no_grad():
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=1)
predicted_class = torch.argmax(probs).item()
# Get class label and confidence
class_names = ["Non-offensive", "Offensive"]
confidence = probs[0][predicted_class].item()
print(f"Text: {sample_text}")
print(f"Predicted class: {class_names[predicted_class]} (confidence: {confidence:.2%})")
print(f"Class probabilities: {dict(zip(class_names, [f'{p:.2%}' for p in probs[0].tolist()]))}")
Model tree for mopatik/Afro-XLM-R-offensive-detection-v1
Base model
Davlan/afro-xlmr-base