Canadian Street View Classifier
Deep learning models for classifying street-view images of Canadian cities. This repository contains multiple models fine-tuned on the Canadian Street View Cities dataset.
Models Included
- SwinV2
- ConvNeXt
Contains a CNN-based model and a Transformer-based model both trained to predict the city from a street-view image.
Cities Included
Calgary, Charlottetown, Edmonton, Halifax, Hamilton, Kitchener-Waterloo, Montreal, Ottawa-Gatineau, Québec City, Saskatoon, St Johns, Toronto, Vancouver, Victoria, Winnipeg
Model Performance
| Model | Accuracy | Macro Precision | Macro Recall | Macro F1-Score |
|---|---|---|---|---|
| ConvNeXt-tiny | 0.98980 | 0.98983 | 0.98980 | 0.98980 |
| Swin Transformer V2 | 0.99440 | 0.99439 | 0.99440 | 0.99439 |
Performance was evaluated on the test split of the Canadian Street View Cities dataset. Both models achieve high accuracy across all classes, with Swin Transformer V2 slightly outperforming ConvNeXt-tiny.
Known Limitations
These models were trained on images sourced from Mapillary. As a result, their performance may be lower when applied to street-view images from other datasets or sources, due to differences in image style, quality, or perspective.
Demo
Try the model live in a Space: Canadian StreetView Classifier
Usage Example
Installation
pip install torch torchvision timm huggingface_hub
Download Model Weights
from huggingface_hub import hf_hub_download
vit_path = hf_hub_download(
repo_id="canada-guesser/canadian_streetview_cities_models",
filename="vit_model/swinv2_base_window12_192_0_finetuned_canadian_streetview.bin"
)
Initialize model
import torch
import timm
model = timm.create_model("swinv2_base_window12_192", pretrained=False, num_classes=15)
model.load_state_dict(torch.load(vit_path, map_location="cpu"))
model.eval()
Transform and predict
from PIL import Image
from torchvision import transforms
transform = transforms.Compose([
transforms.Resize((192, 192)),
transforms.ToTensor(),
transforms.Normalize(mean=(0.5,0.5,0.5), std=(0.5,0.5,0.5))
])
class_names = [
"Calgary", "Charlottetown", "Edmonton", "Halifax", "Hamilton",
"Kitchener-Waterloo", "Montreal", "Ottawa-Gatineau", "Quebec City", "Saskatoon",
"St Johns", "Toronto", "Vancouver", "Victoria", "Winnipeg",
]
img = Image.open("img.jpg").convert("RGB")
x = transform(img).unsqueeze(0)
with torch.no_grad():
pred = model(x)
print(class_names[pred.argmax().item()])
Citation
If you use this dataset or models, please cite:
- Stephen Rebel, Danial McIntyre, Sharav Bali. Canadian Street View Classifier. Hugging Face, 2025.
Model tree for canada-guesser/canadian_streetview_cities_models
Base model
facebook/convnext-tiny-224