Canadian Street View Classifier

Deep learning models for classifying street-view images of Canadian cities. This repository contains multiple models fine-tuned on the Canadian Street View Cities dataset.


Models Included

  • SwinV2
  • ConvNeXt

Contains a CNN-based model and a Transformer-based model both trained to predict the city from a street-view image.


Cities Included

Calgary, Charlottetown, Edmonton, Halifax, Hamilton, Kitchener-Waterloo, Montreal, Ottawa-Gatineau, Québec City, Saskatoon, St Johns, Toronto, Vancouver, Victoria, Winnipeg


Model Performance

Model Accuracy Macro Precision Macro Recall Macro F1-Score
ConvNeXt-tiny 0.98980 0.98983 0.98980 0.98980
Swin Transformer V2 0.99440 0.99439 0.99440 0.99439

Performance was evaluated on the test split of the Canadian Street View Cities dataset. Both models achieve high accuracy across all classes, with Swin Transformer V2 slightly outperforming ConvNeXt-tiny.

Known Limitations

These models were trained on images sourced from Mapillary. As a result, their performance may be lower when applied to street-view images from other datasets or sources, due to differences in image style, quality, or perspective.


Demo

Try the model live in a Space: Canadian StreetView Classifier


Usage Example

Installation

pip install torch torchvision timm huggingface_hub

Download Model Weights

from huggingface_hub import hf_hub_download

vit_path = hf_hub_download(
    repo_id="canada-guesser/canadian_streetview_cities_models",
    filename="vit_model/swinv2_base_window12_192_0_finetuned_canadian_streetview.bin"
)

Initialize model

import torch
import timm

model = timm.create_model("swinv2_base_window12_192", pretrained=False, num_classes=15)
model.load_state_dict(torch.load(vit_path, map_location="cpu"))
model.eval()

Transform and predict

from PIL import Image
from torchvision import transforms

transform = transforms.Compose([
    transforms.Resize((192, 192)),
    transforms.ToTensor(),
    transforms.Normalize(mean=(0.5,0.5,0.5), std=(0.5,0.5,0.5))
])

class_names = [
    "Calgary", "Charlottetown", "Edmonton", "Halifax", "Hamilton",
    "Kitchener-Waterloo", "Montreal", "Ottawa-Gatineau", "Quebec City", "Saskatoon",
    "St Johns", "Toronto", "Vancouver", "Victoria", "Winnipeg",
]

img = Image.open("img.jpg").convert("RGB")
x = transform(img).unsqueeze(0)

with torch.no_grad():
    pred = model(x)

print(class_names[pred.argmax().item()])

Citation

If you use this dataset or models, please cite:

  1. Stephen Rebel, Danial McIntyre, Sharav Bali. Canadian Street View Classifier. Hugging Face, 2025.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for canada-guesser/canadian_streetview_cities_models

Finetuned
(54)
this model

Dataset used to train canada-guesser/canadian_streetview_cities_models

Space using canada-guesser/canadian_streetview_cities_models 1