SentenceTransformer based on albert/albert-base-v2

This is a sentence-transformers model finetuned from albert/albert-base-v2 on the en-pl dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: albert/albert-base-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
  • Languages: en, multilingual, ar, bg, ca, cs, da, de, el, es, et, fa, fi, fr, gl, gu, he, hi, hr, hu, hy, id, it, ja, ka, ko, ku, lt, lv, mk, mn, mr, ms, my, nb, nl, pl, pt, ro, ru, sk, sl, sq, sr, sv, th, tr, uk, ur, vi, zh

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'AlbertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("jansowa/albert-base-v2-multilingual-en-pl")
# Run inference
sentences = [
    'This is a diagram of the U.S. counterinsurgency strategy in Afghanistan.',
    'To jest diagram przeciwpartyzanckiej strategii w Afganistanie.',
    'Biedni ludzie, ludzie, których prawa człowieka zostały naruszone, brzemię tego to strata godności. Brak godności.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9598, 0.5281],
#         [0.9598, 1.0000, 0.5946],
#         [0.5281, 0.5946, 1.0000]])

Evaluation

Metrics

Knowledge Distillation

Metric Value
negative_mse -37.0147

Translation

Metric Value
src2trg_accuracy 0.7046
trg2src_accuracy 0.6613
mean_accuracy 0.683

Training Details

Training Dataset

en-pl

  • Dataset: en-pl at 0c70bc6
  • Size: 292,290 training samples
  • Columns: english, non_english, and label
  • Approximate statistics based on the first 1000 samples:
    english non_english label
    type string string list
    details
    • min: 4 tokens
    • mean: 24.47 tokens
    • max: 256 tokens
    • min: 5 tokens
    • mean: 40.48 tokens
    • max: 256 tokens
    • size: 768 elements
  • Samples:
    english non_english label
    And then there are certain conceptual things that can also benefit from hand calculating, but I think they're relatively small in number. Są też pewne zadania koncepcyjne, w których ręczne kalkulacje mogą być przydatne, ale jest ich stosunkowo niewiele. [0.1842687577009201, -0.27380749583244324, 1.380724310874939, 0.5485912561416626, -0.5771370530128479, ...]
    One thing I often ask about is ancient Greek and how this relates. Często zadaję pytania o starożytną Grekę i co do tego ma. [-0.22509485483169556, -0.8029794096946716, -0.26132631301879883, -0.1386972814798355, -0.4966896176338196, ...]
    See, the thing we're doing right now is we're forcing people to learn mathematics. Zwróćcie uwagę, to co teraz robimy jest zmuszaniem ludzi do nauki matematyki. [0.701093316078186, -0.31419914960861206, -0.8785677552223206, -0.3886241614818573, 0.7142088413238525, ...]
  • Loss: MSELoss

Evaluation Dataset

en-pl

  • Dataset: en-pl at 0c70bc6
  • Size: 992 evaluation samples
  • Columns: english, non_english, and label
  • Approximate statistics based on the first 992 samples:
    english non_english label
    type string string list
    details
    • min: 4 tokens
    • mean: 25.03 tokens
    • max: 253 tokens
    • min: 5 tokens
    • mean: 45.15 tokens
    • max: 256 tokens
    • size: 768 elements
  • Samples:
    english non_english label
    Thank you so much, Chris. Bardzo dziękuję, Chris. [0.26128441095352173, -0.8462327122688293, -0.4199201762676239, 0.5228638648986816, 1.1514642238616943, ...]
    And it's truly a great honor to have the opportunity to come to this stage twice; I'm extremely grateful. To prawdziwy zaszczyt mieć możliwość drugi raz stanąć w tym miejscu. Jestem niezwykle wdzięczny. [0.22651851177215576, 0.283356636762619, -1.0012000799179077, -0.013265828602015972, 0.08300188928842545, ...]
    I have been blown away by this conference, and I want to thank all of you for the many nice comments about what I had to say the other night. Jestem zachwycony tą konferencją. Chce Wam wszystkim podziękować za miłe komentarze dotyczące mojej wypowiedzi poprzedniego wieczoru. [0.17416058480739594, -0.40953880548477173, -0.62795090675354, 0.35556134581565857, 0.40010693669319153, ...]
  • Loss: MSELoss

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss en-pl loss en-pl_negative_mse en-pl_mean_accuracy
0.0027 100 1.1564 - - -
0.0055 200 0.8704 - - -
0.0082 300 0.6851 - - -
0.0109 400 0.5806 - - -
0.0137 500 0.5265 - - -
0.0164 600 0.4866 - - -
0.0192 700 0.4744 - - -
0.0219 800 0.4665 - - -
0.0246 900 0.4637 - - -
0.0274 1000 0.4616 - - -
0.0301 1100 0.461 - - -
0.0328 1200 0.4624 - - -
0.0356 1300 0.4605 - - -
0.0383 1400 0.4596 - - -
0.0411 1500 0.459 - - -
0.0438 1600 0.455 - - -
0.0465 1700 0.4567 - - -
0.0493 1800 0.4567 - - -
0.0520 1900 0.4541 - - -
0.0547 2000 0.4559 - - -
0.0575 2100 0.456 - - -
0.0602 2200 0.4535 - - -
0.0629 2300 0.4495 - - -
0.0657 2400 0.4515 - - -
0.0684 2500 0.4472 - - -
0.0712 2600 0.4486 - - -
0.0739 2700 0.4468 - - -
0.0766 2800 0.4432 - - -
0.0794 2900 0.4443 - - -
0.0821 3000 0.4437 - - -
0.0848 3100 0.439 - - -
0.0876 3200 0.4379 - - -
0.0903 3300 0.4345 - - -
0.0931 3400 0.4335 - - -
0.0958 3500 0.4322 - - -
0.0985 3600 0.4326 - - -
0.1013 3700 0.4319 - - -
0.1040 3800 0.4298 - - -
0.1067 3900 0.4243 - - -
0.1095 4000 0.4243 - - -
0.1122 4100 0.4226 - - -
0.1150 4200 0.4216 - - -
0.1177 4300 0.4208 - - -
0.1204 4400 0.4245 - - -
0.1232 4500 0.4236 - - -
0.1259 4600 0.4176 - - -
0.1286 4700 0.4175 - - -
0.1314 4800 0.4148 - - -
0.1341 4900 0.4163 - - -
0.1368 5000 0.4127 0.4204 -44.0072 0.1240
0.1396 5100 0.4089 - - -
0.1423 5200 0.4145 - - -
0.1451 5300 0.4084 - - -
0.1478 5400 0.4082 - - -
0.1505 5500 0.4062 - - -
0.1533 5600 0.4056 - - -
0.1560 5700 0.4039 - - -
0.1587 5800 0.4059 - - -
0.1615 5900 0.405 - - -
0.1642 6000 0.4003 - - -
0.1670 6100 0.4003 - - -
0.1697 6200 0.3994 - - -
0.1724 6300 0.4 - - -
0.1752 6400 0.3965 - - -
0.1779 6500 0.3964 - - -
0.1806 6600 0.3948 - - -
0.1834 6700 0.3955 - - -
0.1861 6800 0.3959 - - -
0.1888 6900 0.3942 - - -
0.1916 7000 0.3922 - - -
0.1943 7100 0.3933 - - -
0.1971 7200 0.395 - - -
0.1998 7300 0.3932 - - -
0.2025 7400 0.3881 - - -
0.2053 7500 0.3884 - - -
0.2080 7600 0.3862 - - -
0.2107 7700 0.3864 - - -
0.2135 7800 0.3879 - - -
0.2162 7900 0.3895 - - -
0.2190 8000 0.3847 - - -
0.2217 8100 0.3856 - - -
0.2244 8200 0.3863 - - -
0.2272 8300 0.3859 - - -
0.2299 8400 0.3821 - - -
0.2326 8500 0.3825 - - -
0.2354 8600 0.3799 - - -
0.2381 8700 0.381 - - -
0.2409 8800 0.3824 - - -
0.2436 8900 0.3811 - - -
0.2463 9000 0.3774 - - -
0.2491 9100 0.3781 - - -
0.2518 9200 0.3792 - - -
0.2545 9300 0.3769 - - -
0.2573 9400 0.376 - - -
0.2600 9500 0.3799 - - -
0.2627 9600 0.3737 - - -
0.2655 9700 0.3757 - - -
0.2682 9800 0.3753 - - -
0.2710 9900 0.3761 - - -
0.2737 10000 0.3701 0.3808 -41.4255 0.3191
0.2764 10100 0.3718 - - -
0.2792 10200 0.3701 - - -
0.2819 10300 0.3704 - - -
0.2846 10400 0.3724 - - -
0.2874 10500 0.3725 - - -
0.2901 10600 0.3726 - - -
0.2929 10700 0.3679 - - -
0.2956 10800 0.3671 - - -
0.2983 10900 0.368 - - -
0.3011 11000 0.3711 - - -
0.3038 11100 0.3696 - - -
0.3065 11200 0.3677 - - -
0.3093 11300 0.3651 - - -
0.3120 11400 0.365 - - -
0.3147 11500 0.3635 - - -
0.3175 11600 0.3595 - - -
0.3202 11700 0.363 - - -
0.3230 11800 0.3644 - - -
0.3257 11900 0.3649 - - -
0.3284 12000 0.3623 - - -
0.3312 12100 0.3634 - - -
0.3339 12200 0.3616 - - -
0.3366 12300 0.3644 - - -
0.3394 12400 0.3608 - - -
0.3421 12500 0.3601 - - -
0.3449 12600 0.3623 - - -
0.3476 12700 0.3606 - - -
0.3503 12800 0.3585 - - -
0.3531 12900 0.3622 - - -
0.3558 13000 0.361 - - -
0.3585 13100 0.3595 - - -
0.3613 13200 0.3569 - - -
0.3640 13300 0.3597 - - -
0.3668 13400 0.3586 - - -
0.3695 13500 0.3577 - - -
0.3722 13600 0.3569 - - -
0.3750 13700 0.3546 - - -
0.3777 13800 0.3546 - - -
0.3804 13900 0.3552 - - -
0.3832 14000 0.3535 - - -
0.3859 14100 0.3566 - - -
0.3886 14200 0.3556 - - -
0.3914 14300 0.3548 - - -
0.3941 14400 0.3529 - - -
0.3969 14500 0.3549 - - -
0.3996 14600 0.3539 - - -
0.4023 14700 0.3508 - - -
0.4051 14800 0.3536 - - -
0.4078 14900 0.3528 - - -
0.4105 15000 0.3548 0.3599 -40.0086 0.4451
0.4133 15100 0.3523 - - -
0.4160 15200 0.3483 - - -
0.4188 15300 0.3507 - - -
0.4215 15400 0.3507 - - -
0.4242 15500 0.3516 - - -
0.4270 15600 0.3503 - - -
0.4297 15700 0.3476 - - -
0.4324 15800 0.3484 - - -
0.4352 15900 0.3487 - - -
0.4379 16000 0.3473 - - -
0.4406 16100 0.3501 - - -
0.4434 16200 0.3481 - - -
0.4461 16300 0.3462 - - -
0.4489 16400 0.347 - - -
0.4516 16500 0.3458 - - -
0.4543 16600 0.3485 - - -
0.4571 16700 0.3461 - - -
0.4598 16800 0.3483 - - -
0.4625 16900 0.3456 - - -
0.4653 17000 0.3454 - - -
0.4680 17100 0.344 - - -
0.4708 17200 0.344 - - -
0.4735 17300 0.3417 - - -
0.4762 17400 0.3469 - - -
0.4790 17500 0.3465 - - -
0.4817 17600 0.3438 - - -
0.4844 17700 0.3437 - - -
0.4872 17800 0.3413 - - -
0.4899 17900 0.3425 - - -
0.4927 18000 0.3429 - - -
0.4954 18100 0.3449 - - -
0.4981 18200 0.3425 - - -
0.5009 18300 0.3431 - - -
0.5036 18400 0.3431 - - -
0.5063 18500 0.3429 - - -
0.5091 18600 0.343 - - -
0.5118 18700 0.3413 - - -
0.5145 18800 0.3425 - - -
0.5173 18900 0.3386 - - -
0.5200 19000 0.3415 - - -
0.5228 19100 0.341 - - -
0.5255 19200 0.3395 - - -
0.5282 19300 0.3413 - - -
0.5310 19400 0.3412 - - -
0.5337 19500 0.3387 - - -
0.5364 19600 0.3413 - - -
0.5392 19700 0.3383 - - -
0.5419 19800 0.3414 - - -
0.5447 19900 0.3377 - - -
0.5474 20000 0.341 0.3462 -38.8721 0.5459
0.5501 20100 0.3364 - - -
0.5529 20200 0.3377 - - -
0.5556 20300 0.3362 - - -
0.5583 20400 0.338 - - -
0.5611 20500 0.3326 - - -
0.5638 20600 0.3362 - - -
0.5665 20700 0.3368 - - -
0.5693 20800 0.3379 - - -
0.5720 20900 0.3362 - - -
0.5748 21000 0.334 - - -
0.5775 21100 0.3389 - - -
0.5802 21200 0.3361 - - -
0.5830 21300 0.3358 - - -
0.5857 21400 0.3333 - - -
0.5884 21500 0.3349 - - -
0.5912 21600 0.3332 - - -
0.5939 21700 0.3354 - - -
0.5967 21800 0.3334 - - -
0.5994 21900 0.3324 - - -
0.6021 22000 0.3317 - - -
0.6049 22100 0.3312 - - -
0.6076 22200 0.3352 - - -
0.6103 22300 0.333 - - -
0.6131 22400 0.3358 - - -
0.6158 22500 0.332 - - -
0.6186 22600 0.3321 - - -
0.6213 22700 0.3327 - - -
0.6240 22800 0.3312 - - -
0.6268 22900 0.3317 - - -
0.6295 23000 0.3277 - - -
0.6322 23100 0.3334 - - -
0.6350 23200 0.3313 - - -
0.6377 23300 0.331 - - -
0.6404 23400 0.3326 - - -
0.6432 23500 0.3325 - - -
0.6459 23600 0.3288 - - -
0.6487 23700 0.331 - - -
0.6514 23800 0.3315 - - -
0.6541 23900 0.3312 - - -
0.6569 24000 0.329 - - -
0.6596 24100 0.3263 - - -
0.6623 24200 0.3326 - - -
0.6651 24300 0.3297 - - -
0.6678 24400 0.3251 - - -
0.6706 24500 0.3309 - - -
0.6733 24600 0.3302 - - -
0.6760 24700 0.3274 - - -
0.6788 24800 0.3278 - - -
0.6815 24900 0.3268 - - -
0.6842 25000 0.3283 0.3376 -38.1403 0.6129
0.6870 25100 0.3278 - - -
0.6897 25200 0.3285 - - -
0.6924 25300 0.3288 - - -
0.6952 25400 0.3275 - - -
0.6979 25500 0.327 - - -
0.7007 25600 0.328 - - -
0.7034 25700 0.3292 - - -
0.7061 25800 0.3255 - - -
0.7089 25900 0.3279 - - -
0.7116 26000 0.3276 - - -
0.7143 26100 0.3254 - - -
0.7171 26200 0.3254 - - -
0.7198 26300 0.3237 - - -
0.7226 26400 0.3261 - - -
0.7253 26500 0.3247 - - -
0.7280 26600 0.3277 - - -
0.7308 26700 0.324 - - -
0.7335 26800 0.3262 - - -
0.7362 26900 0.3223 - - -
0.7390 27000 0.3205 - - -
0.7417 27100 0.3265 - - -
0.7445 27200 0.3234 - - -
0.7472 27300 0.3228 - - -
0.7499 27400 0.3202 - - -
0.7527 27500 0.3234 - - -
0.7554 27600 0.3239 - - -
0.7581 27700 0.323 - - -
0.7609 27800 0.3232 - - -
0.7636 27900 0.324 - - -
0.7663 28000 0.3239 - - -
0.7691 28100 0.3224 - - -
0.7718 28200 0.3258 - - -
0.7746 28300 0.3259 - - -
0.7773 28400 0.3229 - - -
0.7800 28500 0.3266 - - -
0.7828 28600 0.3212 - - -
0.7855 28700 0.3243 - - -
0.7882 28800 0.3237 - - -
0.7910 28900 0.3225 - - -
0.7937 29000 0.3233 - - -
0.7965 29100 0.3249 - - -
0.7992 29200 0.3246 - - -
0.8019 29300 0.321 - - -
0.8047 29400 0.3263 - - -
0.8074 29500 0.3244 - - -
0.8101 29600 0.3232 - - -
0.8129 29700 0.3212 - - -
0.8156 29800 0.3235 - - -
0.8183 29900 0.3197 - - -
0.8211 30000 0.3219 0.3297 -37.3467 0.6714
0.8238 30100 0.3238 - - -
0.8266 30200 0.3243 - - -
0.8293 30300 0.3238 - - -
0.8320 30400 0.3194 - - -
0.8348 30500 0.3198 - - -
0.8375 30600 0.3227 - - -
0.8402 30700 0.3199 - - -
0.8430 30800 0.3209 - - -
0.8457 30900 0.3212 - - -
0.8485 31000 0.3182 - - -
0.8512 31100 0.3214 - - -
0.8539 31200 0.3203 - - -
0.8567 31300 0.3246 - - -
0.8594 31400 0.3171 - - -
0.8621 31500 0.3208 - - -
0.8649 31600 0.3203 - - -
0.8676 31700 0.319 - - -
0.8704 31800 0.3179 - - -
0.8731 31900 0.3187 - - -
0.8758 32000 0.3197 - - -
0.8786 32100 0.319 - - -
0.8813 32200 0.3214 - - -
0.8840 32300 0.3205 - - -
0.8868 32400 0.3179 - - -
0.8895 32500 0.3197 - - -
0.8922 32600 0.3197 - - -
0.8950 32700 0.3187 - - -
0.8977 32800 0.3195 - - -
0.9005 32900 0.3193 - - -
0.9032 33000 0.3188 - - -
0.9059 33100 0.3166 - - -
0.9087 33200 0.3186 - - -
0.9114 33300 0.3182 - - -
0.9141 33400 0.3167 - - -
0.9169 33500 0.3203 - - -
0.9196 33600 0.3189 - - -
0.9224 33700 0.3177 - - -
0.9251 33800 0.3174 - - -
0.9278 33900 0.3194 - - -
0.9306 34000 0.318 - - -
0.9333 34100 0.3171 - - -
0.9360 34200 0.3185 - - -
0.9388 34300 0.3175 - - -
0.9415 34400 0.3181 - - -
0.9442 34500 0.3219 - - -
0.9470 34600 0.3137 - - -
0.9497 34700 0.3164 - - -
0.9525 34800 0.3161 - - -
0.9552 34900 0.3177 - - -
0.9579 35000 0.3165 0.3260 -37.0147 0.6830
0.9607 35100 0.3181 - - -
0.9634 35200 0.3161 - - -
0.9661 35300 0.3156 - - -
0.9689 35400 0.3152 - - -
0.9716 35500 0.3186 - - -
0.9744 35600 0.3197 - - -
0.9771 35700 0.3191 - - -
0.9798 35800 0.3161 - - -
0.9826 35900 0.3184 - - -
0.9853 36000 0.3166 - - -
0.9880 36100 0.316 - - -
0.9908 36200 0.3194 - - -
0.9935 36300 0.3158 - - -
0.9963 36400 0.3187 - - -
0.9990 36500 0.317 - - -

Framework Versions

  • Python: 3.13.2
  • Sentence Transformers: 5.1.0
  • Transformers: 4.51.3
  • PyTorch: 2.6.0+cu126
  • Accelerate: 1.4.0
  • Datasets: 3.3.2
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MSELoss

@inproceedings{reimers-2020-multilingual-sentence-bert,
    title = "Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2020",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/2004.09813",
}
Downloads last month
10
Safetensors
Model size
11.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jansowa/albert-base-v2-multilingual-en-pl

Finetuned
(239)
this model

Dataset used to train jansowa/albert-base-v2-multilingual-en-pl

Evaluation results