πŸ“Š
Dataset

Bert Base Multilingual Cased

by Google Bert huggingface/google-bert/bert-base-multilingual-cased
Free2AITools Nexus Index
40.4
S: Semantic 50

Query-time baseline · scored live at search

A: Authority 57
P: Popularity 74
R: Recency 100
Q: Quality 70
Tech Context
Vital Performance
Data Integrity 40.4 FNI Score
- Size
- Rows
- Tokens
Dataset Information Summary
Entity Passport
Registry ID huggingface/google-bert/bert-base-multilingual-cased
License Apache-2.0
Provider huggingface
πŸ“œ

Cite this dataset

Academic & Research Attribution

BibTeX
@misc{hf_dataset_huggingface_google_bert_bert_base_multilingual_cased,
  author = {Google Bert},
  title = {Bert Base Multilingual Cased Dataset},
  year = {2026},
  howpublished = {\url{https://huggingface.co/google-bert/bert-base-multilingual-cased}},
  note = {Accessed via Free2AITools.}
}
APA Style
Google Bert. (2026). Bert Base Multilingual Cased [Dataset]. Free2AITools. https://huggingface.co/google-bert/bert-base-multilingual-cased

πŸ”¬Technical Deep Dive

Full Specifications [+]

βš–οΈ Free2AITools Nexus Index V2.0

Semantic (S) 50

Query-time baseline · scored live at search

Authority (A) 57
Popularity (P) 74
Recency (R) 100
Quality (Q) 70

πŸ’¬ Index Insight

FNI V2.0 for Bert Base Multilingual Cased: Authority (A:57), Popularity (P:74), Recency (R:100), Quality (Q:70). Semantic (S) is a query-time baseline scored live at search.

Free2AITools Nexus Index

Data Sources / Provenance

Open data Updated: Live data
⬇️
Downloads
4.8M

🎯 Task Categories

translation

πŸ‘οΈ Data Preview

πŸ“Š

Row-level preview not available for this dataset.

Schema structure is shown in the Field Logic panel when available.

πŸ”— Explore Full Dataset β†—

🧬 Field Logic

🧬

Schema not yet indexed for this dataset.

Dataset Specification

BERT multilingual base model (cased)

Pretrained model on the top 104 languages with the largest Wikipedia using a masked language modeling (MLM) objective. It was introduced in this paper and first released in this repository. This model is case sensitive: it makes a difference between english and English.

Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by the Hugging Face team.

Model description

BERT is a transformers model pretrained on a large corpus of multilingual data in a self-supervised fashion. This means it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it was pretrained with two objectives:

  • Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run the entire masked sentence through the model and has to predict the masked words. This is different from traditional recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the sentence.
  • Next sentence p

Social Proof

HuggingFace Hub
4.8MDownloads
πŸ”„ Updated daily

Source summary: Based on Hugging Face metadata. Not a recommendation.

πŸ“Š FNI Methodology πŸ“š Knowledge Baseℹ️ Verify with original source

πŸ›‘οΈ Dataset Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

πŸ†” Identity & Source

id
hf-dataset--huggingface--google-bert--bert-base-multilingual-cased
slug
huggingface--google-bert--bert-base-multilingual-cased
source
huggingface
author
Google Bert
license
Apache-2.0
tags
transformers, pytorch, tf, jax, safetensors, bert, fill-mask, multilingual, af, sq, ar, an, hy, ast, az, ba, eu, bar, be, bn, inc, bs, br, bg, my, ca, ceb, ce, zh, cv, hr, cs, da, nl, en, et, fi, fr, gl, ka, de, el, gu, ht, he, hi, hu, is, io, id, ga, it, ja, jv, kn, kk, ky, ko, la, lv, lt, roa, nds, lm, mk, mg, ms, ml, mr, mn, min, ne, new, nb, nn, oc, fa, pms, pl, pt, pa, ro, ru, sco, sr, scn, sk, sl, aze, es, su, sw, sv, tl, tg, th, ta, tt, te, tr, uk, ud, uz, vi, vo, war, cy, fry, pnb, yo, d

βš™οΈ Technical Specs

architecture
null
params billions
null
context length
null
pipeline tag
fill-mask

πŸ“Š Engagement & Metrics

downloads
4,781,227
stars
0
forks
0

Data indexed from public sources. Updated daily.