“Does it work for Portuguese?” is the question we get most often. Here are twelve benchmarks worth your time.

General language

  • ASSIN 2 — semantic textual similarity and entailment.
  • PTBR Hate — hate-speech classification for Brazilian Portuguese.
  • BR-Wikicorpora — long-text QA.

Legal

  • LEX-PT — four tasks over Portuguese legislation and case law.
  • STJ Citation — Supreme Court citation prediction.

Clinical

  • Port-Disease — disease mention extraction from Portuguese EHR.
  • ClinPT-Bench — medication, dosage, and timeline extraction.

Public administration

  • DRE-QA — question answering over the Diário da República.
  • Public-Services-NER — entity extraction over municipal documents.

Multimodal

  • PortVQA — Portuguese visual QA.
  • OCR-PT — handwritten historical Portuguese documents.

Reasoning

  • Exame-PT — Brazilian national exam reasoning benchmark.

Corrections and additions welcome — this list is a living document.