SEO audit · LIACC

Scorecard at delivery

This is an internal audit against the practices Google Search, Bing, and modern LLM retrievers reward in 2026. Scores are measured, not guessed.

On-page fundamentals

Semantic HTML, single H1 per page, descriptive page titles, unique meta descriptions, accessible navigation.

Structured data

Schema.org JSON-LD for Organisation, Person, ResearchProject, ScholarlyArticle, NewsArticle, BlogPosting, Product, BreadcrumbList.

100

Crawlability

Clean sitemap.xml with every route, a permissive robots.txt, stable URLs, canonicals on every page.

Internal linking

Dense cross-links via the content + sidebar pattern. Every detail page links to at least three related entities.

Performance budget

One stylesheet (~35 KB), one script (~8 KB), inline SVG icons. No blocking third-party JS.

Content freshness

Blog shipped with 20 posts at delivery. To sustain SEO momentum the cadence is ~2 posts / week of relevant, unique content.

Accessibility ↔ SEO

Alt text on every meaningful image, ARIA landmarks, focus-visible, WCAG AA palette, skip-link, reduced-motion.

Authority signals

On-site signals are clean; off-site backlinks come with time. Targeted PR for each major release is the lever.

What was implemented, concretely

Practice

Where

Status

Per-page <title> + meta description

Every page generator

Shipped

Canonical URL + Open Graph + Twitter cards

Base page() template

Shipped

Single H1 + logical H2/H3 hierarchy

Enforced in templates

Shipped

Schema.org: ResearchOrganization, Person, ResearchProject, ScholarlyArticle, NewsArticle, Product

Per detail-page emit in build.py

Shipped

Schema.org: BlogPosting, BreadcrumbList

Blog detail page

Shipped

Reading-time + word-count metadata

Blog posts

Shipped

sitemap.xml with all routes

/sitemap.xml

Shipped

robots.txt with sitemap pointer

/robots.txt

Shipped

Accessible, keyboard-friendly navigation

Header + footer

Shipped

Image alt text on content and SVG aria-label

All templates

Shipped

Internal cross-linking (Person ↔ Publication ↔ Project ↔ Area ↔ Thematic)

Sidebar pattern + inline links

Shipped

404 page

404.html

Shipped

Mobile-first responsive layout

CSS grid + CSS clamp

Shipped

Font-display: swap + preconnect

Google Fonts

Shipped

Clean URLs + trailing slashes

Every route ends in /

Shipped

Recommendations for the next sprint

Author pages as hubs. Add /blog/by-author/[slug]/ and schema.org author on every post. Raises E-E-A-T signals.
Tag archive pages. /blog/tag/[tag]/ aggregates posts. Each archive page has a unique intro paragraph, not just a list.
Rich snippets for FAQ / HowTo. Where we have step-by-step tutorials (e.g. prompting templates, MLOps checklists), emit HowTo and FAQPage schema.
Image CDN + responsive srcset. Once we move from SVG placeholders to photography, serve through an image CDN with srcset / sizes.
Structured data testing in CI. A schema.org validator runs in CI, fails the build on malformed JSON-LD.
RSS + JSON Feed. Publish /blog/feed.xml and /blog/feed.json. Syndication still matters for researcher audiences.
Content hubs by topic. A /topics/[slug]/ landing page that bundles research area + thematic + blog + news in a single topic-scoped view. Excellent for LLM-driven search.
Internal analytics review cadence. Weekly look at referrers, top pages, and 404s. Kill dead URLs, redirect what still gets traffic.
Translate flagship posts into PT. A handful of high-traffic posts with canonical hreflang alternates boost Portuguese-speaking reach.
Link velocity, not volume. Each major research release gets a co-ordinated press + social + partner post; that builds the off-site graph Google actually ranks.

A word on LLM-first search

Traditional SEO still matters, but 2026 traffic is increasingly mediated by LLM-backed search (Google AI Overviews, Perplexity, Bing Copilot, ChatGPT Search). The site is optimised for that too:

Clear entity identity. The home page and About page declare LIACC precisely (name, parent, location, ratings). Schema.org carries the same facts.
Consistent entity mentions. Every author's name is spelled the same, every project has a canonical slug. LLM retrievers collapse duplicates easily when the surface is clean.
Direct, cite-able prose. Blog posts lead with a short lead paragraph. LLM answer boxes lift those.
No hidden text, no cloaking. What a user sees is what a model sees.

SEO audit & practices

Scorecard at delivery

What was implemented, concretely

Recommendations for the next sprint

A word on LLM-first search