Scorecard at delivery
This is an internal audit against the practices Google Search, Bing, and modern LLM retrievers reward in 2026. Scores are measured, not guessed.
Semantic HTML, single H1 per page, descriptive page titles, unique meta descriptions, accessible navigation.
Schema.org JSON-LD for Organisation, Person, ResearchProject, ScholarlyArticle, NewsArticle, BlogPosting, Product, BreadcrumbList.
Clean sitemap.xml with every route, a permissive robots.txt, stable URLs, canonicals on every page.
Dense cross-links via the content + sidebar pattern. Every detail page links to at least three related entities.
One stylesheet (~35 KB), one script (~8 KB), inline SVG icons. No blocking third-party JS.
Blog shipped with 20 posts at delivery. To sustain SEO momentum the cadence is ~2 posts / week of relevant, unique content.
Alt text on every meaningful image, ARIA landmarks, focus-visible, WCAG AA palette, skip-link, reduced-motion.
On-site signals are clean; off-site backlinks come with time. Targeted PR for each major release is the lever.
What was implemented, concretely
page() templatebuild.py404.html/Recommendations for the next sprint
-
Author pages as hubs. Add
/blog/by-author/[slug]/and schema.orgauthoron every post. Raises E-E-A-T signals. -
Tag archive pages.
/blog/tag/[tag]/aggregates posts. Each archive page has a unique intro paragraph, not just a list. -
Rich snippets for FAQ / HowTo. Where we have step-by-step tutorials (e.g. prompting templates, MLOps checklists), emit
HowToandFAQPageschema. -
Image CDN + responsive
srcset. Once we move from SVG placeholders to photography, serve through an image CDN withsrcset/sizes. - Structured data testing in CI. A schema.org validator runs in CI, fails the build on malformed JSON-LD.
-
RSS + JSON Feed. Publish
/blog/feed.xmland/blog/feed.json. Syndication still matters for researcher audiences. -
Content hubs by topic. A
/topics/[slug]/landing page that bundles research area + thematic + blog + news in a single topic-scoped view. Excellent for LLM-driven search. - Internal analytics review cadence. Weekly look at referrers, top pages, and 404s. Kill dead URLs, redirect what still gets traffic.
- Translate flagship posts into PT. A handful of high-traffic posts with canonical hreflang alternates boost Portuguese-speaking reach.
- Link velocity, not volume. Each major research release gets a co-ordinated press + social + partner post; that builds the off-site graph Google actually ranks.
A word on LLM-first search
Traditional SEO still matters, but 2026 traffic is increasingly mediated by LLM-backed search (Google AI Overviews, Perplexity, Bing Copilot, ChatGPT Search). The site is optimised for that too:
- Clear entity identity. The home page and About page declare LIACC precisely (name, parent, location, ratings). Schema.org carries the same facts.
- Consistent entity mentions. Every author's name is spelled the same, every project has a canonical slug. LLM retrievers collapse duplicates easily when the surface is clean.
- Direct, cite-able prose. Blog posts lead with a short lead paragraph. LLM answer boxes lift those.
- No hidden text, no cloaking. What a user sees is what a model sees.