A possible EPP future

This plan is an experiment to explore the solution space for introducing OXA into EPP.

The following is motivated by three goals:

The ordering is not about dependencies — most phases can be explored in parallel. Ordering runs from developer experience and habitability, through adopting emerging standards, to eliminating runtime complexity. A static site ensures the content rendered by EPP is easily hostable in perpetuity.

  1. Ensure test coverage and fast, local feedback — visual regression in Docker, E2E coverage for critical flows, unit tests for data-fetching logic.
  2. Habitability and clearer responsibility boundaries — typed SiteConfig; no API routes or proxies; framework-agnostic component library.
  3. Use emerging standards — OXA and DocMaps as the data source; parity-tested against the EPP API before cut-over.
  4. Remove runtime complexity — Next.js static export; Node.js process eliminated; Caddy serves static files.
  5. Align tooling with project intent — Astro replaces Next.js; incremental builds driven by DocMaps digests.
  6. Use CDN directly for hosting — static files move to Fastly Object Storage; the container is eliminated.

Ensure test coverage and fast, local feedback

Before changing anything, establish a feedback loop that catches regressions quickly and locally. Without this, every subsequent change is made against a backdrop of uncertainty — you cannot know whether a refactor has broken a visual or a critical user flow until it reaches a deployed environment. The four steps here are the safety net that makes every phase that follows lower-risk.

1a — Local visual regression via Playwright in Docker

Chromatic runs cloud-only; there is no local visual regression. @storybook/test-runner is already installed — it drives a built Storybook with Playwright and toHaveScreenshot(). Running inside a pinned Docker image pins the browser and font stack, making screenshots reproducible across macOS, Linux, and CI. @playwright/test is already at 1.57.0; use the matching image.

Add .storybook/test-runner.ts to take screenshots per story, reusing the Chromatic viewport modes:

// .storybook/test-runner.ts
import { getStoryContext } from '@storybook/test-runner';
import { expect } from '@playwright/test';

export const postVisit = async (page, context) => {
  const storyContext = await getStoryContext(page, context);
  const modes = storyContext.parameters?.chromatic?.modes;

  if (modes) {
    for (const [name, cfg] of Object.entries(modes)) {
      if (cfg.viewport) await page.setViewportSize(cfg.viewport);
      await page.waitForLoadState('networkidle');
      await expect(page).toHaveScreenshot(
        `${context.id}-${name}.png`,
        { maxDiffPixelRatio: 0.02 }
      );
    }
  } else {
    await expect(page).toHaveScreenshot(
      `${context.id}.png`,
      { maxDiffPixelRatio: 0.02 }
    );
  }
};

Add two scripts to package.json (build Storybook, serve it, run the test runner in Docker). On macOS Docker Desktop replace localhost with host.docker.internal.

"test-storybook:visual": "yarn build-storybook && \
  npx serve storybook-static -p 6006 -n & \
  docker run --rm --network host \
    -v \"$(pwd):/work\" -w /work \
    mcr.microsoft.com/playwright:v1.57.0-noble \
    yarn test-storybook --url http://localhost:6006",

"test-storybook:visual:update": "yarn build-storybook && \
  npx serve storybook-static -p 6006 -n & \
  docker run --rm --network host \
    -v \"$(pwd):/work\" -w /work \
    mcr.microsoft.com/playwright:v1.57.0-noble \
    yarn test-storybook --url http://localhost:6006 --update-snapshots"

Run test-storybook:visual:update once to generate baselines and commit them. Chromatic continues on PRs for reviewer-facing diffs; the Docker run is the local and CI gate.

1b — Fix Playwright local setup

playwright.config.ts has both baseURL and webServer commented out. The three existing browser test specs hardcode http://localhost:3001 and require a manually started server. Playwright supports an array of webServer entries; configure it to start both Wiremock and Next.js before the suite runs:

// playwright.config.ts
webServer: [
  {
    command: 'yarn wiremock',
    url: 'http://localhost:8080/__admin/health',
    reuseExistingServer: !process.env.CI,
  },
  {
    command: 'API_SERVER=http://localhost:8080 yarn start',
    url: 'http://localhost:3000',
    reuseExistingServer: !process.env.CI,
  },
],
use: {
  baseURL: 'http://localhost:3000',
},

Replace all hardcoded http://localhost:3001 URLs in existing specs with relative paths. Tests then run identically against local dev, CI, and staging.

1c — Add E2E tests for critical flows

Article tabs

Tests asserting correct content per URL survive both the current router-based tabs and the Phase 5 static HTML files:

test('fulltext tab is active by default', async ({ page }) => {
  await page.goto('/reviewed-preprints/85111');
  await expect(page.locator('[aria-current="page"]')).toHaveText('Full text');
});

test('figures tab renders figures', async ({ page }) => {
  await page.goto('/reviewed-preprints/85111/figures');
  await expect(page.locator('[aria-current="page"]')).toHaveText('Figures');
});

URL redirects

The same assertions verify the Caddy configuration in step 5c:

test('articles path redirects to reviewed-preprints', async ({ request }) => {
  const response = await request.get('/articles/85111', { maxRedirects: 0 });
  expect(response.status()).toBe(301);
  expect(response.headers()['location']).toContain('/reviewed-preprints/85111');
});

VOR redirect

Add a VOR Wiremock fixture, then assert that a VOR article at /reviewed-preprints/ redirects to /articles/. The test documents the requirement so whichever step 5a approach is chosen can be verified against it.

Both EPP flavours

The Wiremock fixture biophysics-colab-111111 exists but no browser test uses it. Add an E2E test asserting flavour-specific strings — "Curators" not "Editors", "Curated Preprint" not "Reviewed Preprint" — before step 2a refactors i18n.

1d — Unit test getServerSideProps logic

getServerSideProps in [...path].page.tsx (lines 218–317) has four branches: VOR redirect, preprint redirect, unknown msid → 404, and PDF variant pre-fetching imgInfo. No unit tests exist. Mock fetchVersion via fetch-mock (already installed); assert the returned shape ({ redirect }, { notFound: true }, { props }) for each branch.

Habitability and clearer responsibility boundaries

With the test safety net in place, this phase addresses accumulated design problems and misaligned responsibilities. Step 2a eliminates the i18n namespace misuse that hides TypeScript gaps and complicates isolation testing. Steps 3a–3d enforce a clear boundary: the client fetches data and renders HTML — format adaptation, proxying, and operational endpoints belong to the server. Step 4a removes next/* imports from the component library so it is framework-agnostic before the heavier infrastructure changes ahead.

2a — Replace i18n namespace abuse with typed SiteConfig

i18n has three namespaces (default, elife, biophysics_colab) holding site config — publisher name, editor labels, timeline titles, URLs — selected via NEXT_PUBLIC_SITE_NAME. There is one language. Problems: nested $t() references are invisible to TypeScript; all 17 components calling useTranslation() require I18nextProvider to render, making isolation testing awkward and Phase 7’s Astro island integration unnecessarily complex.

Replace with a typed SiteConfig object and a single React context:

// src/site-config.ts
export type SiteConfig = {
  publisherShort: string;
  publisherLong: string;
  processUrl: string;
  aboutAssessmentsUrl: string;
  aboutAssessmentsDescription: string;
  editorsAndReviewersTitle: string;
  timelineVersionTitle: string;
  timelineVersionTitleWithEvaluation: string;
  reviewProcessReviewed: string;
  reviewProcessRevised: string;
  canonicalUrl: (msid: string) => string;
  // ... one field per flavour-varying value
};

export const elifeConfig: SiteConfig = {
  publisherShort: 'eLife',
  editorsAndReviewersTitle: 'Editors',
  canonicalUrl: (msid) => `https://elifesciences.org/reviewed-preprints/${msid}`,
  // ...
};

export const biophysicsColabConfig: SiteConfig = {
  publisherShort: 'Biophysics Colab',
  editorsAndReviewersTitle: 'Curators',
  canonicalUrl: (msid) => `/reviewed-preprints/${msid}`,
  // ...
};

useSiteConfig() replaces every useTranslation() call. Set the context once in _app.page.tsx from config.siteName. Nested $t() references become plain string composition; interpolated values (e.g. URL templates) become functions, as shown with canonicalUrl above. TypeScript surfaces any missed call sites at compile time.

Update Storybook stories to pass both configs explicitly — this gives Chromatic and the Docker visual tests flavour coverage on every PR without needing the i18n provider or an environment variable.

2b — Flatten EnhancedArticle and make MetaData explicit

EnhancedArticle.article is typed as Omit<ProcessedArticle, 'doi' | 'date'> — a nested sub-object that exists because the DB migration comment in the type acknowledges the old schema was never fully dropped. This produces the articleWithVersions.article.article.* access pattern throughout the codebase, and a double-spread in getServerSideProps to flatten it:

const metaData = {
  ...articleWithVersions.article,         // EnhancedArticle
  ...articleWithVersions.article.article, // ProcessedArticle (wins on conflict)
  ...
};

Complete the DB migration: merge ProcessedArticle fields directly into EnhancedArticle and remove the Omit<…> wrapper. article.article.* access becomes article.* throughout.

At the same time, make MetaData honest. Fields like sentForReview, preprintPosted, preprintUrl, and preprintDoi are spread into metaData at runtime but absent from the MetaData type — invisible to TypeScript. Either add them to the type explicitly or stop spreading them and pass them as a separate prop. Either way, the implicit fields disappear.

Also fix a name collision: VersionSummary is defined twice with different meanings — once in enhanced-article.ts as the discriminated union VORVersionSummary | PreprintVersionSummary | ExternalVersionSummary, and once in reviewed-preprint-snippet.ts as Omit<EnhancedArticle, 'article' | 'peerReview'>. Rename the latter to something unambiguous (EnhancedArticleMetadata or similar); it disappears entirely once step 3a moves the reviewed-preprints API to the server.

2c — Replace hollow Zod schemas with real validation

In src/utils/data-fetch/fetch-data.ts, the schemas for the fields that matter most are all placeholders:

const ToDoSchema = z.any();
const ProcessedArticleSchema = ToDoSchema;
const PeerReviewSchema = ToDoSchema;
const RelatedContentSchema = ToDoSchema;
const VersionSummarySchema = ToDoSchema;

The outer EnhancedArticleWithVersionsSchema validates only that the response is an object with the right top-level keys. Everything nested — article content, peer review, version history — passes through without structural validation. A schema change in the EPP server produces a silent rendering error rather than a caught parse failure.

Replace each placeholder with a Zod schema that mirrors the corresponding TypeScript type. The types already exist in src/types/; the schemas follow directly from them. After step 2b flattens EnhancedArticle, the schema structure matches the type structure without the Omit awkwardness. safeParse failures in fetchVersion become explicit null returns with a logged schema error, making API contract violations immediately visible.

3a — Move reviewed-preprints API to enhanced-preprint-server

/api/reviewed-preprints and /api/reviewed-preprints/[msid] are a format adapter: they take the EPP internal schema and emit application/vnd.elife.reviewed-preprint+json; version=1. The server already owns the data; it should own the format too.

Move the ~400-line transformation in reviewed-preprints.page.ts to enhanced-preprint-server. Logic to port:

  • Author-line generation and date formatting
  • eLife Assessment extraction from evaluation HTML (findTerms regex for significance/strength terms)
  • x-total-count header, pagination, and date-range query parameters
  • Full-text content serialisation for indexContent

Move api-reviewed-preprints.spec.ts with the code. Co-ordinate with the external consumer team before retiring the client-side route.

3b — Move citation proxy to server

/api/citations/[msid]/bibtex and /ris exist because the browser knows the msid but the upstream citation endpoint requires a DOI. The proxy resolves this by looking up the article, extracting the DOI, then fetching the citation and returning it with a Content-Disposition: attachment header. The server already resolves msid → DOI; move the proxy there and update article pages to link directly to the server citation endpoints.

3c — Replace download proxies with direct links

/api/downloads/[msid]/pdf and /xml proxy files from upstream to attach a Content-Disposition: attachment; filename=… header.

  • PDF: article.pdfUrl is present in article data at render time. Render <a href="{pdfUrl}" download="{msid}-v{version}.pdf">. The download attribute forces a save dialog with the chosen filename.
  • XML: generateArticleXmlUri(msid, versionIdentifier) deterministically constructs the URL. Same approach.

The download attribute is restricted by browser security to same-origin requests. If files are served from a different origin, the fix is a Content-Disposition: attachment header on the CDN distribution — a configuration change, not a proxy.

3d — Collapse operational endpoints

robots.txt becomes a static file in public/robots.txt. Maintain separate variants per environment and select the correct one at deploy time. Delete robots.page.ts.

/ping is a health check. A static public/ping file returning 200 satisfies any check that verifies only HTTP status. Delete ping.page.ts.

/status checks whether the upstream API is reachable — a concern that belongs to the server, which owns that dependency. Move it there and delete status.page.ts.

4a — Decouple components from next/*

Replace next/image in components

Seven components use next/image for logos and icons: site-header, site-header-biophysics-colab, biophysics-colab-site-footer, footer-main, investors, error-messages, page-not-found. Replace each with a plain <img>. Don’t introduce a wrapper — the Phase 7 replacement is astro:assets <Image>, a different API.

Replace useRouter() with a prop

[...path].page.tsx reads the active tab from useRouter(). Derive it in getServerSideProps from the request path and pass it down as an activeTab prop instead.

Swap @storybook/nextjs@storybook/react

Once no component imports from next/*, update .storybook/main.js and confirm baselines are unchanged. By this point step 2a has already removed the i18n provider from stories — no temporary decorator is needed for the framework swap.

Use emerging standards

The EPP server is a proprietary API that aggregates and transforms content from bioRxiv and eLife. OXA and DocMaps replace that proprietary surface with open standards: OXA for structured article content, DocMaps for publishing history and peer review. Introducing the adapters here — while the framework is still Next.js — keeps the data-source change and the framework migration as separate, independently verifiable steps. Phase 7’s Astro content collections consume the adapters from this phase directly.

6a — Audit EnhancedArticleWithVersions against OXA + DocMaps

Before writing code, classify every field in EnhancedArticleWithVersions as one of: OXA, DocMaps, residual (neither standard covers it yet), or derived (computable from available data at render time). A first pass has been done by reading the code:

OXA fields

article.title, article.abstract, article.content, article.headings, article.references — carried directly in the OXA node tree. article.authors — OXA author blocks include affiliations and ORCID identifiers. article.licenses — OXA license nodes. article.meta.authorNotes — footnote-style author notes; expected as typed nodes in the OXA tree. subjects — article classifications; likely in OXA metadata but not yet confirmed.

DocMaps fields

msid, doi, versionIdentifier, versionDoi, umbrellaDoi, preprintDoi, preprintUrl, preprintPosted, sentForReview, published — all come from DocMap step outputs. The full versions record — version history for tab navigation, the “previous version” warning, and the timeline — is reconstructed from the DocMap step sequence. The withEvaluationSummary flag on each version summary is derived by checking whether the corresponding DocMap step has an evaluation-summary output. ExternalVersionSummary.corrections (correction event links shown in the timeline) may or may not have a DocMaps representation; treat as residual until confirmed.

peerReview (evaluation summary, individual reviews, author response) maps to DocMap step outputs. Each Evaluation.text field is currently a raw HTML string — the same format DocMaps carries for annotation-platform content. The significance and strength term extraction in reviewed-preprints.page.ts runs a regex over that HTML to populate the eLife assessment badge; this post-processing step remains on the client side after the switch.

Residual fields

metrics (views, downloads, citations) is fetched from a separate metrics API; neither standard covers it. Either keep as a direct client call or drop in favour of a lazy client-side fetch. relatedContent (retraction notices, related articles) is editorial data with no OXA or DocMaps equivalent; it will remain an EPP server field or be migrated to a separate editorial API. pdfUrl is a URL pointer to a hosted file — not article content — and has no natural OXA home; treat as residual and source from a CDN URL convention or a minimal EPP server field. eLocationId, volume, and publishedYear are journal-assignment fields derivable once the VOR relationship is known from DocMaps but not carried by either standard directly. siteName and license (the plain string on EnhancedArticle, distinct from ProcessedArticle.licenses) are configuration.

Third data source: IIIF

Image dimension data (contentToImgInfo) is fetched from config.iiifUrl/info.json at render time for the PDF tab only. This is separate from both the EPP server and OXA/DocMaps and is unaffected by the switch.

The output of this step is a written field-mapping document (a comment block in src/adapters/README.md or similar) and an explicit definition of the residual surface — the minimum EPP server or external API calls still needed after the switch.

6b — Write OXA content adapter

Implement src/adapters/oxa.ts. The OXA node tree structure ({ type, id, classes, data, children }) maps naturally to the existing Content type in src/types/content.ts:

// src/adapters/oxa.ts
export async function fetchOxaDocument(doi: string): Promise<ProcessedArticle> {
  const doc = await fetch(`${OXA_ENDPOINT}/articles/${encodeDoi(doi)}`).then(r => r.json());
  return oxaDocToProcessedArticle(doc);
}

function mapOxaNode(node: OxaNode): Content {
  switch (node.type) {
    case 'Paragraph': return { type: 'Paragraph', content: node.children.map(mapOxaNode) };
    case 'Heading':   return { type: 'Heading', depth: node.data.depth, content: node.children.map(mapOxaNode) };
    case 'Figure':    return { type: 'Figure', ...extractFigureData(node) };
    case 'Math':      return { type: 'MathFragment', value: node.data.math };
    // ... remaining node types
  }
}

Cover all node types present in eLife articles: Paragraph, Heading, Figure (with IIIF URL extraction), Table, MathFragment, MathBlock, CodeBlock, List, ListItem, Link, Strong, Emphasis. Add OXA fixture files in src/adapters/oxa/__tests__/fixtures/ from at least two real articles (one eLife, one biophysics-colab). Unit tests assert the mapped output matches the equivalent EPP API response fields for those fixtures.

6c — Write DocMaps publishing-history adapter

Implement src/adapters/docmaps.ts. A DocMap is a JSON-LD graph of “steps”, each step containing inputs, actions, outputs, and participants. The relevant step types map to EPP fields as follows:

DocMaps step type EPP field
preprint-posted preprintPosted, preprintDoi, preprintUrl
under-review sentForReview
peer-reviewed peerReview.reviews[], peerReview.evaluationSummary
author-response peerReview.authorResponse
revised new PreprintVersionSummary entry in versions
version-of-record VORVersionSummary; triggers VOR redirect
// src/adapters/docmaps.ts
export async function fetchPublishingHistory(msid: string): Promise<PublishingHistory> {
  const docmap = await fetch(
    `${DOCMAPS_ENDPOINT}/docmaps/v2/articles/${msid}`
  ).then(r => r.json());
  return parseDocMap(docmap);
}

Use the @knowledgefutures/docmaps-sdk package for JSON-LD framing rather than parsing raw JSON-LD manually. Extend the Wiremock stubs from step 1b with DocMaps fixtures for the existing test articles (85111, 15102, biophysics-colab-111111), including a VOR article fixture to cover the version-of-record step type.

6d — Validate parity against EPP API output

Write integration tests in src/adapters/__tests__/parity.test.ts that fetch the same articles via both paths and assert field-level equivalence:

  1. Fetch articles via EPP API using the existing Wiremock fixtures from step 1b.
  2. Fetch the same articles via OXA + DocMaps adapters using the new Wiremock stubs added in steps 6b and 6c.
  3. Assert field-level equivalence for all mapped fields identified in step 6a.
  4. Document intentional divergences — cases where OXA or DocMaps is richer or differently structured than the EPP API — in a comment block at the top of the test file and in the field-mapping document from step 6a.

The biophysics-colab article must be included — it exercises a different publisher DocMaps endpoint and a different OXA namespace.

6e — Switch to OXA + DocMaps

Add a feature flag in config.ts:

// src/config.ts
useOxaDocmaps: !!process.env.NEXT_PUBLIC_USE_OXA_DOCMAPS,

In [...path].page.tsx getServerSideProps, branch on the flag:

const article = config.useOxaDocmaps
  ? await buildArticleFromAdapters(msid)  // OXA + DocMaps
  : await fetchVersion(msid);             // EPP API (existing path)

Deploy to staging with NEXT_PUBLIC_USE_OXA_DOCMAPS=true. Run the Phase 1 E2E suite against the staging deployment. Run the Docker visual regression suite against Storybook stories seeded from OXA/DocMaps data. Once both pass, remove the feature flag branch and delete the fetchVersion code path.

Remove runtime complexity

After habitability and clearer responsibility boundaries removed all API routes, the Node.js runtime has nothing left to do at request time — every response is either a static file or a redirect. Making output: 'export' explicit is the acknowledgement of that fact. Caddy replaces the Node.js container as a simple, configuration-driven static file server. The Phase 1 E2E tests verify the Caddy configuration — run the same Playwright suite against the Caddy-served static output and it should pass unchanged.

5a — Convert SSR pages to static generation

Both getServerSideProps functions switch to static generation. index.page.tsx is a straightforward conversion — call fetchVersions() in getStaticProps instead of at request time.

[...path].page.tsx requires getStaticPaths to enumerate all articles at build time. Generate only the base /msid path; Caddy's try_files serves the same file for sub-paths like /msid/figures, and the active tab is determined client-side from the URL (as a prop after step 4a):

export const getStaticPaths: GetStaticPaths = async () => {
  const { items } = await fetchVersions();
  return {
    paths: items.map(({ msid }) => ({ params: { path: [msid] } })),
    fallback: false,
  };
};

The VOR redirect. The redirect logic at lines 270–291 of [...path].page.tsx cannot live in a static HTML file. It must be resolved before this step ships. Three options in order of preference:

  1. Unify the URL spaces. Serve everything under /reviewed-preprints/, add <link rel="canonical"> for SEO, and retire the /articles/ path for preprints. No server round-trip required.
  2. Have Caddy proxy the decision to a lightweight endpoint on enhanced-preprint-server that returns the canonical URL for a given msid. Caddy performs the redirect.
  3. Defer to Phase 7, where Astro middleware can resolve it at build time by writing redirect HTML stubs during the static build.

5b — Static export config and image handling

// next.config.js
module.exports = {
  output: 'export',
  images: { unoptimized: true },
  // existing config...
};

images: { unoptimized: true } disables the /_next/image server-side resize endpoint. It has no visible effect here: the seven components that previously used next/image now use plain <img> tags (after step 4a); article figures use IIIF URL parameters for sizing independently of Next.js. Phase 7 replaces logos with astro:assets <Image> for build-time WebP generation.

5c — URL rewrites to Caddy

The rewrite rules in next.config.js move to the Caddyfile. The Phase 1 redirect tests run against the Caddy-served output and verify correctness:

epp.elifesciences.org {
  root * /srv/out

  @articles path_regexp art ^/articles/(.+)$
  redir @articles /reviewed-preprints/{re.art.1} 301

  @previews path_regexp prev ^/previews/(.+)$
  redir @previews /reviewed-preprints/{re.prev.1} 301

  @numeric path_regexp num ^/(\d+)(\.bib|\.ris|\.pdf|\.xml)?$
  redir @numeric /reviewed-preprints/{re.num.1}{re.num.2} 301

  file_server
  try_files {path} {path}.html {path}/index.html =404
}

5d — Rebuild webhook

With fallback: false, new articles return 404 until a rebuild completes. Configure enhanced-preprint-server to emit a webhook on article publication that triggers the CI/CD pipeline. The pipeline runs next build, then updates Caddy's document root atomically via a symlink swap or rolling deploy.

Add a scheduled rebuild (for example, hourly) as a fallback in case a webhook delivery fails silently. The scheduled rebuild bounds the worst-case delay for a new article becoming visible.

Align tooling with project intent

Next.js is a full-stack framework optimised for server-rendered applications. A static site that publishes scientific articles is better served by a tool designed for exactly that use case. Because habitability work decoupled the component library from Next.js-specific imports and replaced i18n with a typed config, and use emerging standards migrated the data source to OXA + DocMaps, this phase focuses on Astro page files and content collections. The 147 React components and their stories do not change.

7a — Project setup and content collections

Astro’s content layer (Astro 5) is what enables incremental builds. It replaces getStaticProps and getStaticPaths with a centralised data-fetching layer that caches results between builds in .astro/data-store.json. Each entry carries an id and a digest. On every build Astro compares digests against the cache; only entries whose digest changed get their pages re-rendered. One new article = one changed digest = only that article’s pages rebuilt.

The content collection loader uses the DocMaps adapter from step 6c to enumerate articles and derive digests from DocMaps version identifiers — giving true incremental build semantics tied to actual content changes rather than arbitrary timestamps:

// src/content/config.ts
import { defineCollection } from 'astro:content';
import { fetchAllPublishingHistories } from '../adapters/docmaps';

const articles = defineCollection({
  loader: async () => {
    const histories = await fetchAllPublishingHistories();
    return histories.map((history) => ({
      id: history.msid,
      // Digest from DocMaps: changes only when the DocMap itself changes
      digest: history.docmapDigest,
      ...history,
    }));
  },
});

export const collections = { articles };

Each article page then fetches its OXA content at build time:

// src/pages/reviewed-preprints/[msid].astro
const entry = await getEntry('articles', msid);
const oxaContent = await fetchOxaDocument(entry.data.doi); // from Phase 6 adapter

For the incremental cache to work in CI, .astro/data-store.json must be persisted between runs. Without this, every CI build is a full rebuild:

# GitHub Actions
- uses: actions/cache@v4
  with:
    path: .astro
    key: astro-${{ github.ref_name }}-${{ github.run_id }}
    restore-keys: astro-${{ github.ref_name }}-

7b — Page files and BaseLayout

Astro pages live in src/pages/ as .astro files. The existing React components are used directly — no component rewrites required.

---
// src/pages/reviewed-preprints/[msid].astro
import { getCollection, getEntry } from 'astro:content';
import BaseLayout from '../../layouts/BaseLayout.astro';
import { ArticlePage } from '../../components/pages/article/article-page';
import { elifeConfig, biophysicsColabConfig } from '../../site-config';

export async function getStaticPaths() {
  const articles = await getCollection('articles');
  return articles.map((entry) => ({ params: { msid: entry.id } }));
}

const { msid } = Astro.params;
const entry = await getEntry('articles', msid);
const siteName = import.meta.env.PUBLIC_SITE_NAME ?? 'elife';
const siteConfig = siteName === 'biophysics-colab'
  ? biophysicsColabConfig : elifeConfig;
---
<BaseLayout title={entry.data.title} siteName={siteName}>
  <ArticlePage article={entry.data} activeTab="fulltext"
    siteConfig={siteConfig} />
</BaseLayout>

_app.page.tsx and _document.page.tsx become src/layouts/BaseLayout.astro, handling GTM, Cookiebot, font variables, and the layout switch. Replace next/font/google with @fontsource/noto-serif / @fontsource/noto-sans. NEXT_PUBLIC_*PUBLIC_* via import.meta.env. No I18nextProvider needed — SiteConfig is a plain prop.

7c — Split catch-all into per-tab pages

In Astro, each tab is a separate .astro file. Tab navigation becomes plain <a href> links between static HTML files — no JavaScript required, works without JS, indexed independently by crawlers:

src/pages/reviewed-preprints/
  [msid].astro          → fulltext (canonical)
  [msid]/figures.astro  → figures tab
  [msid]/reviews.astro  → peer review tab

Both /reviewed-preprints/msid and /msid/fulltext currently show fulltext. Make [msid].astro canonical; add a one-line Caddy 301 from /msid/fulltext/msid. The PDF tab can become a static page or be retired in favour of the direct PDF link from step 3c.

7d — Audit client:* directives

Astro renders React components to static HTML at build time and ships zero JavaScript for them by default. A component only hydrates on the client when opted into with a client:* directive. Missing a directive on an interactive component produces no error — the component renders correctly but ignores user events.

Directive Hydrates when Use for
client:load Page loads immediately Components needed on first interaction
client:idle Browser reaches idle Lower-priority interactive components
client:visible Scrolled into viewport Components below the fold
client:only="react" Never SSR’d Components that cannot render server-side

Most of the 147 components are presentational and need no directive. Interactive components identified by Phase 1 E2E tests: authors toggle (client:visible), clipboard copy (client:visible), jump-to menu (client:idle), modal (client:load).

7e — Build-time image optimisation

Replace the plain logo <img> tags (from step 4a) with astro:assets <Image> for build-time WebP generation:

---
import { Image } from 'astro:assets';
import elifeLogo from '../../images/elife-logo.svg';
---
<Image src={elifeLogo} width={120} height={40} alt="eLife" />

7f — Verify visual regression baselines

Run the Docker visual regression suite against the Astro build. Run Phase 1 E2E tests against the Caddy-served Astro output. Both should pass unchanged.

Use CDN directly for hosting

Static files don’t need a container. Uploading out/ to Fastly Object Storage and serving directly from the CDN eliminates the k8s Deployment, Service, IngressRoute, and Caddy container in a single phase. Requests are handled at Fastly’s edge — closer to readers, with no infrastructure to manage.

8a — Provision Fastly Object Storage bucket and configure the Fastly service

Create a Fastly Object Storage bucket via the Fastly API or CLI. No "static website hosting" mode — the CDN handles URL routing via VCL. Auth is a Fastly API token; no IAM or cross-cloud credentials needed.

Point the Fastly service backend at the bucket. Add VCL for:

  • Directory-style URLs — append index.html when path has no file extension
  • 404 handling — serve 404/index.html with status 404 on missing objects
  • Surrogate keys — tag HTML responses by msid for per-article cache purge
# Example Fastly VCL fragment — directory index rewriting
sub vcl_recv {
  if (req.url !~ "\.[a-zA-Z0-9]+$") {
    set req.url = req.url + "index.html";
  }
}

Test before DNS moves: point the Fastly service at Object Storage while Caddy stays live in k8s. Use a Fastly staging service or Fastly-Debug: 1 to verify article pages, homepage, and 404 handling.

8b — Update CI/CD: upload to Object Storage and purge Fastly cache

Replace Docker build + kubectl apply with an Object Storage upload and Fastly purge. Fastly Object Storage is S3-compatible, so aws s3 sync --endpoint-url works. Two passes for correct cache headers:

# Hashed assets — cache forever
aws s3 sync out/ s3://epp-static/ \
  --endpoint-url "$FASTLY_STORAGE_ENDPOINT" \
  --exclude "*.html" \
  --cache-control "public, max-age=31536000, immutable" \
  --delete

# HTML — always revalidate
aws s3 sync out/ s3://epp-static/ \
  --endpoint-url "$FASTLY_STORAGE_ENDPOINT" \
  --include "*.html" \
  --cache-control "no-cache" \
  --delete

After upload, purge by surrogate key (single article) or purge the whole service (full rebuild):

fastly purge --service-id "$FASTLY_SERVICE_ID" \
             --surrogate-key "rp/$MSID vor/$MSID"

Run the hashed-asset pass first — HTML must not reference an asset hash before the asset is uploaded.

8c — VOR redirect in Fastly VCL

If the VOR redirect was not resolved in step 5a: use a Fastly Edge Dictionary mapping each msid to its type (vor or rp), populated by CI via the Fastly API. By Phase 6 the article type is available directly from DocMaps (the version-of-record step type), so CI populates the dictionary from the DocMaps index without querying the EPP server. VCL issues a 301 before Object Storage is consulted:

# Populate dictionary during CI build (pseudocode)
for article in articles:
  fastly dictionary upsert \
    --dict-id "$DICT_ID" \
    --item-key "$article.msid" \
    --item-value "$article.type"  # "vor" or "rp" — from DocMaps
# VCL recv — redirect if URL prefix does not match article type
sub vcl_recv {
  declare local var.msid STRING;
  declare local var.type STRING;
  set var.msid = regsuball(req.url, "^/(articles|reviewed-preprints)/([^/?]+).*", "\2");
  set var.type = table.lookup(article_types, var.msid, "");

  if (var.type == "vor" && req.url ~ "^/reviewed-preprints/") {
    return(synth(301, "/articles/" + var.msid));
  }
  if (var.type == "rp" && req.url ~ "^/articles/") {
    return(synth(301, "/reviewed-preprints/" + var.msid));
  }
}

8d — Remove k8s manifests and cut over DNS

Lower DNS TTL to 60s at least 24 hours before. Switch the Fastly backend from the k8s ingress to Object Storage; verify via Fastly's hostname. If DNS already points to Fastly, the backend switch is the entire cutover — DNS does not move.

After a monitoring period, delete from the k8s manifests repo:

  • Deployment — EPP client pods and Caddy container
  • Service — ClusterIP or NodePort for the Caddy container
  • IngressRoute (Traefik) — routing rule from Traefik to the service
  • Any HorizontalPodAutoscaler targeting the EPP deployment

Remove the Caddy Dockerfile and Caddyfile. URL rewrite rules are now in Fastly VCL (step 8c) or were already removed in step 5c.

Step overview and risk register

Step What changes Key risk
1a Playwright screenshot assertions on Storybook stories run inside pinned Docker image; baseline screenshots committed

Medium — Pixel-diff threshold must be calibrated against real rendering noise before the check becomes gating.

1b Playwright webServer config starts both Wiremock and Next.js automatically; hardcoded URLs replaced with baseURL

Low — No VOR Wiremock fixture exists; VOR redirect test in 1c is blocked until one is added.

1c E2E tests for article tabs, URL redirects, VOR redirect, homepage, 404, and both EPP flavours

Medium — VOR Wiremock fixture is a prerequisite. Biophysics-colab flavour assertions depend on i18n being correctly initialised until step 2a replaces it.

1d Unit tests for all four getServerSideProps branches

Low — Standard unit testing with mockable dependencies.

2a i18n namespace abuse replaced with typed SiteConfig; both flavours explicitly covered in Storybook stories and E2E tests

Medium — 17 components to update; nested $t() references require a full audit of i18n.ts before starting. Visual regression baseline from step 1a is the safety net.

2b DB migration completed: ProcessedArticle flattened into EnhancedArticle; MetaData made an explicit type; second VersionSummary renamed; EnhancedArticleWithPublishedDate moved to types

Low — Mechanical rename and type consolidation. Eliminates .article.article.* double-nesting and invisible spread fields. Do before step 6a.

2c Hollow z.any() placeholders replaced with real Zod schemas mirroring the TypeScript types; safeParse failures return null with a logged error

Low — Low risk but high diagnostic value; surfaces API contract drift at the boundary. Do after step 2b once the types are stable.

3a Reviewed-preprints format adapter (~400 lines) moves to enhanced-preprint-server; API specs move with it; external consumers re-pointed

High — Complex transform logic; server may be a different language; external consumer coordination requires another team.

3b Citation proxy moves to server; article pages link directly to server citation endpoints

Low — Server already resolves msid → DOI. 30 lines of proxy logic.

3c PDF and XML download proxies removed; article pages use download attribute direct links

Medium — Browser ignores download on cross-origin URLs. CDN must add Content-Disposition: attachment or links open in-browser.

3d robots.txt becomes a static file; /ping becomes a static file; /status moves to server

Low — Confirm nothing monitors /ping or /status on the client before removing them.

4a Seven components: next/image<img>. Page component: useRouter()activeTab prop. Storybook: @storybook/nextjs@storybook/react

Medium — Framework swap breaks stories that still use router context. Complete the prop refactor first. Step 2a having already removed the i18n provider means no temporary decorator is needed.

5a getServerSidePropsgetStaticProps + getStaticPaths; VOR redirect decision made and implemented

Medium — Build time scales with article count; benchmark early. VOR redirect logic must be resolved before this step ships.

5b output: 'export' added; images: { unoptimized: true } set

Low — No user-visible effect for this project's image types (plain <img> logos after step 4a, IIIF figures).

5c URL rewrites move from next.config.js to Caddyfile; Phase 1 redirect tests verify the result

Medium — Caddy rewrite syntax differs from Next.js. Dynamic VOR redirect cannot be a static rule; requires a resolution from step 5a.

5d EPP server emits rebuild webhook on article publication; CI rebuilds and redeploys; scheduled fallback rebuild added

Medium — Silent webhook failure leaves new articles returning 404. Scheduled rebuild is the fallback. Endpoint must be authenticated.

6a Every field in EnhancedArticleWithVersions classified as OXA, DocMaps, residual, or derived; residual surface defined in writing

Low — Design step, no code changes. Output is a field-mapping document that prevents parity surprises in step 6d.

6b OXA content adapter: fetchOxaDocument(doi)ProcessedArticle; all node types mapped; OXA fixture files committed

Medium — OXA is at v0.0.1; pin library version and treat schema changes as deliberate adapter failures rather than silently absorbing them.

6c DocMaps publishing-history adapter: fetchPublishingHistory(msid){ versions, peerReview, timeline }; Wiremock stubs added for DocMaps endpoint

Medium — JSON-LD parsing complexity; use @knowledgefutures/docmaps-sdk rather than parsing raw JSON-LD. Assessment term extraction has no DocMaps equivalent; becomes a post-processing step.

6d Parity integration tests: OXA + DocMaps output asserted against EPP API output field-by-field; intentional divergences documented

Medium — Parity gaps are expected output, not failures. Complete before 6e; undiscovered gaps after the switch produce hard-to-attribute rendering errors.

6e Feature flag NEXT_PUBLIC_USE_OXA_DOCMAPS; staging deployment verified against Phase 1 E2E suite and visual regression; flag removed and EPP API data path deleted

High — Replaces data source for all pages. Phase 1 E2E suite and visual regression baseline are the safety nets. Keep EPP API path live behind the flag for at least one production build cycle before deleting it.

7a Astro project initialised with React integration; content collections use DocMaps adapter for article enumeration and digests, OXA adapter for per-article content; CI configured to persist .astro/data-store.json

High — CI cache must be explicitly persisted or every build is a full rebuild. React 19 + @astrojs/react compatibility must be verified upfront.

7b Astro page files and BaseLayout.astro written; SiteConfig passed as prop (no i18n provider); Noto fonts via @fontsource

Medium — GTM and Cookiebot script placement differs from Next.js <Head>. Verify in built output and test both site configurations.

7c Catch-all replaced by per-tab .astro files; tab navigation becomes plain <a href> links

Low — Fulltext URL duplication resolved by a one-line Caddy redirect. Phase 1 tab tests verify correctness.

7d client:* directives added to interactive components: authors toggle, clipboard, jump-to-menu, modal

High — Missing a directive silently breaks interactivity with no error. Only detectable through E2E interaction tests. Phase 1 interaction tests must be complete first.

7e Logo <img> tags replaced with astro:assets <Image> for build-time WebP generation

Low — SVGs pass through unchanged. Check for externally-referenced SVGs before enabling.

7f Docker visual regression suite re-run against Astro output; Phase 1 E2E suite run against Caddy-served Astro build

Low — Confirmation step. Main failure mode is a missed client:* directive from step 7d.

8a Fastly Object Storage bucket provisioned; Fastly service backend switched to Object Storage with directory-index VCL, 404 handling, and per-article surrogate keys

Low — Newer product; confirm it is available on the account plan. Review the Fastly static-site tutorial for current bucket and backend syntax before starting.

8b CI/CD replaces Docker build and kubectl apply with aws s3 sync --endpoint-url to Fastly Object Storage (separate passes for HTML and hashed assets) and Fastly surrogate-key purge

Medium — The --delete flag opens a short window where a deleted object is not yet replaced. Upload new and changed objects first; delete stale ones last.

8c Fastly Edge Dictionary populated by CI from DocMaps index (article type sourced from version-of-record step, no EPP server query needed); VCL recv issues 301 if URL prefix does not match article type

Medium — Dictionary must be updated after the Object Storage upload completes, not before. A race between the dictionary update and the upload would cause a correctly-redirected URL to 404 transiently.

8d k8s Deployment, Service, and IngressRoute deleted; Caddy Dockerfile retired; DNS TTL lowered before cutover; Fastly backend switch used as the cutover mechanism

Low — If Fastly already fronts the cluster, cutover is a single backend change with instant rollback. Confirm Fastly's position in the stack before planning the sequence.

Background

OXA

OXA (Open eXchange for Articles) is a JSON-based typed node tree for structured article content, backed by eLife, Stencila, Quarto, and Curvenote. It is a web-native alternative to JATS XML, representing headings, paragraphs, figures, math, and code as composable, typed nodes. The specification is at version 0.0.1.

DocMaps

DocMaps is a JSON-LD framework for representing peer review and publishing history as a graph of editorial events — preprint posting, submission for review, individual evaluations, revisions, and VOR transitions. eLife/Sciety, EMBO, and bioRxiv have collectively published thousands of DocMaps.

OXA + DocMaps as source of truth

Together OXA and DocMaps replace the EPP server’s proprietary transformation API as the source of truth for both article content and publishing history. Use emerging standards introduces the adapters and validates parity; Align tooling with project intent consumes them directly in Astro content collections, with incremental build digests derived from DocMaps version identifiers.