| Google Books Ngram Viewer | Frequency time series | English corpora through 2022; queried at smoothing 0 before local transforms | Google Books Ngram terms; attribution required |
| Project Gutenberg | Public-domain context text | Selected public-domain books, mainly eighteenth to early twentieth century | Project Gutenberg License; public-domain status varies outside the US |
| Library of Congress / Chronicling America | Historical newspaper evidence | Digitized US newspapers, chiefly 1770s-1960s depending on collection availability | Library of Congress rights statements; item-level rights vary |
| Wikimedia / Wikinews / MediaWiki APIs | Modern context and attention signals | Contemporary page, article, and context metadata where relevant | CC BY / CC BY-SA family; project-specific terms apply |
| Lexical references | Attestation and sense-history checks | OED candidate checks, Online Etymology Dictionary, Wiktionary, Merriam-Webster, Cambridge | Publisher-specific; entries are not reproduced |
| Policy, clinical, and technical references | Domain context anchors | EU AI Act, GDPR/ICO, FTC/NIST/OECD, PubMed/MeSH, WHO/NIMH, APA/DSM-history pointers, Stanford HAI and related pages | Source-specific; used as citation targets and metadata, not republished text |
| Public law and human-rights repositories | Legal and rights anchors | Wikisource, CourtListener, Justia/Oyez, Cornell Wex, NY Senate, UN, ECHR, EUR-Lex, eCFR, govinfo, DOJ/HHS, FTC, OECD, CPPA and related public pages | Source-specific; legal text, summaries, and court opinions are cited or paraphrased, not redistributed as a corpus |
| Geographic and demographic context sources | Aggregate context signals | OpenAlex, GDELT, World Bank indicators, Our World in Data fallback values, Open-Elevation, and Google Trends availability checks | Source-specific; used as aggregate metrics, metadata, or unavailable-source audits only |