Quick Start

Just type (or Cut&Paste) the URL for the page you want to validate into the text field on the form and press the "Check now" button.


Calling/Linking to the Validator

You can link directly to the Validator home page, or you can call the Validator CGI program. The home page is at the moment (and for the foreseeable future) and the CGI program can be reached at

What kind of checks ArchiveReady does?

ArchiveReady is checking several website attributes such as:

  1. Hypertext validity and format (HTML, CSS validation),
  2. Page contents structure,
  3. Compliance with open standards,
  4. Hyperlinks validity,
  5. Robots.txt contents,
  6. Sitemap.xml contents and validity,
  7. RSS feeds structure.

What people say about web archiving?

Web archiving is the process of collecting portions of the World Wide Web and ensuring the collection is preserved in an archive, such as an archive site, for future researchers, historians, and the public.

To enable the archive of a site by the Portuguese Web Archive, it is fundamental that the site presents a crawler-friendly homepage. Portuguese Web Archive Crawler

Heritrix (ExtractorJS) has trouble finding the links that are not hardcoded strings in javascript. Heritrix Known Issues

If fancy features such as JavaScript, cookies, session IDs, frames, DHTML, or Flash keep you from seeing all of your site in a text browser, then search engine spiders may have trouble crawling your site. How Googlebot sees your webpages

The content of your robots.txt file tells search engine crawlers how they should visit your site. Google Webmaster Guidelines

What kind of technologies are used to make ArchiveReady?

ArchiveReady is built using Python and various different libraries such as requests, Beautiful Soup. Nginx and uwsgi are also used. Everything is written in Vim.


