check_site

synopsis:Site validation spider

A site validator which uses webtoolbox.clients.Spider to process an entire site and checking for bad links, 404s, and optionally HTML validation. It generates either text or HTML reports and can be used to generate lists of site URLs for use with load-testing tools like tornado_bench or wk_bench.

--help
Display all available options and full help
-v
--verbosity
Increase the amount of information displayed or logged
--validate-html
Process all HTML using HTML Tidy and report any validation errors
--format=REPORT_FORMAT
Generate the report as HTML or text
--report=REPORT_FILE
Save report to a file instead of stdout
--skip-media
Skip media files: <img>, <object>, etc.
--skip-resources
Skip resources: <script>, <link>
Skip links whose URL matches the specified regular expression
--save-page-list=PAGE_LIST
Save a list of URLs for HTML pages in the specified file for use with a tool like tornado_bench or wk_bench
--save-resource-list=RESOURCE_LIST
Save a list of URLs for pages resources in the specified file
--log=LOG_FILE
Specify a location other than stderr
--simultaneous-connections=2
Adjust the number of simultaneous connections which will be opened to the server

Previous topic

The Tools

Next topic

red_spider

This Page