check_site
| synopsis: | Site validation spider |
A site validator which uses webtoolbox.clients.Spider to process an entire
site and checking for bad links, 404s, and optionally HTML validation. It
generates either text or HTML reports and can be used to generate lists of
site URLs for use with load-testing tools like tornado_bench or
wk_bench.
-
--help
- Display all available options and full help
-
-v
-
--verbosity
- Increase the amount of information displayed or logged
-
--validate-html
- Process all HTML using HTML Tidy and
report any validation errors
-
--format=REPORT_FORMAT
- Generate the report as HTML or text
-
--report=REPORT_FILE
- Save report to a file instead of stdout
-
--skip-media
- Skip media files: <img>, <object>, etc.
-
--skip-resources
- Skip resources: <script>, <link>
-
--skip-link-re=SKIP_LINK_RE
- Skip links whose URL matches the specified regular
expression
-
--save-page-list=PAGE_LIST
- Save a list of URLs for HTML pages in the specified
file for use with a tool like tornado_bench or wk_bench
-
--save-resource-list=RESOURCE_LIST
- Save a list of URLs for pages resources in the
specified file
-
--log=LOG_FILE
- Specify a location other than stderr
-
--simultaneous-connections=2
- Adjust the number of simultaneous connections which will be opened to the
server