The meaning of the first three optional parameters is exactly the same
as in Allow command.
Use this to scan a HTML page for "href" tags but not to index the contents
of the page with an URLs that match (doesn't match) given argument.
Commands have global effect for all configuration file.
# When indexing large mail list archives for example, the index and thread
# index pages (like mail.10.html, thread.21.html, etc.) should be scanned
# for links but shouldn't be indexed:
HrefOnly */mail*.html */thread*.html