Description
HoldBadHrefs defines period of time to keep
unavailable documents (with bad status, e.g.
404 Not found or 503 Service unavailable)
before deleting them from the database.
When indexer finds that a remote host is down,
the documents from this site are not deleted from
the database immediately and search.cgi uses the previous content
of these documents. However, if the site doesn't respond for
a long period of time (e.g. a month), it should be fine to
remove its documents from the database.
The default value is 0, which means never
delete unavailable documents from the database automatically
(for better crawling performance).
Note:
You can periodically delete bad documents from the database manually,
using indexer with the -s command line parameter
(status limit), for example: indexer -Cw -s404.
See Period for the time format description.