mnoGoSearch is an open source full-featured SQL based web search engine. mnoGoSearch consists of two parts. The first part is an indexing mechanism - indexer. indexer walks over HTML hypertext links and stores information about the documents in the database. The second part is a Web CGI front-end search.cgi which displays HTML form and search results in the browser. search.cgi uses information collected by indexer.
Note: A PHP and a Perl front-ends are also available.
The first public version of mnoGoSearch was released in November 1998. The search engine was named UDMSearch until the project was acquired by Lavtech.Com Corp. in October 2000 and its name changed to mnoGoSearch.
The latest changelog can be found on our website.
The main mnoGoSearch features are:
MySQL, PostgreSQL, SQLite, Mimer, Virtuoso, Interbase, Oracle (see the Section called Oracle notes in Chapter 7), MS SQL, DB2 (see the Section called IBM DB2 notes in Chapter 7), Sybase, InterSystems Cache databases can be used as storage. mnoGoSearch can also work with various ODBC libraries: iODBC, unixODBC, EasySoft ODBC-ODBC bridge.
HTTP proxy support.
NNTP support (news:// and nntp:// URL schemes).
HTDB virtual URL scheme support. You can build an index and search through the large text fields of an SQL database and thus use mnoGoSearch as an external fulltext search solution in your database applications.
Build-in parsers for HTML, XML, text, RTF, DOCX, message/rfc822 (*.eml and *.mht) and MP3 file types.
External parsers support for any other document types.
Basic authorization support to index password protected HTTP servers.
Proxy authorization support.
Reentry capability. You can run multiple indexing and searching processes at the same time even on the same database. Multi-threaded crawling and search are also supported.
Robots exclusion standard support, including <META NAME="robots" content="...">, <a rel="nofollow">, X-Robots-Tags HTTP header and robots.txt (with the * and $ patterns).
C language CGI, PHP, Perl Web front-ends.
You can embed search into your own application with help of a C API library.
Boolean query language.
Ordering results by relevancy, popularity rank, modification time. User defined ordering.
Fuzzy search: word forms (stemming), synonyms, substrings, dehyphenation, transliteration, accent insensitive search.
Most of the modern character sets support
HTML templates to customize search results easily.
Advanced search options like modification time limits, document type limits etc.
Phrases segmenting for Chinese and Japanese languages.