mnoGoSearch for Windows Version History
Changes in 3.2.42
(12 April 2007)
- Minor indexing performance improvements were done.
- Bug#1016 "Indexer is selecting wrong Content-Type" was fixed.
- Bug#1024 "Clear database limitations do not work: error ORA-01795" was fixed.
- Bug#1044 "-Ewordstat: incorrect unicode sequence" was fixed.
- Bug#1110 "'invalid UTF-8 byte sequence detected' when INSERT INTO dictXX"
was fixed. This error happened when indexing into PostgreSQL with DBMode=multi.
The "intag" column type was changed from TEXT to BYTEA in the tables
"dict00".."dictFF".
- Bug#1182 "Indexer crashes with -a -y 'content/type'" was fixed.
- Bug#1398 "DateFactor does not work with DBMode=blob" was fixed.
- Bug#1427 "ORA-01785: maxinum number of expressions in a list is 1000" was fixed.
- Bug#1436 "Cannot run -Ewordstat, ORA-01400: cannot insert NULL" was fixed.
Changes in 3.2.41
(03 February 2007)
-
DBMode=blob is now supported with Firebird/Interbase.
-
The "UserScore" and "UserScoreFactor" search.htm commands where added.
These commands allow to mix score calculated by mnoGoSearch with
user defined score. The "us" search.cgi parameter was added to
choose which UserScore to use for the current search session.
-
Hindi language map was added. Thanks to Yannick LE NY for contribution,
"Create fast search index" for DBMode=blob performance improvements
were made: it's now up to two times faster depending on the
database. Converting now uses less memory allocations and
memory moves, and also utilizes DISABLE/ENABLE KEYS technique
with MySQL when writting to the table "bdict".
-
The "DEFAULT" clause was removed from all BLOB/TEXT fields
in MySQL create scripts, to avoid errors when running with
mysqld in "strict" mode.
-
The "ServerTable" command is now ignored when loading search.htm.
It's usefull when ServerTable/DBAddr commands are written
in a separate file which is included from both indexer.conf and
search.htm. Thanks to Michael Hanselmann for the patch.
-
The search cache now respects the "u" variable.
Thanks to Michael Hanselmann for the patch.
-
Fixed that search.cgi now detects and reports missing template
ENDIF and ENDWHILE operators. Previously it fell into an endless loop.
3.2.40.1
10 November 2006
- Fixed that cookies with path "/foo" didn't match neither "/foobar" nor
"/foo/bar.html". Only cookies with trailing slashes in path (e.g. "/foo/")
worked as expected.
- Fixed that section references like 'title:word1 body:word2' didn't work
in DBMode=single and DBMode=multi
- Bug#1142 "indexer gives PG errors when indexing" was fixed.
- Fixed that substring search with DBMode=blob returned extra empty
documents in some cases.
- Fixed that "fl" was ignored with "ul" or date limit specified at the
same time.
- Fixed that indexer crashed with "UseCookies yes" in some cases.
- bug#1300 "Low ranking with GroupBySite=yes" was fixed.
- A fix in MySQL driver was made to make indexer work with mysql-3.23 again:
fixed not to use SQL syntax appeared in mysql-4.0 or later (bug since 3.2.38).
- Fixed that search.cgi crashed because of stack overflow in some cases.
3.2.39.1
05 June 2006
- "Locale" search.htm command was added. Month and day names in $(Last-Modified)
are now printed according to the desired locale. Example: "Locale french".
- Search query syntax now undestands section name references.
For example, "title:web body:server" will find documents having "web"
in title and "server" in body.
- Automatic phrase search was implemented for complex words having dots,
dashes, underscores, commas and slashes (-_.,/) as delimiters between
word parts. For example, `max_allowed_packet' now automatically searches
for phrase`"max allowed packet"', not just for three separate words.
- Better excerpt generation for phrase search. Separate words are not included
into excerpts anymore.
- Minor query tracking performance improvements were done for MySQL,
Oracle, Mimer and Interbase.
- "CREATE TABLE url" and "CREATE TABLE urlinfo" MySQL statements were fixed.
Now these tables can be bigger than 4Gb.
- Fixed that "Custom HTTP headers" on the "Common" didn't work.
- Fixed that path and file names were not URL-unescaped when talking
to a FTP server.
- Bug#769 "Missing Alias var in clone template" was fixed.
- Bug#1105 "Column 'rec_id' in field list is ambiguous" was fixed.
- Bug#1099 "Illegal using sequences in Oracle" was fixed.
- Fixed that the variables declared using ReplaceVar were not converted
from LocalCharset to BrowserCharset with an empty search query typed.
3.2.38.1
15 March 2006
3.2.37.1
17 Febrary 2006
-
Fixed that crosswords didn't take into account the "fl"
parameter.
-
Fixed that empty "fl" value didn't return any results,
instead of returning all matching results without filtering.
-
Fixed that search unnecessarily loaded the "fl" limits
with an empty search query, e.g. when switching between "simple" and
"extended" search form.
-
Fixed that search cache did not take into account the
"fl" parameter, so one could get the same result with different limits
if Cache is on.
-
Fixed that "de" patameter didn't work inclusively, e.g.
de=01/01/2006 included only those documents modified before
"01/01/2006 00:00:00", instead of "01/01/2006 23:59:59".
-
Several minor memleaks were fixed.
3.2.36.1
02 Febrary 2006
- indexer now supports DBMode=blob, which is now the fastest
DBMode for both indexing and searching.
- It's now possible to use variables in an external parser
command line. This example passes URL and TAG values in the
parser command line:
Mime "text/pdf" "text/plain" "/path/to/parser -u ${URL} -t ${TAG}"
See the list of all available variables in "indexer -v6" output,
in the lines beginning with "Response." prefix.
- "SQLWordForms sql" search.htm command was added. It intorduces
a new fuzzy search method allowing to load synonyms or
word forms from the SQL database. It can be used as a faster
replacement for Synonym and Ispell fuzzy search methods.
- Synonym files now understand "Mode: reverse" and "Mode: oneway"
commands to change word expansion behaviour between
"all words exapand to all words on the same line" and
"only the leftmost word expands to other words on the same line".
- "NumWordFactor num" search.htm command was added,
where num is between 0 and 255. It specifies how much the number
of found words in a document affects its final score.
255 means maxinum effect, 0 means ignore the count of found words.
- "MinCoordFactor num" search.htm command was added.
Use this command to give more score for those documents
having the first found word closer to the beginning
of the document. Use with a number between 0 and 255.
The default value is 0, which means no effect.
- "URLDataThreshold num" search.htm command was added. It allows
to improve search performance with DBMode=blob for the queries
returning a small number of results (not more than several hundreds).
If search returns less than "num" documents,
full URL information is not loaded from the "bdict" table
and the "url" table is used instead. The default value is 0,
which means always read URL data from the "bdict" table.
Find the number which is good for your installation experimentally.
- "UseNumericOperators yes/no" search.htm command was added.
When set to "yes", the "<" and ">" signs are treated as numeric
comparison operators, e.g. "<100" finds all documents
which have numbers less than 100 in their body or title or
other sections according to the "wf" settings.
Default value is "no", i.e. numeric operators are ignored.
- New character set name aliases were added: "armscii8", "koi8r",
"koi8u" and "ujis", for MySQL names compatibility.
- Fixed that XML character set declaration was not processed, e.g.:
<?xml version="1.0" encoding="utf-8"?>
- Fixed that query tracking didn't work with Oracle, DB2,
Firebird, Mimer, Sybase (Bug#742).
- Fixed that "crossdict" table wasn't created for Oracle, DB2,
Mimer and Interbase/Firebird (Bug#748).
- Fixed that $(PerSite) value was calculated incorrectly
with several DBAddr search.htm commands.
- Fixed that template operators inside a HTML comment
were interpreted instead of being printed just as
a comment part (Bug#708, part2).
- Fixed that <!EREG> didn't work with "<" and ">" characters
inside REPLACE attribute (Bug#1010)
- Fixed that <META NAME="ROBOTS" CONTENT="NOINDEX"> didn't
prevent indexing of the url.file, url.path, url.site,
url.proto sections (Bug#679).
- indexer now chooses character set value in this
order: "Content-Type" HTTP header, "Content-Type" META tag,
RemoteCharset value from indexer.conf. Previously RemoteCharset
was incorrectly selected in the first instance (bug#575).
- Fixed that "Sun, 6 Nov 1994 08:49:37 GMT" date format was
not recognized when indexing a NEWS server (Bug#694).
- Syntax error in PostgreSQL trigger was fixed (Bug#784).
- Fixed that search.cgi could crash when running with DBMode=blob
in some cases. Thanks to Goga for proposing the fix.
3.2.35.1
05 November 2005
- Fixed that msvcrtd.dll is not required anymore (Bug#927).
- A new "wtime" column was added into "qtrack" table
to store time spent for search, in milliseconds.
Everyone who uses "trackquery" feature needs add this
column (e.g. using ALTER TABLE) or recreate "qtrack" table.
- IndexIf/NoIndexIf now understand variables, e.g.
the following command means not to index documents
having content type "text/plain" from the site 'site':
NoIndexIf "${URL}#${Content-Type}" "http://site/*#text/plain"
- indexer and search.cgi now load my.cnf file by default.
Use "DBAddr mysql://user:passwd@host/dbname/?MyCnfGroup=group"
to read options from the named group. If MyCnfGroup=no is
specified, then the option file is not loaded (Bug#771).
- "DateFactor number" search.htm command was added. Use with a
number in the range 0..255 to change effect of Last-Modified
of a document on its score. The default value is 0, which means
don't take Last-Modified into account. If DateFactor is set
to a non-zero value, then a more fresh document gets better
score than an older document with the same content.
- Indexer now treats the documents having "xml" and "rss" substrings
in Content-Type header as XML documents. E.g. "application/xml",
"application/rss" are now understood as XML as well.
Previously only the exact "text/xml" string worked.
- DBType=blob now works with PostgreSQL.
- "Deflate" DBAddr parameter was added into indexer.conf, e.g.
"DBAddr mysql://root@localhost/test/?DBType=multi&Deflate=yes".
With "Deflate=yes" specified, indexer compresses data
when converting with "indexer -Eblob", which makes
a smaller database size and faster search.
- It is possible to rewrite only URL data for DBMode=blob:
"indexer -Erewriteurl". It's useful for very quick rewrite
of URL data after adding "Deflate=yes", without touching
word information.
- CustomLog indexer.conf command was added to log to stdout using
a user defined format, e.g.:
"CustomLog '[${PID}] ${CurrentTime} ${Status} ${URL} ${Content-Type}'".
- Several minor search performance improvements were made.
- Several bugs in "AlwaysFoundWord" were fixed.
- Fixed that loading URL data in "DBMode=blob" didn't work
on big endian platforms (e.g. MacOS X). As a result search
loaded data from "url" table, which was slow.
- Fixed that "Section url.file" and "Section url.path" didn't
work well when indexing FTP sites having national letters
in directory and file names (Bug#658). Directory and file
names (after %XX URL-unescaping) considered to have the
same character set with the one specified in RemoteCharset
(or iso-8859-1 by default). A new indexer.conf command
"RemoteFileNameCharset" was added for the case when URL
character set is different from RemoteCharset.
- Fixed that MySQL-4.1 running in utf8 failed to create "qinfo"
table with "Specified key was too long" error (Bug#1041).
- Fixed that the "<!DOCTYPE...>" tag was removed from the template
(Bug#781, Bug#1026).
- Fixed that "<!CDATA[]]>" tags were not correctly processed
by XML parser.
3.2.34.1
17 October 2005
- Fixed that changes in "Length" parameter on "Sections" tabdiappeared after reopening indexer.conf and length was alwaysset to the default value (256).
- Per session Cookie support was added, use new "UseCookie yes/no" indexer.conf command to switch on/off.
- "sybase" database type was added. e.g. sybase://sa@localhost/db/. Tested with ASE-12.5 with native ctlib as well as unixODBC interfaces.
- Relevancy improvements: "WordDistanceWeight number" search.htm command was added. Use with a number in the range 0..255 to change effect of distance between the searched words on the resulting score. The default value is 255, which means maximum effect of word distance.
- Relevancy improvements: "DocSizeWeight number" search.htm command was added. Use with a number in the range 0..255 to give lower score to a longer document and higher score to a shorter document if both documents contain the same number of found words. The default value is 255, which means maximum effect of document size.
- New "nfw" search.cgi parameter. It uses the same format with "fw". If all found words appear in the only one section, then resulting score becomes lower. It can be used for example to ignore spam in KEYWORDS meta tag. I.e. if you use high "fw" and "nwf" values for the section corresponding to KEYWORDS, then score will high only if a word appeared in KEYWORDS and also in title/section, but not only in KEYWORDS.
- New "StrictModeThreshold number" search.htm command. If search returned less retults than the given number, then search automatically switches from m=all mode (all words) to less strict m=any mode (any word). Default value is 0, which means don't switch automatically to less strict mode.
- "Cached Copy" now looks better for "text/vnd.wap.wml" (WAP documents).
- Language quesser now understands "cn" as synonym for "zh" to detect Chinese.
- "DefaultContentType" search.htm command was added. Helps when "Content-Type" header is not stored in the database and automatic guesser fails to detect a document type. Previosly "text/plain" was assumed.
- search.cgi now can do Cyrillic->Latin and Latin->Cyrillic transliteration. New "tl=yes" search.cgi parameter was added to activate transliteration.
- Self-links (i.e. when a page has a link to itself) do not affect popularity rank anymore.
- It is possible to use phrase as a synonym now.
- Added "AlwaysFoundWord" search template command. It specifies dummy word that is always considered found.
- PgSQL driver has been slightly optimized.
- Several improvements to search template to be compatible with XHTML.
- Fixed that "<![CDATA[...]]>" entries didn't work well in search.htm.
- Fixed search.cgi crash, which showed up on Debian and Suse in some cases (Bug#1004).
- Fixed that after indexing with MinWordLength in indexer.conf phrase search didn't work properly.
- Fixed that search could split words into parts because of invoking Chinese/Thai segmenter in wrong cases.
- Fixed that search query and word statistics were displayed in LocalCharset instead of BrowserCharset when no documents were found.
- Fixed that search.cgi crashed if NumSections was smaller than actual number of sections stored in the database.
- Fixed minor bug in synonyms code. One wasn't able to use synonyms feature if there are less than three synonyms defined.
- Several stability and performance improvements were made.
3.2.33.1
25 June 2005
- Japanese stoplist was added. Thanks to Alexander Sharapov.
- <!EREG> template operator was added.
- <!IFLE>, <!IFLT>, <!IFGE>, <IFGT> template operators
where added for less-or-equal, less, greater-or-equal, greater numeric comparison.
- "Realm site <li>" now follows only links from the same site
with the current URL.
- $(CurrentTimestamp) and $(Last-Modified-Timestamp) search.htm
variables where added, representing current date and a document
modification date in numeric (Unix timestamp) format.
- New "dstmp" search parameter was added. It can be used instead of dy/dd/dm.
- New "ExcerptStopword yes/no" search.htm command was added,
to choose whether stopwords should be highlighted in excerpts.
- MaxDocPerSite server setting was added.
- Relevancy improvements were made (better word distance
calculation, word count is taken in account now).
- Excerpt generating performance improvements were made.
- Fixed that entities like  didn't work with Big5 (bug#755).
- Fixed that <!INCLUDE> didn't work in 3.2.32.
- Fixed that indexer exited with "Duplicate error" message with PostgreSQL 8.0.
- Fixed that indexer could crash when processing a malformed BASE HREF tag.
- Fixed that search results were wrongly displayed
if search limits returned no documents in some cases.
- Fixed that a page was not removed from search index
in some cases even if it was already removed from site.
- Fixed that "Alias regex" didn't work in search.htm.
- Fixed that "Pro" version silently didn't start indexing
if some record in the list on "Server" tab contained a bad URL.
- Several stability improvements were made.
3.2.32.1
03 April 2005
- Misspelled word suggestions were added. If a search
query didn't return any results, a "Did you mean: ..."
link is displayed.
- Faster loading of ispell dictionaries.
- HTMLENCODE function in template language was added.
- Only quotes now accepted as phrase delimiters. Apostroph
signs don't work as delimiters anymore.
- A number of speed and stability improvements were made.
3.2.28.1
17 December 2004
- Popularity Rank calculation was added.
- COM interface and ASP front-end were added.
- Fixed that a user defined User-Agent header was not sent by Indexer.
- Fixed that <!INCLUDE> template directive didn't work in some cases.
- Various minor fixes and improvements were made.
3.2.24.1
05 November 2004
- PDF, DOC, XLS and RTF plug-in components are now shipped
with the distribution.
- Several stability impovements were made.
- Better error reporting in search results.
3.2.16.1
05 May 2004
This release is built with the new generation mnoGoSearch engine 3.2,
which includes the following major features:
- Better storage engine: mnoGoSearch Lite now uses SQLite
(http://www.sqlite.org) engine to store its data,
which made the features previously available only in SQL
and Pro versions working in Lite version, including
incremental indexing, smart reindexing, subsection control.
New engine also makes Lite version suitable for bigger sites,
up to several dozens thousand documents.
- Better internationalization: support of all modern widely
used character set was added, including multibyte character
sets for Chinese, Japanese and Korean, as well as Unicode UTF-8.
- Better templates: search.htm language was significantly
improved. New feateres include conditional operators,
external includes, simple/extended search modes were added.
- Cached copy: you can see a snapshot of each page
as it looked when indexer processed it, with search
words hilighted.
- Smart excerpts: search now fetches relevant excerpts
containing query words.
- Results ordering: it is now possible to choose relevancy/date
results order, as well as reverse order.
- Content encoding: indexer now supports gzip/compress/deflate
HTTP compression. It significantly reduces traffic for HTTP
servers supporting compression.
- Meta tags: you can define custom META tags you want to index.
Only Description and Keywords META tags were available in the
previous version.
- A lot of other improvements and enhancements were made.
3.1.15.15
09 June 2003
- Several minor memory leaks were fixed
- Code optimization for faster indexing and searching
3.1.15.14
26 February 2003
- Now user can set special characters like + - ! etc to be parts of word.
3.1.15.13
20 September 2002
- Sorting servers operation added
- Group servers deletion added
3.1.15.12
01 July 2002
- Search COM objects added
- ASP frontend added
3.1.15.11
24 January 2002
- Import of servers list from text file now possible
- Limitation on number of registered templates removed
- Various minor bug fixes
3.1.15.10
13 November 2001
- Template configuration fixed
- Introduced thread priority configuration (to lower CPU usage)
- Various minor bug fixes
3.1.15.9
12 October 2001
- IBM DB2 support added
- Indexing speed is much faster, especially with MSSQL
- The problem with running Pro version on some NT 4 machines is now fixed.
- Various minor bug fixes
3.1.15.8
8 October 2001
- Fixed several installation issues
- Updated manual
- Various minor bug fixes
3.1.15.7
6 September 2001
- FTP indexing now supported
- News (NNTP) indexing now supported
- Various minor bug fixes
3.1.15
1 August 2001
New mnoGoSearch Pro for Windows NT/2000 with NT Service for remote administration
and scheduling.
- Improved support for doc2txt parser.
- Intranet indexing using login/password now possible
- Indexer now supports Content-language META tag.
- Various bug fixes.
3.1.14
1 June 2001
- This version is based on mnoGoSearch 3.1.14 for Unix/Linux.
- Now in SQL version it is possible to import ispell dictionaries to database to speed up search.
- Fixed bug in Lite version that caused "Can't open dict file '...\dict.txt'" error message.
3.1.12.12
15 May 2001
- Added support for working with several templates;
- Category and Tag filtering bug fixed;
- A bug in MS SQL database creation script fixed.
3.1.12.11
03 May 2001
- net_error_delay value is now saved;
- Categories bug fixed;
3.1.12
24 April 2001
Try mnoGoSearch for Windows!
|