From DBSight Full-Text Search Engine/Platform Wiki
- SalesForce.com adapter to crawl and synchronize data.
- [[Virtual DBSight Site| Virtual Site is added to allow departmental specific site on a single DBSight instance.]
- The number of search results explained via debug option is now adjustable.
- Option to switch between default Lucene parser and ComplexPhraseQueryParser is added.
- Improve performance and Removing 255 length limit for CommaSemicolonExtendedAnalyzer.
- Fix displaying error when show content via primary key column.
- Fix exception when using Synonyms and Reserved Words to analyze content.
- CommaSemicolonExtendedAnalyzer can work well during searching.
- Configurable Query Timeout is added to handle query timeout to avoid non-responding database connections.
- Lucene 3.0 compatible.
- Freemarker template is added to Facet Search definition xml file.
- Facet Types Editing Window is added.
- Synonyms and Stop Words are configured individually.
- Avoid overwriting reserved, spell_check, stopwords, synonyms file.
- Configurable Reserved Words is added.
- Update POI library to process Word Doc, Excel, and PowerPoint files.
New Lucene 3.0 API
- Most existing customized Analyzer needs to adapt to Lucene 3.0 API
New Templating System
- One template can have several sub-pages, great for AJAX.
- Added partial scaffolding for search suggestion, tag cloud, etc.
- Released Javadoc of SearchResult.
- Better and clear scaffolding. Experts can even create your own scaffold, or a partial of scaffold, and reuse them.
- Support Facet search on numbers. Can group numbers, for example, to a configurable price range.
Improved Database Crawling
- Batch mode to make subsequent queries more efficient
- Re-organized JDBC library structure, can support more complicated jdbc drivers that needs several files. And better classpath handling, so no crosstalk with libraries on the system classpath.
- Alternative Incremental Indexing SQL now is more flexible.
- Now you can select outside database!
- Added Search API for Java in remote procedure call style.
- Added Submit API to submit document, to make it instantly searchable, without waiting for scheduled indexing.
- Published the binary protocol between the searching client and DBSight
- Multiple scheduled operations for the same index, including incremental-indexing, re-creating index, spell checking index, remotely subscribed index etc.
- Adjusting SQL won't lose information of the fields ( idea of Ken from Berkeley)
- Keep old database and scheduler settings when uploading new index configurations. Good for production, staging, testing, development instances, or synchronizing settings with indexing node and searching node. (idea of James from Costco)
- Much Easier to apply Tag Cloud, AJAX Search Suggestions.
- Remote Index Replication now including incremental updates, spell check indexes.
- Configurable Email Notification for indexing processes.
Other Features You should know
- Add AJAX Search Suggestions.
- Add statistics of top searches that's returning no results.
- Generate index-specific spell check dictionary.
- Add multiple schedules for the same index.
DBSight 2.x is currently backward compatible. But it's better to try to upgrade now than later. Compatible means when you download the index configuration, and even copy the index data to from 1.x to 2.x, after a DBSight restart, in most cases the search and indexing should still work. Most likely you need to manually go through the configuration again and verify everything is right.
- IKAnalyzer is added for better support for international language, especially Chinese, and customizable dictionary.
- Added DBSight version to index configuration file.
- Incremental Indexing Sql improves performance by not requiring order by modified date.
- Sum and Avg for any field is added in facet search results.
- Old history data reading error is ignored.
- Paginating Postgres resultset is added.
- Configuration loading exception is fixed.
- Single Valued Facet warm up is added.
- Boost factor default to 1 if it's zero or negative.
- Compressed UnIndexed field type is added.
- UnStored index field is supporting stop words now.
- Thorough or Fast Deletion Detection choices are added.
- Action to list all configurations is created.
- MinIntegerValue and MaxIntegerValue are added to FilterColumn class.
- Added counter to make sure current search finishes during index hot swapping.
- Support for Lucene Query in Multi-Search mode is added.
- Support for numbers in Metaphone Analyzer and Double Metaphone Analyzer is added.
- Dynamically searchable columns is fixed with multiple words search.
- Sum and Average for each Facet are added.
- Merge Index only during Non-Peak hours is added.
- Scaffolding error if no results are found is fixed.
- Dynamically searchable columns is added to java search API.
- Dynamically disable facet search is added to java search API.
- Fix errors when re-creating period table after indexing stops abnormally.
- Spell check dictionary building option is added to scheduling.
- Scheduled jobs are properly cleaned after jobs are re-configured.
- Dynamically searchable columns is added.
- JSON Date is represented as a long number.
- Added support for time before 1970. Need to re-create index when upgrading.
- Shutdown hanging with tomcat is fixed.
- Fix Scaffolding error.
- Removed hard coded indexing log level.
- beta version of Google Map Scaffolding is added.
- Multi-Valued Facet is more memory-efficient.
- Server-Side table ajax-sorting query is fixed.
- JSONP scaffolding is added.
- Result-Per-Page partial scaffolding is added.
- PatternSyntaxException of Advanced Incremental SQL is fixed.
- NarrayBy in boolean operartor OR mode is fixed.
- Boosting a document by a value is fixed.
- SqlServer Driver name is fixed.
- Spell Checking suggestion is avoid if field query is included.
- User Input value is added to SearchResult.
- Configurable number of NarrowBy (Facet Search) choices is added.
- Page for empty results is fixed.
- Added Multiple-Choice Query
- Multiple-Choice Partial Scaffolding is added.
- Suggest-as-you-type now works for 1 character also.
- Column-specific partial scaffolding is added so that it can be applied several times.
- Scaffolding now can be applied to sub directories.
- Performance for spell checking dictionary generation is greatly improved for large indexes.
- Performance for deletion detection is greatly improved and efficient.
- Different versions of same jdbc drivers can be used.
- Scaffolding for rss syndication is fixed.
- Fixed jQuery.js-included-twice problem.
- Added support for negative-only queries.
- Added arbitrary directory structure support for customized scaffolding.
- Added hierarchical date search.
- Added support for more memory efficient duplication checking for searching outside database in incremental mode.
- Added support for direct lucene query in Search API.
- Smartly adjust default index field types based on column type.
- Filtered column link also remove child filtered column.
- Delete action also delete buffer index entries.
- Improve performance by skipping empty buffer index.
- Added numeric range search with inclusive/exclusive on both ends.
- Clearer Deletion SQL choice with incremental indexing.
- Multiple schedules can be added/changed/deleted.
- Updated to latest Lucene library
- Buffer Index updating documents submitted fields only.
- Fixed flushing buffer index entries.
- Added configurable email notification for indexing processes.
- Added Search Suggestions, including single words and phrases.
- Added AJAX Search Suggestions partial scaffolding.
- Index Cluster mode support subscribing incremental index changes, spell-check index changes
- Simplifying tag cloud generation via partial scaffold.
- Added default IBM DB2 Type 2 and Type 4 configuration.
- Added option to schedule deletion detection for incremental indexing.
- Added job scheduling for spell check dictionary.
- Added customizable fetcher for data sources other than database.
- Fix possible format error for period table.
- Fix problems handling negative queries like "night -title:sky".
- Added multiple schedules for the same index.
- Include counting integer zeros for facet searching.
- Improve numeric facet search by adding open-ended ranges.
- Fix bug related to read float numbers into integer.
- Added caching for multi-rows query.
- Added batching for multi-rows query.
- Added index specific spell checking.
- Fix highlighter highlighting without summarizing.
- Fix job scheduling error for standard license.
- Adding optional classpath for indexing, resolving Tomcat jsdk classpath issue.
- Simplifying narrowBy rendering logic.
- Improve Search Performance by Facet Search on child columns only when parent column is already filtered.
- Default to getting auto commit database connection to avoid extra operations.
- Improve search performance by avoiding repeatedly loading configuration files.
- Fix wrong error reporting for BufferIndex submit.do
- Fix memory leaking when refreshing the index in memory-only mode.
- Added configurable max field length.
- Process empty query by matching all documents in multi-index search mode.
- Handle SqlServer special empty or all zero date time format.
- Use field-specific analyzers with Lucene query parser.
- Fix intermittent file synchronization error due to resource leaking in cluster mode.
- Correctly set query log size.
- Added configurable logging for indexing processes.
- Query Translator fixes a negative query bug.
- Query Translator fixes a bug when booleanOperator is OR.
- Fix bug when saving URL To Ping.
- Increasing speed for data retrieval by batching de-duplication checking
- Fixed file deletion error on windows OS
- Added Javadoc for rendering search result
- Added logging for direct Lucene queries
- Improved search usage logging based on IP for servers behind proxy
- Added field-specific highlighting via wizard generated code
- Added freemarker template servlet for rendering search results
- Added support for efficiently searching multiple same-structured datasource
- Avoid exception when option "Empty Query Match All" is unchecked
- Enhancing templateName parameter by adding options for a different index from current index, a different file from default main.vm
- Added configurable sorting for narrowBy results
- Added Regular Expression Search
- Added time-based ranking for sortBy=_relevance_ during searching
- Fix possible hanging during shutdown
- Improved Time-based Ranking
- Added Reserved Words to by pass analyzers
- Handle unpaired double quote in user search string
- Improve performance when updating existing index documents
- Added javadoc for writing JSP templates
- Fixed bug on reserved words with analyer
- Search queries including stopwords will rank higher exact match
- Upgrade to latest lucene.jar, r639002
- Added sortBy=_relevance_ during searching
- Fix cron scheduler display bug with the new quartz scheduler
- Do not continue with partial data if errors happens during retrieving data when re-creating index
- showIndexUsage action now is opened to allowed IP list
- Added indexing Zip file inside BLOB columns
- Do not continue with partial data if errors happens during retrieving full id list
- Fix time-based ranking exception when date value is null
- Enhanced HTML filter, to support html text with less memory usage
- Prevent integer overflow when indexed documents are around Integer.MAX_VALUE
- Upgraded quartz scheduler to 1.6.0 to fix scheduler error on Mac
- Enabled Wildcard highlighting
- Added default support for Apache Derby database.
- Added PortugueseAnalyzer.
- Added Case-insensitive search for comma-semi colon separated values.
- Added comma separated synonyms, in addition to empty space separated synonyms.
- Added configuration maximum summarizer lenth.
- Added initial version of time-based ranking
- Made Comma-Semicolon Analyzer case insensitive, ideal for tags
- Added case insensitive Keywords
- Added more logging message when refreshing index
- Fix highlighting for Chinese Analyzer
- Reduce memory for facet search for fields with multiple integer values (Keywords type).
- Added "Html Text" field type for avoiding html tags in the text.
- Fix detecting jdbc driver class with multiple interface inheritance, for Firebird jdbc.
- Reduce more memory for facet search with multiple valued categories
- Fix possible racing condition of modifying column list
- Fix possible shifted time for scheduling jobs
- Fix highlighting for Chinese Analyzer
- Reduce memory for facet search for fields with multiple integer values (Keywords type).
- Refresh index if only deletion occurred
- Added OneWordNumberLowerCaseAnalyzer to search for connected words.
- Allowing spaces in Range Queries.
- Added text/xml mime type to XML search result templates.
- Display Index in Memory mode on dashboard.
- Added capability to use customized Similarity class
- Show tooltip for SQL queries having double quotes inside
- MultiSearch supports sorting and facet search
- Avoid validation error for mysql blob type
- Avoid lowering indxing process priority on linux
- Increase Indexing speed, by scale of 2 to 3.
- Upgrading to latest Lucene, build 08/06/2007
- Added ISOLatin1AccentFilter to FrenchAnalyzer
- Added JSP, or other templating capability to render search results (http://wiki.dbsight.com/index.php?title=Debug_Template#Velocity_is_great.2C_but_can_I_use_JSP.2C_or_other_templating_technology_instead.3F)
- Added SynchronizedUpdateIndex and SynchronizedUpdateTempIndex for optimal performance in Storage Area Network
- Added booleanOperator to dynamically change boolean logic operator AND or OR. http://wiki.dbsight.com/index.php?title=Search_Multiple_Indexes#Dynamically_change_boolean_operator
- Server wide free index size increased to 100GB.
- Exposing Filter class for customization. http://wiki.dbsight.com/index.php?title=File_Types#How_to_extend_your_own_file_types
- Added CommaSemicolonAnalyzer for Keywords type.
- Fix ignored min wildcard word length, introduced in 1.4.1.
- Use jQuery to view indexing log.
- Added Keywords support for Number
- Replicate indexes in Server Clustering mode when only deletion occurs
- Correctly analyze Lucene syntax in lq=... mode
- Download index configuration and templates in IE
- Avoid security checking for local requests
June 16th, 2007
- Upgrade jTds drver to version 1.2
- Added UI to find frequent words for smaller index size
- Added jdbc instructions to support Sql Server cluster
- Lucene grammar can be used together with DBSight grammar http://wiki.dbsight.com/index.php?title=Search_Multiple_Indexes#Combined_Lucene-grammar_and_DBSight-grammar_mode
May 26th, 2007
- Added customizable stop words list for all analyzers
- Fix spelling index creation
- Not to spell check words existing in the dictionary
- Spelling not to handle Captalized words
- Compile in 1.5 to improve search performance(60% less time is reported)
- Less memory for multiple-valued narrowBy facet search
May 6th, 2007
- Added synonyms.
- Added multiple keywords, support displaying tag cloud in search results.
- Added JRuby Scripting !
- Always use faster adding indexes without merging
- Much faster bulk deleting
- Faster copying on linux environment
- Memory-efficiently find out deleted documents
- Upgrade to jdk 1.5.
- Fix query parser: analyzer was applied twice.
- Fix index file uploading error from windows to linux systems.
Apr 15th, 2007
- synchronized deletion
- time-weighted result ranking control
- no time-out 100000M index size for free version
- primary key column and modified date column can only be Keyword
- added compressed text field type
- wait for merging to complete
- somewhat faster cache warmup
- options to merge indexes without optimizing
Mar 30th, 2007
- upgrade to jdk 1.5
- start searcherMaxidle on startup, instead of (searcherMaxactive-searcherMaxidle)
- more acurate indexing hot-swapping for large indexes
- upgrade to latest lucene.jar
- fix killing long running indexing jobs blocked for 30 minutes
- multi valued results from previous subsequent queries can be used as parameters now
- fix SyBase jdbc driver loading
- make range query field case-insensitive
Mar 7th, 2007
- Improved XML, JSON templates, can automatically populate configurable columns
- added a load balancing helper: ping a url when its index is ready
- Support windows services, DBSight can run in windows background
Feb 26th, 2007
- added SoundEx, Metaphone, DoubleMetaphone analyzers, good for languages like Bengali, Hindi, Gujarati, Kannada, Malayalam, Tamil, Telugu
- added support for mysql older version,where "limit ? offset ?" is not supported
Jan 29th, 2007
- fix: extra subsequent query is created
- range query field name changed to non case-sensitive
- add parameter string concatenation mode, in addition to variable binding, for greater flexibility
- fix column name overlapping for range query
- synchronized log writing to prevent mangled log format
- debug mode in search for easier remote debug
- fix xml template's encoding
Dec 3rd, 2006
- much improved search performance in multiple narrowBy mode
- fix NPE in doc.getValues
- remove session usage to avoid possible J2EE memory growth
- reduce memory consumption by avoiding open index several times
Nov 8th, 2006
- reduce memory consumption for Keywords type
- configure velocity logging to use log4j
- default getConnection loginTimeout to 100 seconds, to prevent blocking for connections
- correct field:"value" behavior under "OR" mode
Oct 12th, 2006
- Added options to show all documents for empty queries
- Added status report in xml format for system and specific index, for integration purpose
- Avoid applying global stop words
- PingAction to check DBSight health
- configurable maximum number of active, idle searchers and wait time
Sep 18th, 2006
- same analyzer is used when analyzing queries and indexing content
- greatly improved mysql main query performance for large dataset
- http action for garbage collection
- robust and faster remote index replications
- avoid refreshing _local.uri when starting up if the file already exists
- download and upload index configuration files and template files
- avoid carry ".svn", "CVS" directory when creating templates from example templates
Sep 1st, 2006
- fix bug: type UnStored can be searched now
- fix bug: duplicated records when incremental indexing and scheduled job is reCreating index
- fix bug: close resources when load-index-into-memory exceptions
- In index hot-switching stage, no need to load temporary index into memory
- added xml result template
- added json result template
- replace highlight/summarize macro with parameterized highlighter/summarizer
- multiple range search with "+" and "-" support
- display version and build number
- fix bug with multiple range search queries
Aug 22nd, 2006
- fix security access filter bug, causing failure to releasing, duplicated docs
- support wildcard search in minus mode
Aug 6th, 2006
- support Tomcat 5.5 by moving tld files inside struts.jar to WEB-INF/tlds
- add local and server URL to server status page
- correct list size estimation for re-creating indexing mode
- options to control whether to search the new indices or not when old index is not available and indexing is creating new the indices.
- sortableTable.vm fix for numeric column matching, by Justin from Computer Sciences Corporation
- Only setFetchSize when it's a positive number
- sortBy and desc can both support comma separated values
- set classpath to WEB-INF/indexingLib if it exists; if not, use WEB-INF/lib
- code change for new licensing model
June 5th, 2006
- Escape advanced extra JDBC properties in URL
- Support allowed IP or Host names for remote-operation
Apr 28th, 2006
- Improve robustness on window's file deletion
- fix OutOfMemory when previous index exists and indexing large database tables
- support User-Level Security Filtered Search
- support deleting indexed documents
- support range query
Apr 9th, 2006
- can vertically concatenate all String columns
- fix: multi-search regression
- fix: Unix system does not need double quotes for file names
- upgrade to lucene 2.0RC1, clean any existing deprecated API
Apr 4th, 2006
- use getColumnLabel() instead of getColumnName() for mysql
- catch setFetchSize exception in mysql some drivers
- fixed: Indexing startup error when directory names having empty space
Mar 19, 2006
- analyzers configurable for individual fields
- increase to 1000 documents to better estimate documents list size
- index configuration and data directory can be set to any directory
- fall back to http when site is running under https mode
Mar 06, 2006
- add noResults.vm to templates to display search instructions to end user
- only search indexes with non-negative displayOrder in multi search
- shutdown indexing when jdbc driver mismatch
- correct error message when wildcard search matches too many records
- fix CJKAnalyzer highlighting problem when searching on ID
Feb 12, 2006
- fix weblogic startup problem
- dashboard layout changes
- clean java exception when index reindexed after configuration changed
- set $isMultiSearch to true in multiSearch mode
Feb 05, 2006
- set column name to non-editable when configure SQL
- escape html in editTemplateFile.vm
- render empty partial result values to italic null
- make SQL editting area wider, fixed font size
- clean narrowBy results when query is empty
Jan 29, 2006
- can trigger indexing/re-indexing/full-indexing from localhost by localhost
- multiple indexes search display only selected indexes in global navigation
- single index search can configure to display all available indexes in global navigation
- beta: different field can have different analyzer
Jan 15, 2006
- UI to add OR mode -- You can switch between AND or OR mode easily.
- beta version of multiple indexes search -- Search the indexes at the same time
- Upgrade oracle jdbc driver to 10.0.2.0.1, ojdbc4.jar
- Upgrade postgres jdbc driver to 8.1v404, jdbc3
- added lucene raw query parser search mode
- Support searching multiple indexes
- Support multiple sortBy fields
- OR mode support "+" operator
- Added OR mode search option