Search Multiple Indexes

From DBSight Full-Text Search Engine/Platform Wiki

Table of contents

Search Multiple Indexes

How it works

We added one more action to do searching on multiple indexes, Instead of sending search queries to "search.do", which

Action Description Results Template
search.do Search one index, specified by "indexName" Specified by "templateName", or the default template.
multiSearch.do Search all available indexes Render results to /templates/multiSearch/multiSearch.ftl, which will render results by "documents.ftl" under the default template for each index

Special Parameters for multiSearch.do

Parameter Example Description
indexName indexName=index1 or comma-separated indexName=index1,index2,index3 Specified all indexes to search on
start start=0 or comma-separated start=10,5,30 Specified start document number of each index
length length=0 or comma-separated length=10,25,30 Specified the number of results per page of each index

Search Sharded Indexes

Note: see Search Multiple Data Sources, 
if multiple index has the same structure, 
and you want to search all of them at the same time.

What's in the result page

The result page will contain all indexes with matching results, and result count for each index.

For each index, only the first 5 ( by default) results will be shown. To change the default 5, you need to pass in a parameter "length", like "length=7". It will affect all indexes.

All the variables are passed to the templates. To view all the templates, you can add one line at the end of the template.

The order of displayed indexes is controled by each index's display order.

Sort Multiple Columns

The purpose is to sort a return set first by one column, then by a second column.

DBSight recommended way is to use:

sortBy=-f1,+f2,-f3

Alternatively, you can sort this way:

sortBy=f1,f2,f3&desc=Y,N,Y

This will sort by field f1,f2,f3, and order is descending,ascending,descending

If you want to sort by relevance, you can use the name

sortBy=_relevance_

as one of the sortBy parameters. However, it'll always be descending, so the value in desc=Y or desc=N are omitted.

Relevance + Time-Based Ranking

we wanted to sort our result by first relevance and then modified date. Modified date is a column in our query. Is there a way to do it in DBSight’s query?

Relevance is actually a floating number, and hard to sort by it first, then by some other fields. However, you can adjust relevance by other fields, like price, number of votes, etc. What about time?

Here is a new feature recently added to v1.5.3.

For each index, you can enable Time-based ranking by going to "Configure Search"->"Time-Based Ranking". There you can adjust the default ranking by modifying file WEB-INF/data/<index-name>-dataset-config.xml file, <date-weight-formula> section. You can add as many <time-weight> entries as you want for finer grained control.

Using Lucene's own query parser

Lucene-query-only mode

Some users asked for Lucene's RAW query parser.

It's added in release 1.1.7. To do this, pass in query option like "lucene=y". So the query may look like

?q=abc&indexName=oneIndex&lucene=y

However, since Lucene has no clue of the ranking of each field. It'll be just a search without weight.

If someone really miss some features, we can surely add it to our current parser.

Combined Lucene-grammar and DBSight-grammar mode

It's added in release 1.4.3. To do this, pass in lucene query like "lq=field_name:good". So the query may look like

?q=abc&indexName=oneIndex&lq=field_name:good

The Lucene query will be "AND"ed with the normal DBSight query from "q=...".

The philosophy of this feature is: usually the "&lq=..." is supplied programmatically, and "&q=..." is user's input. It'll satisfy both sides. For example, if you want to do a query like this:

u2 beautiful day type_id:(1 OR 2)

You can do

q=u2 beautiful day&lq=type_id:(1 OR 2)

Remember to encode the parameters in the URL!

Note

  • When using Lucene Query Syntax, the field name is case-sensitive! Quite often, the field name from database is capitalized, so, for example, instead of
lq=has_brand:(O or 1)

You may need to use:

lq=HAS_BRAND:(O or 1)


Search Debug Mode

When you want to look at lucene queries generated by DBSight parser, you can pass in query option like "debug=y". So the query may look like

?q=abc&indexName=oneIndex&debug=y

Adjust number of explained search results

the debug=... parameter can now support an integer. So

?q=abc&indexName=oneIndex&debug=100

will display search explanation for the first 100 results.

Dynamically change boolean operator

booleanOperator is added in the latest beta release. Basically, append this to the url:

&booleanOperator=and
&booleanOperator=or

They will overwrite the statically defined boolean logic operator.

For example:

q=dbsight+java&booleanOperator=and

will find results with all of the words, and

q=dbsight+java&booleanOperator=or

will find at least one of the words.

For lq, the similar parameter is also available:

lq=field:value&lqBooleanOperator=and
lq=field:value&lqBooleanOperator=or

Dynamically change searchable columns

Usually Searchable Columns is statically configured. However you may sometimes want to limit the search to specific columns. To do so, you can pass in query options like:

&searchable=x,y,z              //x,y,z are the column names, separated by comma

The dynamic searchable columns should be already set to searchable from the web ui.

Dynamically change filterable columns

Similar to dynamic searchable columns, you may want to limit the facet search to specific columns. To do so, you can pass in query options like:

&filterable=x,y,z              //x,y,z are the column names, separated by comma

The dynamic filterable columns should be already set to filterable from the web ui.

Dynamically change selectable columns

If you have a long list of columns, but on search result template, you only need several columns to display, you can use this option. This will cut down transmission time. The idea is the same that you use "select columnA from tableA" will be faster than "select * from tableA".

The usage is:

&selectable=x,y,z              //x,y,z are case-sensitive column names, separated by comma

Dynamically change filterable columns length limit

For narrowBy columns, the number of narrowBy items could be large. For example, the narrowBy can show how many matches for each author, but the number of authors can be pretty large.

By default, DBSight limit narrowBy items to 17. For DBSight 4.0.8, you can dynamically adjust it like this:

&facetLimit=25             //change the narrowBy items limit to 25

Older DBSight does not have such limit. So the performance could suffer when narrowBy items number is very large.

Randomize Search Results

If you have a large number of records, and want to randomize the results, but still want to have a consistent pagination, this is what you need:

&randomQuerySeed=1234          //1234 can be replaced with any valid non-zero integer

Just keep using the same seed number, and of course the same query, you can have a randomized search results. If you changes the seed number, you will see a different ordering of the same results, which gives the end users some kind of freshness.