Search Multiple Indexes
From DBSight Full-Text Search Engine/Platform Wiki
| Table of contents |
Search Multiple Indexes
How it works
We added one more action to do searching on multiple indexes, Instead of sending search queries to "search.do", which
| Action | Description | Results Template |
| search.do | Search one index, specified by "indexName" | Specified by "templateName", or the default template. |
| multiSearch.do | Search all available indexes | Render results to /templates/multiSearch/multiSearch.ftl, which will render results by "documents.ftl" under the default template for each index |
Special Parameters for multiSearch.do
| Parameter | Example | Description |
| indexName | indexName=index1 or comma-separated indexName=index1,index2,index3 | Specified all indexes to search on |
| start | start=0 or comma-separated start=10,5,30 | Specified start document number of each index |
| length | length=0 or comma-separated length=10,25,30 | Specified the number of results per page of each index |
Search Sharded Indexes
Note: see Search Multiple Data Sources, if multiple index has the same structure, and you want to search all of them at the same time.
What's in the result page
The result page will contain all indexes with matching results, and result count for each index.
For each index, only the first 5 ( by default) results will be shown. To change the default 5, you need to pass in a parameter "length", like "length=7". It will affect all indexes.
All the variables are passed to the templates. To view all the templates, you can add one line at the end of the template.
The order of displayed indexes is controled by each index's display order.
Sort Multiple Columns
Reguested:
Is it possible to sort a return set first by one column, then by a second column? I only seem to be able to make it sort by one column. -- Julie
With release 1.1.6, the sortBy parameter can accept comma separated column names.
So you can use
sortBy=f1,f2,f3&desc=Y,N,Y
This will sort by field f1,f2,f3, and order is descending,ascending,descending
If you want to sort by relevance, you can use the name
sortBy=_relevance_
as one of the sortBy parameters. However, it'll always be descending, so the value in desc=Y or desc=N are omitted.
Relevance + Time-Based Ranking
we wanted to sort our result by first relevance and then modified date. Modified date is a column in our query. Is there a way to do it in DBSight’s query?
Relevance is actually a floating number, and hard to sort by it first, then by some other fields. However, you can adjust relevance by other fields, like price, number of votes, etc. What about time?
Here is a new feature recently added to v1.5.3.
For each index, you can enable Time-based ranking by going to "Configure Search"->"Time-Based Ranking". There you can adjust the default ranking by modifying file WEB-INF/data/<index-name>-dataset-config.xml file, <date-weight-formula> section. You can add as many <time-weight> entries as you want for finer grained control.
Using Lucene's own query parser
Lucene-query-only mode
Some users asked for Lucene's RAW query parser.
It's added in release 1.1.7. To do this, pass in query option like "lucene=y". So the query may look like
?q=abc&indexName=oneIndex&lucene=y
However, since Lucene has no clue of the ranking of each field. It'll be just a search without weight.
If someone really miss some features, we can surely add it to our current parser.
Combined Lucene-grammar and DBSight-grammar mode
It's added in release 1.4.3. To do this, pass in lucene query like "lq=field_name:good". So the query may look like
?q=abc&indexName=oneIndex&lq=field_name:good
The Lucene query will be "AND"ed with the normal DBSight query from "q=...".
The philosophy of this feature is: usually the "&lq=..." is supplied programmatically, and "&q=..." is user's input. It'll satisfy both sides. For example, if you want to do a query like this:
u2 beautiful day type_id:(1 OR 2)
You can do
q=u2 beautiful day&lq=type_id:(1 OR 2)
Remember to encode the parameters in the URL!
Note
- When using Lucene Query Syntax, the field name is case-sensitive! Quite often, the field name from database is capitalized, so, for example, instead of
lq=has_brand:(O or 1)
You may need to use:
lq=HAS_BRAND:(O or 1)
- Lucene Query Syntax can be found here: http://lucene.apache.org/java/docs/queryparsersyntax.html
Search Debug Mode
When you want to look at lucene queries generated by DBSight parser, you can pass in query option like "debug=y". So the query may look like
?q=abc&indexName=oneIndex&debug=y
Adjust number of explained search results
the debug=... parameter can now support an integer. So
?q=abc&indexName=oneIndex&debug=100
will display search explanation for the first 100 results.
Dynamically change boolean operator
booleanOperator is added in the latest beta release. Basically, append this to the url:
&booleanOperator=and &booleanOperator=or
They will overwrite the statically defined boolean logic operator.
For example:
q=dbsight+java&booleanOperator=and
will find results with all of the words, and
q=dbsight+java&booleanOperator=or
will find at least one of the words.
Dynamically change searchable columns
Usually Searchable Columns is statically configured. However you may sometimes want to limit the search to specific columns. To do so, you can pass in query options like:
&searchable=x,y,z //x,y,z are the column names, separated by comma
The dynamic searchable columns should be already set to searchable from the web ui.
Dynamically change filterable columns
Similar to dynamic searchable columns, you may want to limit the facet search to specific columns. To do so, you can pass in query options like:
&filterable=x,y,z //x,y,z are the column names, separated by comma
The dynamic filterable columns should be already set to filterable from the web ui.
Randomize Search Results
If you have a large number of records, and want to randomize the results, but still want to have a consistent pagination, this is what you need:
&randomQuerySeed=1234 //1234 can be replaced with any valid non-zero integer
Just keep using the same seed number, and of course the same query, you can have a randomized search results. If you changes the seed number, you will see a different ordering of the same results, which gives the end users some kind of freshness.
