Search Multiple Data Sources
From DBSight Full-Text Search Engine/Platform Wiki
| Table of contents |
Search Multiple Indexes
Search Multiple Indexes On the Same Server
If you have multiple databases with a common structure, this may be what you are looking for.
Prequisite
- You need to define the index configuration, like connection info, main query and subsequent query, indexing schedule, etc, for each database.
Searching
When searching, using this format
search.do?indexName=index1,index2,index3&... //search index1, index2, index3. The first index is "index1"
Rendering
If templateName is not specified, DBSight will use the template of the first specified index. However, this is not recommended since maybe the template are not the same.
Here are some examples showing how to specify the indexName and templateName:
indexName=a,b,c&templateName=x|y|main.jsp => search on index:a,b,c, render by index:x, template:y, file: main.jsp indexName=a,b,c&templateName=x|y => search on index:a,b,c, render by index:x, template:y, file: main.vm indexName=a,b,c&templateName=y|main.jsp => search on index:a,b,c, render by index:a, template:y, file: main.jsp indexName=x&templateName=y|main.jsp => search on index:x, render by index:x, template:y, file: main.jsp indexName=x&templateName=y => search on index:x, render by index:x, template:default template, file: main.vm indexName=&templateName=x|y|main.ftl => search on index:x, render by index:x, template:y, file: main.ftl indexName=&templateName=x|y => search on index:x, render by index:x, template:y, file: main.vm
As you may notice, you can specify main.jsp, to use a jsp to render results. Also ftl for freemarker servlet. Actually, you can even use any other file name, like documents.jsp, to render a partial of template results. It will come handy when using AJAX.
Search Multiple Indexes Across the Multiple Servers!
Let say you have 3 indexes on 3 DBSight servers respectively. They have normal search URL like this:
http://hostname1:port1/dbsight/search.do?indexName=index1&... http://hostname2:port2/dbsight/search.do?indexName=index2&... http://hostname3:port3/dbsight_different/search.do?indexName=index3&...
Then you can combine the search results from the 3 indexes, and render results via the configuration defined on hostname1:
http://hostname1:port1/dbsight/search.do?indexName=index1,index2@hostname2:port2/dbsight,index3@hostname3:port3/dbsight_different
Shard Search Parameters
Most parameters are the same, like "start", "length"/"limit" etc.
One parameter unique for Shard Search is "timeout" (in milliseconds). This is because when talking to several remote nodes, some nodes may go down unexpectedly. You don't want to just wait for this particularly slow node. The default value is 10 seconds ( 10000 milliseconds). Please adjust it to fit your network requirements.
This way, you can achieve a very scalable sharded search solution.
Maintain Multiple Indexes
To make the above searching work, the indexes should be mostly the same, especially with the same set of columns. The jdbc connection, SQLs can vary a little. Most likely you want to vary on how to partition the data.
For example, you can partition the data via date, by having these different Main Queries for the different boxes:
select * from table1 where created_date < to_date('2000-01-01')
select * from table1 where created_date > to_date('2000-01-01') and created_date <= to_date('2005-01-01')
select * from table1 where created_date > to_date('2005-01-01') and created_date <= to_date('2010-01-01')
Partitioning the data via date is a common approach. You can leave old data untouched so it will not slow down the indexing process, and phase out the old data when convenient.
