Setup For Performance

From DBSight Full-Text Search Engine/Platform Wiki

Table of contents

Environment Setup

Hardware Configuration

For multiple concurrent searches, the conventional hard disk is the bottleneck. It's tested that Solid State Drive helps a lot.

External Links:

  1. http://wiki.statsbiblioteket.dk/summa/Hardware

DBSight 4.x JVM Configuration

The memory setup is Changed! Please read carefully.

DBSight 4.x defaults to a separated searching mode. Each time, a separated search process is launched to specifically serve all the search request. If the index is updated, a new search process is created and the previous search process is stopped. This way, the search nodes starts and stops much faster without any JVM GC halting.

In this case, the DBSight J2EE web application itself is only acting as a proxy. So it does not consume too much memory at all. In this architecture the new search process will take most of the memory.

So, no special memory settings for starting/stopping DBSight 4.x itself. But you need to configure the searching process specifically if you need more than the default 1G heap size.

There are two parts of memory setting:

1. Configure memory for DBSight J2EE web application

To start/stop DBSight, you just need normal settings like:

-Xmx128m -Xnoclassgc
java -Xmx128m -Xnoclassgc -jar start.jar

2. Configure memory for DBSight search process

To specify larger heap size for each index, you can go to "Advanced Settings" page.

On 32bit jvm, the max heap size is default to 1G on 32bit jvm, and 12G on 64bit jvm. This is just upper limit and you can configure it as necessary.

DBSight Setup

Caching Facet Search (NarrowBy) Results

This setting is on "Advanced Settings" page.

The narrowBy search takes a lot of time, usually 4~5 times more than the basic search, especially for searches with a lot of number of hits. This is why you want to enable facet search caching to improve performance.

Facet caching lets you

  1. cache the narrowBy results for searches with the most number of hits (call it TopFacetCache)
  2. cache the most recent narrowBy searches (call it LatestFacetCache)

Configure NarrowBy Caching

Here is how to you can choose the number of cache entries:

For TopFacetCache, you can choose something like 1000, or more, as long as memory is allowed.

For LatestFacetCache, it's more for users who are paginating the results. So you don't really need to put it pretty high. It could be 10 or 100.