Click here to view and discuss this page in DocCommentXchange. In the future, you will be sent there automatically.

SQL Anywhere 11.0.0 » SQL Anywhere Server - SQL Usage » Querying and Modifying Data » Querying data » Full text search » Text indexes

 

Creating text indexes

When you create a text index, you can specify the text configuration object to use. Text configuration objects contain settings that affect how strings are broken into terms for indexing. If you do not specify a text configuration object, a default text configuration object is used. For more information about text configuration objects, including the defaults, see Text configuration objects.

When you create a text index, you must also choose a refresh type. Refreshing text indexes can impact performance, as well as the availability of the underlying tables. Choosing a refresh type involves weighing the tradeoff between performance and data staleness.

Refresh types for text indexes

There are three refresh types supported for text indexes: immediate, automatic, and manual.

  • Immediate refresh   The text index is automatically refreshed when data in the underlying table changes. Use the IMMEDIATE REFRESH clause of the CREATE TEXT INDEX statement to specify this refresh type. This is the default behavior if the refresh type is not specified at text index creation time.

    If you update a column that requires a change to an immediate refresh text index, then the index is changed during the update statement, which may make it significantly more expensive. Therefore, IMMEDIATE REFRESH is only recommended if the data must be up-to-date at all times, or the indexed columns are relatively short.

    An IMMEDIATE REFRESH text index is populated at creation time, and an exclusive lock is held on the table during this initial refresh.

    IMMEDIATE REFRESH indexes support all isolation levels.

    If you alter a column that a text index defined as IMMEDIATE REFRESH is built on, then the text index is immediately updated.

  • Automatic refresh   The text index is automatically refreshed using a time interval that you specify. Use the AUTO REFRESH clause of the CREATE TEXT INDEX statement or ALTER TEXT INDEX statement to specify this refresh type.

    If some data staleness is acceptable, you can define the text index as either MANUAL REFRESH, or as AUTO REFRESH and set the interval at which to refresh the text index. A query on a stale index returns matching data as of the last refresh. Rows that have been inserted, deleted, or updated since the last refresh are not returned by a query.

    You can manually refresh an AUTO REFRESH text index at any time using the REFRESH TEXT INDEX statement. See REFRESH TEXT INDEX statement.

    Refreshes for AUTO REFRESH text indexes always scan the table using isolation level 0. See isolation_level option [compatibility].

    Text indexes defined as AUTO REFRESH may be refreshed more often than the interval specified. The database server executes an internal event every minute to check for required refreshes. It refreshes an index if either the time since the last refresh is larger than the refresh interval, or the total length of all pending rows (pending_length as returned by the sa_text_index_stats system procedure) exceeds 20% of the total index size (doc_length as returned by sa_text_index_stats). See sa_text_index_stats system procedure.

    An AUTO REFRESH index is not automatically refreshed (that is, it contains no data) at creation time, and is not available for use until after the first refresh. The first refresh occurs when an internal refresh event runs, usually within the first minute after the text index is created.

    Changes to indexed values are never reflected in a text index, or in queries that use the text index, until after a refresh.

    AUTO REFRESH text indexes are not refreshed during a reload unless the -g option is specified for dbunload. See Unload utility (dbunload).

  • Manual refresh   The text index is refreshed only when you execute a REFRESH TEXT INDEX statement. Use the MANUAL REFRESH clause of the CREATE TEXT INDEX statement to specify this refresh type.

    MANUAL REFRESH is recommended if data in the underlying table is rarely changed, if a greater degree of data staleness is acceptable, or if you want to tie refreshing to an event or condition. If you define a text index as MANUAL REFRESH, you can design your own refresh strategy. For example, you can use a procedure like the following which refreshes all MANUAL REFRESH text indexes using a refresh interval that is passed as an argument, and rules that are similar to those used for AUTO REFRESH text indexes.

    CREATE PROCEDURE refresh_manual_text_indexes( 
       refresh_interval UNSIGNED INT )
    BEGIN
     FOR lp1 AS c1 CURSOR FOR
       SELECT ts.*
       FROM SYS.SYSTEXTIDX ti JOIN sa_text_index_stats( ) ts
       ON ( ts.index_id = ti.index_id )
       WHERE ti.refresh_type = 1 -- manual refresh indexes only
     DO
       BEGIN
        IF last_refresh IS null 
        OR cast(pending_length as float) / (
           IF doc_length=0 THEN NULL ELSE doc_length ENDIF) > 0.2
        OR DATEDIFF( MINUTE, CURRENT TIMESTAMP, last_refresh )
           > refresh_interval THEN
         EXECUTE IMMEDIATE 'REFRESH TEXT INDEX ' || text-index-name || ' ON "'
         || table-owner || '"."' || table-name || '"';
        END IF;
       END;
      END FOR;
    END;

    At any time, you can use the sa_text_index_stats system procedure to decide if a refresh is needed, and whether the refresh should be a complete rebuild or an incremental update. See sa_text_index_stats system procedure.

    A MANUAL REFRESH text index is not refreshed (populated with data) at creation time, and is not usable until it is refreshed.

    You refresh a MANUAL REFRESH text index using the REFRESH TEXT INDEX statement. See REFRESH TEXT INDEX statement.

    Changes to indexed values are never reflected in a text index, or in queries that use the text index, until after a refresh.

    MANUAL REFRESH text indexes are not refreshed during a reload unless the -g option is specified for dbunload. See Unload utility (dbunload).


Create a text index
Refresh a text index