This page documents the SQL-level interface to the full-text-search module. =fts3= usage is documented, =fts2= and =fts1= are mostly identical. Status: *DRAFT*. I consider this information correct, but haven't done a final review. **Creating and Destroying Tables** _: _sql-command_ ::= *CREATE VIRTUAL TABLE* [ _database-name_ .] _table-name_ *USING fts3* [( [ _argument_ [, _argument_ ]*] )] _: _argument_ ::= _name_ | *TOKENIZE* _tokenizer_ _: _tokenizer_ ::= *SIMPLE* | *PORTER* | _user-defined_ If no tokenizer is requested, then the =SIMPLE= tokenizer is used. Additional user-defined tokenizers may be built into a particular SQLite library, =SIMPLE= and =PORTER= are the only tokenizers provided by default. If no column names are requested, then the single column =content= is provided. Thus, the following statements all create a table with a single column =content= using tokenizer =SIMPLE=: CREATE VIRTUAL TABLE t USING fts3; CREATE VIRTUAL TABLE t USING fts3(); CREATE VIRTUAL TABLE t USING fts3(content); CREATE VIRTUAL TABLE t USING fts3(TOKENIZE simple); CREATE VIRTUAL TABLE t USING fts3(content, TOKENIZE simple); The column names =docid= and =tokenize= are reserved and should not be used in =fts3= create statements. Multiple column names can be provided: CREATE VIRTUAL TABLE t USING fts3(name, address); Only the first word of a column definition is considered in creating the table - all extraneous text is ignored. So the following two statements are identical: CREATE VIRTUAL TABLE t USING fts3(name); CREATE VIRTUAL TABLE t USING fts3(name NOT NULL UNIQUE); Any additional information, including constraints and type information, is ignored. Destroying an =fts= table uses the standard SQLite {link: http://www.sqlite.org/lang_droptable.html DROP TABLE} syntax. **Accessing fts tables** =fts= tables are accessed using =SELECT=, =INSERT=, =UPDATE=, and =DELETE= statements exactly as regular SQLite tables. The set of columns which can be referenced are exactly those enumerated in the create statement, all of which will appear to be of type =TEXT=. The table has the implicit unique index on =rowid=, plus the full-text index (see next section), but otherwise allows for no indices. A query such as: SELECT rowid FROM t WHERE content LIKE 'This%'; will perform a full table scan over all rows. In addition to the explicitly-enumerated columns, the following columns are implicitely present: rowid - a unique 64-bit integer per row, as for a regular table. docid - the preferred alias for rowid. - a "magic" column with the same name as the table, used for snippet generation. These columns should not be referenced in the create statement, and will not appear to exist except when explicitly referenced by name in =SELECT= or =INSERT= statements. When inserting into an =fts= table, if no =docid= is provided, then one is generated in the same way as values are generated for =INTEGER PRIMARY KEY= columns in regular SQLite tables. This =docid= can be accessed using =LAST_INSERT_ROWID()=: INSERT INTO t VALUES ('This is a test'); SELECT LAST_INSERT_ROWID(); The =rowid= or =docid= can be provided to =INSERT= statements, but providing both is an error: INSERT INTO t (rowid, content) VALUES (10, 'This is a test'); INSERT INTO t (docid, content) VALUES (11, 'This is a test'); -- Generates a new docid INSERT INTO t (docid, content) VALUES (null, 'This is a test'); -- Throws an error INSERT INTO t (docid, rowid, content) VALUES (10, 11, 'This is a test'); =fts= tables do not allow =UPDATE= on =docid=, the following statements throw an error: UPDATE t SET docid = docid + 1 WHERE content = 'This is a test'; UPDATE t SET rowid = rowid + 1 WHERE content = 'This is a test'; **Using the Fulltext Index** =fts= tokenizes its input data and builds an index over the resulting terms. To do queries using this index, a new =MATCH= operator is introduced. An example =fts= table: CREATE VIRTUAL TABLE recipe USING fts3(name, ingredients); INSERT INTO recipe VALUES ( 'broccoli stew', 'broccoli peppers cheese tomatoes' ); INSERT INTO recipe VALUES ( 'pumpkin stew', 'pumpkin onions garlic celery' ); INSERT INTO recipe VALUES ( 'broccoli pie', 'broccoli cheese onions flour' ); INSERT INTO recipe VALUES ( 'pumpkin pie', 'pumpkin sugar flour butter' ); To find recipes which are made with onions and cheese: SELECT docid FROM recipe WHERE ingredients MATCH 'onions cheese'; The string to the right of the =MATCH= is tokenized, each token is found in the index, and the set of documents where both tokens are present is returned. Since the =MATCH= has column =ingredients= on the left, this only finds recipes where the tokens are present in that column. A search over all columns can be done using the column with the same name as the table: SELECT docid FROM recipe WHERE recipe MATCH 'stew'; The query language for the right-hand-side of the =MATCH= operator has a few variants, here described by example: -- Recipes with 'onions' and 'cheese' each in any column of the row. SELECT * FROM recipe WHERE recipe MATCH 'onions cheese'; -- Recipes with either 'onions' or 'cheese' each in any column of -- the row. OR must be upper-case. SELECT * FROM recipe WHERE recipe MATCH 'onions OR cheese'; -- Recipes with 'stew' in the name column and 'onions' in the -- ingredients column. SELECT * FROM recipe WHERE recipe MATCH 'name:stew ingredients:onions'; -- Recipes with 'green' and 'onions' next to each other in that -- order, in any field. Known as "phrase search". SELECT * FROM recipe WHERE recipe MATCH '"green onions"'; -- Recipes which use onions but not cheese. SELECT * FROM recipe WHERE recipe MATCH 'onions -cheese'; -- Recipes which have 'onions' in any field and 'cheese' in none. SELECT * FROM recipe WHERE recipe MATCH 'onions -cheese'; -- Prefix search, finds recipes with words that start with the -- characters 'bu' (such as "butter"). SELECT * FROM recipe WHERE recipe MATCH 'bu*'; *NOTE* that the table-named column will not participate if you alias a table in a statement. So this statement will not work: SELECT docid FROM recipe AS ra WHERE ra MATCH 'stew'; it must be written as: SELECT docid FROM recipe AS ra WHERE recipe MATCH 'stew'; *NOTE* that MATCH cannot be used multiple times against the same table. So the following query, while valid syntax, will cause an error: SELECT docid FROM recipe WHERE recipe MATCH 'onions' AND recipe MATCH 'cheese'; It must be written as: SELECT docid FROM recipe WHERE recipe MATCH 'onions cheese';