terça-feira, 8 de abril de 2014

How to use the Path Hierarchy Tokenizer in ElasticSearch

How to use the Path Hierarchy Tokenizer:

0. If you and just to check the output of this tokenizer, you can run:

curl -XGET 'localhost:9200/_analyze?tokenizer=path_hierarchy&filters=lowercase' -d '/something/something/else'


1. Put a map:

$ curl -XPUT localhost:9200/index_files3/ -d '
{
  "settings":{
     "index":{
        "analysis":{
           "analyzer":{
              "analyzer_path_hierarchy":{
                 "tokenizer":"my_path_hierarchy_tokenizer",
                 "filter":"lowercase"
              }
           },
       "tokenizer" : {
              "my_path_hierarchy_tokenizer" : {
                  "type" : "path_hierarchy",
                  "delimiter" : "/",
                  "replacement" : "*",
                  "buffer_size" : "1024",
                  "reverse" : "true",
                  "skip" : "0"
               }
           }
        }
     }
  },
  "mappings":{
     "file":{
        "properties":{
           "path":{
              "analyzer":"analyzer_path_hierarchy",
              "type":"string"
           }
        }
     }
  }
}'


2. Get the map:

curl -XGET 'http://localhost:9200/index_files3/file/_mapping'


3. Add a new document:

$ curl -XPUT 'http://localhost:9200/index_files3/file/1' -d '{
    "name" : "c1",
    "text" : "t1",
    "path" : "/c1/c2/c3"
}'


4. Check if it is there:

$ curl -XGET 'http://localhost:9200/index_files3/file/1'


5. Search:

Fail: curl -XGET 'http://localhost:9200/index_files3/_search?q=path:c1'

Success: curl -XGET 'http://localhost:9200/index_files3/_search?q=path://c1'