In this post I will combine the popularity of a document with the most searched terms in order to boost the results based on past search issued against a particular index.
PUT blogposts
PUT statistics
PUT /blogposts/post/1
{
"title": "About popularity",
"content": "In this post we will talk about...",
"votes": 6
}
PUT /blogposts/post/2
{
"title": "About elasticsearch",
"content": "In this post we will talk about...",
"votes": 3
}
PUT /blogposts/post/3
{
"title": "About popularity",
"content": "In this post we will talk about...",
"votes": 7
}
PUT /statistics/queries/1
{
"user_query": "popularity"
}
PUT /statistics/queries/2
{
"user_query": "popularity in elasticsearch"
}
PUT /statistics/queries/3
{
"user_query": "boost"
}
PUT /statistics/queries/4
{
"user_query": "boost in elasticsearch"
}
PUT /statistics/queries/5
{
"user_query": "elasticsearch is the best search engine"
}
GET blogposts/post/_mapping
GET statistics/queries/_mapping
POST statistics/queries/_search
{
"query" : {
"match_all" : {}
},
"facets": {
"keywords": {
"terms": {
"field": "user_query"
}
}
}
}
POST blogposts/post/_search
{
"sort" : [
{ "votes" : {"order" : "desc"}},
"_score"
],
"query" : {
"match" : {
"title":{
"query":"elasticsearch popularity"
}
}
}
}
Mostrando postagens com marcador ElasticSearch. Mostrar todas as postagens
Mostrando postagens com marcador ElasticSearch. Mostrar todas as postagens
quinta-feira, 11 de setembro de 2014
quarta-feira, 23 de abril de 2014
[ElasticSearch] An example of using SetFetchSource
The SetFetchSource is a new feature of ElasticSearch1.1. This is a example of how to use it:
String [] excludes = {"field0"};
String [] includes = {"field1","field2"};
SearchResponse searcher = client.getClient().prepareSearch(INDEX).setFetchSource(includes, excludes).setQuery(qb).execute().actionGet();
If you do not include the "addFields()" command, you will not be able to iterate over the fields of the hits. However, you can iterate using "sourceAsMap()".
for (SearchHit hit : searcher.getHits().getHits()) {
Map hits = hit.sourceAsMap();
for (String key : hits.keySet()) {
//do something
}
Object [] fieldValues = hits.values().toArray();
for (Object fieldValue:fieldValues) {
//do something
}
}
As you will notice, the excluded fields at "excludes" will not show up. Also, if you try "hit.getFields()" it will be empty.
References:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-source-field.html
https://github.com/elasticsearch/elasticsearch/blob/master/src/test/java/org/elasticsearch/search/source/SourceFetchingTests.java
String [] excludes = {"field0"};
String [] includes = {"field1","field2"};
SearchResponse searcher = client.getClient().prepareSearch(INDEX).setFetchSource(includes, excludes).setQuery(qb).execute().actionGet();
If you do not include the "addFields()" command, you will not be able to iterate over the fields of the hits. However, you can iterate using "sourceAsMap()".
for (SearchHit hit : searcher.getHits().getHits()) {
Map
for (String key : hits.keySet()) {
//do something
}
Object [] fieldValues = hits.values().toArray();
for (Object fieldValue:fieldValues) {
//do something
}
}
As you will notice, the excluded fields at "excludes" will not show up. Also, if you try "hit.getFields()" it will be empty.
References:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-source-field.html
https://github.com/elasticsearch/elasticsearch/blob/master/src/test/java/org/elasticsearch/search/source/SourceFetchingTests.java
Marcadores:
ElasticSearch,
Handson,
SetFetchSource
terça-feira, 22 de abril de 2014
[ElasticSearch] QueryBuild vs. String: ElasticsearchParseException[Failed to derive xcontent from...
I had a problem while writing tests for use the new setFetchSource feature of ES 1.1. After some test, research and thinking (try and error). I finally realised that the problem rises when you are using a plain string query instead of a QueryBuilder.
org.elasticsearch.action.search.SearchPhaseExecutionException: Failed to execute phase [query_fetch], all shards failed; shardFailures {[KMpFYBpxRECgm3gJC_1-uw][content][0]: SearchParseException[[content][0]: from[-1],size[10]: Parse Failure [Failed to parse source [{"size":10,"query_binary":"VGVzdHRleHQ=","_source":{"includes":["CONTENTID","URL"],"excludes":["CONTENT"]},"fields":["CONTENTID","URL"]}]]]; nested: ElasticsearchParseException[Failed to derive xcontent from (offset=0, length=8): [84, 101, 115, 116, 116, 101, 120, 116]]; }
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:272)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:224)
at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteFetch(SearchServiceTransportAction.java:307)
at org.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryAndFetchAction.java:71)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Makes sense, since the error says: "failed to derive xcontent from (offset". However, for a not native english speaker, took some time. Just change:
String query="";
SearchResponse response = client.prepareSearch(CONTENT_INDEX).setFetchSource(includes, excludes).setQuery(query).addFields(includes)
.setSize(10).execute().actionGet();
To:
String query="";
QueryStringQueryBuilder qb = QueryBuilders.queryString(query);
SearchResponse response = client.prepareSearch(CONTENT_INDEX).setFetchSource(includes, excludes).setQuery(qb).addFields(includes)
.setSize(10).execute().actionGet();
org.elasticsearch.action.search.SearchPhaseExecutionException: Failed to execute phase [query_fetch], all shards failed; shardFailures {[KMpFYBpxRECgm3gJC_1-uw][content][0]: SearchParseException[[content][0]: from[-1],size[10]: Parse Failure [Failed to parse source [{"size":10,"query_binary":"VGVzdHRleHQ=","_source":{"includes":["CONTENTID","URL"],"excludes":["CONTENT"]},"fields":["CONTENTID","URL"]}]]]; nested: ElasticsearchParseException[Failed to derive xcontent from (offset=0, length=8): [84, 101, 115, 116, 116, 101, 120, 116]]; }
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:272)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:224)
at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteFetch(SearchServiceTransportAction.java:307)
at org.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryAndFetchAction.java:71)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Makes sense, since the error says: "failed to derive xcontent from (offset". However, for a not native english speaker, took some time. Just change:
String query="
SearchResponse response = client.prepareSearch(CONTENT_INDEX).setFetchSource(includes, excludes).setQuery(query).addFields(includes)
.setSize(10).execute().actionGet();
To:
String query="
QueryStringQueryBuilder qb = QueryBuilders.queryString(query);
SearchResponse response = client.prepareSearch(CONTENT_INDEX).setFetchSource(includes, excludes).setQuery(qb).addFields(includes)
.setSize(10).execute().actionGet();
Marcadores:
ElasticSearch,
failed to derive xcontent from,
QueryBuilders
terça-feira, 8 de abril de 2014
How to use the Path Hierarchy Tokenizer in ElasticSearch
How to use the Path Hierarchy Tokenizer:
0. If you and just to check the output of this tokenizer, you can run:
curl -XGET 'localhost:9200/_analyze?tokenizer=path_hierarchy&filters=lowercase' -d '/something/something/else'
1. Put a map:
$ curl -XPUT localhost:9200/index_files3/ -d '
{
"settings":{
"index":{
"analysis":{
"analyzer":{
"analyzer_path_hierarchy":{
"tokenizer":"my_path_hierarchy_tokenizer",
"filter":"lowercase"
}
},
"tokenizer" : {
"my_path_hierarchy_tokenizer" : {
"type" : "path_hierarchy",
"delimiter" : "/",
"replacement" : "*",
"buffer_size" : "1024",
"reverse" : "true",
"skip" : "0"
}
}
}
}
},
"mappings":{
"file":{
"properties":{
"path":{
"analyzer":"analyzer_path_hierarchy",
"type":"string"
}
}
}
}
}'
2. Get the map:
curl -XGET 'http://localhost:9200/index_files3/file/_mapping'
3. Add a new document:
$ curl -XPUT 'http://localhost:9200/index_files3/file/1' -d '{
"name" : "c1",
"text" : "t1",
"path" : "/c1/c2/c3"
}'
4. Check if it is there:
$ curl -XGET 'http://localhost:9200/index_files3/file/1'
5. Search:
Fail: curl -XGET 'http://localhost:9200/index_files3/_search?q=path:c1'
Success: curl -XGET 'http://localhost:9200/index_files3/_search?q=path://c1'
0. If you and just to check the output of this tokenizer, you can run:
curl -XGET 'localhost:9200/_analyze?tokenizer=path_hierarchy&filters=lowercase' -d '/something/something/else'
1. Put a map:
$ curl -XPUT localhost:9200/index_files3/ -d '
{
"settings":{
"index":{
"analysis":{
"analyzer":{
"analyzer_path_hierarchy":{
"tokenizer":"my_path_hierarchy_tokenizer",
"filter":"lowercase"
}
},
"tokenizer" : {
"my_path_hierarchy_tokenizer" : {
"type" : "path_hierarchy",
"delimiter" : "/",
"replacement" : "*",
"buffer_size" : "1024",
"reverse" : "true",
"skip" : "0"
}
}
}
}
},
"mappings":{
"file":{
"properties":{
"path":{
"analyzer":"analyzer_path_hierarchy",
"type":"string"
}
}
}
}
}'
2. Get the map:
curl -XGET 'http://localhost:9200/index_files3/file/_mapping'
3. Add a new document:
$ curl -XPUT 'http://localhost:9200/index_files3/file/1' -d '{
"name" : "c1",
"text" : "t1",
"path" : "/c1/c2/c3"
}'
4. Check if it is there:
$ curl -XGET 'http://localhost:9200/index_files3/file/1'
5. Search:
Fail: curl -XGET 'http://localhost:9200/index_files3/_search?q=path:c1'
Success: curl -XGET 'http://localhost:9200/index_files3/_search?q=path://c1'
Marcadores:
ElasticSearch,
Path Hierarchy Tokenizer
terça-feira, 11 de março de 2014
[RASCUNHO] Coleção de textos sobre como modificar a função de similaridade do ElasticSearch
It is necessary to extends the classes:
org.apache.lucene.search.similarities.Similarity;
org.elasticsearch.index.similarity.AbstractSimilarityProvider;
Since the ES 0.9 it is possible to change the similarity function to each field. This document explains how:
http://elasticsearchserverbook.com/elasticsearch-0-90-similarities/
org.apache.lucene.search.similarities.Similarity;
org.elasticsearch.index.similarity.AbstractSimilarityProvider;
Since the ES 0.9 it is possible to change the similarity function to each field. This document explains how:
http://elasticsearchserverbook.com/elasticsearch-0-90-similarities/
General information:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-similarity.html
About the tf/idf and BM25:
https://www.found.no/foundation/similarity/
http://stackoverflow.com/questions/19423423/simple-explanation-of-different-elasticsearch-similarity-algorithms
https://groups.google.com/forum/#!topic/elasticsearch/UZYa49_9AFg
Implementation of a custom function:
http://elasticsearch-users.115913.n3.nabble.com/How-to-use-ElasticSearch-Custom-Similarity-provider-classes-td4047683.html
https://github.com/awnuxkjy/es-custom-similarity-provider/tree/master/src/main/java/org/elasticsearch/index/similarity
http://elasticsearch-users.115913.n3.nabble.com/Configuring-a-Custom-Similarity-td4034063.html
https://www.found.no/foundation/similarity/
http://stackoverflow.com/questions/19423423/simple-explanation-of-different-elasticsearch-similarity-algorithms
https://groups.google.com/forum/#!topic/elasticsearch/UZYa49_9AFg
Implementation of a custom function:
http://elasticsearch-users.115913.n3.nabble.com/How-to-use-ElasticSearch-Custom-Similarity-provider-classes-td4047683.html
https://github.com/awnuxkjy/es-custom-similarity-provider/tree/master/src/main/java/org/elasticsearch/index/similarity
http://elasticsearch-users.115913.n3.nabble.com/Configuring-a-Custom-Similarity-td4034063.html
Assinar:
Postagens (Atom)