本篇笔记记录了ElasticSearch安装中文分词插件IK Analysis,测试分词和测试搜索的过程
相关笔记:
CentOS6.9使用RPM包安装ElasticSearch
CentOS6.9安装ElasticSearch
安装IK
方法1:下载预编译包
wget -c https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.6.0/elasticsearch-analysis-ik-6.6.0.zip
创建插件文件夹 cd your-es-root/plugins/ && mkdir ik
将插件解压缩到文件夹 your-es-root/plugins/ik
方法2:使用elasticsearch-plugin进行安装(从v5.5.1版本支持)
./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.6.0/elasticsearch-analysis-ik-6.6.0.zip
安装后重启elasticsearch
service elasticsearch restart
分词测试
创建一个索引
curl -XPUT http://192.168.75.135:9200/ikindex
创建一个映射模版
curl -XPOST http://192.168.75.135:9200/ikindex/iktype/_mapping -H 'Content-Type:application/json' -d'
{
"properties": {
"content": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
}
}
}'
默认分词测试
curl -XGET http://192.168.75.135:9200/ikindex/_analyze?pretty=true -H 'Content-Type:application/json' -d'{"text":"我的学习笔记"}'
{
"tokens" : [
{
"token" : "我",
"start_offset" : 0,
"end_offset" : 1,
"type" : "<IDEOGRAPHIC>",
"position" : 0
},
{
"token" : "的",
"start_offset" : 1,
"end_offset" : 2,
"type" : "<IDEOGRAPHIC>",
"position" : 1
},
{
"token" : "学",
"start_offset" : 2,
"end_offset" : 3,
"type" : "<IDEOGRAPHIC>",
"position" : 2
},
{
"token" : "习",
"start_offset" : 3,
"end_offset" : 4,
"type" : "<IDEOGRAPHIC>",
"position" : 3
},
{
"token" : "笔",
"start_offset" : 4,
"end_offset" : 5,
"type" : "<IDEOGRAPHIC>",
"position" : 4
},
{
"token" : "记",
"start_offset" : 5,
"end_offset" : 6,
"type" : "<IDEOGRAPHIC>",
"position" : 5
}
]
}
IK分词测试
curl -XGET 'http://192.168.75.135:9200/ikindex/_analyze?pretty=true' -H 'Content-Type:application/json' -d'{"analyzer":"ik_max_word","text":"我的学习笔记"}'
{
"tokens" : [
{
"token" : "我",
"start_offset" : 0,
"end_offset" : 1,
"type" : "CN_CHAR",
"position" : 0
},
{
"token" : "的",
"start_offset" : 1,
"end_offset" : 2,
"type" : "CN_CHAR",
"position" : 1
},
{
"token" : "学习",
"start_offset" : 2,
"end_offset" : 4,
"type" : "CN_WORD",
"position" : 2
},
{
"token" : "笔记",
"start_offset" : 4,
"end_offset" : 6,
"type" : "CN_WORD",
"position" : 3
}
]
}
搜索测试
索引一些文档
curl -XPOST http://192.168.75.135:9200/ikindex/iktype/1 -H 'Content-Type:application/json' -d'{"content":"家明的家,我的学习笔记"}'
curl -XPOST http://192.168.75.135:9200/ikindex/iktype/2 -H 'Content-Type:application/json' -d'{"content":"php是世界上最好的语言"}'
curl -XPOST http://192.168.75.135:9200/ikindex/iktype/3 -H 'Content-Type:application/json' -d'{"content":"我喜欢户外,喜欢钓鱼,喜欢自驾,我是家明"}'
curl -XPOST http://192.168.75.135:9200/ikindex/iktype/4 -H 'Content-Type:application/json' -d'{"content":"这是我的个人博客,家明的学习笔记"}'
curl -XPOST http://192.168.75.135:9200/ikindex/iktype/4 -H 'Content-Type:application/json' -d'{"content":"爱学习的孩纸,才是个好孩纸"}'
高亮显示搜索结果
curl -XPOST http://192.168.75.135:9200/ikindex/iktype/_search -H 'Content-Type:application/json' -d'
{
"query" : { "match" : { "content" : "学习笔记" }},
"highlight" : {
"pre_tags" : ["<tag1>", "<tag2>"],
"post_tags" : ["</tag1>", "</tag2>"],
"fields" : {
"content" : {}
}
}
}'
搜索结果
{
"took": 9,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.66301036,
"hits": [{
"_index": "ikindex",
"_type": "iktype",
"_id": "4",
"_score": 0.66301036,
"_source": {
"content": "爱学习的孩纸,才是个好孩纸"
},
"highlight": {
"content": ["爱<tag1>学习</tag1>的孩纸,才是个好孩纸"]
}
}, {
"_index": "ikindex",
"_type": "iktype",
"_id": "1",
"_score": 0.5753642,
"_source": {
"content": "家明的家,我的学习笔记"
},
"highlight": {
"content": ["家明的家,我的<tag1>学习</tag1><tag1>笔记</tag1>"]
}
}]
}
}