1.elasticsearch单节点部署
1.介绍
1.Lucene
就是一个jar包,里面包含了封装好的各种建立倒排索引,以及进行搜索的代码,包含各种算法,我们就用java开发的时候,引入lucene jar,然后基于lucene的api去进行开发就可以了,
我们就可以将已有的数据数据建立索引,lucene会在本地磁盘上面,给我们组织索引的数据结构。另外的话,我们也可以用lucene提供的的功能和api来针对磁盘上的索引数据,进行搜索。
2.elasticsearch
分布式搜索和分析引擎
Elasticsearch也使用Java开发并使用Lucene作为其核心来实现所有索引和搜索的功能,但是它的目的是通过简单的RESTful API来隐藏Lucene的复杂性,从而让全文搜索变得简单。
总结:
es就是lucene封装了外壳,讲lucene复杂的流程简单化 Elasticsearch不是什么新技术,主要是将全文检索,数据分析以及分布式技术,合并在一起,才形成了独一无二的ES,lucene(全文检索)
2.安装
1.安装java
yum install -y java-1.8.0-openjdk.x86_64
2.下载安装软件
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.6.0.rpm
rpm -ivh elasticsearch-6.6.0.rpm
warning: elasticsearch-6.6.0.rpm: Header V4 RSA/SHA512 Signature, key ID d88e42b4: NOKEY
Preparing... ################################# [100%]
Creating elasticsearch group... OK
Creating elasticsearch user... OK
Updating / installing...
1:elasticsearch-0:6.6.0-1 ################################# [100%]
### NOT starting on installation, please execute the following statements to configure elasticsearch service to start automatically using systemd
sudo systemctl daemon-reload
sudo systemctl enable elasticsearch.service
### You can start elasticsearch service by executing
sudo systemctl start elasticsearch.service
Created elasticsearch keystore in /etc/elasticsearch
### 配置启动
systemctl daemon-reload
systemctl enable elasticsearch.service
systemctl start elasticsearch.service
systemctl status elasticsearch.service
### 检查是否启动成功
[ soft]# netstat -lntup |grep 9200
tcp6 0 0 127.0.0.1:9200 :::* LISTEN 10317/java
tcp6 0 0 ::1:9200 :::* LISTEN 10317/java
[ soft]# curl 127.0.0.1:9200
{
"name" : "ixFqenL",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "IOiVybhlTe6X6j1YUNOrMA",
"version" : {
"number" : "6.6.0",
"build_flavor" : "default",
"build_type" : "rpm",
"build_hash" : "a9861f4",
"build_date" : "2019-01-24T11:27:09.439740Z",
"build_snapshot" : false,
"lucene_version" : "7.6.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}3.配置文件
rpm -ql elasticsearch #查看elasticsearch软件安装了哪些目录
rpm -qc elasticsearch #查看elasticsearch的所有配置文件
/etc/elasticsearch/elasticsearch.yml #配置文件
/etc/elasticsearch/jvm.options. #jvm虚拟机配置文件
/etc/init.d/elasticsearch #init启动文件
/etc/sysconfig/elasticsearch #环境变量配置文件
/usr/lib/sysctl.d/elasticsearch.conf #sysctl变量文件,修改最大描述符
/usr/lib/systemd/system/elasticsearch.service #systemd启动文件
/var/lib/elasticsearch # 数据目录
/var/log/elasticsearch #日志目录
/var/run/elasticsearch #pid目录
1.修改配置文件
[ soft]# grep "^[a-Z]" /etc/elasticsearch/elasticsearch.yml
##节点名node.name: node-1
##数据目录
path.data: /data/elasticsearch
##日志目录
path.logs: /var/log/elasticsearch
##锁内存,提前占用内存
bootstrap.memory_lock: true
##网络ip,不配置默认为127.0.0.1,这样只能自己访问
network.host: 192.168.100.29
##端口,默认9200
http.port: 9200
2.查看最大最小内存
[ soft]# vi /etc/elasticsearch/jvm.options
-Xms1g
-Xmx1g
官方文档 内存限制:
1.不要超过32G
2.最大最小设置一样
3.配置文件设置锁定内存
4.至少给服务器本身空余50%的内存
3.创建数据目录
mkdir /data/elasticsearch
##因为es会自动创建用户,所以需要讲目录给他,不然写不了数据
chown -R elasticsearch:elasticsearch /data/elasticsearch/
4.启动 此时会失败
tail -f /var/log/elasticsearch/elasticsearch.log
[1]: memory locking requested for elasticsearch process but memory is not locked
官方解决方法:
### 修改启动配置文件或创建新配置文件
方法1: systemctl edit elasticsearch
方法2: vim /usr/lib/systemd/system/elasticsearch.service
### 增加如下参数
[Service]
LimitMEMLOCK=infinity
### 重新启动
systemctl daemon-reload
systemctl restart elasticsearch
5.配置完成
[ ~]# netstat -lntup |grep 9200
tcp6 0 0 192.168.100.29:9200 :::* LISTEN 10848/java
[ ~]# curl 192.168.100.29:9200
{
"name" : "node-1",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "q04N04iKQ-KaU8_KPCLJBw",
"version" : {
"number" : "6.6.0",
"build_flavor" : "default",
"build_type" : "rpm",
"build_hash" : "a9861f4",
"build_date" : "2019-01-24T11:27:09.439740Z",
"build_snapshot" : false,
"lucene_version" : "7.6.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}3.head插件安装
vi /etc/elasticsearch/elasticsearch.yml http.cors.enabled: true http.cors.allow-origin: "*"
把head包放入谷歌


Elasticsearch 数据库
=========================
Document 行
Type 表
Index 库
filed 字段
创建索引
curl -XPUT 192.168.100.29:9200/vipinfo?pretty
curl -XPUT ‘192.168.100.29:9200/vipinfo/user/1?pretty‘ -H ‘Content-Type: application/json‘ -d‘
{
"first_name" : "John",
"last_name": "Smith",
"age" : 25,
"about" : "I love to go rock climbing", "interests": [ "sports", "music" ]
}‘
curl -XPUT ‘localhost:9200/vipinfo/user/2?pretty‘ -H ‘Content-Type: application/json‘ -d‘ {
"first_name": "Jane",
"last_name" : "Smith",
"age" : 32,
"about" : "I like to collect rock albums", "interests": [ "music" ]
}‘
curl -XPUT ‘localhost:9200/vipinfo/user/3?pretty‘ -H ‘Content-Type: application/json‘ -d‘ {
"first_name": "Douglas", "last_name" : "Fir",
"age" : 35,
"about": "I like to build cabinets", "interests": [ "forestry" ]
}‘主键不能重复(id) 不指定为随机生成
随机指定有利于提高效率,因为数据库在插入时会判断是否id重复
相关推荐
另外一部分,则需要先做聚类、分类处理,将聚合出的分类结果存入ES集群的聚类索引中。数据处理层的聚合结果存入ES中的指定索引,同时将每个聚合主题相关的数据存入每个document下面的某个field下。