Hadoop集群部署初试
安装步骤:
1.规划机器
2.修改主机名称,配置ssh免登,安装jdk
3.修改配置文件,创建目录
4.启动应用
1.规划机器(centos1作为master)
规划三台机器,一种centos1作为master,其余两台机器作为slaves
10.240.139.101 centos1
10.240.140.20 centos2
10.240.139.72 centos3
centos1安装NameNode SecondNameNode ResourceManager
centos2安装DataNode NodeManager
2.修改主机名称,配置ssh免登,安装jdk
[root@centos1 bin]# vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=centos1
NTPSERVERARGS=iburst
[root@centos1 bin]# vi /etc/hosts
10.240.139.101 centos1
10.240.140.20 centos2
10.240.139.72 centos3
配置SSH免登,成功之后ssh localhost, ssh centos2不用用户名密码即成功。
#登录到centos1 cd ~/.ssh rm ./id_rsa* #删除之前的key ssh-keygen -t rsa #生成新的key,一直回车即可 cat ./id_rsa.pub >> ./authorized_keys scp authorized_keys root@centos2:~/.ssh/authorized_keys_from_centos1 #登录到centos2 cat authorized_keys_from_centos1 >> ./authorized_keys
关闭防火墙:
sudo service iptables stop # 关闭防火墙服务 sudo chkconfig iptables off # 禁止防火墙开机自启,就不用手动关闭了
JDK安装就滤过了。
3.修改配置文件
hadoop-env.sh:
hadoop的环境变量配置文件,需要配置JAVA_HOME的变量
yarn-env.sh:
yarn的环境配置文件,需要配置JAVA_HOME的变量
core-site.xml:
hadoop的全局默认参数配置
hdfs-site.xml:
hdfs的参数配置
yarn-site.xml:
yarn的参数配置
mapred-site.xml:
mapred的参数配置
slaves:
从节点配置
core-site.xml文件
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/usr/local/hadoop/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://centos1:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>centos1:50090</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/tmp/dfs/data</value>
</property>
</configuration>slaves
centos2 centos3
yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>centos1</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>centos1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>centos1:19888</value>
</property>
</configuration>创建目录:/usr/local/hadoop/tmp/dfs/name /usr/local/hadoop/tmp/dfs/data
4.启动应用
./bin/hdfs namenode -format
./sbin/start-dfs.sh centos1上可以看到NameNode SecondaryNameNode centos2,3上看到datanode
./start-yarn.sh centos1上会看到ResourceManager centos2,3上会看到NodeManager进程
查看hdfs相关:http://10.240.139.101:50090/
查看yarn相关:http://10.240.139.72:8042/node/node