盒子
盒子
文章目录
  1. 1. Linux 基础配置
    1. 1.1 设置静态IP:
    2. 1.2 虚拟机配置:
    3. 1.3 修改主机名:
    4. 1.4 设置IP和主机名映射:
    5. 1.5 设置免密登录
    6. 1.6 禁用selinux与防火墙
      1. 1.6.1 selinux
      2. 1.6.2 防火墙
    7. 1.7 卸载 Openjdk
  2. 2. Hadoop 配置
    1. 2.1 集群规划:
    2. 2.2 解压 hadoop 并清理文档
    3. 2.3 指定 Java 路径
    4. 2.4 HDFS 相关修改
      1. 2.4.1 修改 core-site.xml
      2. 2.4.2 修改hdfs-site.xml
      3. 2.4.3 修改slaves
    5. 2.5 MapReduce 与 YARN 修改
      1. 2.5.1 修改 mapred-site.xml
      2. 2.5.2 修改 yarn-site.xml
    6. 2.6 集群启动
      1. 2.6.1 启动 zookeeper,再启动 journalnode
      2. 2.6.2 格式化 namenode
      3. 2.6.3 同步元数据
      4. 2.6.4 初始化 ZKFC
      5. 2.6.5 启动 HDFS 相关进程
      6. 2.6.6 查看进程
      7. 2.6.7 验证测试

Hadoop 集群 HA 架构配置

1. Linux 基础配置

1.1 设置静态IP:

宿主机配置:
VMware: NAT模式,不使用DHCP
VMnet8: IPv4使用固定ip,子网掩码

1.2 虚拟机配置:

vim /etc/sysconfig/network-scripts/ifcfg-ens33

1
2
3
4
5
6
7
8
9
10
11
12
13
TYPE="Ethernet"
PROXY_METHOD="none"
BROWSER_ONLY="no"
BOOTPROTO="static"
DEFROUTE="yes"
IPV4_FAILURE_FATAL="yes"
NAME="ens33"
UUID="a428bf24-b245-408a-88b6-d0934885c452"
DEVICE="ens33"
ONBOOT="yes"
IPADDR="192.168.12.130"
GATEWAY="192.168.12.2"
DNS1="192.168.12.2"

1.3 修改主机名:

vim /etc/sysconfig/network

1
2
NETWORKING=yes
HOSTNAME=master01

vim /etc/hostname

1
master01

1.4 设置IP和主机名映射:

vim /etc/hosts

1.5 设置免密登录

ssh-keygen

生成公钥和私钥

ssh-copy-id -i ~/.ssh/id_rsa.pub 目标机器用户名@目标机器名

将公钥拷贝到目标机器

ssh 目标机器用户名@目标机器名

1.6 禁用selinux与防火墙

1.6.1 selinux

vim /etc/sysconfig/selinux

将SELINUX改成disabled

1.6.2 防火墙

systemctl stop firewalld
systemctl disable firewalld

1.7 卸载 Openjdk

查看jdk情况

rpm -qa | grep java

若使用openjdk则卸载

rpm -e –nodeps ‘查询到的openjdk,多个文件用空格隔开’

解压JDK
配置环境变量

2. Hadoop 配置

2.1 集群规划:

有3台虚拟机,分别是 master01, master02, slave01, slave02, slave03。

节点 master01 master02 slave01 slave02 slave03
组件 Namenode
DFSZKFailoverController
ResourceManager
Jobhistory
Namenode
DFSZKFailoverController
Datanode
NodeManager
JournalNode
Datanode
NodeManager
JournalNode
Datanode
NodeManager
JournalNode

2.2 解压 hadoop 并清理文档

tar -zxvf hadoop-2.7.7.tar.gz -C /opt/modules/
mv hadoop-2.7.7/ hadoop277

清理hadoop-2.5.0/share/doc

2.3 指定 Java 路径

文件:hadoop-env.sh / mapred-env.sh / yarn-env.sh

2.4 HDFS 相关修改

2.4.1 修改 core-site.xml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://ns</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/modules/hadoop277/data/tmp</value>
</property>

<property>
<name>ha.zookeeper.quorum</name>
<value>master01:2181,master02:2181,slave01:2181,slave02:2181,slave03:2181</value>
</property>

</configuration>

2.4.2 修改hdfs-site.xml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>

<property>
<name>dfs.nameservices</name>
<value>ns</value>
</property>

<property>
<name>dfs.ha.namenodes.ns</name>
<value>nn1,nn2</value>
</property>

<property>
<name>dfs.namenode.rpc-address.ns.nn1</name>
<value>master01:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns.nn2</name>
<value>master02:8020</value>
</property>

<property>
<name>dfs.namenode.http-address.ns.nn1</name>
<value>master01:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.ns.nn2</name>
<value>master02:50070</value>
</property>

<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://slave01:8485;slave02:8485;slave03:8485/ns</value>
</property>

<property>
<name>dfs.client.failover.proxy.provider.ns</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>

<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/modules/hadoop277/data/dfs/jn</value>
</property>

<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>


<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
</configuration>

2.4.3 修改slaves

从节点主机名

slave01
slave02
slave03

2.5 MapReduce 与 YARN 修改

2.5.1 修改 mapred-site.xml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
<configuration>
<property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
</property>

<property>
<name>mapreduce.jobhistory.address</name>
<value>master01:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master01:19888</value>
</property>
</configuration>

2.5.2 修改 yarn-site.xml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
<configuration>
<!-- 开启RM高可靠 -->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<!-- 指定RM的cluster id -->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>RM_HA_ID</value>
</property>
<!-- 指定RM的名字 -->
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<!-- 分别指定RM的地址 -->
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>master01</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>master02</value>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<!-- 指定zk集群地址 -->
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>master01:2181,master02:2181,slave01:2181,slave02:2181,slave03:2181</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>

分发 hadoop 文件到其他节点

2.6 集群启动

2.6.1 启动 zookeeper,再启动 journalnode

slave01, slave02, slave03

sbin/hadoop-daemon.sh start journalnode

2.6.2 格式化 namenode

master01

bin/hdfs namenode -format

2.6.3 同步元数据

master01 上启动 namenode

master02:

bin/hdfs namenode -bootstrapStandby

2.6.4 初始化 ZKFC

master01

bin/hdfs zkfc -formatZK
//zk下生成hadoop-ha目录表示成功

2.6.5 启动 HDFS 相关进程

master01

sbin/start-dfs.sh

1
2
3
4
5
6
7
8
9
10
11
12
13
Starting namenodes on [master01 master02]
master02: starting namenode, logging to /opt/modules/hadoop277/logs/hadoop-root-namenode-master02.out
master01: starting namenode, logging to /opt/modules/hadoop277/logs/hadoop-root-namenode-master01.out
slave01: starting datanode, logging to /opt/modules/hadoop277/logs/hadoop-root-datanode-slave01.out
slave02: starting datanode, logging to /opt/modules/hadoop277/logs/hadoop-root-datanode-slave02.out
slave03: starting datanode, logging to /opt/modules/hadoop277/logs/hadoop-root-datanode-slave03.out
Starting journal nodes [slave01 slave02 slave03]
slave02: starting journalnode, logging to /opt/modules/hadoop277/logs/hadoop-root-journalnode-slave02.out
slave03: starting journalnode, logging to /opt/modules/hadoop277/logs/hadoop-root-journalnode-slave03.out
slave01: starting journalnode, logging to /opt/modules/hadoop277/logs/hadoop-root-journalnode-slave01.out
Starting ZK Failover Controllers on NN hosts [master01 master02]
master02: starting zkfc, logging to /opt/modules/hadoop277/logs/hadoop-root-zkfc-master02.out
master01: starting zkfc, logging to /opt/modules/hadoop277/logs/hadoop-root-zkfc-master01.out

2.6.6 查看进程

master01 进程

1
2
3
4
5
6392 Jps
6026 DFSZKFailoverController
2027 QuorumPeerMain
5711 NameNode
6191 ResourceManager

master02 进程

1
2
3
4
3635 NameNode
3752 DFSZKFailoverController
3884 Jps
1805 QuorumPeerMain

slave01, slave02, slave03 进程

1
2
3
4
5
3507 DataNode
2085 QuorumPeerMain
3606 JournalNode
3863 Jps
3759 NodeManager

2.6.7 验证测试

HDFS webUI: http://master02:50070
YARN webUI: http://master01:8088

测试 YARN

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.7.jar wordcount /in/README.txt /out


后记:写完博客,天也快亮了。日拱一卒,功不唐捐。加油!!

支持一下
扫一扫,支持forsigner
  • 微信扫一扫
  • 支付宝扫一扫