Hadoop 2.9.2 部署

主要是为了熟悉和使用hdfs,部署在4个虚拟机节点上.

Note: 在所有节点操作!

关闭iptables,selinux,ntpdate等等.

分别设置节点的hostname:

hostnamectl set-hostname namenode
hostnamectl set-hostname datanode01
hostnamectl set-hostname datanode02
hostnamectl set-hostname datanode03

写入hosts:

cat  >> /etc/hosts <<'EOF'
172.17.2.30 namenode
172.17.2.31 datanode01
172.17.2.32 datanode02
172.17.2.33 datanode03
EOF

创建hadoop用户,密码123123:

useradd hadoop && echo 123123 | passwd --stdin hadoop
# echo "hadoopALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers

由于我为hdfs单独准备了一个硬盘,所以额外配置挂载点:

mkfs.xfs /dev/vdb
mkdir /hadoop
cat >> /etc/fstab <<'EOF'
/dev/vdb    /hadoop    xfs defaults,noatime    0 0
EOF
mount -a
chown -R hadoop: /hadoop

安装java:

curl -o java.sh https://wiki2.xbits.net:4430/_export/code/linux:others:install_java?codeblock=0
bash java.sh

安装hadoop 2.9.2:

wget https://files.xbits.net:4430/hadoop-2.9.2.tar.gz
tar zxf hadoop-2.9.2.tar.gz -C /usr/local/
ln -s /usr/local/hadoop-2.9.2/ /usr/local/hadoop
chown -R hadoop: /usr/local/hadoop/
ll /usr/local/hadoop/

Note: 在namenode上进行配置!

配置ssh免密 & 环境变量:

# 切换到hadoop用户操作
su - hadoop
# 生成公钥
ssh-keygen
# ssh免密(包括namenode自己),hadoop密码123123
ssh-copy-id namenode  
ssh-copy-id datanode01
ssh-copy-id datanode02
ssh-copy-id datanode03
# 环境变量
cat >> ~/.bash_profile <<'EOF'
export HADOOP_HOME=/usr/local/hadoop                            
export HADOOP_COMMON_HOME=$HADOOP_HOME                          
export HADOOP_HDFS_HOME=$HADOOP_HOME                            
export HADOOP_MAPRED_HOME=$HADOOP_HOME                          
export HADOOP_YARN_HOME=$HADOOP_HOME                            
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native     
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
EOF
 
source ~/.bash_profile

配置并拷贝hdfs-site.xml到其他节点:

cat > /usr/local/hadoop/etc/hadoop/hdfs-site.xml<<'EOF'
<?xml version="1.0" encoding="UTF-8"?>                       
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>  
<configuration>                                              
 <property>                                                  
    <name>dfs.replication</name>                             
    <value>3</value>                                         
  </property>                                                
  <property>                                                 
    <name>dfs.datanode.data.dir</name>                       
    <value>file:///hadoop/datanode</value>                         
  </property>                                                
</configuration>
EOF
# 分发到其他datanode节点
scp /usr/local/hadoop/etc/hadoop/hdfs-site.xml datanode01:/usr/local/hadoop/etc/hadoop/hdfs-site.xml
scp /usr/local/hadoop/etc/hadoop/hdfs-site.xml datanode02:/usr/local/hadoop/etc/hadoop/hdfs-site.xml
scp /usr/local/hadoop/etc/hadoop/hdfs-site.xml datanode03:/usr/local/hadoop/etc/hadoop/hdfs-site.xml

配置并拷贝code-site.xml到其他节点:

cat >/usr/local/hadoop/etc/hadoop/core-site.xml<<'EOF'
<?xml version="1.0" encoding="UTF-8"?>                     
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>                                            
<property>                                                 
    <name>fs.defaultFS</name>                              
    <value>hdfs://namenode:9000/</value>                     
  </property>                                              
</configuration>
EOF
 
# sed -i -e 's/\${JAVA_HOME}/\/usr\/java\/default/' /usr/local/hadoop/etc/hadoop/hadoop-env.sh
scp /usr/local/hadoop/etc/hadoop/core-site.xml datanode01:/usr/local/hadoop/etc/hadoop/core-site.xml
scp /usr/local/hadoop/etc/hadoop/core-site.xml datanode02:/usr/local/hadoop/etc/hadoop/core-site.xml
scp /usr/local/hadoop/etc/hadoop/core-site.xml datanode03:/usr/local/hadoop/etc/hadoop/core-site.xml

创建相关目录:

mkdir /hadoop/{datanode,namenode}
ssh datanode01 'mkdir /hadoop/{datanode,namenode}'
ssh datanode02 'mkdir /hadoop/{datanode,namenode}'
ssh datanode03 'mkdir /hadoop/{datanode,namenode}'

继续配置hdfs-site.xml,这次就不用分发至其他节点了:

vim /usr/local/hadoop/etc/hadoop/hdfs-site.xml
 
<property>                             
	<name>dfs.namenode.name.dir</name>  
	<value>file:///hadoop/namenode</value>
</property>

配置mapred-site.xml:

cat > /usr/local/hadoop/etc/hadoop/mapred-site.xml <<'EOF'
<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>
EOF

配置yarn-site.xml:

cat >/usr/local/hadoop/etc/hadoop/yarn-site.xml<<'EOF'
<?xml version="1.0"?>                          
<configuration>                                
  <property>                                   
    <name>yarn.resourcemanager.hostname</name> 
    <value>namenode</value>                      
  </property>                                  
  <property>                                   
    <name>yarn.nodemanager.hostname</name>     
    <value>namenode</value>                      
  </property>                                  
  <property>                                   
    <name>yarn.nodemanager.aux-services</name> 
    <value>mapreduce_shuffle</value>           
  </property>                                  
</configuration> 
EOF

配置slaves:

cat > /usr/local/hadoop/etc/hadoop/slaves<<'EOF'
namenode 
datanode01
datanode02
datanode03
EOF

Format NameNode and start Hadoop services:

hdfs namenode -format
start-dfs.sh
start-yarn.sh

show status:

tree -N /hadoop
jps

打开下面的URL可以查看更多信息:

简单跑一个小job,感受下hadoop的大致用法:

# create a directory
hdfs dfs -mkdir /test
# copy a local file to /test
hdfs dfs -copyFromLocal /usr/local/hadoop/NOTICE.txt /test
# show contents of a file
hdfs dfs -cat /test/NOTICE.txt
# execute a sample program
hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar wordcount /test/NOTICE.txt /output01
# show results
hdfs dfs -ls /output01    
hdfs dfs -cat /output01/part-r-00000

  • linux/hdfs2.9.2_deploy.txt
  • 最后更改: 2019/08/23 00:18
  • 由 mrco