hadoop-3

Full destributed module.

Design

NameNode: s201  
DataNode: s202, s203, s204

Config of Hadoop

  1. core-site.xml

    1
    2
    3
    4
    <property>
    <name>fs.defaultFS</name>
    <value>hdfs://s201/</value>
    </property>
  2. hdfs-site.xml

    1
    2
    3
    4
    <property>
    <name>dfs.replication</name>
    <value>3</value>
    </property>

    There are 3 datanodes.

  3. mapred-site.xml
    Don not need to modify.
  4. yarn-site.xml

    1
    2
    3
    4
    5
    6
    7
    8
    <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>s201</value>
    </property>
    <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
    </property>

    resourcemanager is s201.

  5. Clear
    Remove all of ${HADOOP_HOME}/logs and /tmp.

    Clone VMs and Test

  6. Clone
    Clone –> Create a full done
  7. Config hostname and network
    • hostname
      Edit /etc/sysconfig/network
      Change HOSTNAME=s20x
    • Network
      • IP
        Edit /etc/sysconfig/network-scripts/ifcfg-ehtx
        Remove HWADDR and UUID
        Change IPADDR=xxx.xxx.xxx.20x
      • Card info
        Remove /etc/udev/rules.d/70-persistent-net.rules
      • Restart network
  8. Ssh without password
    Config s201 login others without password.
  9. Start hadoop

    1
    2
    $> hadoop namenode -format
    $> start-all.sh

    WebUI: http://192.168.137.201:50070
    Click Datanodes –> 3 datanodes

Config by script

  1. Install rsync package
  2. Edit xcall.sh to run command on all nodes

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    #!/bin/bash
    param=$@
    ip=201
    if [ $# -lt 1 ]; then
    echo parameters less than 1
    exit 1
    fi
    for (( ; ip<=204; ip=$ip+1 )); do
    echo =============host: ${ip}============
    ssh s$ip "${param}"
    done
  3. Edit xrsync.sh copy file to other nodes

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    #!/bin/bash
    if [ $# -lt 1 ]; then
    echo no param
    exit 1
    fi
    path=$1
    dir=$(dirname $path)
    filename=$(basename $path)
    cd $dir
    fullpath=$(pwd -P .)
    user=$(whoami)
    for (( ip=202; ip<=204; ip=$ip+1 )); do
    echo ============node: s$ip=============
    rsync -lr ${path} ${user}@s${ip}:${fullpath}
    done