hadoop-4

Hadoop start procedure.

Processes

  1. start-dfs.sh
    NameNode
    SecondNameNode
    DataNode
  2. start-yarn.sh
    ResourceManager
    NodeManage

Scripts

  1. sbin/start-all.sh
    libexec/hadoop-config.sh
    start-dfs.sh
    start-yarn.sh
  2. sbin/start-dfs.sh
    libexec/hadoop-config.sh
    sbin/hadoop-daemon.sh --config ... --hostname ... start namenode ...
    sbin/hadoop-daemons.sh --config ... --hostname ... start datanode ...
    sbin/hadoop-daemon.sh --config ... --hostname ... start secondarynamenode ...
    sbin/hadoop-daemon.sh --config ... --hostname ... start zkfc ...
  3. sbin/start-yarn.sh
    libexec/hadoop-config.sh
    sbin/yarn-daemon.sh start resourcemanager
    sbin/yarn-daemon.sh start nodemanager
  4. sbin/hadoop-deamons.sh
    libexec/hadoop-config.sh
    get slave node from slaves file
    hadoop-daemon.sh
  5. sbin/hadoop-deamon.sh
    libexec/hadoop-config.sh
    bin/hdfs ...
  6. Start process separately
    1
    2
    3
    4
    $> hadoop-daemon.sh start namenode // start namenode (run on master node)
    $> hadoop-daemons.sh start datanode // start datanodes on 3 machines (run on master node, `hadoop-daemons.sh` will access slaves file)
    $> hadoop-daemon.sh stop namenode // stop namenode (run on master node)
    $> hadoop-daemon.sh stop datanode // stop datanode on that node who runs this command.