hadoop-8

Hdfs writting process.

Set block size

  1. Add property
    Default blocksize is 128m, minimal blocksize is 1m.
    Edit hdfs-site.xml, add property dfs.blocksize.

    1
    2
    3
    4
    5
    6
    7
    8
    <property>
    <name>dsf.blocksize</name>
    <value>2k</value>
    </property>
    <property>
    <name>dfs.namenode.fs-limits.min-block-size</name>
    <value>1024</value>
    </property>

    Note: Too small blocksize can degrade performance.

  2. Check

    1
    $> hdfs getconf -confKey dfs.blocksize // show blocksize.
  3. Restart hadoop and put file

    1
    2
    3
    $> stop-dfs.sh
    $> start-dfs.sh
    $> hdfs dfs -put a.txt
  4. Put file by Hadoop API

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    public void putFileWithBlocksize() {
    Configuration conf = new Configuration();
    conf.set("fs.defaultFS", "hdfs://192.168.137.201:8020/");
    try {
    FileSystem fs = FileSystem.get(conf);
    FSDataOutputStream fsdo = fs.create(new Path("/usr/win7admin/blocksize.txt"),
    true, 1024, (short)2, 1024); // replication: 2; block size: 1024.
    FileInputStream fis = new FileInputStream("D:/HexoSourceCode/source/_posts/new1.md");
    IOUtils.copyBytes(fis, fsdo, 1024);
    fis.close();
    fsdo.close();
    System.out.println("over!");
    } catch (Exception e) {
    e.printStackTrace();
    }
    }

Writting process

  1. Concept
    • Packet
      64k
    • chunk
      Checksum each chunk
      512 bytes