girl, elephant, petting

第一次部屬 Hadoop 就上手 – Part 2 – Hadoop 安裝及設定

Hadoop叢集基礎架設範例

  1. 要準備的事項

    1. 選定架設機器(本範例使用 5台 , ip 30-34)

    1. 本範例使用 Ubuntu 18.04 ,並使用 VMware 方便克隆多台電腦

    2. 資源配置如下:

    1台機器 for NameNode
    1台機器 for ResourceManager
    3台機器 for Worker
    
    每台機器資源均為:  
    Cpu : 4 core
    Ram : 8 G
    

承上 part 1 , 將基本環境配置好後,開始執行 Hadoop 安裝及設定

  1. 下載及安裝hadoop(管理者身份)

    1. 下載
    cd
    wget http://ftp.tc.edu.tw/pub/Apache/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz
    
    1. 解壓縮
    tar -tvf hadoop-3.2.1.tar.gz #查看一下檔案內容
    tar -xvf hadoop-3.2.1.tar.gz -C /usr/local
    
    1. 更名
    mv /usr/local/hadoop-3.2.1 /usr/local/hadoop
    
    1. 改變資料夾及檔案擁有者
    chown -R hadoop:hadoop /usr/local/hadoop
    

  1. 設定hadoop使用者環境變數 (Hadoop身份)

    1. 設定.bashrc
    nano ~/.bashrc
    

    # Set HADOOP_HOME
    export HADOOP_HOME=/usr/local/hadoop
    # Set HADOOP_MAPRED_HOME
    export HADOOP_MAPRED_HOME=${HADOOP_HOME} 
    # Add Hadoop bin and sbin directory to PATH
    export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
    
    1. 重新載入設定檔
    source ~/.bashrc  # . .bashrc
    
    1. 查看環境變數


  1. 更改 Hadoop運行程式時環境腳本(Hadoop身份)
nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop

  1. 更改 Hadoop core-site.xml(Hadoop身份)
nano /usr/local/hadoop/etc/hadoop/core-site.xml 

<property>
   <name>hadoop.tmp.dir</name>
   <value>/home/hadoop/data</value>
   <description>Temporary Directory.</description>
</property>
<property>
   <name>fs.defaultFS</name>
   <value>hdfs://test30.example.org</value>
   <description>Use HDFS as file storage engine</description>
</property>
  • Hadoop 3.2.0版之後有檢查語法指令
hadoop conftest


  • 參考 hortonworks 各組件參數配置建議

GitHub原始碼請點我


  1. 更改 Hadoop mapred-site.xml(Hadoop身份)
nano /usr/local/hadoop/etc/hadoop/mapred-site.xml

<property>
	<name>mapreduce.map.memory.mb</name>
	<value>2048</value>
</property>
<property>
	<name>mapreduce.map.java.opts</name>
	<value>-Xmx1638m</value>
</property>
<property>
	<name>mapreduce.reduce.memory.mb</name>
	<value>4096</value>
</property>
<property>
	<name>mapreduce.reduce.java.opts</name>
	<value>-Xmx3276m</value>
</property>
<property>
	<name>yarn.app.mapreduce.am.resource.mb</name>
	<value>4096</value>
</property>
<property>
	<name>yarn.app.mapreduce.am.command-opts</name>
	<value>-Xmx3276m</value>
</property>
<property>
	<name>mapreduce.task.io.sort.mb</name>
	<value>819</value>
</property>
<property>
       <name>mapreduce.framework.name</name>
       <value>yarn</value>
</property>
<property>
       <name>mapreduce.jobhistory.address</name>
       <value>test32.example.org:10020</value>
</property>
<property>
       <name>mapreduce.jobhistory.webapp.address</name>
       <value>test32.example.org:19888</value>
</property>

  1. 更改 Hadoop yarn-site.xml(Hadoop身份)
nano /usr/local/hadoop/etc/hadoop/yarn-site.xml

<property>
   <name>yarn.nodemanager.aux-services</name>
   <value>mapreduce_shuffle</value>
</property>
<property>
   <name>yarn.nodemanager.env-whitelist</name>
   <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
<property>
   <name>yarn.scheduler.minimum-allocation-mb</name>
   <value>2048</value>
</property>
<property>
   <name>yarn.scheduler.maximum-allocation-mb</name>
   <value>6144</value>
</property>
<property>
   <name>yarn.nodemanager.resource.memory-mb</name>
   <value>6144</value>
</property>
<property>
   <name>yarn.nodemanager.resource.detect-hardware-capabilities</name>
   <value>true</value>
</property>
<property>
   <name>yarn.nodemanager.resource.cpu-vcores</name>
   <value>3</value>
</property>
<property>
   <name>yarn.resourcemanager.hostname</name>
   <value>test31.example.org</value>
</property>	
<property>
   <name>yarn.resourcemanager.scheduler.class</name>
   <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>

  1. 更改Hadoop hdfs-site.xml(Hadoop身份)
nano /usr/local/hadoop/etc/hadoop/hdfs-site.xml

<property>
   <name>dfs.permissions.superusergroup</name>
   <value>hadoop</value>
   <description>The name of the group of super-users. The value should be a single group name.</description>
</property>

  1. 建立Hadoop worker檔(管理者身份)
nano /usr/local/hadoop/etc/hadoop/workers

部屬教學 part 2 安裝及配置 到此, 請至 part 3 機器複製及叢集啟動 繼續 ….


如果覺得內容還不錯,請我喝杯咖啡吧~

Similar Posts

發佈留言

發佈留言必須填寫的電子郵件地址不會公開。