Home IT Linux Windows Database Network Programming Server Mobile  
           
  Home \ Server \ CentOS 7 install Hadoop-cdh-2.6     - Android Studio commonly used shortcuts and how to follow the Eclipse Shortcuts (Linux)

- Ubuntu uses the / etc / profile file to configure the JAVA environment variable (Linux)

- Linux Getting Started Tutorial: Ubuntu laptop screen brightness adjustment (Linux)

- Linux system security configuration (Linux)

- Java integrated development environment common set of operations (Linux)

- Linux modify the system time (Linux)

- Android View event delivery (Programming)

- GCC and gfortran write MEX program (Matlab2012a) under Ubuntu 14.04 (Programming)

- Use Visual Studio 2015 to develop Android program (Programming)

- After CentOS configure SSH password Free, still prompted for a password (Linux)

- CentOS7 Kubernetes used on container management (Server)

- Linux 4.0+ kernel support for hardware switching module (HW Switch Offload) (Linux)

- Linux 10 useful examples of command-line completion (Linux)

- Learn to read the source code of vmstat (Linux)

- PostgreSQL with the C Completing the storage process instances (Database)

- MongoDB upgrade from 2.4.9 to 2.6.0 and PHP record of mongo extension upgrade from 1.4.5 to 1.5.1 (Database)

- Customize own small private Linux system (Linux)

- Glibc support encryption by modifying the DNS (Programming)

- Basic Tutorial: Linux novice should know 26 commands (Linux)

- httpd-2.4 feature (Server)

 
         
  CentOS 7 install Hadoop-cdh-2.6
     
  Add Date : 2018-11-21      
         
       
         
  1.Hadoop Introduction

Apache Software Foundation's Hadoop is an open source distributed computing platform. In Hadoop Distributed File System (HDFS, Hadoop Distributed Filesystem) and MapReduce (Google MapReduce open source implementation) as the core Hadoop system to provide users with low-level details transparent distributed infrastructure.

For Hadoop clusters in terms of the role can be divided into two categories: Master and Salve. An HDFS cluster consists of a NameNode and several DataNode thereof. Wherein NameNode as the primary server, and client management namespace file system access to the file system operations; data cluster DataNode managing storage. MapReduce framework consists of a single run on the primary node JobTracker and run from each cluster node TaskTracker composed of. All tasks constitute the master node is responsible for scheduling a job, these tasks are distributed across different from the node. Master monitor their implementation, and re-run the previous failed tasks; only from the node is responsible for the tasks assigned by the master node. After When a Job is submitted, JobTracker received submit jobs and configuration information, configuration information will be sent from node aliquots, and scheduling tasks and monitoring TaskTracker execution.

As can be seen from the above description, HDFS and MapReduce together form the core Hadoop distributed system architecture. HDFS implement distributed file system in the cluster, MapReduce cluster on a distributed computing and tasking. HDFS provides MapReduce task processing the file operations such as storage and support, on the basis of HDFS MapReduce realized on the distribution of tasks, tracking, job execution, and collect the results, interaction between the two, completed the Hadoop distributed cluster main mission.

1.2 Environmental instructions

master 192.168.0.201

slave 192.168.0.220

Two nodes are CentOS 7

1.3 Preparing the Environment

Permanently turn off the firewall and selinux

systemctl disable firewalld
systemctl stop firewalld
setenforce 0

1.4 Network Configuration

Two modify the host name: master / salve

Setting hosts, able to resolve each other

1.5 Configuring ssh trust

 master
  yum -y install sshpass
  ssh-keygen all the way round
  ssh-copy-id -i ~ / .ssh / id_rsa.pub root@192.168.0.220
slave
  yum -y install sshpass
  ssh-keygen all the way round
  ssh-copy-id -i ~ / .ssh / id_rsa.pub root@192.168.0.201
Test ssh other host does not prompt for a password is OK

2. Install the JDK

Two machines are installed

tar zxvf jdk-8u65-linux-x64.tar.gz
mv jdk1.8.0_65 / usr / jdk

2.1 Setting environment variables

Both machines are set

export JAVA_HOME = / usr / jdk
export JRE_HOME = / usr / jdk / jre
export CLASSPATH =:. $ CLASSPATH: $ JAVA_HOME / lib: $ JRE_HOME / lib
export PATH = $ PATH: $ JAVA_HOME / bin: $ JRE_HOME / bin
Executive source / etc / profile

3. Test JDK

java -version

3.1 Installation Hadoop

Download the official website CDH-2.6-hadoop: archive.cloudera.com/cdh5

tar zxvf hadoop-2.6.0-cdh5.4.8.tar.gz
mv hadoop-2.6.0-cdh5.4.8 / usr / hadoop
cd / usr / hadoop
mkdir -p dfs / name
mkdir -p dfs / data
mkdir -p tmp

3.2 Adding slave

cd / usr / hadoop / etc / hadoop
 vim slaves
  192.168.0.220 # added slaveIP

3.3 modify hadoop-env.sh and yarn.env.sh

vim hadoop-env.sh / vim yarn-env.sh
export export JAVA_HOME = / usr / jdk # added java variable

3.4 modify the core-site.xml


        < Property>
                < Name> fs.defaultFS < / name>
                < Value> hdfs: //192.168.0.201: 9000 < / value>
        < / Property>
        < Property>
                < Name> io.file.buffer.size < / name>
                < Value> 131702 < / value>
        < / Property>
        < Property>
                < Name> hadoop.tmp.dir < / name>
                < Value> file: / usr / hadoop / tmp < / value>
        < / Property>
        < Property>
                < Name> hadoop.proxyuser.hadoop.hosts < / name>
                < Value> * < / value>
        < / Property>
        < Property>
                < Name> hadoop.proxyuser.hadoop.groups < / name>
                < Value> * < / value>
        < / Property>
< / Configuration>

3.5 modify hdfs-site.xml

< Configuration>
        < Property>
                < Name> dfs.namenode.name.dir < / name>
                < Value>: / usr / hadoopdfs / name < / value>
        < / Property>
        < Property>
                < Name> dfs.datanode.data.dir < / name>
                < Value>: / sur / hadoop / dfs / data < / ​​value>
        < / Property>
        < Property>
                < Name> dfs.replication < / name>
                < Value> 2 < / value>
        < / Property>
        < Property>
                < Name> dfs.namenode.secondary.http-address < / name>
                < Value> 192.168.0.201:9001 < / value>
        < / Property>
        < Property>
                < Name> dfs.webhdfs.enabled < / name>
                < Value> true < / value>
        < / Property>
< / Configuration>

3.6 modify mapred-site.xml

< Configuration>
        < Property>
                < Name> mapreduce.framework.name < / name>
                < Value> yarn < / value>
        < / Property>
        < Property>
                < Name> mapreduce.jobhistory.address < / name>
                < Value> 192.168.0.201:10020 < / value>
        < / Property>
        < Property>
                < Name> mapreduce.jobhistory.webapp.address < / name>
                < Value> 192.168.0.201:19888 < / value>
        < / Property>
< / Configuration>

3.7 modify yarn-site.xml

 < Configuration>
        < Property>
                < Name> yarn.nodemanager.aux-services < / name>
                < Value> mapreduce_shuffle < / value>
        < / Property>
        < Property>
                < Name> yarn.nodemanager.auxservices.mapreduce.shuffle.class < / name>
                < Value> org.apache.hadoop.mapred.ShuffleHandler < / value>
        < / Property>
        < Property>
                < Name> yarn.resourcemanager.address < / name>
                < Value> 192.168.0.201:8032 < / value>
        < / Property>
        < Property>
                < Name> yarn.resourcemanager.scheduler.address < / name>
                < Value> 192.168.0.201:8030 < / value>
        < / Property>
        < Property>
                < Name> yarn.resourcemanager.resource-tracker.address < / name>
                < Value> 192.168.0.201:8031 < / value>
        < / Property>
        < Property>
                < Name> yarn.resourcemanager.admin.address < / name>
                < Value> 192.168.0.201:8033 < / value>
        < / Property>
        < Property>
                < Name> yarn.resourcemanager.webapp.address < / name>
                < Value> 192.168.0.201:8088 < / value>
        < / Property>
        < Property>
                < Name> yarn.nodemanager.resource.memory-mb < / name>
                < Value> 768 < / value>
        


4. The configuration files are copied to the slave side

scp -r / usr / hadoop root@192.168.0.220: / usr /

5. Format nanenode

./bin/hdfs namenode -format

5.1 Start hdfs

./sbin/start-dfs.sh$ ./sbin/start-yarn.sh

5.2 Check start conditions

Enter 192.168.0.201:8088

Enter a URL: 192.168.0.201: 9001
     
         
       
         
  More:      
 
- View and modify Linux machine name (Linux)
- Compare Several MySQL environmental issues (Database)
- Oracle table of nested loop connection (Database)
- History and Statistics tuptime use tools to view Linux server system boot time (Server)
- How to use Git to upload code to GitHub project (Linux)
- Linux, rename the file or folder (mv command and rename command) (Linux)
- Getting case of Python Hello World (Programming)
- Apple Mac computer to install Windows 10 Concise Guide (Linux)
- OpenGL Superb Learning Notes - Fragment Shader (Programming)
- CentOS / Linux SWAP partitions added (Linux)
- Use ldap implement Windows Remote Desktop Ubuntu Linux (Linux)
- Linux disk management practices (Linux)
- MySQL remote connection settings (Database)
- Difference LVS three scheduling modes (Server)
- Ubuntu resolve sudo: source: command not found error (Linux)
- Thinking in Java study notes - Access modifiers (Programming)
- To deploy MySQL database with separate read and write OneProxy (Database)
- How to add and delete bookmarks in Ubuntu (Linux)
- Hadoop scheduling availability of workflow platform - Oozie (Server)
- When Vim create Python scripts, vim autocomplete interpreter and encoding method (Programming)
     
           
     
  CopyRight 2002-2016 newfreesoft.com, All Rights Reserved.