Home PC Games Linux Windows Database Network Programming Server Mobile  
           
  Home \ Server \ CentOS 7 install Hadoop-cdh-2.6     - Teach you how to protect the security of Linux desktop (Linux)

- How to Install Focuswriter 1.4.5 (Linux)

- HttpClient4 usage upgrade from HttpClient3 (Programming)

- CentOS 6 / Linux su: Unable to set user ID: Resource temporarily unavailable (Linux)

- MySQL5.7.10 installation documentation (Database)

- cursor_sharing induced error ORA-00600 (Database)

- PostgreSQL procedural language learning (Database)

- Video editing captions under Linux (Linux)

- 7 JavaScript interview questions (Programming)

- Keepalived + HAProxy high availability load balancing (Server)

- ORA-30926 and MERGE tables empty the temporary occupation problem (Database)

- Git uses a small mind (Linux)

- Linux install Eclipse for C / C ++ Development (Linux)

- How to install Hadoop on CentOS7 (Server)

- Linux Getting Started Tutorial: Ubuntu laptop screen brightness adjustment (Linux)

- Teamviewer not start in Linux (Linux)

- Fedora 23 How to install LAMP server (Server)

- Oracle 11g RMAN virtual private directory (Database)

- After the first remote installation GlassFish Web to remotely access their back office management system error solution appears (Server)

- Java logging performance of those things (Programming)

 
         
  CentOS 7 install Hadoop-cdh-2.6
     
  Add Date : 2018-11-21      
         
         
         
  1.Hadoop Introduction

Apache Software Foundation's Hadoop is an open source distributed computing platform. In Hadoop Distributed File System (HDFS, Hadoop Distributed Filesystem) and MapReduce (Google MapReduce open source implementation) as the core Hadoop system to provide users with low-level details transparent distributed infrastructure.

For Hadoop clusters in terms of the role can be divided into two categories: Master and Salve. An HDFS cluster consists of a NameNode and several DataNode thereof. Wherein NameNode as the primary server, and client management namespace file system access to the file system operations; data cluster DataNode managing storage. MapReduce framework consists of a single run on the primary node JobTracker and run from each cluster node TaskTracker composed of. All tasks constitute the master node is responsible for scheduling a job, these tasks are distributed across different from the node. Master monitor their implementation, and re-run the previous failed tasks; only from the node is responsible for the tasks assigned by the master node. After When a Job is submitted, JobTracker received submit jobs and configuration information, configuration information will be sent from node aliquots, and scheduling tasks and monitoring TaskTracker execution.

As can be seen from the above description, HDFS and MapReduce together form the core Hadoop distributed system architecture. HDFS implement distributed file system in the cluster, MapReduce cluster on a distributed computing and tasking. HDFS provides MapReduce task processing the file operations such as storage and support, on the basis of HDFS MapReduce realized on the distribution of tasks, tracking, job execution, and collect the results, interaction between the two, completed the Hadoop distributed cluster main mission.

1.2 Environmental instructions

master 192.168.0.201

slave 192.168.0.220

Two nodes are CentOS 7

1.3 Preparing the Environment

Permanently turn off the firewall and selinux

systemctl disable firewalld
systemctl stop firewalld
setenforce 0

1.4 Network Configuration

Two modify the host name: master / salve

Setting hosts, able to resolve each other

1.5 Configuring ssh trust

 master
  yum -y install sshpass
  ssh-keygen all the way round
  ssh-copy-id -i ~ / .ssh / id_rsa.pub root@192.168.0.220
slave
  yum -y install sshpass
  ssh-keygen all the way round
  ssh-copy-id -i ~ / .ssh / id_rsa.pub root@192.168.0.201
Test ssh other host does not prompt for a password is OK

2. Install the JDK

Two machines are installed

tar zxvf jdk-8u65-linux-x64.tar.gz
mv jdk1.8.0_65 / usr / jdk

2.1 Setting environment variables

Both machines are set

export JAVA_HOME = / usr / jdk
export JRE_HOME = / usr / jdk / jre
export CLASSPATH =:. $ CLASSPATH: $ JAVA_HOME / lib: $ JRE_HOME / lib
export PATH = $ PATH: $ JAVA_HOME / bin: $ JRE_HOME / bin
Executive source / etc / profile

3. Test JDK

java -version

3.1 Installation Hadoop

Download the official website CDH-2.6-hadoop: archive.cloudera.com/cdh5

tar zxvf hadoop-2.6.0-cdh5.4.8.tar.gz
mv hadoop-2.6.0-cdh5.4.8 / usr / hadoop
cd / usr / hadoop
mkdir -p dfs / name
mkdir -p dfs / data
mkdir -p tmp

3.2 Adding slave

cd / usr / hadoop / etc / hadoop
 vim slaves
  192.168.0.220 # added slaveIP

3.3 modify hadoop-env.sh and yarn.env.sh

vim hadoop-env.sh / vim yarn-env.sh
export export JAVA_HOME = / usr / jdk # added java variable

3.4 modify the core-site.xml


        < Property>
                < Name> fs.defaultFS < / name>
                < Value> hdfs: //192.168.0.201: 9000 < / value>
        < / Property>
        < Property>
                < Name> io.file.buffer.size < / name>
                < Value> 131702 < / value>
        < / Property>
        < Property>
                < Name> hadoop.tmp.dir < / name>
                < Value> file: / usr / hadoop / tmp < / value>
        < / Property>
        < Property>
                < Name> hadoop.proxyuser.hadoop.hosts < / name>
                < Value> * < / value>
        < / Property>
        < Property>
                < Name> hadoop.proxyuser.hadoop.groups < / name>
                < Value> * < / value>
        < / Property>
< / Configuration>

3.5 modify hdfs-site.xml

< Configuration>
        < Property>
                < Name> dfs.namenode.name.dir < / name>
                < Value>: / usr / hadoopdfs / name < / value>
        < / Property>
        < Property>
                < Name> dfs.datanode.data.dir < / name>
                < Value>: / sur / hadoop / dfs / data < / ​​value>
        < / Property>
        < Property>
                < Name> dfs.replication < / name>
                < Value> 2 < / value>
        < / Property>
        < Property>
                < Name> dfs.namenode.secondary.http-address < / name>
                < Value> 192.168.0.201:9001 < / value>
        < / Property>
        < Property>
                < Name> dfs.webhdfs.enabled < / name>
                < Value> true < / value>
        < / Property>
< / Configuration>

3.6 modify mapred-site.xml

< Configuration>
        < Property>
                < Name> mapreduce.framework.name < / name>
                < Value> yarn < / value>
        < / Property>
        < Property>
                < Name> mapreduce.jobhistory.address < / name>
                < Value> 192.168.0.201:10020 < / value>
        < / Property>
        < Property>
                < Name> mapreduce.jobhistory.webapp.address < / name>
                < Value> 192.168.0.201:19888 < / value>
        < / Property>
< / Configuration>

3.7 modify yarn-site.xml

 < Configuration>
        < Property>
                < Name> yarn.nodemanager.aux-services < / name>
                < Value> mapreduce_shuffle < / value>
        < / Property>
        < Property>
                < Name> yarn.nodemanager.auxservices.mapreduce.shuffle.class < / name>
                < Value> org.apache.hadoop.mapred.ShuffleHandler < / value>
        < / Property>
        < Property>
                < Name> yarn.resourcemanager.address < / name>
                < Value> 192.168.0.201:8032 < / value>
        < / Property>
        < Property>
                < Name> yarn.resourcemanager.scheduler.address < / name>
                < Value> 192.168.0.201:8030 < / value>
        < / Property>
        < Property>
                < Name> yarn.resourcemanager.resource-tracker.address < / name>
                < Value> 192.168.0.201:8031 < / value>
        < / Property>
        < Property>
                < Name> yarn.resourcemanager.admin.address < / name>
                < Value> 192.168.0.201:8033 < / value>
        < / Property>
        < Property>
                < Name> yarn.resourcemanager.webapp.address < / name>
                < Value> 192.168.0.201:8088 < / value>
        < / Property>
        < Property>
                < Name> yarn.nodemanager.resource.memory-mb < / name>
                < Value> 768 < / value>
        


4. The configuration files are copied to the slave side

scp -r / usr / hadoop root@192.168.0.220: / usr /

5. Format nanenode

./bin/hdfs namenode -format

5.1 Start hdfs

./sbin/start-dfs.sh$ ./sbin/start-yarn.sh

5.2 Check start conditions

Enter 192.168.0.201:8088

Enter a URL: 192.168.0.201: 9001
     
         
         
         
  More:      
 
- Installation Android IDE development tools, Android Studio 1.5 under Ubuntu (Linux)
- First start with Kali Linux 2.0 (Linux)
- Oracle11g Trigger Debugging Record Error: PLS-00201: identifier SYS.DBMS_SYSTEM 'must be declared (Database)
- 20 open source / commercial Linux server management control panel (Server)
- Linux --- manual release system cache (Linux)
- Nginx server security configuration (Server)
- ethtool command Detailed (Linux)
- Oracle Standby Redo Log experiment (Database)
- Linux network cut package is not fully defragment (Linux)
- Analysis: Little Notebook facing a major security threat secure online (Linux)
- Android to determine whether the device to open WIFI, GPRS data connection (Programming)
- How to customize your Linux desktop: Gnome 3 (Linux)
- UUID in Java (Programming)
- Go build the locale under Windows (Linux)
- Talking about modern programming language syntax and standard library tightly bound phenomenon (Programming)
- Arduino UNO simulation development environment set up and run simulation (Linux)
- Talk about the Linux folder permissions issue again (Linux)
- MySQL5.6.17 compiler installation under CentOS (Database)
- struts2 completely the wrong way to capture 404 (Programming)
- Fatal NI connect error 12170 error in Alert Log (Database)
     
           
     
  CopyRight 2002-2022 newfreesoft.com, All Rights Reserved.