Home PC Games Linux Windows Database Network Programming Server Mobile  
           
  Home \ Server \ Apache Spark1.1.0 deployment and development environment to build     - To install Google Chrome browser under Ubuntu 14.04 LTS (Linux)

- Profile Linux users login shell and login to read (Linux)

- Several SQL Server data migration / Export Import Practice (Database)

- In Spring AOP example explanation (Programming)

- What happens after the MySQL disk space is full (Database)

- Elasticsearch Kibana installation notes (Linux)

- Android Studio simple setup (Linux)

- Django1.8 return json json string and the string contents of the received post (Programming)

- Developing a Web server yourself (Server)

- Oracle 11g on Linux system boot from the startup settings (Database)

- The direct insertion sort algorithm (Programming)

- EXP-00091 Error resolved (Database)

- Nginx Load Balancing (standby) + Keepalived (Server)

- Wireless LAN security solutions (Linux)

- CentOS 5.8 (64) Python 2.7.5 installation error resolved (Linux)

- Use Swift remove the spaces in the string (Programming)

- Calling Qt libraries to implement functional processes of some summary (Programming)

- Boost notes --Thread - problems encountered in the initial use on Ubuntu (Programming)

- C ++ inline functions (Programming)

- Ubuntu 14.04 build Hadoop 2.5.1 standalone and pseudo-distributed environment (32-bit) (Server)

 
         
  Apache Spark1.1.0 deployment and development environment to build
     
  Add Date : 2018-11-21      
         
         
         
  Spark is the Apache-based company introduced a Hadoop Distributed File System (HDFS) parallel computing architecture. And MapReduce different, Spark is not limited to the preparation of map and reduce two methods, which provide a more powerful computing memory (in-memory computing) model, so that the user can be programmed to read the data into memory among the cluster, and users can easily and quickly repeated queries, ideally suited for machine learning algorithms. This article introduces Apache Spark1.1.0 deployment and development environment to build.
0. Prepare

For learning purposes, this article Spark deployed in a virtual machine, select the virtual machine VMware WorkStation. In the virtual machine, you need to install the following software:

Ubuntu 14.04.1 LTS 64-bit desktop version
hadoop-2.4.0.tar.gz
jdk-7u67-linux-x64.tar.gz
scala-2.10.4.tgz
spark-1.1.0-bin-hadoop2.4.tgz
Spark's development environment, the paper choose Windows7 platform, IDE choose IntelliJ IDEA. In Windows, you need to install the following software:

IntelliJ IDEA 13.1.4 Community Edition
apache-maven-3.2.3-bin.zip (installation process is relatively simple, the reader self-installation)
1. Install JDK

Unzip jdk installation package to / usr / lib directory:

sudo cp jdk-7u67-linux-x64.gz / usr / lib
cd / usr / lib
sudo tar -xvzf jdk-7u67-linux-x64.gz
sudo gedit / etc / profile

Add the environment variable to the end of / etc / profile file:

export JAVA_HOME = / usr / lib / jdk1.7.0_67
export JRE_HOME = / usr / lib / jdk1.7.0_67 / jre
export PATH = $ JAVA_HOME / bin: $ JRE_HOME / bin: $ PATH
export CLASSPATH =:. $ JAVA_HOME / lib: $ JRE_HOME / lib: $ CLASSPATH
Save and update the / etc / profile:

source / etc / profile
Jdk test whether the installation was successful:

java -version



 

2. Install and configure SSH

sudo apt-get update
sudo apt-get install openssh-server
sudo /etc/init.d/ssh start
Build and add keys:

ssh-keygen -t rsa -P ""
cd /home/hduser/.ssh
cat id_rsa.pub >> authorized_keys
ssh login:

ssh localhost


 

3. Install hadoop2.4.0

Pseudo-distributed mode installation hadoop2.4.0. Unzip hadoop2.4.0 to / usr / local directory:

sudo cp hadoop-2.4.0.tar.gz / usr / local /
sudo tar -xzvf hadoop-2.4.0.tar.gz
Add the environment variable to the end of / etc / profile file:

export HADOOP_HOME = / usr / local / hadoop-2.4.0
export PATH = $ HADOOP_HOME / bin: $ HADOOP_HOME / sbin: $ PATH

export HADOOP_COMMON_LIB_NATIVE_DIR = $ HADOOP_HOME / lib / native
export HADOOP_OPTS = "- Djava.library.path = $ HADOOP_HOME / lib"
Save and update the / etc / profile:

source / etc / profile
Modify jdk path located /usr/local/hadoop-2.4.0/etc/hadoop of hadoop-env.sh and yarn-env.sh file:

cd /usr/local/hadoop-2.4.0/etc/hadoop
sudo gedit hadoop-env.sh
sudo gedit yarn-evn.sh
hadoop-env.sh:

#The java implementation to use.
export JAVA_HOME=/usr/lib/jdk1.7.0_67

yarn-env.sh:

#some Java parameters
#export JAVA_HOME=/home/y/libexec/jdk1.6.0/
export JAVA_HOME=/usr/lib/jdk1.7.0_67


Modify core-site.xml:

sudo gedit core-site.xml
In the < configuration > < / configuration > add between: