|
Use the following configuration Ubuntu Python application development Spark
Ubuntu 64-bit Basic environment configuration
Install JDK, download jdk-8u45-linux-x64.tar.gz, extract to /opt/jdk1.8.0_45
Download: http: //www.Oracle.com/technetwork/java/javase/downloads/index.html
Install scala, download scala-2.11.6.tgz, extract to /opt/scala-2.11.6
Shimoji Address: http://www.scala-lang.org/
Install Spark, download spark-1.3.1-bin-Hadoop2.6.tgz, extract it to / opt / spark-hadoop
Download: http: //spark.apache.org/downloads.html,
Configuration environment variable, edit the / etc / profile, execute the following command
python @ ubuntu: ~ $ sudo gedit / etc / profile
In most file add:
#Seeting JDK JDK environment variable
export JAVA_HOME = / opt / jdk1.8.0_45
export JRE_HOME = $ {JAVA_HOME} / jre
export CLASSPATH =:. $ {JAVA_HOME} / lib: $ {JRE_HOME} / lib
export PATH = $ {JAVA_HOME} / bin: $ {JRE_HOME} / bin: $ PATH
#Seeting Scala Scala environment variable
export SCALA_HOME = / opt / scala-2.11.6
export PATH = $ {SCALA_HOME} / bin: $ PATH
#setting Spark Spark environment variable
export SPARK_HOME = / opt / spark-hadoop /
#PythonPath The Spark in pySpark module increases Python environment
export PYTHONPATH = / opt / spark-hadoop / python
Restart the computer, so that / etc / profile permanent, temporary take effect, open a command window and execute source / etc / profile to take effect in the current window
Test results of the installation
Open a command window, switch to the root directory of Spark
Executive ./bin/spark-shell, Scala to open the connection window Spark
Startup error message appears scala>, a successful start
Executive ./bin/pyspark, open the Python connection to Spark window
Ubuntu under Spark development environment to build
Startup error, as shown above appears when the successful launch.
Browser accessed via: The following page
Test SPark available.
Python An application made Spark
Previously set PYTHONPATH, will pyspark added to Python's search path
Open Spark installation directory, Python- "build folder py4j, complex Python to the next directory, as shown:
Open a command line window, enter the python, Python version 2.7.6, as shown in note Spark does not support Python3
Enter the import pyspark, as shown below, prove that the development work is completed before
Use Pycharm new key projects, the use of red box code testing |
|
|
|