A simple example for PySpark based project.
-
JDK (>=7) (as the local test downloads Spark 1.6.0, which dropped JDK 6 support)
-
Python PEP 8 (for checking code-style in local test)
pip install pep8
-
Python Py4J == 0.9.0 (for initiating Spark in local test)
pip install -r requirements.txt
-
For test in your local, just run below:
./dev/run_tests
This contains some codes to download Spark 1.6.0 to run tests. So, it is okay to run the script without some specific modifications.
- As currently PySpark cannot be installed via
pip install
due to this issue, Spark has to be installed andSPARK_HOME="your Spark path"
andPYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH
should be set in order to use this library outside after building.