First, follow the documented setup instructions for IntelliJ IDEA.
Next, ensure you have a valid Python version installed on your machine, and set up an SDK pointing to it.  One that
is managed by pyenv will work fine.
Next, add the python directory as a module, using File/New/Module from Existing Sources.... Next, in the settings
for the new module, associate it with the Python SDK just created above.
Now, do a full project build to ensure there are no errors.
With all of the above done, you should be able to debug the tests under python/pyspark/tests, after a few environment
variables are set up.  The easiest way to do this is to create a new debug configuration for one of the Python tests
(which will probably fail initially), then edit it.  You will need to set the following:
- Working directory: /path/to/source/spark
- Environment variables:
- PYSPARK_PYTHON=- /path/to/your/python(the same one the SDK points to)
- PYSPARK_DRIVER_VERSION=- $PYSPARK_PYTHON
 
At this point, you should be able to re-run the debug configuration and hit breakpoints.


