Skip to content

Commit c971c8c

Browse files
author
sam
authored
fix: prevent environment pollution when importing pyspark (#1552)
* fix: prevent environment pollution when importing pyspark Signed-off-by: Sam Goodwin <[email protected]> * fix: only pop if dirty Signed-off-by: Sam Goodwin <[email protected]> --------- Signed-off-by: Sam Goodwin <[email protected]>
1 parent 18717fb commit c971c8c

File tree

1 file changed

+10
-3
lines changed

1 file changed

+10
-3
lines changed

pandera/external_config.py

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,9 @@
22

33
import os
44

5+
is_spark_local_ip_dirty = False
6+
is_pyarrow_ignore_timezone_dirty = False
7+
58
try:
69
# try importing pyspark to see if it exists. This is important because the
710
# pandera.typing module defines a Series type that inherits from
@@ -10,12 +13,16 @@
1013
# https://spark.apache.org/docs/3.2.0/api/python/user_guide/pandas_on_spark/typehints.html#type-hinting-with-names
1114
# pylint: disable=unused-import
1215
if os.getenv("SPARK_LOCAL_IP") is None:
16+
is_spark_local_ip_dirty = True
1317
os.environ["SPARK_LOCAL_IP"] = "127.0.0.1"
1418
if os.getenv("PYARROW_IGNORE_TIMEZONE") is None:
19+
is_pyarrow_ignore_timezone_dirty = True
1520
# This can be overriden by the user
1621
os.environ["PYARROW_IGNORE_TIMEZONE"] = "1"
1722

1823
import pyspark.pandas
19-
except ImportError:
20-
os.environ.pop("SPARK_LOCAL_IP")
21-
os.environ.pop("PYARROW_IGNORE_TIMEZONE")
24+
finally:
25+
if is_spark_local_ip_dirty:
26+
os.environ.pop("SPARK_LOCAL_IP")
27+
if is_pyarrow_ignore_timezone_dirty:
28+
os.environ.pop("PYARROW_IGNORE_TIMEZONE")

0 commit comments

Comments
 (0)