diff --git a/docs/notes/dev-tools/google-colab/files.qmd b/docs/notes/dev-tools/google-colab/files.qmd
index 20e3dd1..4a7fd85 100644
--- a/docs/notes/dev-tools/google-colab/files.qmd
+++ b/docs/notes/dev-tools/google-colab/files.qmd
@@ -2,31 +2,32 @@
Let's take a few moments to explore the \"Files\" menu in the Google Colab left sidebar.
-We see there are some example files in the \"sample_data\" directory.
+We see there are some example files in the \"sample_data\" directory:
-![Example files in the Colab filesystem](../../../images/colab-filesystem.png){height=350 fig-align="center"}
+![Example files in the Colab filesystem.](../../../images/colab-filesystem.png){height=350 fig-align="center"}
## Downloading Files
-Observe, it is possible to download files like these from the Colab filesystem to your local machine, by right-clicking on them.
+Observe, it is possible to download files like these from the Colab filesystem to your local machine, by right-clicking on them:
![Downloading files from the Colab filesystem.](../../../images/colab-file-download.png){height=350 fig-align="center"}
## Uploading Files
-And it is possible to upload files from your local machine to the Colab filesystem as well, using the "Files > Upload to session storage" menu option (i.e. the button with the file upload icon).
+And it is possible to upload files from your local machine to the Colab filesystem as well, using the "Files > Upload to session storage" menu option (i.e. the button with the file upload icon):
![Uploading files to the Colab filesystem.](../../../images/colab-file-upload.png){height=350 fig-align="center"}
## Accessing and Manipulating Files
-Once we have the files in the Colab filesystem, we can write Python code to access and manipulate them.
+Once we have the files in the Colab filesystem, we can write Python code to access and manipulate them:
+ One way of interacting with the filesystem in Python is by using the capabilities of [the `os` module](../../python-modules/os.ipynb).
+ For reading and writing text (\".txt\") files, we can leverage the `open` function (see [Text File Operations](../../python-lang/file-operations.qmd)).
- + For reading and writing tabular data (\".csv\") files, we can leverage the `pandas` package (see [Getting Started with Pandas](https://prof-rossetti.github.io/applied-data-science-python-book/notes/pandas/obtaining-dataframes.html)).
+ + For reading and writing tabular data (\".csv\") files, we can leverage the `pandas` package. The `pandas` package is a foundational component of the Python ecosystem, and provides capabilities for processing tabular data. Although outside the scope of this book, working with tabular data is covered in more detail in the professor's [Applied Data Science in Python](https://prof-rossetti.github.io/applied-data-science-python-book/notes/pandas/obtaining-dataframes.html) book.
+
Some of these examples might seem a bit complicated at the moment for beginners. For now, the main take-away is understanding there are ways for us to write Python code to interact with the surrounding environment, specifically accessing and manipulating the filesystem.
diff --git a/docs/notes/fetching-data/csv.qmd b/docs/notes/fetching-data/csv.qmd
index b45e0e5..3d3dc70 100644
--- a/docs/notes/fetching-data/csv.qmd
+++ b/docs/notes/fetching-data/csv.qmd
@@ -61,3 +61,5 @@ Calculating the average grade (using series aggregation methods):
print(grades_column.mean())
print(grades_column.median())
```
+
+The `pandas` package is a foundational component of the Python ecosystem, and provides many additional capabilities for processing tabular data. Although outside the scope of this book, working with tabular data is covered in more detail in the professor's [Applied Data Science in Python](https://prof-rossetti.github.io/applied-data-science-python-book/notes/pandas/obtaining-dataframes.html) book.
diff --git a/docs/notes/python-modules/os.ipynb b/docs/notes/python-modules/os.ipynb
index 56e8910..0e32c95 100644
--- a/docs/notes/python-modules/os.ipynb
+++ b/docs/notes/python-modules/os.ipynb
@@ -1,976 +1,1004 @@
{
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "qNAdiHklyLhf"
- },
- "source": [
- "# The `os` Module"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "vgTIx3lx4QMT"
- },
- "source": [
- "## Filesystem Operations\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "FzZ8bCfCodn6"
- },
- "source": [
- "\n",
- "The [`os` module](https://docs.python.org/3/library/os.html) helps us access and manipulate the file system.\n",
- "\n",
- "\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "tHcQfhazBX2m"
- },
- "source": [
- "### Current Working Directory"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "o0EYbKxJpScy"
- },
- "source": [
- "Detecting the name of the current working directory, using the [`getcwd` function](https://docs.python.org/3/library/os.html#os.getcwd):"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 1,
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/",
- "height": 35
- },
- "id": "ZiSowuruobqs",
- "outputId": "517edc8c-bcd3-4689-f038-779814517482"
- },
- "outputs": [
- {
- "output_type": "execute_result",
- "data": {
- "text/plain": [
- "'/content'"
- ],
- "application/vnd.google.colaboratory.intrinsic+json": {
- "type": "string"
- }
- },
- "metadata": {},
- "execution_count": 1
- }
- ],
- "source": [
- "import os\n",
- "\n",
- "os.getcwd()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "pVJCJphfBX2o"
- },
- "source": [
- "We see in Google Colab the default working directory is called \"content\"."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "CtE1TQbIBX2o"
- },
- "source": [
- "### Listing Files"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "DpkyEiouqLXg"
- },
- "source": [
- "Listing all files and folders that exist in a given directory (for example in the \"content\" directory where we are right now), using the [`listdir` function](https://docs.python.org/3/library/os.html#os.listdir):"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 2,
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "stXz9D5PqOfC",
- "outputId": "0aa7e816-3dee-48a4-b945-f95db7179f6d"
- },
- "outputs": [
- {
- "output_type": "execute_result",
- "data": {
- "text/plain": [
- "['.config', 'sample_data']"
- ]
- },
- "metadata": {},
- "execution_count": 2
- }
- ],
- "source": [
- "os.listdir(\"/content\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "0tzsYbvrrRsR"
- },
- "source": [
- "We see there is a \"sample_data\" directory.\n",
- "\n",
- "After further inspection, we see it contains some example text and data files:"
- ]
- },
- {
- "cell_type": "code",
- "source": [
- "os.listdir(\"/content/sample_data\")"
- ],
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "3Zjx7ZzJCdNG",
- "outputId": "dcc98d5d-12e0-4b47-b812-5230ad3c5448"
- },
- "execution_count": 3,
- "outputs": [
- {
- "output_type": "execute_result",
- "data": {
- "text/plain": [
- "['anscombe.json',\n",
- " 'README.md',\n",
- " 'california_housing_train.csv',\n",
- " 'mnist_train_small.csv',\n",
- " 'mnist_test.csv',\n",
- " 'california_housing_test.csv']"
- ]
- },
- "metadata": {},
- "execution_count": 3
- }
- ]
- },
- {
- "cell_type": "markdown",
- "source": [
- ":::{.callout-note}\n",
- "So far we have used an absolute file reference, but since we are already in the \"content\" directory, it is possible to use a relative file references instead. These references are relative to the \"content\" directory, where we are right now."
- ],
- "metadata": {
- "id": "vKI0O7xwIMny"
- }
- },
- {
- "cell_type": "code",
- "source": [
- "os.listdir()"
- ],
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "wWzDFlsXLAi0",
- "outputId": "44e39f99-5815-410d-a7a3-ffac2fefcb94"
- },
- "execution_count": 4,
- "outputs": [
- {
- "output_type": "execute_result",
- "data": {
- "text/plain": [
- "['.config', 'sample_data']"
- ]
- },
- "metadata": {},
- "execution_count": 4
- }
- ]
- },
- {
- "cell_type": "code",
- "source": [
- "os.listdir(\"sample_data\")"
- ],
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "yvofS2THINPI",
- "outputId": "fb385438-9aca-4e99-b22b-d0a3b9c3fb09"
- },
- "execution_count": 5,
- "outputs": [
- {
- "output_type": "execute_result",
- "data": {
- "text/plain": [
- "['anscombe.json',\n",
- " 'README.md',\n",
- " 'california_housing_train.csv',\n",
- " 'mnist_train_small.csv',\n",
- " 'mnist_test.csv',\n",
- " 'california_housing_test.csv']"
- ]
- },
- "metadata": {},
- "execution_count": 5
- }
- ]
- },
- {
- "cell_type": "markdown",
- "source": [
- ":::"
- ],
- "metadata": {
- "id": "RHrWnjdFJAHX"
- }
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "veHbkh2sBX2p"
- },
- "source": [
- "### Detecting Directories and Files"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "z4JD64T3rcBq"
- },
- "source": [
- "Checking to see whether a given directory or file exists, using the `isdir` and `isfile` functions from the `os.path` sub-module:\n"
- ]
- },
- {
- "cell_type": "code",
- "source": [
- "os.path.isdir(\"sample_data\")"
- ],
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "QtSSz9FiLnHj",
- "outputId": "44859195-f80a-46da-b2bb-05402519d374"
- },
- "execution_count": 6,
- "outputs": [
- {
- "output_type": "execute_result",
- "data": {
- "text/plain": [
- "True"
- ]
- },
- "metadata": {},
- "execution_count": 6
- }
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 7,
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "RLZdJSILrgiy",
- "outputId": "ace33995-bd71-45e4-b39d-20ad16cf3c18"
- },
- "outputs": [
- {
- "output_type": "execute_result",
- "data": {
- "text/plain": [
- "True"
- ]
- },
- "metadata": {},
- "execution_count": 7
- }
- ],
- "source": [
- "os.path.isfile(\"sample_data/README.md\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "rC65gC5IBX2r"
- },
- "source": [
- "### Reading and Writing Files"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "c84DC2iZr4Y_"
- },
- "source": [
- "See the [Text File Operations](../python-lang/file-operations.qmd) chapter for examples of how to read and write text files using the `open` function:"
- ]
- },
- {
- "cell_type": "code",
- "source": [
- "with open(\"sample_data/README.md\", \"r\") as file:\n",
- " print(file.read())"
- ],
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "pw1FwYUEEAiy",
- "outputId": "de78fe0f-3ffc-421b-bfe8-17061e551de6"
- },
- "execution_count": 8,
- "outputs": [
- {
- "output_type": "stream",
- "name": "stdout",
- "text": [
- "This directory includes a few sample datasets to get you started.\n",
- "\n",
- "* `california_housing_data*.csv` is California housing data from the 1990 US\n",
- " Census; more information is available at:\n",
- " https://docs.google.com/document/d/e/2PACX-1vRhYtsvc5eOR2FWNCwaBiKL6suIOrxJig8LcSBbmCbyYsayia_DvPOOBlXZ4CAlQ5nlDD8kTaIDRwrN/pub\n",
- "\n",
- "* `mnist_*.csv` is a small sample of the\n",
- " [MNIST database](https://en.wikipedia.org/wiki/MNIST_database), which is\n",
- " described at: http://yann.lecun.com/exdb/mnist/\n",
- "\n",
- "* `anscombe.json` contains a copy of\n",
- " [Anscombe's quartet](https://en.wikipedia.org/wiki/Anscombe%27s_quartet); it\n",
- " was originally described in\n",
- "\n",
- " Anscombe, F. J. (1973). 'Graphs in Statistical Analysis'. American\n",
- " Statistician. 27 (1): 17-21. JSTOR 2682899.\n",
- "\n",
- " and our copy was prepared by the\n",
- " [vega_datasets library](https://github.com/altair-viz/vega_datasets/blob/4f67bdaad10f45e3549984e17e1b3088c731503d/vega_datasets/_data/anscombe.json).\n",
- "\n"
- ]
- }
- ]
- },
- {
- "cell_type": "markdown",
- "source": [
- "See [Getting Started with Pandas](https://prof-rossetti.github.io/applied-data-science-python-book/notes/pandas/obtaining-dataframes.html) for examples of how to read and write tabular data files using the `pandas` package:"
- ],
- "metadata": {
- "id": "kvSy9q9PHpXq"
- }
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "qNAdiHklyLhf"
+ },
+ "source": [
+ "# The `os` Module"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "vgTIx3lx4QMT"
+ },
+ "source": [
+ "## Filesystem Operations\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "FzZ8bCfCodn6"
+ },
+ "source": [
+ "\n",
+ "The [`os` module](https://docs.python.org/3/library/os.html) helps us access and manipulate the file system.\n",
+ "\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "tHcQfhazBX2m"
+ },
+ "source": [
+ "### Current Working Directory"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "o0EYbKxJpScy"
+ },
+ "source": [
+ "Detecting the name of the current working directory, using the [`getcwd` function](https://docs.python.org/3/library/os.html#os.getcwd):"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 35
},
+ "id": "ZiSowuruobqs",
+ "outputId": "517edc8c-bcd3-4689-f038-779814517482"
+ },
+ "outputs": [
{
- "cell_type": "code",
- "source": [
- "from pandas import read_csv\n",
- "\n",
- "df = read_csv(\"sample_data/california_housing_test.csv\")\n",
- "df.head()"
- ],
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/",
- "height": 226
- },
- "id": "kIbHer2eHoPv",
- "outputId": "d80821d8-852b-4535-ea5c-80e67207e22a"
+ "data": {
+ "application/vnd.google.colaboratory.intrinsic+json": {
+ "type": "string"
},
- "execution_count": 9,
- "outputs": [
- {
- "output_type": "execute_result",
- "data": {
- "text/plain": [
- " longitude latitude housing_median_age total_rooms total_bedrooms \\\n",
- "0 -122.05 37.37 27.0 3885.0 661.0 \n",
- "1 -118.30 34.26 43.0 1510.0 310.0 \n",
- "2 -117.81 33.78 27.0 3589.0 507.0 \n",
- "3 -118.36 33.82 28.0 67.0 15.0 \n",
- "4 -119.67 36.33 19.0 1241.0 244.0 \n",
- "\n",
- " population households median_income median_house_value \n",
- "0 1537.0 606.0 6.6085 344700.0 \n",
- "1 809.0 277.0 3.5990 176500.0 \n",
- "2 1484.0 495.0 5.7934 270500.0 \n",
- "3 49.0 11.0 6.1359 330000.0 \n",
- "4 850.0 237.0 2.9375 81700.0 "
- ],
- "text/html": [
- "\n",
- "
\n",
- "
\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " | \n",
- " longitude | \n",
- " latitude | \n",
- " housing_median_age | \n",
- " total_rooms | \n",
- " total_bedrooms | \n",
- " population | \n",
- " households | \n",
- " median_income | \n",
- " median_house_value | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " 0 | \n",
- " -122.05 | \n",
- " 37.37 | \n",
- " 27.0 | \n",
- " 3885.0 | \n",
- " 661.0 | \n",
- " 1537.0 | \n",
- " 606.0 | \n",
- " 6.6085 | \n",
- " 344700.0 | \n",
- "
\n",
- " \n",
- " 1 | \n",
- " -118.30 | \n",
- " 34.26 | \n",
- " 43.0 | \n",
- " 1510.0 | \n",
- " 310.0 | \n",
- " 809.0 | \n",
- " 277.0 | \n",
- " 3.5990 | \n",
- " 176500.0 | \n",
- "
\n",
- " \n",
- " 2 | \n",
- " -117.81 | \n",
- " 33.78 | \n",
- " 27.0 | \n",
- " 3589.0 | \n",
- " 507.0 | \n",
- " 1484.0 | \n",
- " 495.0 | \n",
- " 5.7934 | \n",
- " 270500.0 | \n",
- "
\n",
- " \n",
- " 3 | \n",
- " -118.36 | \n",
- " 33.82 | \n",
- " 28.0 | \n",
- " 67.0 | \n",
- " 15.0 | \n",
- " 49.0 | \n",
- " 11.0 | \n",
- " 6.1359 | \n",
- " 330000.0 | \n",
- "
\n",
- " \n",
- " 4 | \n",
- " -119.67 | \n",
- " 36.33 | \n",
- " 19.0 | \n",
- " 1241.0 | \n",
- " 244.0 | \n",
- " 850.0 | \n",
- " 237.0 | \n",
- " 2.9375 | \n",
- " 81700.0 | \n",
- "
\n",
- " \n",
- "
\n",
- "
\n",
- "
\n",
- "
\n"
- ],
- "application/vnd.google.colaboratory.intrinsic+json": {
- "type": "dataframe",
- "variable_name": "df",
- "summary": "{\n \"name\": \"df\",\n \"rows\": 3000,\n \"fields\": [\n {\n \"column\": \"longitude\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 1.9949362939550161,\n \"min\": -124.18,\n \"max\": -114.49,\n \"num_unique_values\": 607,\n \"samples\": [\n -121.15,\n -121.46,\n -121.02\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"latitude\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 2.1296695233438325,\n \"min\": 32.56,\n \"max\": 41.92,\n \"num_unique_values\": 587,\n \"samples\": [\n 40.17,\n 33.69,\n 39.61\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"housing_median_age\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 12.555395554955755,\n \"min\": 1.0,\n \"max\": 52.0,\n \"num_unique_values\": 52,\n \"samples\": [\n 14.0,\n 49.0,\n 7.0\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"total_rooms\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 2155.59333162558,\n \"min\": 6.0,\n \"max\": 30450.0,\n \"num_unique_values\": 2215,\n \"samples\": [\n 1961.0,\n 1807.0,\n 680.0\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"total_bedrooms\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 415.6543681363232,\n \"min\": 2.0,\n \"max\": 5419.0,\n \"num_unique_values\": 1055,\n \"samples\": [\n 532.0,\n 764.0,\n 2162.0\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"population\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 1030.5430124122422,\n \"min\": 5.0,\n \"max\": 11935.0,\n \"num_unique_values\": 1802,\n \"samples\": [\n 947.0,\n 1140.0,\n 2019.0\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"households\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 365.42270980552604,\n \"min\": 2.0,\n \"max\": 4930.0,\n \"num_unique_values\": 1026,\n \"samples\": [\n 646.0,\n 629.0,\n 504.0\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"median_income\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 1.854511729691481,\n \"min\": 0.4999,\n \"max\": 15.0001,\n \"num_unique_values\": 2578,\n \"samples\": [\n 1.725,\n 0.7403,\n 2.6964\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"median_house_value\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 113119.68746964433,\n \"min\": 22500.0,\n \"max\": 500001.0,\n \"num_unique_values\": 1784,\n \"samples\": [\n 71900.0,\n 63000.0,\n 115800.0\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}"
- }
- },
- "metadata": {},
- "execution_count": 9
- }
+ "text/plain": [
+ "'/content'"
]
+ },
+ "execution_count": 1,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "import os\n",
+ "\n",
+ "os.getcwd()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "pVJCJphfBX2o"
+ },
+ "source": [
+ "We see in Google Colab the default working directory is called \"content\"."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "CtE1TQbIBX2o"
+ },
+ "source": [
+ "### Listing Files"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "DpkyEiouqLXg"
+ },
+ "source": [
+ "Listing all files and folders that exist in a given directory (for example in the \"content\" directory where we are right now), using the [`listdir` function](https://docs.python.org/3/library/os.html#os.listdir):"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
},
+ "id": "stXz9D5PqOfC",
+ "outputId": "0aa7e816-3dee-48a4-b945-f95db7179f6d"
+ },
+ "outputs": [
{
- "cell_type": "markdown",
- "metadata": {
- "id": "_17RzoYDBX2q"
- },
- "source": [
- "### Deleting Files"
+ "data": {
+ "text/plain": [
+ "['.config', 'sample_data']"
]
+ },
+ "execution_count": 2,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "os.listdir(\"/content\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "0tzsYbvrrRsR"
+ },
+ "source": [
+ "We see there is a \"sample_data\" directory.\n",
+ "\n",
+ "After further inspection, we see it contains some example text and data files:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "3Zjx7ZzJCdNG",
+ "outputId": "dcc98d5d-12e0-4b47-b812-5230ad3c5448"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "['anscombe.json',\n",
+ " 'README.md',\n",
+ " 'california_housing_train.csv',\n",
+ " 'mnist_train_small.csv',\n",
+ " 'mnist_test.csv',\n",
+ " 'california_housing_test.csv']"
+ ]
+ },
+ "execution_count": 3,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "os.listdir(\"/content/sample_data\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "vKI0O7xwIMny"
+ },
+ "source": [
+ ":::{.callout-note}\n",
+ "So far we have used an absolute file reference, but since we are already in the \"content\" directory, it is possible to use a relative file references instead. These references are relative to the \"content\" directory, where we are right now."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
},
+ "id": "wWzDFlsXLAi0",
+ "outputId": "44e39f99-5815-410d-a7a3-ffac2fefcb94"
+ },
+ "outputs": [
{
- "cell_type": "code",
- "source": [
- "filepath = \"sample_data/anscombe.json\"\n",
- "\n",
- "# verifying the file exists:\n",
- "os.path.isfile(filepath)"
- ],
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "it4IQL4sNC7o",
- "outputId": "55dc1ebc-ee67-4ec1-d8fd-8aaf2913fdb6"
- },
- "execution_count": 10,
- "outputs": [
- {
- "output_type": "execute_result",
- "data": {
- "text/plain": [
- "True"
- ]
- },
- "metadata": {},
- "execution_count": 10
- }
+ "data": {
+ "text/plain": [
+ "['.config', 'sample_data']"
]
+ },
+ "execution_count": 4,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "os.listdir()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "yvofS2THINPI",
+ "outputId": "fb385438-9aca-4e99-b22b-d0a3b9c3fb09"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "['anscombe.json',\n",
+ " 'README.md',\n",
+ " 'california_housing_train.csv',\n",
+ " 'mnist_train_small.csv',\n",
+ " 'mnist_test.csv',\n",
+ " 'california_housing_test.csv']"
+ ]
+ },
+ "execution_count": 5,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "os.listdir(\"sample_data\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "For the remainder of this chapter we will continue using relative references, for simplicity."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "RHrWnjdFJAHX"
+ },
+ "source": [
+ ":::"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "veHbkh2sBX2p"
+ },
+ "source": [
+ "### Detecting Directories and Files"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "z4JD64T3rcBq"
+ },
+ "source": [
+ "Checking to see whether a given directory or file exists, using the `isdir` and `isfile` functions from the `os.path` sub-module:\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
},
+ "id": "QtSSz9FiLnHj",
+ "outputId": "44859195-f80a-46da-b2bb-05402519d374"
+ },
+ "outputs": [
{
- "cell_type": "markdown",
- "metadata": {
- "id": "wLw8JfYL2tdc"
- },
- "source": [
- "Deleting a file, using the [`remove` function](https://docs.python.org/3/library/os.html#os.remove):"
+ "data": {
+ "text/plain": [
+ "True"
]
+ },
+ "execution_count": 6,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "os.path.isdir(\"sample_data\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
},
+ "id": "RLZdJSILrgiy",
+ "outputId": "ace33995-bd71-45e4-b39d-20ad16cf3c18"
+ },
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": 11,
- "metadata": {
- "id": "oGSWOswR2u9A"
- },
- "outputs": [],
- "source": [
- "os.remove(filepath)"
+ "data": {
+ "text/plain": [
+ "True"
]
- },
- {
- "cell_type": "code",
- "source": [
- "# verifying the file was deleted:\n",
- "os.path.isfile(filepath)"
+ },
+ "execution_count": 7,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "os.path.isfile(\"sample_data/README.md\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ ""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "_17RzoYDBX2q"
+ },
+ "source": [
+ "### Deleting Files"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
},
+ "id": "it4IQL4sNC7o",
+ "outputId": "55dc1ebc-ee67-4ec1-d8fd-8aaf2913fdb6"
+ },
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": 18,
- "metadata": {
- "id": "vugRrjZvpyA5"
- },
- "outputs": [],
- "source": [
- "os.makedirs(\"my_data\", exist_ok=True)"
+ "data": {
+ "text/plain": [
+ "True"
]
+ },
+ "execution_count": 10,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "filepath = \"sample_data/anscombe.json\"\n",
+ "\n",
+ "# verifying the file exists:\n",
+ "os.path.isfile(filepath)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "wLw8JfYL2tdc"
+ },
+ "source": [
+ "Deleting a file, using the [`remove` function](https://docs.python.org/3/library/os.html#os.remove):"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "metadata": {
+ "id": "oGSWOswR2u9A"
+ },
+ "outputs": [],
+ "source": [
+ "os.remove(filepath)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
},
+ "id": "zLBd1XF-BdSo",
+ "outputId": "eb30b9ef-4393-4997-dfab-d9fafa726358"
+ },
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": 14,
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "hUSjUFttq24k",
- "outputId": "5ad3fd93-8221-40e9-89c9-13cdc10219aa"
- },
- "outputs": [
- {
- "output_type": "execute_result",
- "data": {
- "text/plain": [
- "['.config', 'my_data', 'sample_data']"
- ]
- },
- "metadata": {},
- "execution_count": 14
- }
- ],
- "source": [
- "# verifying the \"my_data\" directory got created:\n",
- "os.listdir()"
+ "data": {
+ "text/plain": [
+ "False"
]
+ },
+ "execution_count": 12,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "# verifying the file was deleted:\n",
+ "os.path.isfile(filepath)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "wYhlyEdzBX2q"
+ },
+ "source": [
+ "### Creating Directories"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "xgRz036LpyiW"
+ },
+ "source": [
+ "Creating a new directory using the [`makedirs` function](https://docs.python.org/3/library/os.html#os.makedirs):"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 18,
+ "metadata": {
+ "id": "vugRrjZvpyA5"
+ },
+ "outputs": [],
+ "source": [
+ "os.makedirs(\"my_data\", exist_ok=True)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
},
+ "id": "hUSjUFttq24k",
+ "outputId": "5ad3fd93-8221-40e9-89c9-13cdc10219aa"
+ },
+ "outputs": [
{
- "cell_type": "code",
- "source": [
- "#| echo: false\n",
- "\n",
- "#df.to_csv(\"my_data/california_housing_test.csv\", index=False)"
- ],
- "metadata": {
- "id": "t_vqxL5IQuUZ"
- },
- "execution_count": 19,
- "outputs": []
- },
- {
- "cell_type": "markdown",
- "source": [
- "To delete an empty directory, we can use the `rmdir` function from the `os` module, however it only works for empty directories and throws an error if the directory does not exist. So for a more robust solution, we can use the [`rmtree` function](https://docs.python.org/3/library/shutil.html#shutil.rmtree) from the `shutil` module:"
- ],
- "metadata": {
- "id": "2rtE_R03PSFQ"
- }
- },
- {
- "cell_type": "code",
- "source": [
- "#os.rmdir(\"my_data\")"
- ],
- "metadata": {
- "id": "x8cRgG0IPqo3"
- },
- "execution_count": 21,
- "outputs": []
- },
- {
- "cell_type": "code",
- "source": [
- "from shutil import rmtree\n",
- "\n",
- "rmtree(\"my_data\", ignore_errors=True)"
- ],
- "metadata": {
- "id": "LjpFLVTFOoIx"
- },
- "execution_count": 16,
- "outputs": []
- },
- {
- "cell_type": "code",
- "source": [
- "# verifying the \"my_data\" directory got deleted:\n",
- "os.listdir()"
- ],
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "fOk_jjKTOrgJ",
- "outputId": "c86a055e-3ba6-42dd-fdce-2bec8269b4e6"
- },
- "execution_count": 17,
- "outputs": [
- {
- "output_type": "execute_result",
- "data": {
- "text/plain": [
- "['.config', 'sample_data']"
- ]
- },
- "metadata": {},
- "execution_count": 17
- }
+ "data": {
+ "text/plain": [
+ "['.config', 'my_data', 'sample_data']"
]
+ },
+ "execution_count": 14,
+ "metadata": {},
+ "output_type": "execute_result"
}
- ],
- "metadata": {
+ ],
+ "source": [
+ "# verifying the \"my_data\" directory got created:\n",
+ "os.listdir()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 19,
+ "metadata": {
+ "id": "t_vqxL5IQuUZ"
+ },
+ "outputs": [],
+ "source": [
+ "#| echo: false\n",
+ "\n",
+ "#df.to_csv(\"my_data/california_housing_test.csv\", index=False)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Deleting Directories"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "2rtE_R03PSFQ"
+ },
+ "source": [
+ "To delete an empty directory, we can use the `rmdir` function from the `os` module, however it only works for empty directories and throws an error if the directory does not exist. So for a more robust solution, we can use the [`rmtree` function](https://docs.python.org/3/library/shutil.html#shutil.rmtree) from the `shutil` module:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 21,
+ "metadata": {
+ "id": "x8cRgG0IPqo3"
+ },
+ "outputs": [],
+ "source": [
+ "#os.rmdir(\"my_data\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {
+ "id": "LjpFLVTFOoIx"
+ },
+ "outputs": [],
+ "source": [
+ "from shutil import rmtree\n",
+ "\n",
+ "rmtree(\"my_data\", ignore_errors=True)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {
"colab": {
- "provenance": []
- },
- "kernelspec": {
- "display_name": "Python 3",
- "name": "python3"
+ "base_uri": "https://localhost:8080/"
},
- "language_info": {
- "name": "python"
+ "id": "fOk_jjKTOrgJ",
+ "outputId": "c86a055e-3ba6-42dd-fdce-2bec8269b4e6"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "['.config', 'sample_data']"
+ ]
+ },
+ "execution_count": 17,
+ "metadata": {},
+ "output_type": "execute_result"
}
+ ],
+ "source": [
+ "# verifying the \"my_data\" directory got deleted:\n",
+ "os.listdir()"
+ ]
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "provenance": []
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "name": "python3"
},
- "nbformat": 4,
- "nbformat_minor": 0
-}
\ No newline at end of file
+ "language_info": {
+ "name": "python"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}