diff --git a/section-4-unstructured-data-projects/end-to-end-dog-vision-v2.ipynb b/section-4-unstructured-data-projects/end-to-end-dog-vision-v2.ipynb index 177d21df6..92b56dcc9 100644 --- a/section-4-unstructured-data-projects/end-to-end-dog-vision-v2.ipynb +++ b/section-4-unstructured-data-projects/end-to-end-dog-vision-v2.ipynb @@ -4640,9 +4640,7 @@ "id": "wRIwqSrRUtqm" }, "source": [ - "### TK - The magic of transfer learning\n", - "\n", - "UPTOHERE\n", + "### The magic of transfer learning\n", "\n", "**Transfer learning** is the process of getting an existing working model and adjusting it to your own problem.\n", "\n", @@ -4663,12 +4661,16 @@ "\n", "And the good news is, there are plenty of places to find pretrained models!\n", "\n", - "* [`tf.keras.applications`](https://www.tensorflow.org/api_docs/python/tf/keras/applications) - A module built-in to TensorFlow and Keras with a series of pretrained models ready to use.\n", - "* TK - keras CV, keras NLP \n", - "* [`Hugging Face Models Hub`](https://huggingface.co/models) - A large collection of pretrained models on a wide range on tasks, from computer vision to natural language processing to audio processing. \n", - "* [`Kaggle Models`](https://www.kaggle.com/models) - A huge collection of different pretrained models for many different tasks.\n", + "| Resource | Description |\n", + "| :--- | :--- |\n", + "| [`tf.keras.applications`](https://www.tensorflow.org/api_docs/python/tf/keras/applications) | A module built-in to TensorFlow and Keras with a series of pretrained models ready to use. |\n", + "| [KerasNLP](https://keras.io/keras_nlp/) and [KerasCV](https://keras.io/keras_cv/) | Two dedicated libraries for NLP (natural language processing) and CV (computer vision) each of which includes many modality-specific APIs and is capable of running with TensorFlow, JAX or PyTorch. |\n", + "| [Hugging Face Models Hub](https://huggingface.co/models) | A large collection of pretrained models on a wide range on tasks, from computer vision to natural language processing to audio processing. |\n", + "| [Kaggle Models](https://www.kaggle.com/models) | A huge collection of different pretrained models for many different tasks. |\n", "\n", - "TK - image/table of where to find pretrained models\n", + "\"Four\n", + "\n", + "*Different locations to find pretrained models. This list is consistantly expanding as machine learning becomes more and more open-source.* \n", "\n", "> **Note:** For most new machine learning problems, if you're looking to get good results quickly, you should generally look for a pretrained model similar to your problem and use transfer learning to adapt it to your own domain.\n", "\n", @@ -4728,799 +4730,19 @@ }, { "cell_type": "code", - "execution_count": 58, + "execution_count": 1, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "3IQe-aO4aOGp", - "outputId": "40cd7f57-7451-4823-fc19-833b0d6c44d6" + "outputId": "40cd7f57-7451-4823-fc19-833b0d6c44d6", + "tags": [] }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Model: \"efficientnetv2-b0\"\n", - "__________________________________________________________________________________________________\n", - " Layer (type) Output Shape Param # Connected to \n", - "==================================================================================================\n", - " input_1 (InputLayer) [(None, 224, 224, 3)] 0 [] \n", - " \n", - " rescaling (Rescaling) (None, 224, 224, 3) 0 ['input_1[0][0]'] \n", - " \n", - " normalization (Normalizati (None, 224, 224, 3) 0 ['rescaling[0][0]'] \n", - " on) \n", - " \n", - " stem_conv (Conv2D) (None, 112, 112, 32) 864 ['normalization[0][0]'] \n", - " \n", - " stem_bn (BatchNormalizatio (None, 112, 112, 32) 128 ['stem_conv[0][0]'] \n", - " n) \n", - " \n", - " stem_activation (Activatio (None, 112, 112, 32) 0 ['stem_bn[0][0]'] \n", - " n) \n", - " \n", - " block1a_project_conv (Conv (None, 112, 112, 16) 4608 ['stem_activation[0][0]'] \n", - " 2D) \n", - " \n", - " block1a_project_bn (BatchN (None, 112, 112, 16) 64 ['block1a_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block1a_project_activation (None, 112, 112, 16) 0 ['block1a_project_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block2a_expand_conv (Conv2 (None, 56, 56, 64) 9216 ['block1a_project_activation[0\n", - " D) ][0]'] \n", - " \n", - " block2a_expand_bn (BatchNo (None, 56, 56, 64) 256 ['block2a_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block2a_expand_activation (None, 56, 56, 64) 0 ['block2a_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block2a_project_conv (Conv (None, 56, 56, 32) 2048 ['block2a_expand_activation[0]\n", - " 2D) [0]'] \n", - " \n", - " block2a_project_bn (BatchN (None, 56, 56, 32) 128 ['block2a_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block2b_expand_conv (Conv2 (None, 56, 56, 128) 36864 ['block2a_project_bn[0][0]'] \n", - " D) \n", - " \n", - " block2b_expand_bn (BatchNo (None, 56, 56, 128) 512 ['block2b_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block2b_expand_activation (None, 56, 56, 128) 0 ['block2b_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block2b_project_conv (Conv (None, 56, 56, 32) 4096 ['block2b_expand_activation[0]\n", - " 2D) [0]'] \n", - " \n", - " block2b_project_bn (BatchN (None, 56, 56, 32) 128 ['block2b_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block2b_drop (Dropout) (None, 56, 56, 32) 0 ['block2b_project_bn[0][0]'] \n", - " \n", - " block2b_add (Add) (None, 56, 56, 32) 0 ['block2b_drop[0][0]', \n", - " 'block2a_project_bn[0][0]'] \n", - " \n", - " block3a_expand_conv (Conv2 (None, 28, 28, 128) 36864 ['block2b_add[0][0]'] \n", - " D) \n", - " \n", - " block3a_expand_bn (BatchNo (None, 28, 28, 128) 512 ['block3a_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block3a_expand_activation (None, 28, 28, 128) 0 ['block3a_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block3a_project_conv (Conv (None, 28, 28, 48) 6144 ['block3a_expand_activation[0]\n", - " 2D) [0]'] \n", - " \n", - " block3a_project_bn (BatchN (None, 28, 28, 48) 192 ['block3a_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block3b_expand_conv (Conv2 (None, 28, 28, 192) 82944 ['block3a_project_bn[0][0]'] \n", - " D) \n", - " \n", - " block3b_expand_bn (BatchNo (None, 28, 28, 192) 768 ['block3b_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block3b_expand_activation (None, 28, 28, 192) 0 ['block3b_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block3b_project_conv (Conv (None, 28, 28, 48) 9216 ['block3b_expand_activation[0]\n", - " 2D) [0]'] \n", - " \n", - " block3b_project_bn (BatchN (None, 28, 28, 48) 192 ['block3b_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block3b_drop (Dropout) (None, 28, 28, 48) 0 ['block3b_project_bn[0][0]'] \n", - " \n", - " block3b_add (Add) (None, 28, 28, 48) 0 ['block3b_drop[0][0]', \n", - " 'block3a_project_bn[0][0]'] \n", - " \n", - " block4a_expand_conv (Conv2 (None, 28, 28, 192) 9216 ['block3b_add[0][0]'] \n", - " D) \n", - " \n", - " block4a_expand_bn (BatchNo (None, 28, 28, 192) 768 ['block4a_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block4a_expand_activation (None, 28, 28, 192) 0 ['block4a_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block4a_dwconv2 (Depthwise (None, 14, 14, 192) 1728 ['block4a_expand_activation[0]\n", - " Conv2D) [0]'] \n", - " \n", - " block4a_bn (BatchNormaliza (None, 14, 14, 192) 768 ['block4a_dwconv2[0][0]'] \n", - " tion) \n", - " \n", - " block4a_activation (Activa (None, 14, 14, 192) 0 ['block4a_bn[0][0]'] \n", - " tion) \n", - " \n", - " block4a_se_squeeze (Global (None, 192) 0 ['block4a_activation[0][0]'] \n", - " AveragePooling2D) \n", - " \n", - " block4a_se_reshape (Reshap (None, 1, 1, 192) 0 ['block4a_se_squeeze[0][0]'] \n", - " e) \n", - " \n", - " block4a_se_reduce (Conv2D) (None, 1, 1, 12) 2316 ['block4a_se_reshape[0][0]'] \n", - " \n", - " block4a_se_expand (Conv2D) (None, 1, 1, 192) 2496 ['block4a_se_reduce[0][0]'] \n", - " \n", - " block4a_se_excite (Multipl (None, 14, 14, 192) 0 ['block4a_activation[0][0]', \n", - " y) 'block4a_se_expand[0][0]'] \n", - " \n", - " block4a_project_conv (Conv (None, 14, 14, 96) 18432 ['block4a_se_excite[0][0]'] \n", - " 2D) \n", - " \n", - " block4a_project_bn (BatchN (None, 14, 14, 96) 384 ['block4a_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block4b_expand_conv (Conv2 (None, 14, 14, 384) 36864 ['block4a_project_bn[0][0]'] \n", - " D) \n", - " \n", - " block4b_expand_bn (BatchNo (None, 14, 14, 384) 1536 ['block4b_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block4b_expand_activation (None, 14, 14, 384) 0 ['block4b_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block4b_dwconv2 (Depthwise (None, 14, 14, 384) 3456 ['block4b_expand_activation[0]\n", - " Conv2D) [0]'] \n", - " \n", - " block4b_bn (BatchNormaliza (None, 14, 14, 384) 1536 ['block4b_dwconv2[0][0]'] \n", - " tion) \n", - " \n", - " block4b_activation (Activa (None, 14, 14, 384) 0 ['block4b_bn[0][0]'] \n", - " tion) \n", - " \n", - " block4b_se_squeeze (Global (None, 384) 0 ['block4b_activation[0][0]'] \n", - " AveragePooling2D) \n", - " \n", - " block4b_se_reshape (Reshap (None, 1, 1, 384) 0 ['block4b_se_squeeze[0][0]'] \n", - " e) \n", - " \n", - " block4b_se_reduce (Conv2D) (None, 1, 1, 24) 9240 ['block4b_se_reshape[0][0]'] \n", - " \n", - " block4b_se_expand (Conv2D) (None, 1, 1, 384) 9600 ['block4b_se_reduce[0][0]'] \n", - " \n", - " block4b_se_excite (Multipl (None, 14, 14, 384) 0 ['block4b_activation[0][0]', \n", - " y) 'block4b_se_expand[0][0]'] \n", - " \n", - " block4b_project_conv (Conv (None, 14, 14, 96) 36864 ['block4b_se_excite[0][0]'] \n", - " 2D) \n", - " \n", - " block4b_project_bn (BatchN (None, 14, 14, 96) 384 ['block4b_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block4b_drop (Dropout) (None, 14, 14, 96) 0 ['block4b_project_bn[0][0]'] \n", - " \n", - " block4b_add (Add) (None, 14, 14, 96) 0 ['block4b_drop[0][0]', \n", - " 'block4a_project_bn[0][0]'] \n", - " \n", - " block4c_expand_conv (Conv2 (None, 14, 14, 384) 36864 ['block4b_add[0][0]'] \n", - " D) \n", - " \n", - " block4c_expand_bn (BatchNo (None, 14, 14, 384) 1536 ['block4c_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block4c_expand_activation (None, 14, 14, 384) 0 ['block4c_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block4c_dwconv2 (Depthwise (None, 14, 14, 384) 3456 ['block4c_expand_activation[0]\n", - " Conv2D) [0]'] \n", - " \n", - " block4c_bn (BatchNormaliza (None, 14, 14, 384) 1536 ['block4c_dwconv2[0][0]'] \n", - " tion) \n", - " \n", - " block4c_activation (Activa (None, 14, 14, 384) 0 ['block4c_bn[0][0]'] \n", - " tion) \n", - " \n", - " block4c_se_squeeze (Global (None, 384) 0 ['block4c_activation[0][0]'] \n", - " AveragePooling2D) \n", - " \n", - " block4c_se_reshape (Reshap (None, 1, 1, 384) 0 ['block4c_se_squeeze[0][0]'] \n", - " e) \n", - " \n", - " block4c_se_reduce (Conv2D) (None, 1, 1, 24) 9240 ['block4c_se_reshape[0][0]'] \n", - " \n", - " block4c_se_expand (Conv2D) (None, 1, 1, 384) 9600 ['block4c_se_reduce[0][0]'] \n", - " \n", - " block4c_se_excite (Multipl (None, 14, 14, 384) 0 ['block4c_activation[0][0]', \n", - " y) 'block4c_se_expand[0][0]'] \n", - " \n", - " block4c_project_conv (Conv (None, 14, 14, 96) 36864 ['block4c_se_excite[0][0]'] \n", - " 2D) \n", - " \n", - " block4c_project_bn (BatchN (None, 14, 14, 96) 384 ['block4c_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block4c_drop (Dropout) (None, 14, 14, 96) 0 ['block4c_project_bn[0][0]'] \n", - " \n", - " block4c_add (Add) (None, 14, 14, 96) 0 ['block4c_drop[0][0]', \n", - " 'block4b_add[0][0]'] \n", - " \n", - " block5a_expand_conv (Conv2 (None, 14, 14, 576) 55296 ['block4c_add[0][0]'] \n", - " D) \n", - " \n", - " block5a_expand_bn (BatchNo (None, 14, 14, 576) 2304 ['block5a_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block5a_expand_activation (None, 14, 14, 576) 0 ['block5a_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block5a_dwconv2 (Depthwise (None, 14, 14, 576) 5184 ['block5a_expand_activation[0]\n", - " Conv2D) [0]'] \n", - " \n", - " block5a_bn (BatchNormaliza (None, 14, 14, 576) 2304 ['block5a_dwconv2[0][0]'] \n", - " tion) \n", - " \n", - " block5a_activation (Activa (None, 14, 14, 576) 0 ['block5a_bn[0][0]'] \n", - " tion) \n", - " \n", - " block5a_se_squeeze (Global (None, 576) 0 ['block5a_activation[0][0]'] \n", - " AveragePooling2D) \n", - " \n", - " block5a_se_reshape (Reshap (None, 1, 1, 576) 0 ['block5a_se_squeeze[0][0]'] \n", - " e) \n", - " \n", - " block5a_se_reduce (Conv2D) (None, 1, 1, 24) 13848 ['block5a_se_reshape[0][0]'] \n", - " \n", - " block5a_se_expand (Conv2D) (None, 1, 1, 576) 14400 ['block5a_se_reduce[0][0]'] \n", - " \n", - " block5a_se_excite (Multipl (None, 14, 14, 576) 0 ['block5a_activation[0][0]', \n", - " y) 'block5a_se_expand[0][0]'] \n", - " \n", - " block5a_project_conv (Conv (None, 14, 14, 112) 64512 ['block5a_se_excite[0][0]'] \n", - " 2D) \n", - " \n", - " block5a_project_bn (BatchN (None, 14, 14, 112) 448 ['block5a_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block5b_expand_conv (Conv2 (None, 14, 14, 672) 75264 ['block5a_project_bn[0][0]'] \n", - " D) \n", - " \n", - " block5b_expand_bn (BatchNo (None, 14, 14, 672) 2688 ['block5b_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block5b_expand_activation (None, 14, 14, 672) 0 ['block5b_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block5b_dwconv2 (Depthwise (None, 14, 14, 672) 6048 ['block5b_expand_activation[0]\n", - " Conv2D) [0]'] \n", - " \n", - " block5b_bn (BatchNormaliza (None, 14, 14, 672) 2688 ['block5b_dwconv2[0][0]'] \n", - " tion) \n", - " \n", - " block5b_activation (Activa (None, 14, 14, 672) 0 ['block5b_bn[0][0]'] \n", - " tion) \n", - " \n", - " block5b_se_squeeze (Global (None, 672) 0 ['block5b_activation[0][0]'] \n", - " AveragePooling2D) \n", - " \n", - " block5b_se_reshape (Reshap (None, 1, 1, 672) 0 ['block5b_se_squeeze[0][0]'] \n", - " e) \n", - " \n", - " block5b_se_reduce (Conv2D) (None, 1, 1, 28) 18844 ['block5b_se_reshape[0][0]'] \n", - " \n", - " block5b_se_expand (Conv2D) (None, 1, 1, 672) 19488 ['block5b_se_reduce[0][0]'] \n", - " \n", - " block5b_se_excite (Multipl (None, 14, 14, 672) 0 ['block5b_activation[0][0]', \n", - " y) 'block5b_se_expand[0][0]'] \n", - " \n", - " block5b_project_conv (Conv (None, 14, 14, 112) 75264 ['block5b_se_excite[0][0]'] \n", - " 2D) \n", - " \n", - " block5b_project_bn (BatchN (None, 14, 14, 112) 448 ['block5b_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block5b_drop (Dropout) (None, 14, 14, 112) 0 ['block5b_project_bn[0][0]'] \n", - " \n", - " block5b_add (Add) (None, 14, 14, 112) 0 ['block5b_drop[0][0]', \n", - " 'block5a_project_bn[0][0]'] \n", - " \n", - " block5c_expand_conv (Conv2 (None, 14, 14, 672) 75264 ['block5b_add[0][0]'] \n", - " D) \n", - " \n", - " block5c_expand_bn (BatchNo (None, 14, 14, 672) 2688 ['block5c_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block5c_expand_activation (None, 14, 14, 672) 0 ['block5c_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block5c_dwconv2 (Depthwise (None, 14, 14, 672) 6048 ['block5c_expand_activation[0]\n", - " Conv2D) [0]'] \n", - " \n", - " block5c_bn (BatchNormaliza (None, 14, 14, 672) 2688 ['block5c_dwconv2[0][0]'] \n", - " tion) \n", - " \n", - " block5c_activation (Activa (None, 14, 14, 672) 0 ['block5c_bn[0][0]'] \n", - " tion) \n", - " \n", - " block5c_se_squeeze (Global (None, 672) 0 ['block5c_activation[0][0]'] \n", - " AveragePooling2D) \n", - " \n", - " block5c_se_reshape (Reshap (None, 1, 1, 672) 0 ['block5c_se_squeeze[0][0]'] \n", - " e) \n", - " \n", - " block5c_se_reduce (Conv2D) (None, 1, 1, 28) 18844 ['block5c_se_reshape[0][0]'] \n", - " \n", - " block5c_se_expand (Conv2D) (None, 1, 1, 672) 19488 ['block5c_se_reduce[0][0]'] \n", - " \n", - " block5c_se_excite (Multipl (None, 14, 14, 672) 0 ['block5c_activation[0][0]', \n", - " y) 'block5c_se_expand[0][0]'] \n", - " \n", - " block5c_project_conv (Conv (None, 14, 14, 112) 75264 ['block5c_se_excite[0][0]'] \n", - " 2D) \n", - " \n", - " block5c_project_bn (BatchN (None, 14, 14, 112) 448 ['block5c_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block5c_drop (Dropout) (None, 14, 14, 112) 0 ['block5c_project_bn[0][0]'] \n", - " \n", - " block5c_add (Add) (None, 14, 14, 112) 0 ['block5c_drop[0][0]', \n", - " 'block5b_add[0][0]'] \n", - " \n", - " block5d_expand_conv (Conv2 (None, 14, 14, 672) 75264 ['block5c_add[0][0]'] \n", - " D) \n", - " \n", - " block5d_expand_bn (BatchNo (None, 14, 14, 672) 2688 ['block5d_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block5d_expand_activation (None, 14, 14, 672) 0 ['block5d_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block5d_dwconv2 (Depthwise (None, 14, 14, 672) 6048 ['block5d_expand_activation[0]\n", - " Conv2D) [0]'] \n", - " \n", - " block5d_bn (BatchNormaliza (None, 14, 14, 672) 2688 ['block5d_dwconv2[0][0]'] \n", - " tion) \n", - " \n", - " block5d_activation (Activa (None, 14, 14, 672) 0 ['block5d_bn[0][0]'] \n", - " tion) \n", - " \n", - " block5d_se_squeeze (Global (None, 672) 0 ['block5d_activation[0][0]'] \n", - " AveragePooling2D) \n", - " \n", - " block5d_se_reshape (Reshap (None, 1, 1, 672) 0 ['block5d_se_squeeze[0][0]'] \n", - " e) \n", - " \n", - " block5d_se_reduce (Conv2D) (None, 1, 1, 28) 18844 ['block5d_se_reshape[0][0]'] \n", - " \n", - " block5d_se_expand (Conv2D) (None, 1, 1, 672) 19488 ['block5d_se_reduce[0][0]'] \n", - " \n", - " block5d_se_excite (Multipl (None, 14, 14, 672) 0 ['block5d_activation[0][0]', \n", - " y) 'block5d_se_expand[0][0]'] \n", - " \n", - " block5d_project_conv (Conv (None, 14, 14, 112) 75264 ['block5d_se_excite[0][0]'] \n", - " 2D) \n", - " \n", - " block5d_project_bn (BatchN (None, 14, 14, 112) 448 ['block5d_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block5d_drop (Dropout) (None, 14, 14, 112) 0 ['block5d_project_bn[0][0]'] \n", - " \n", - " block5d_add (Add) (None, 14, 14, 112) 0 ['block5d_drop[0][0]', \n", - " 'block5c_add[0][0]'] \n", - " \n", - " block5e_expand_conv (Conv2 (None, 14, 14, 672) 75264 ['block5d_add[0][0]'] \n", - " D) \n", - " \n", - " block5e_expand_bn (BatchNo (None, 14, 14, 672) 2688 ['block5e_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block5e_expand_activation (None, 14, 14, 672) 0 ['block5e_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block5e_dwconv2 (Depthwise (None, 14, 14, 672) 6048 ['block5e_expand_activation[0]\n", - " Conv2D) [0]'] \n", - " \n", - " block5e_bn (BatchNormaliza (None, 14, 14, 672) 2688 ['block5e_dwconv2[0][0]'] \n", - " tion) \n", - " \n", - " block5e_activation (Activa (None, 14, 14, 672) 0 ['block5e_bn[0][0]'] \n", - " tion) \n", - " \n", - " block5e_se_squeeze (Global (None, 672) 0 ['block5e_activation[0][0]'] \n", - " AveragePooling2D) \n", - " \n", - " block5e_se_reshape (Reshap (None, 1, 1, 672) 0 ['block5e_se_squeeze[0][0]'] \n", - " e) \n", - " \n", - " block5e_se_reduce (Conv2D) (None, 1, 1, 28) 18844 ['block5e_se_reshape[0][0]'] \n", - " \n", - " block5e_se_expand (Conv2D) (None, 1, 1, 672) 19488 ['block5e_se_reduce[0][0]'] \n", - " \n", - " block5e_se_excite (Multipl (None, 14, 14, 672) 0 ['block5e_activation[0][0]', \n", - " y) 'block5e_se_expand[0][0]'] \n", - " \n", - " block5e_project_conv (Conv (None, 14, 14, 112) 75264 ['block5e_se_excite[0][0]'] \n", - " 2D) \n", - " \n", - " block5e_project_bn (BatchN (None, 14, 14, 112) 448 ['block5e_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block5e_drop (Dropout) (None, 14, 14, 112) 0 ['block5e_project_bn[0][0]'] \n", - " \n", - " block5e_add (Add) (None, 14, 14, 112) 0 ['block5e_drop[0][0]', \n", - " 'block5d_add[0][0]'] \n", - " \n", - " block6a_expand_conv (Conv2 (None, 14, 14, 672) 75264 ['block5e_add[0][0]'] \n", - " D) \n", - " \n", - " block6a_expand_bn (BatchNo (None, 14, 14, 672) 2688 ['block6a_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block6a_expand_activation (None, 14, 14, 672) 0 ['block6a_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block6a_dwconv2 (Depthwise (None, 7, 7, 672) 6048 ['block6a_expand_activation[0]\n", - " Conv2D) [0]'] \n", - " \n", - " block6a_bn (BatchNormaliza (None, 7, 7, 672) 2688 ['block6a_dwconv2[0][0]'] \n", - " tion) \n", - " \n", - " block6a_activation (Activa (None, 7, 7, 672) 0 ['block6a_bn[0][0]'] \n", - " tion) \n", - " \n", - " block6a_se_squeeze (Global (None, 672) 0 ['block6a_activation[0][0]'] \n", - " AveragePooling2D) \n", - " \n", - " block6a_se_reshape (Reshap (None, 1, 1, 672) 0 ['block6a_se_squeeze[0][0]'] \n", - " e) \n", - " \n", - " block6a_se_reduce (Conv2D) (None, 1, 1, 28) 18844 ['block6a_se_reshape[0][0]'] \n", - " \n", - " block6a_se_expand (Conv2D) (None, 1, 1, 672) 19488 ['block6a_se_reduce[0][0]'] \n", - " \n", - " block6a_se_excite (Multipl (None, 7, 7, 672) 0 ['block6a_activation[0][0]', \n", - " y) 'block6a_se_expand[0][0]'] \n", - " \n", - " block6a_project_conv (Conv (None, 7, 7, 192) 129024 ['block6a_se_excite[0][0]'] \n", - " 2D) \n", - " \n", - " block6a_project_bn (BatchN (None, 7, 7, 192) 768 ['block6a_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block6b_expand_conv (Conv2 (None, 7, 7, 1152) 221184 ['block6a_project_bn[0][0]'] \n", - " D) \n", - " \n", - " block6b_expand_bn (BatchNo (None, 7, 7, 1152) 4608 ['block6b_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block6b_expand_activation (None, 7, 7, 1152) 0 ['block6b_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block6b_dwconv2 (Depthwise (None, 7, 7, 1152) 10368 ['block6b_expand_activation[0]\n", - " Conv2D) [0]'] \n", - " \n", - " block6b_bn (BatchNormaliza (None, 7, 7, 1152) 4608 ['block6b_dwconv2[0][0]'] \n", - " tion) \n", - " \n", - " block6b_activation (Activa (None, 7, 7, 1152) 0 ['block6b_bn[0][0]'] \n", - " tion) \n", - " \n", - " block6b_se_squeeze (Global (None, 1152) 0 ['block6b_activation[0][0]'] \n", - " AveragePooling2D) \n", - " \n", - " block6b_se_reshape (Reshap (None, 1, 1, 1152) 0 ['block6b_se_squeeze[0][0]'] \n", - " e) \n", - " \n", - " block6b_se_reduce (Conv2D) (None, 1, 1, 48) 55344 ['block6b_se_reshape[0][0]'] \n", - " \n", - " block6b_se_expand (Conv2D) (None, 1, 1, 1152) 56448 ['block6b_se_reduce[0][0]'] \n", - " \n", - " block6b_se_excite (Multipl (None, 7, 7, 1152) 0 ['block6b_activation[0][0]', \n", - " y) 'block6b_se_expand[0][0]'] \n", - " \n", - " block6b_project_conv (Conv (None, 7, 7, 192) 221184 ['block6b_se_excite[0][0]'] \n", - " 2D) \n", - " \n", - " block6b_project_bn (BatchN (None, 7, 7, 192) 768 ['block6b_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block6b_drop (Dropout) (None, 7, 7, 192) 0 ['block6b_project_bn[0][0]'] \n", - " \n", - " block6b_add (Add) (None, 7, 7, 192) 0 ['block6b_drop[0][0]', \n", - " 'block6a_project_bn[0][0]'] \n", - " \n", - " block6c_expand_conv (Conv2 (None, 7, 7, 1152) 221184 ['block6b_add[0][0]'] \n", - " D) \n", - " \n", - " block6c_expand_bn (BatchNo (None, 7, 7, 1152) 4608 ['block6c_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block6c_expand_activation (None, 7, 7, 1152) 0 ['block6c_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block6c_dwconv2 (Depthwise (None, 7, 7, 1152) 10368 ['block6c_expand_activation[0]\n", - " Conv2D) [0]'] \n", - " \n", - " block6c_bn (BatchNormaliza (None, 7, 7, 1152) 4608 ['block6c_dwconv2[0][0]'] \n", - " tion) \n", - " \n", - " block6c_activation (Activa (None, 7, 7, 1152) 0 ['block6c_bn[0][0]'] \n", - " tion) \n", - " \n", - " block6c_se_squeeze (Global (None, 1152) 0 ['block6c_activation[0][0]'] \n", - " AveragePooling2D) \n", - " \n", - " block6c_se_reshape (Reshap (None, 1, 1, 1152) 0 ['block6c_se_squeeze[0][0]'] \n", - " e) \n", - " \n", - " block6c_se_reduce (Conv2D) (None, 1, 1, 48) 55344 ['block6c_se_reshape[0][0]'] \n", - " \n", - " block6c_se_expand (Conv2D) (None, 1, 1, 1152) 56448 ['block6c_se_reduce[0][0]'] \n", - " \n", - " block6c_se_excite (Multipl (None, 7, 7, 1152) 0 ['block6c_activation[0][0]', \n", - " y) 'block6c_se_expand[0][0]'] \n", - " \n", - " block6c_project_conv (Conv (None, 7, 7, 192) 221184 ['block6c_se_excite[0][0]'] \n", - " 2D) \n", - " \n", - " block6c_project_bn (BatchN (None, 7, 7, 192) 768 ['block6c_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block6c_drop (Dropout) (None, 7, 7, 192) 0 ['block6c_project_bn[0][0]'] \n", - " \n", - " block6c_add (Add) (None, 7, 7, 192) 0 ['block6c_drop[0][0]', \n", - " 'block6b_add[0][0]'] \n", - " \n", - " block6d_expand_conv (Conv2 (None, 7, 7, 1152) 221184 ['block6c_add[0][0]'] \n", - " D) \n", - " \n", - " block6d_expand_bn (BatchNo (None, 7, 7, 1152) 4608 ['block6d_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block6d_expand_activation (None, 7, 7, 1152) 0 ['block6d_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block6d_dwconv2 (Depthwise (None, 7, 7, 1152) 10368 ['block6d_expand_activation[0]\n", - " Conv2D) [0]'] \n", - " \n", - " block6d_bn (BatchNormaliza (None, 7, 7, 1152) 4608 ['block6d_dwconv2[0][0]'] \n", - " tion) \n", - " \n", - " block6d_activation (Activa (None, 7, 7, 1152) 0 ['block6d_bn[0][0]'] \n", - " tion) \n", - " \n", - " block6d_se_squeeze (Global (None, 1152) 0 ['block6d_activation[0][0]'] \n", - " AveragePooling2D) \n", - " \n", - " block6d_se_reshape (Reshap (None, 1, 1, 1152) 0 ['block6d_se_squeeze[0][0]'] \n", - " e) \n", - " \n", - " block6d_se_reduce (Conv2D) (None, 1, 1, 48) 55344 ['block6d_se_reshape[0][0]'] \n", - " \n", - " block6d_se_expand (Conv2D) (None, 1, 1, 1152) 56448 ['block6d_se_reduce[0][0]'] \n", - " \n", - " block6d_se_excite (Multipl (None, 7, 7, 1152) 0 ['block6d_activation[0][0]', \n", - " y) 'block6d_se_expand[0][0]'] \n", - " \n", - " block6d_project_conv (Conv (None, 7, 7, 192) 221184 ['block6d_se_excite[0][0]'] \n", - " 2D) \n", - " \n", - " block6d_project_bn (BatchN (None, 7, 7, 192) 768 ['block6d_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block6d_drop (Dropout) (None, 7, 7, 192) 0 ['block6d_project_bn[0][0]'] \n", - " \n", - " block6d_add (Add) (None, 7, 7, 192) 0 ['block6d_drop[0][0]', \n", - " 'block6c_add[0][0]'] \n", - " \n", - " block6e_expand_conv (Conv2 (None, 7, 7, 1152) 221184 ['block6d_add[0][0]'] \n", - " D) \n", - " \n", - " block6e_expand_bn (BatchNo (None, 7, 7, 1152) 4608 ['block6e_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block6e_expand_activation (None, 7, 7, 1152) 0 ['block6e_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block6e_dwconv2 (Depthwise (None, 7, 7, 1152) 10368 ['block6e_expand_activation[0]\n", - " Conv2D) [0]'] \n", - " \n", - " block6e_bn (BatchNormaliza (None, 7, 7, 1152) 4608 ['block6e_dwconv2[0][0]'] \n", - " tion) \n", - " \n", - " block6e_activation (Activa (None, 7, 7, 1152) 0 ['block6e_bn[0][0]'] \n", - " tion) \n", - " \n", - " block6e_se_squeeze (Global (None, 1152) 0 ['block6e_activation[0][0]'] \n", - " AveragePooling2D) \n", - " \n", - " block6e_se_reshape (Reshap (None, 1, 1, 1152) 0 ['block6e_se_squeeze[0][0]'] \n", - " e) \n", - " \n", - " block6e_se_reduce (Conv2D) (None, 1, 1, 48) 55344 ['block6e_se_reshape[0][0]'] \n", - " \n", - " block6e_se_expand (Conv2D) (None, 1, 1, 1152) 56448 ['block6e_se_reduce[0][0]'] \n", - " \n", - " block6e_se_excite (Multipl (None, 7, 7, 1152) 0 ['block6e_activation[0][0]', \n", - " y) 'block6e_se_expand[0][0]'] \n", - " \n", - " block6e_project_conv (Conv (None, 7, 7, 192) 221184 ['block6e_se_excite[0][0]'] \n", - " 2D) \n", - " \n", - " block6e_project_bn (BatchN (None, 7, 7, 192) 768 ['block6e_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block6e_drop (Dropout) (None, 7, 7, 192) 0 ['block6e_project_bn[0][0]'] \n", - " \n", - " block6e_add (Add) (None, 7, 7, 192) 0 ['block6e_drop[0][0]', \n", - " 'block6d_add[0][0]'] \n", - " \n", - " block6f_expand_conv (Conv2 (None, 7, 7, 1152) 221184 ['block6e_add[0][0]'] \n", - " D) \n", - " \n", - " block6f_expand_bn (BatchNo (None, 7, 7, 1152) 4608 ['block6f_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block6f_expand_activation (None, 7, 7, 1152) 0 ['block6f_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block6f_dwconv2 (Depthwise (None, 7, 7, 1152) 10368 ['block6f_expand_activation[0]\n", - " Conv2D) [0]'] \n", - " \n", - " block6f_bn (BatchNormaliza (None, 7, 7, 1152) 4608 ['block6f_dwconv2[0][0]'] \n", - " tion) \n", - " \n", - " block6f_activation (Activa (None, 7, 7, 1152) 0 ['block6f_bn[0][0]'] \n", - " tion) \n", - " \n", - " block6f_se_squeeze (Global (None, 1152) 0 ['block6f_activation[0][0]'] \n", - " AveragePooling2D) \n", - " \n", - " block6f_se_reshape (Reshap (None, 1, 1, 1152) 0 ['block6f_se_squeeze[0][0]'] \n", - " e) \n", - " \n", - " block6f_se_reduce (Conv2D) (None, 1, 1, 48) 55344 ['block6f_se_reshape[0][0]'] \n", - " \n", - " block6f_se_expand (Conv2D) (None, 1, 1, 1152) 56448 ['block6f_se_reduce[0][0]'] \n", - " \n", - " block6f_se_excite (Multipl (None, 7, 7, 1152) 0 ['block6f_activation[0][0]', \n", - " y) 'block6f_se_expand[0][0]'] \n", - " \n", - " block6f_project_conv (Conv (None, 7, 7, 192) 221184 ['block6f_se_excite[0][0]'] \n", - " 2D) \n", - " \n", - " block6f_project_bn (BatchN (None, 7, 7, 192) 768 ['block6f_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block6f_drop (Dropout) (None, 7, 7, 192) 0 ['block6f_project_bn[0][0]'] \n", - " \n", - " block6f_add (Add) (None, 7, 7, 192) 0 ['block6f_drop[0][0]', \n", - " 'block6e_add[0][0]'] \n", - " \n", - " block6g_expand_conv (Conv2 (None, 7, 7, 1152) 221184 ['block6f_add[0][0]'] \n", - " D) \n", - " \n", - " block6g_expand_bn (BatchNo (None, 7, 7, 1152) 4608 ['block6g_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block6g_expand_activation (None, 7, 7, 1152) 0 ['block6g_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block6g_dwconv2 (Depthwise (None, 7, 7, 1152) 10368 ['block6g_expand_activation[0]\n", - " Conv2D) [0]'] \n", - " \n", - " block6g_bn (BatchNormaliza (None, 7, 7, 1152) 4608 ['block6g_dwconv2[0][0]'] \n", - " tion) \n", - " \n", - " block6g_activation (Activa (None, 7, 7, 1152) 0 ['block6g_bn[0][0]'] \n", - " tion) \n", - " \n", - " block6g_se_squeeze (Global (None, 1152) 0 ['block6g_activation[0][0]'] \n", - " AveragePooling2D) \n", - " \n", - " block6g_se_reshape (Reshap (None, 1, 1, 1152) 0 ['block6g_se_squeeze[0][0]'] \n", - " e) \n", - " \n", - " block6g_se_reduce (Conv2D) (None, 1, 1, 48) 55344 ['block6g_se_reshape[0][0]'] \n", - " \n", - " block6g_se_expand (Conv2D) (None, 1, 1, 1152) 56448 ['block6g_se_reduce[0][0]'] \n", - " \n", - " block6g_se_excite (Multipl (None, 7, 7, 1152) 0 ['block6g_activation[0][0]', \n", - " y) 'block6g_se_expand[0][0]'] \n", - " \n", - " block6g_project_conv (Conv (None, 7, 7, 192) 221184 ['block6g_se_excite[0][0]'] \n", - " 2D) \n", - " \n", - " block6g_project_bn (BatchN (None, 7, 7, 192) 768 ['block6g_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block6g_drop (Dropout) (None, 7, 7, 192) 0 ['block6g_project_bn[0][0]'] \n", - " \n", - " block6g_add (Add) (None, 7, 7, 192) 0 ['block6g_drop[0][0]', \n", - " 'block6f_add[0][0]'] \n", - " \n", - " block6h_expand_conv (Conv2 (None, 7, 7, 1152) 221184 ['block6g_add[0][0]'] \n", - " D) \n", - " \n", - " block6h_expand_bn (BatchNo (None, 7, 7, 1152) 4608 ['block6h_expand_conv[0][0]'] \n", - " rmalization) \n", - " \n", - " block6h_expand_activation (None, 7, 7, 1152) 0 ['block6h_expand_bn[0][0]'] \n", - " (Activation) \n", - " \n", - " block6h_dwconv2 (Depthwise (None, 7, 7, 1152) 10368 ['block6h_expand_activation[0]\n", - " Conv2D) [0]'] \n", - " \n", - " block6h_bn (BatchNormaliza (None, 7, 7, 1152) 4608 ['block6h_dwconv2[0][0]'] \n", - " tion) \n", - " \n", - " block6h_activation (Activa (None, 7, 7, 1152) 0 ['block6h_bn[0][0]'] \n", - " tion) \n", - " \n", - " block6h_se_squeeze (Global (None, 1152) 0 ['block6h_activation[0][0]'] \n", - " AveragePooling2D) \n", - " \n", - " block6h_se_reshape (Reshap (None, 1, 1, 1152) 0 ['block6h_se_squeeze[0][0]'] \n", - " e) \n", - " \n", - " block6h_se_reduce (Conv2D) (None, 1, 1, 48) 55344 ['block6h_se_reshape[0][0]'] \n", - " \n", - " block6h_se_expand (Conv2D) (None, 1, 1, 1152) 56448 ['block6h_se_reduce[0][0]'] \n", - " \n", - " block6h_se_excite (Multipl (None, 7, 7, 1152) 0 ['block6h_activation[0][0]', \n", - " y) 'block6h_se_expand[0][0]'] \n", - " \n", - " block6h_project_conv (Conv (None, 7, 7, 192) 221184 ['block6h_se_excite[0][0]'] \n", - " 2D) \n", - " \n", - " block6h_project_bn (BatchN (None, 7, 7, 192) 768 ['block6h_project_conv[0][0]']\n", - " ormalization) \n", - " \n", - " block6h_drop (Dropout) (None, 7, 7, 192) 0 ['block6h_project_bn[0][0]'] \n", - " \n", - " block6h_add (Add) (None, 7, 7, 192) 0 ['block6h_drop[0][0]', \n", - " 'block6g_add[0][0]'] \n", - " \n", - " top_conv (Conv2D) (None, 7, 7, 1280) 245760 ['block6h_add[0][0]'] \n", - " \n", - " top_bn (BatchNormalization (None, 7, 7, 1280) 5120 ['top_conv[0][0]'] \n", - " ) \n", - " \n", - " top_activation (Activation (None, 7, 7, 1280) 0 ['top_bn[0][0]'] \n", - " ) \n", - " \n", - " avg_pool (GlobalAveragePoo (None, 1280) 0 ['top_activation[0][0]'] \n", - " ling2D) \n", - " \n", - " top_dropout (Dropout) (None, 1280) 0 ['avg_pool[0][0]'] \n", - " \n", - " predictions (Dense) (None, 1000) 1281000 ['top_dropout[0][0]'] \n", - " \n", - "==================================================================================================\n", - "Total params: 7200312 (27.47 MB)\n", - "Trainable params: 7139704 (27.24 MB)\n", - "Non-trainable params: 60608 (236.75 KB)\n", - "__________________________________________________________________________________________________\n" - ] - } - ], + "outputs": [], "source": [ - "base_model.summary()" + "# Note: Uncomment to see full output\n", + "# base_model.summary()" ] }, { @@ -5529,7 +4751,44 @@ "id": "6nBMeHrVa1-y" }, "source": [ - "TK image - what our base model looks like\n", + "Truncated output of above: \n", + "\n", + "```\n", + "Model: \"efficientnetv2-b0\"\n", + "__________________________________________________________________________________________________\n", + " Layer (type) Output Shape Param # Connected to \n", + "==================================================================================================\n", + " input_1 (InputLayer) [(None, 224, 224, 3)] 0 [] \n", + " \n", + " rescaling (Rescaling) (None, 224, 224, 3) 0 ['input_1[0][0]'] \n", + " \n", + " normalization (Normalizati (None, 224, 224, 3) 0 ['rescaling[0][0]'] \n", + " on) \n", + " \n", + " stem_conv (Conv2D) (None, 112, 112, 32) 864 ['normalization[0][0]'] \n", + " \n", + " stem_bn (BatchNormalizatio (None, 112, 112, 32) 128 ['stem_conv[0][0]'] \n", + " n) \n", + " \n", + " stem_activation (Activatio (None, 112, 112, 32) 0 ['stem_bn[0][0]'] \n", + " n) \n", + " \n", + " >>> [Lots more layers here, removed 90% of them for brevity] <<< \n", + " \n", + " \n", + " avg_pool (GlobalAveragePoo (None, 1280) 0 ['top_activation[0][0]'] \n", + " ling2D) \n", + " \n", + " top_dropout (Dropout) (None, 1280) 0 ['avg_pool[0][0]'] \n", + " \n", + " predictions (Dense) (None, 1000) 1281000 ['top_dropout[0][0]'] \n", + " \n", + "==================================================================================================\n", + "Total params: 7200312 (27.47 MB)\n", + "Trainable params: 7139704 (27.24 MB)\n", + "Non-trainable params: 60608 (236.75 KB)\n", + "__________________________________________________________________________________________________\n", + "```\n", "\n", "Woah! Look at all those layers... this is what the \"deep\" in deep learning means! A *deep* number of layers.\n", "\n", @@ -5590,7 +4849,7 @@ "id": "uE1wOxjCMVAg" }, "source": [ - "### TK - Model input and output shapes\n", + "### Model input and output shapes\n", "\n", "One of the most important practical steps in using a deep learning model is input and output shapes.\n", "\n", @@ -5599,6 +4858,12 @@ "* What is the shape of my input data?\n", "* What is the ideal shape of my output data?\n", "\n", + "We ask about shapes because in all deep learning models input and output data comes in the form of tensors.\n", + "\n", + "This goes for text, audio, images and more.\n", + "\n", + "The raw data gets converted to a numerical representation first before being passed to a model. \n", + "\n", "In our case, our input data has the shape of `[(32, 224, 224, 3)]` or `[(batch_size, height, width, colour_channels)]`.\n", "\n", "And our ideal output shape will be `[(32, 120)]` or `[(batch_size, number_of_dog_classes)`.\n", @@ -5763,10 +5028,12 @@ "\n", "You can then used those extracted features and further tailor them to your own use case.\n", "\n", - "TK - image of customizing top layer\n", + "\"\n",\n", "\n", - "Let's create an instance of `base_model` without a top layer.\n", - "\n" + "*Example of how we can take a pretrained model and customize it to our own use case. This kind of transfer learning workflow is often referred to as a feature extracting workflow. Note: In this image the EfficientNetB0 architecture is being demonstrated, however we're going to be using the EfficientNetV2B0 architecture which is slightly different. I've used the older architecture image from the research paper as a newer one wasn't available.*\n", + "\n", + "Let's create an instance of `base_model` without a top layer.\n" ] }, { @@ -5870,7 +5137,7 @@ "id": "zC2t2P_P0aSp" }, "source": [ - "### TK - Model parameters\n", + "### Model parameters\n", "\n", "In traditional programming, you write a list of rules for inputs to go in, get manipulated in some predefined way and then outputs come out.\n", "\n", @@ -6119,7 +5386,7 @@ "id": "f-1WXphXBfok" }, "source": [ - "### TK - Passing data through our model\n", + "### Passing data through our model\n", "\n", "We've spoken a couple of times how our `base_model` is a \"feature extractor\" or \"pattern extractor\".\n", "\n", @@ -6135,9 +5402,12 @@ "\n", "Take a large input (e.g. an image tensor of shape `[224, 224, 3]`) and compress it into a smaller output (e.g. a [**feature vector**](https://en.wikipedia.org/wiki/Feature_(machine_learning)#Feature_vectors) of shape `[1280]`) that captures a useful representation of the input.\n", "\n", + "\"A\n", + "\n", + "*Example of how a model can take an input piece of data and compress its representation into a feature vector with much lower dimensionality than the original data.*\n", + "\n", "> **Note:** A feature vector is also referred to as an [**embedding**](https://developers.google.com/machine-learning/crash-course/embeddings/video-lecture), a compressed representation of a data sample that makes it useful. The concept of embeddings is not limited to images either, the concept of embeddings stretches across all data types (text, images, video, audio + more).\n", "\n", - "TK image - compression of input image\n", "\n", "We can see this in action by passing a single image through our `base_model`." ] @@ -6625,7 +5895,7 @@ "id": "wMlBRuOShYMs" }, "source": [ - "### TK - Going from image to feature vector (practice)\n", + "### Going from image to feature vector (practice)\n", "\n", "We've covered a fair bit in the past few sections.\n", "\n", @@ -6824,7 +6094,7 @@ "id": "wbIVNQp4Dg-t" }, "source": [ - "### TK - Creating a custom model for our dog vision problem\n", + "### Creating a custom model for our dog vision problem\n", "\n", "The main steps when creating any kind of deep learning model from scratch are:\n", "\n", @@ -6856,7 +6126,7 @@ "id": "XMRnPYuVdVEK" }, "source": [ - "### TK - Creating a model with the Sequential API\n", + "### Creating a model with the Sequential API\n", "\n", "The Sequential API is the most straightforward way to create a model.\n", "\n", @@ -7250,7 +6520,7 @@ "id": "tObIiaKAdbH2" }, "source": [ - "### TK - Creating a model with the Functional API\n", + "### Creating a model with the Functional API\n", "\n", "As mentioned before, the [Keras Functional API](https://www.tensorflow.org/guide/keras/functional_api) is a way/design pattern for creating more complex models.\n", "\n", @@ -7399,7 +6669,7 @@ "id": "SL1k0m-_WIyc" }, "source": [ - "### TK - Functionizing model creation\n", + "### Functionizing model creation\n", "\n", "We've created two different kinds of models so far.\n", "\n", @@ -7585,7 +6855,7 @@ "id": "QVPLT6dsvXvN" }, "source": [ - "## TK - 7. Model 0 - Train a model on 10% of the training data\n", + "## 7. Model 0 - Train a model on 10% of the training data\n", "\n", "We've seen our model make a couple of predictions on our data.\n", "\n", @@ -7675,7 +6945,7 @@ "id": "CdQ_JkWlbdk4" }, "source": [ - "### TK - Compiling a model\n", + "### Compiling a model\n", "\n", "After we've created a model, the next step is to compile it.\n", "\n", @@ -7688,8 +6958,7 @@ "2. **The loss function** - this measures how wrong the model is (e.g. how far off are its predictions from the truth, an ideal loss value is 0, meaning the model is perfectly predicting the data).\n", "3. **The metric(s)** - this is a human-readable value that shows how your model is performing, for example, accuracy is often used as an evaluation metric.\n", "\n", - "These three settings work together to help improve a model.\n", - "\n" + "These three settings work together to help improve a model.\n" ] }, { @@ -7698,9 +6967,7 @@ "id": "Q4lKvBBcWQuW" }, "source": [ - "\n", - "\n", - "### TK - Which optimizer should I use?\n", + "### Which optimizer should I use?\n", "\n", "An optimizer tells a model how to improve its internal parameters (weights) to hopefully improve a loss value.\n", "\n", @@ -7718,11 +6985,13 @@ "\n", "For now, the main takeaway is that neural networks learn in the following fashion:\n", "\n", - "TK - graphic for learning paradigm\n", + "Start with random patterns/weights -> Look at data (forward pass) -> Try to predict data (with current weights) -> Measure performance of predictions (loss function, backpropagation calculates gradients of loss with respect to weights) -> Update patterns/weights (optimizer, gradient descent adjusts weights in the opposite direction of the gradients to minimize loss) -> Look at data (forward pass) -> Try to predict data (with updated weights) -> Measure performance (loss function) -> Update patterns/weights (optimizer) -> Repeat all of the above X times. \n", + "\n", + "\"A\n", "\n", - "Start with random patterns/weights -> look at data -> try to predict data -> measure performance of predictions (loss function) -> update patterns/weights (optimizer) -> try to predict data -> measure performance (loss function) -> update patterns/weights (optimizer) -> ...\n", + "*Example of how a neural network learns (in brief). Note the cyclical nature of the learning. You can think of it as a big game of guess and check, where the guess (hopefully) get better over time.*\n", "\n", - "I'll leave the intricasies of gradient descent and backpropagation to your own extra-curricula research.\n", + "I'll leave the intricacies of gradient descent and backpropagation to your own extra-curricula research.\n", "\n", "We're going to focus on using the tools TensorFlow has to offer to implement this process.\n", "\n", @@ -7795,9 +7064,7 @@ "id": "Zn8ZITXmuDuf" }, "source": [ - "\n", - "\n", - "### TK - Which loss function should I use?\n", + "### Which loss function should I use?\n", "\n", "A loss function measures how *wrong* your model's predictions are.\n", "\n", @@ -7912,7 +7179,7 @@ "id": "r4WCY1lkuOzt" }, "source": [ - "### TK - Which mertics should I use?\n", + "### Which mertics should I use?\n", "\n", "The evaluation metric is a human-readable value which is used to see how well your model is performing.\n", "\n", @@ -7962,9 +7229,7 @@ "id": "pdUGCUJ6ajrB" }, "source": [ - "\n", - "\n", - "### TK - Learn more on how a model learns\n", + "### Learn more on how a model learns\n", "\n", "We've breifly touched on optimizers, loss functions, gradient descent and backpropagation, the backbone of neural network learning, however, for a more in-depth look at each of these, I'd check out the following:\n", "\n", @@ -7978,7 +7243,7 @@ "id": "bIBKGOwLamXG" }, "source": [ - "### TK - Putting it all together and compiling our model\n", + "### Putting it all together and compiling our model\n", "\n", "Phew!\n", "\n", @@ -8018,7 +7283,7 @@ "id": "PnyeJsqid8F9" }, "source": [ - "### TK - Fitting a model on the data\n", + "### Fitting a model on the data\n", "\n", "Model created and compiled!\n", "\n", @@ -8127,7 +7392,7 @@ "id": "YdDhrasey9QX" }, "source": [ - "## TK - 8. Putting it all together: create, compile, fit\n", + "## 8. Putting it all together: create, compile, fit\n", "\n", "Let's practice what we've done so far to train our first neural network.\n", "\n", @@ -8205,7 +7470,7 @@ "id": "QGc2yDG1vr3f" }, "source": [ - "### TK - Evaluate Model 0 on the test data\n", + "### Evaluate Model 0 on the test data\n", "\n", "Alright, the next step in our journey is to evaluate our trained model.\n", "\n", @@ -8512,7 +7777,7 @@ "id": "Ycprd6gfi7_I" }, "source": [ - "## TK - 9. Model 1 - Train a model on 100% of the training data\n", + "## 9. Model 1 - Train a model on 100% of the training data\n", "\n", "Time to step it up a notch!\n", "\n", @@ -8608,7 +7873,7 @@ "id": "Gt79lK6vv2Nt" }, "source": [ - "### TK - Evaluate Model 1 on the test data\n", + "### Evaluate Model 1 on the test data\n", "\n", "How about we evaluate our `model_1`?\n", "\n", @@ -8707,7 +7972,7 @@ "id": "4_6cy_dQTxeY" }, "source": [ - "## TK - 10. Make and evaluate predictions of the best model\n", + "## 10. Make and evaluate predictions of the best model\n", "\n", "Now we've trained a model, it's time to make predictions with it!\n", "\n", @@ -9027,7 +8292,7 @@ "id": "nJmQ9KFhzcda" }, "source": [ - "### TK - Visualizing predictions from our best trained model\n", + "### Visualizing predictions from our best trained model\n", "\n", "We could sit there looking at single image predictions of dogs all day.\n", "\n", @@ -9110,7 +8375,7 @@ "id": "beMqsagUVUe8" }, "source": [ - "### TK - Finding the accruacy per class\n", + "### Finding the accruacy per class\n", "\n", "Our model's overall accuracy is ~90%.\n", "\n", @@ -9124,6 +8389,8 @@ "\n", "You'll see on the original [Stanford Dogs Dataset](http://vision.stanford.edu/aditya86/ImageNetDogs/) website that the authors reported the accuracy per class of each of the dog breeds. Their best performing class, `african_hunting_dog` achieved close to 60% accuracy.\n", "\n", + "UPTOHERE\n", + "\n", "TK - image of results from Stanford Dogs paper/dataset\n", "\n", "How about we try and replicate the same plot?\n",