Skip to content

ntu-nail/AI6127

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 

Repository files navigation

AI6127: Deep Neural Networks For Natural Language Processing

Course Objectives

Natural language processing (NLP) is one of the most important fields in artificial intelligence (AI). It has become very crucial in the information age because most of the information is in the form of unstructured text. NLP technologies are applied everywhere as people communicate mostly in language: language translation, web search, customer support, emails, forums, advertisement, radiology reports, to name a few.

There are several core NLP tasks and machine learning models behind NLP applications. Deep learning, a sub-field of machine learning, has recently brought a paradigm shift from traditional task-specific feature engineering to end-to-end systems and has obtained high performance across many different NLP tasks and downstream applications. Tech companies like Google, Baidu, Alibaba, Apple, Amazon, Facebook, Tencent, and Microsoft are now actively working on deep learning methods to improve their products. For example, Google recently replaced its traditional statistical machine translation and speech-recognition systems with systems based on deep learning methods.

Optional Textbooks

  • Deep Learning by Goodfellow, Bengio, and Courville free online
  • Machine Learning — A Probabilistic Perspective by Kevin Murphy online
  • Natural Language Processing by Jacob Eisenstein free online
  • Speech and Language Processing by Dan Jurafsky and James H. Martin (3rd ed. draft)

Intended Learning Outcomes

In this course, students will learn state-of-the-art deep learning methods for NLP. Through lectures and practical assignments, students will learn the necessary tricks for making their models work on practical problems. They will learn to implement and possibly invent their deep learning models using available deep learning libraries like Pytorch.

Our Approach

  • Thorough and Detailed: How to write from scratch, debug, and train deep neural models

  • State of the art: Most lecture materials are new from the research world in the past 1-5 years.

  • Practical: Focus on practical techniques for training the models, and on GPUs.

  • Fun: Cover exciting new advancements in NLP (e.g., Transformer, ChatGPT).

Assessment Approach

Weekly Workload

  • Lecture and practical problems implemented in PyTorch.
  • There will be NO office hours.

Assignments (individually graded)

  • Two (2) assignments will contribute to 2 * 25% = 50% of the total assessment.
  • Students will be graded individually on the assignments. They will be allowed to discuss with each other on the homework assignments, but they are required to submit individual write-ups and coding exercises.

Final Project (Group work but individually graded)

  • There will be a final project contributing to the remaining 50% of the total coursework assessment.
    • 3–6 students per group
    • Presentation: 20%, report: 30%
  • The project will be a group work. Students will be graded individually, depending on their contribution to the group. The final project presentation will ensure the student’s understanding of the project.

Course Prerequisites

  • Proficiency in Python (using Numpy and PyTorch). There is a lecture for those who are not familiar with Python.
  • Linear Algebra, basic Probability and Statistics
  • Machine Learning basics

Teaching

Instructor

Luu Anh Tuan

Teaching Assistants

Nguyen Tran Cong Duy

[email protected]

Schedule & Course Content

Week 1: Introduction

Lecture Slide

Lecture Content

  • What is Natural Language Processing?
  • Why is language understanding difficult?
  • What is Deep Learning?
  • Deep learning vs. other machine learning methods?
  • Why deep learning for NLP?
  • Applications of deep learning to NLP
  • Knowing the target group (background, field of study, programming experience)
  • Expectation from the course

Python & PyTorch Basics

Week 2: Machine Learning Basics

Lecture Slide

Lecture Content

  • What is Machine Learning?
  • Supervised vs. unsupervised learning
  • Linear Regression
  • Logistic Regression
  • Multi-class classification
  • Parameter estimation (MLE & MAP)
  • Gradient-based optimization & SGD

Practical exercise with Pytorch

Week 3: Neural Networks & Optimization Basics

Lecture Slide

Recording of Lecture 3

Lecture Content

  • From Logistic Regression to Feed-forward NN
    • Activation functions
  • SGD with Backpropagation
  • Adaptive SGD (adagrad, adam, RMSProp)
  • Regularization (Weight Decay, Dropout, Batch normalization, Gradient clipping)

Practical exercise with Pytorch

Week 4: Word Vectors

Lecture Slide

Lecture Content

  • Word meaning
  • Denotational semantics
  • Distributed representation of words
  • Word2Vec models (Skip-gram, CBOW)
  • Negative sampling
  • FastText
  • Evaluating word vectors
    • Intrinsic evaluation
    • Extrinsic evaluation
  • Cross-lingual word embeddings

Practical exercise with Pytorch

Skip-gram training

Suggested Readings

Week 5: Window-based Approach and Convolutional Nets

Lecture Slide

Final Project instruction

Lecture Content

  • Classification tasks in NLP
  • Window-based Approach for language modeling
  • Window-based Approach for NER, POS tagging, and Chunking
  • Convolutional Neural Net for NLP
  • Max-margin Training

Suggested Readings

Week 6: Recurrent Neural Nets

Lecture Slide

Lecture Content

  • Language modeling with RNNs
  • Backpropagation through time
  • Text generation with RNN LM
  • Sequence labeling with RNNs
  • Sequence classification with RNNs
  • Issues with Vanilla RNNs
  • Gated Recurrent Units (GRUs) and LSTMs
  • Bidirectional RNNs
  • Multi-layer RNNs

Practical exercise with Pytorch (CNN and RNN for NER)

Suggested Readings

Week 7: Machine translation and Seq2Seg Models

Lecture Slide

Assignment 1 is out here. Deadline: 25 March 2024, 1159pm.

Lecture Content

  • Machine translation
    • Early days (1950s)
    • Statistical machine translation or SMT (1990-2010)
    • Alignment in SMT
    • Decoding in SMT
    • Neural machine translation or NMT (2014 - )
  • Encoder-decoder model for NMT
  • Advantages and disadvantages of NMT
  • Greedy vs. beam-search decoding
  • MT evaluation

Suggested Readings

Week 8: Seq2Seg Models, Attentions, Subwords

Lecture Slide

Lecture Content

  • Information bottleneck issue with vanilla Seq2Seq
  • Attention to the rescue
  • Details of attention mechanism
  • Sub-word models
  • Byte-pair encoding
  • Hybrid models

Practical exercise with Pytorch

Suggested Readings

Week 9: Seq2Seq Variants and Transformer

Lecture Slide

Lecture Content

  • Seq2Seq Variants (Pointer nets, Pointer Generator Nets)
    • Machine Translation
    • Summarization
  • Transformer architecture
    • Self-attention
    • Positional encoding
    • Multi-head attention

The Annotated Transformer

Suggested Readings

Week 10: Contextual embeddings and self-supervised learning

Lecture Slide

Lecture Content

  • Why semi-supervsied?

  • Semisupervised learning dimensions

  • Pre-training and fine-tuning methods

    • CoVe
    • TagLM
    • ELMo
    • GPT
    • ULMfit
    • BERT
    • BART
  • Evaluation benchmarks

    • GLUE
    • SQuAD
    • NER
    • SuperGLUE
    • XNLI

Pre-train Fine-tune with HF

Suggested Readings

Week 11: Large Language Models & Multilingual NLP

Assignment 2 is out here. Deadline: 23 Apr 2025, 11:59pm

Final project report instruction

Lecture Slide

Lecture Content

  • Large Language Models
  • Examples of Large Pretrained Language Models
  • Multilingual NLP
    • Why do we need Multilingual NLP?
    • Low-resource NLP
    • Cross-lingual models
    • Multilingual models

Suggested Readings

Week 12: Bias, Robustness, Hallucination, Multimodal NLP & Recap

Lecture Slide

Lecture Content

  • Bias Problem in Deep Learning for NLP
  • Robustness of NLP Deep Learning Models
  • Hallucination of LLMs
  • Multimodal NLP
  • Recap

Suggested Readings

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published