Course Syllabus: Natural Language Processing (NLP)
1. Course Schedule (20 Weeks)
This schedule merges the 5 Modules from your course documents with a deep dive into modern LLM applications.
| Phase | Weeks | Focus Topics |
|---|---|---|
| I: Foundations | 1–4 | NLP History, Regex, Pipeline (Segmentation, Tokenization), Lemmatization vs. Stemming. |
| II: Machine Learning | 5–8 | Text Statistics, Naïve Bayes, Sentiment Analysis, Logistic Regression, and Vector Space Models. |
| III: Neural NLP | 9–12 | Word Embeddings (Word2Vec/GloVe), Feedforward Networks, RNNs, and LSTMs. |
| IV: The LLM Era | 13–16 | Attention, Transformers (BERT/GPT), Pre-trained Models, and Question Answering. |
| V: Mastery | 17–20 | RAG Pipelines, Ethics & Bias, Cutting-edge Trends, and Capstone Project Presentations. |
2. Evaluation & Grading
Weights are based on the SST University Evaluation Criteria.
- Final Exam (30%): Comprehensive assessment of theory and math.
- Exercises (20%): Hands-on Python coding assignments.
- Group Project (30%): Building a functional NLP application (e.g., Sentiment Analyzer or Chatbot).
- Quizzes (10%): Bi-weekly knowledge checks.
- Class Participation (10%): Engagement in discussions and remote sessions.
3. Generative AI & Academic Integrity Policy
As this is an NLP course, use of AI is permitted under strict transparency rules.
- Transparency: Any code or text generated by AI must be disclosed with a Conversation Share Link (e.g., ChatGPT or Claude shared link).
- The “Human-in-the-loop” Rule: You may use AI to debug or brainstorm, but you must be able to explain the “how” and “why” during random oral spot-checks.
- Prohibited Use: Using AI to generate direct answers for quizzes or the final exam without critical engagement is considered academic misconduct.
4. Automation: Building Slides with Quarto
Quarto allows you to create interactive Reveal.js slides using simple Markdown and Python.
A. Your Slide Template (presentation.qmd)
Copy this into a text file to start.
---
title: "Week 1: Introduction to NLP"
subtitle: "Natural Language Processing & Semantic Analysis"
author: "Francisco Suárez"
format:
revealjs:
theme: sky
transition: slide
incremental: true
chalkboard: true
code-fold: true
---
## What is NLP?
- Intersection of Computer Science, AI, and Linguistics.
- Goal: Enable machines to understand and derive meaning from text/speech.
## The NLP Pipeline
::: {.columns}
::: {.column width="50%"}
1. Data Collection
2. Text Cleaning
3. Pre-processing.
:::
::: {.column width="50%"}
::: {#0ce97621 .cell execution_count=1}
``` {.python .cell-code}
# Example of simple tokenization in Python
text = "The sky is clear."
tokens = text.split()
print(tokens)
```
::: {.cell-output .cell-output-stdout}
```
['The', 'sky', 'is', 'clear.']
```
:::
:::
:::
:::
## Semantic Analysis
> Semantic analysis is the process of drawing meaning from text.
$$ Meaning = f(Grammar, Context, Relationship) $$B. Prompt for Automating Content
When you want an AI to write the code for the next 19 weeks of slides, use this specific prompt:
“Act as a college professor. Generate the Quarto Markdown (
.qmd) code for a 15-slide presentation on [Insert Topic, e.g., Transformers].Use the Reveal.js format. Include: 1. A YAML header with the ‘sky’ theme. 2. At least two slides with executable Python code blocks (
{python}). 3. One slide with a LaTeX equation for the math. 4. Use::: {.incremental}for bullet points. 5. Base the content on the standard NLP pipeline (Tokenization \(\rightarrow\) Embedding \(\rightarrow\) Model).”
Quarto Tutorial: Markdown to Slideshow
This video provides a complete walkthrough on setting up Quarto to transform your Markdown files into professional, code-ready presentations.
http://googleusercontent.com/youtube_content/0
```