Course Syllabus: Natural Language Processing (NLP)

1. Course Schedule (20 Weeks)

This schedule merges the 5 Modules from your course documents with a deep dive into modern LLM applications.

Phase Weeks Focus Topics
I: Foundations 1–4 NLP History, Regex, Pipeline (Segmentation, Tokenization), Lemmatization vs. Stemming.
II: Machine Learning 5–8 Text Statistics, Naïve Bayes, Sentiment Analysis, Logistic Regression, and Vector Space Models.
III: Neural NLP 9–12 Word Embeddings (Word2Vec/GloVe), Feedforward Networks, RNNs, and LSTMs.
IV: The LLM Era 13–16 Attention, Transformers (BERT/GPT), Pre-trained Models, and Question Answering.
V: Mastery 17–20 RAG Pipelines, Ethics & Bias, Cutting-edge Trends, and Capstone Project Presentations.

2. Evaluation & Grading

Weights are based on the SST University Evaluation Criteria.

  • Final Exam (30%): Comprehensive assessment of theory and math.
  • Exercises (20%): Hands-on Python coding assignments.
  • Group Project (30%): Building a functional NLP application (e.g., Sentiment Analyzer or Chatbot).
  • Quizzes (10%): Bi-weekly knowledge checks.
  • Class Participation (10%): Engagement in discussions and remote sessions.

3. Generative AI & Academic Integrity Policy

As this is an NLP course, use of AI is permitted under strict transparency rules.

  1. Transparency: Any code or text generated by AI must be disclosed with a Conversation Share Link (e.g., ChatGPT or Claude shared link).
  2. The “Human-in-the-loop” Rule: You may use AI to debug or brainstorm, but you must be able to explain the “how” and “why” during random oral spot-checks.
  3. Prohibited Use: Using AI to generate direct answers for quizzes or the final exam without critical engagement is considered academic misconduct.

4. Automation: Building Slides with Quarto

Quarto allows you to create interactive Reveal.js slides using simple Markdown and Python.

A. Your Slide Template (presentation.qmd)

Copy this into a text file to start.

---
title: "Week 1: Introduction to NLP"
subtitle: "Natural Language Processing & Semantic Analysis"
author: "Francisco Suárez"
format: 
  revealjs:
    theme: sky
    transition: slide
    incremental: true 
    chalkboard: true
    code-fold: true
---

## What is NLP?
- Intersection of Computer Science, AI, and Linguistics.
- Goal: Enable machines to understand and derive meaning from text/speech.

## The NLP Pipeline
::: {.columns}
::: {.column width="50%"}
1. Data Collection
2. Text Cleaning
3. Pre-processing.
:::

::: {.column width="50%"}


::: {#0ce97621 .cell execution_count=1}
``` {.python .cell-code}
# Example of simple tokenization in Python
text = "The sky is clear."
tokens = text.split()
print(tokens)
```

::: {.cell-output .cell-output-stdout}
```
['The', 'sky', 'is', 'clear.']
```
:::
:::


:::
:::

## Semantic Analysis

> Semantic analysis is the process of drawing meaning from text.

$$ Meaning = f(Grammar, Context, Relationship) $$

B. Prompt for Automating Content

When you want an AI to write the code for the next 19 weeks of slides, use this specific prompt:

“Act as a college professor. Generate the Quarto Markdown (.qmd) code for a 15-slide presentation on [Insert Topic, e.g., Transformers].

Use the Reveal.js format. Include: 1. A YAML header with the ‘sky’ theme. 2. At least two slides with executable Python code blocks ({python}). 3. One slide with a LaTeX equation for the math. 4. Use ::: {.incremental} for bullet points. 5. Base the content on the standard NLP pipeline (Tokenization \(\rightarrow\) Embedding \(\rightarrow\) Model).”


Quarto Tutorial: Markdown to Slideshow

This video provides a complete walkthrough on setting up Quarto to transform your Markdown files into professional, code-ready presentations.

http://googleusercontent.com/youtube_content/0

```