Quick Reference

Quick command reference for running vocabulary extraction. Aliases environment setup, and example commands.

Dictionary Extraction - Quick Commands

✅ Easy Ways to Run the Script (From Anywhere!)

From ANY directory in your project:

1# Activate your conda environment first
2conda activate ags-dictionary
3
4# Run from anywhere (the script will handle paths)
5/home/ag-sayyed/Documents/projects/hbstack/ghafoors-blog/scripts/dictionary/dict-extract.sh \
6  --video-url "https://www.youtube.com/watch?v=9M_dq_0ljsc" \
7  --topic "capitalism" \
8  --create-hugo-page \
9  --source-name "Capitalism is not natural"

Or create an alias in your ~/.bashrc:

1# Add this to ~/.bashrc
2alias dict-extract="/home/ag-sayyed/Documents/projects/hbstack/ghafoors-blog/scripts/dictionary/dict-extract.sh"
3
4# Then you can run from anywhere:
5dict-extract --video-url "URL" --topic "my-topic" --create-hugo-page

Method 2: Using Python Wrapper

1conda activate ags-dictionary
2
3# From project root
4python scripts/dictionary/extract_vocab.py \
5  --video-url "https://www.youtube.com/watch?v=9M_dq_0ljsc" \
6  --topic "capitalism" \
7  --create-hugo-page \
8  --source-name "Capitalism is not natural"

Method 3: From Project Root (Traditional)

1conda activate ags-dictionary
2cd /home/ag-sayyed/Documents/projects/hbstack/ghafoors-blog
3
4python scripts/dictionary/extract_vocabulary.py \
5  --video-url "https://www.youtube.com/watch?v=9M_dq_0ljsc" \
6  --topic "capitalism" \
7  --create-hugo-page \
8  --source-name "Capitalism is not natural"

🔑 Environment Setup

One-time Setup

The following is not needed as I am using Github Copilot paid account, but I am keeping it here for reference and for anyone else who might use this project.

  1. Set your OpenAI API Key:
 1# Option A: Set in your shell profile (recommended)
 2echo 'export OPENAI_API_KEY="sk-your-key-here"' >> ~/.bashrc
 3source ~/.bashrc
 4
 5# Option B: Create .env file in project root
 6cat > /home/ag-sayyed/Documents/projects/hbstack/ghafoors-blog/.env << 'EOF'
 7OPENAI_API_KEY=sk-your-key-here
 8EOF
 9
10# Option C: Create .env in scripts/dictionary/
11cat > /home/ag-sayyed/Documents/projects/hbstack/ghafoors-blog/scripts/dictionary/.env << 'EOF'
12OPENAI_API_KEY=sk-your-key-here
13EOF
  1. Verify installation:
1conda activate ags-dictionary
2python -c "import yaml, openai, youtube_transcript_api; print('✅ All dependencies installed!')"

📝 Example Commands

Basic Usage

1dict-extract \
2  --video-url "https://www.youtube.com/watch?v=VIDEO_ID" \
3  --topic "my-topic"

With Hugo Page Creation

1dict-extract \
2  --video-url "https://www.youtube.com/watch?v=VIDEO_ID" \
3  --topic "my-topic" \
4  --create-hugo-page \
5  --source-name "Video Title"

Append to Existing Vocabulary

1dict-extract \
2  --video-url "https://www.youtube.com/watch?v=VIDEO_ID" \
3  --topic "existing-topic" \
4  --append

Control Word Count and Difficulty

1dict-extract \
2  --video-url "https://www.youtube.com/watch?v=VIDEO_ID" \
3  --topic "my-topic" \
4  --max-words 30 \
5  --difficulty-threshold 7

🐛 Troubleshooting

Error: No module named 'yaml'

Solution: Install dependencies

1conda activate ags-dictionary
2pip install youtube-transcript-api openai PyYAML python-dotenv requests beautifulsoup4

Error: OpenAI API key not found

Solution: Set your API key (see Environment Setup above)

Error: Could not extract video ID

Solution: Make sure you’re using the full YouTube URL:

  • https://www.youtube.com/watch?v=VIDEO_ID
  • https://youtu.be/VIDEO_ID
  • ❌ Just VIDEO_ID

Error: Could not fetch transcript

Solution: Video might not have captions. Try a different video or use manual transcript.


Add to your ~/.bashrc or ~/.zshrc:

 1# Dictionary extraction aliases
 2export GHAFOORS_BLOG="/home/ag-sayyed/Documents/projects/hbstack/ghafoors-blog"
 3alias dict-extract="$GHAFOORS_BLOG/scripts/dictionary/dict-extract.sh"
 4alias dict-cd="cd $GHAFOORS_BLOG"
 5
 6# Quick function for common usage
 7extract-vocab() {
 8    conda activate ags-dictionary
 9    dict-extract --video-url "$1" --topic "$2" --create-hugo-page --source-name "$3"
10}
11
12# Usage: extract-vocab "URL" "topic-name" "Video Title"

Then reload:

1source ~/.bashrc

Now you can run from anywhere:

1extract-vocab "https://youtube.com/watch?v=ABC123" "my-topic" "My Video"

📂 Output Locations

The script will create files in these locations (relative to project root):

  • YAML data: data/dictionary/[topic]/vocabulary.yaml
  • Hugo page: content/docs/dictionary/[topic]/index.md (if --create-hugo-page used)

✅ Your Fixed Command

1conda activate ags-dictionary
2
3# From anywhere - using shell wrapper (easiest!)
4/home/ag-sayyed/Documents/projects/hbstack/ghafoors-blog/scripts/dictionary/dict-extract.sh \
5  --video-url "https://www.youtube.com/watch?v=9M_dq_0ljsc" \
6  --topic "capitalism" \
7  --create-hugo-page \
8  --source-name "Capitalism is not natural"

Note: I fixed the typo in your topic name (“Capitalil sm” → “capitalism”)


Created: 2026-05-05
Status: Ready to use!