Quick command reference for running vocabulary extraction. Aliases environment setup, and example commands.
From ANY directory in your project:
1# Activate your conda environment first
2conda activate ags-dictionary
3
4# Run from anywhere (the script will handle paths)
5/home/ag-sayyed/Documents/projects/hbstack/ghafoors-blog/scripts/dictionary/dict-extract.sh \
6 --video-url "https://www.youtube.com/watch?v=9M_dq_0ljsc" \
7 --topic "capitalism" \
8 --create-hugo-page \
9 --source-name "Capitalism is not natural"
Or create an alias in your ~/.bashrc:
1# Add this to ~/.bashrc
2alias dict-extract="/home/ag-sayyed/Documents/projects/hbstack/ghafoors-blog/scripts/dictionary/dict-extract.sh"
3
4# Then you can run from anywhere:
5dict-extract --video-url "URL" --topic "my-topic" --create-hugo-page
1conda activate ags-dictionary
2
3# From project root
4python scripts/dictionary/extract_vocab.py \
5 --video-url "https://www.youtube.com/watch?v=9M_dq_0ljsc" \
6 --topic "capitalism" \
7 --create-hugo-page \
8 --source-name "Capitalism is not natural"
1conda activate ags-dictionary
2cd /home/ag-sayyed/Documents/projects/hbstack/ghafoors-blog
3
4python scripts/dictionary/extract_vocabulary.py \
5 --video-url "https://www.youtube.com/watch?v=9M_dq_0ljsc" \
6 --topic "capitalism" \
7 --create-hugo-page \
8 --source-name "Capitalism is not natural"
The following is not needed as I am using Github Copilot paid account, but I am keeping it here for reference and for anyone else who might use this project.
1# Option A: Set in your shell profile (recommended)
2echo 'export OPENAI_API_KEY="sk-your-key-here"' >> ~/.bashrc
3source ~/.bashrc
4
5# Option B: Create .env file in project root
6cat > /home/ag-sayyed/Documents/projects/hbstack/ghafoors-blog/.env << 'EOF'
7OPENAI_API_KEY=sk-your-key-here
8EOF
9
10# Option C: Create .env in scripts/dictionary/
11cat > /home/ag-sayyed/Documents/projects/hbstack/ghafoors-blog/scripts/dictionary/.env << 'EOF'
12OPENAI_API_KEY=sk-your-key-here
13EOF
1conda activate ags-dictionary
2python -c "import yaml, openai, youtube_transcript_api; print('✅ All dependencies installed!')"
1dict-extract \
2 --video-url "https://www.youtube.com/watch?v=VIDEO_ID" \
3 --topic "my-topic"
1dict-extract \
2 --video-url "https://www.youtube.com/watch?v=VIDEO_ID" \
3 --topic "my-topic" \
4 --create-hugo-page \
5 --source-name "Video Title"
1dict-extract \
2 --video-url "https://www.youtube.com/watch?v=VIDEO_ID" \
3 --topic "existing-topic" \
4 --append
1dict-extract \
2 --video-url "https://www.youtube.com/watch?v=VIDEO_ID" \
3 --topic "my-topic" \
4 --max-words 30 \
5 --difficulty-threshold 7
No module named 'yaml'Solution: Install dependencies
1conda activate ags-dictionary
2pip install youtube-transcript-api openai PyYAML python-dotenv requests beautifulsoup4
OpenAI API key not foundSolution: Set your API key (see Environment Setup above)
Could not extract video IDSolution: Make sure you’re using the full YouTube URL:
https://www.youtube.com/watch?v=VIDEO_IDhttps://youtu.be/VIDEO_IDVIDEO_IDCould not fetch transcriptSolution: Video might not have captions. Try a different video or use manual transcript.
Add to your ~/.bashrc or ~/.zshrc:
1# Dictionary extraction aliases
2export GHAFOORS_BLOG="/home/ag-sayyed/Documents/projects/hbstack/ghafoors-blog"
3alias dict-extract="$GHAFOORS_BLOG/scripts/dictionary/dict-extract.sh"
4alias dict-cd="cd $GHAFOORS_BLOG"
5
6# Quick function for common usage
7extract-vocab() {
8 conda activate ags-dictionary
9 dict-extract --video-url "$1" --topic "$2" --create-hugo-page --source-name "$3"
10}
11
12# Usage: extract-vocab "URL" "topic-name" "Video Title"
Then reload:
1source ~/.bashrc
Now you can run from anywhere:
1extract-vocab "https://youtube.com/watch?v=ABC123" "my-topic" "My Video"
The script will create files in these locations (relative to project root):
data/dictionary/[topic]/vocabulary.yamlcontent/docs/dictionary/[topic]/index.md (if --create-hugo-page used)1conda activate ags-dictionary
2
3# From anywhere - using shell wrapper (easiest!)
4/home/ag-sayyed/Documents/projects/hbstack/ghafoors-blog/scripts/dictionary/dict-extract.sh \
5 --video-url "https://www.youtube.com/watch?v=9M_dq_0ljsc" \
6 --topic "capitalism" \
7 --create-hugo-page \
8 --source-name "Capitalism is not natural"
Note: I fixed the typo in your topic name (“Capitalil sm” → “capitalism”)
Created: 2026-05-05
Status: Ready to use!