Ready-to-use prompt for extracting vocabulary using GitHub Copilot
[TOPIC] and [TRANSCRIPT] with your contentI need help extracting vocabulary from a video transcript for my English learning dictionary.
**Task**: Analyze this video transcript about [TOPIC] and extract 25 challenging English words suitable for an advanced ESL learner (first language: Urdu).
**Selection Criteria**:
- Difficulty level: 6-9 out of 10
- Important for understanding the topic
- Not common everyday words
- Technical or domain-specific terms
- Useful for academic/professional contexts
**For each word, provide**:
1. word: the English word (lowercase)
2. part_of_speech: (noun, verb, adjective, adverb, phrase, technical-term, etc.)
3. urdu_meaning: Urdu translation in Urdu script
4. example_en: A clear example sentence from the transcript or similar context
5. example_ur: Natural, contextual Urdu translation of the example
6. additional_example_ur: (optional) Another Urdu example showing different usage
**Output Format**: Valid YAML array only, no markdown formatting, no explanations.
**Example Entry**:
```yaml
- word: retrieval
part_of_speech: noun
urdu_meaning: بازیافت، واپس لانا
example_en: Efficient retrieval of information is crucial for RAG systems.
example_ur: RAG سسٹمز کے لیے معلومات کی موثر بازیافت بہت اہم ہے۔
additional_example_ur: ڈیٹا بیس سے دستاویزات کی بازیافت تیزی سے ہونی چاہیے۔
Transcript: [PASTE YOUR TRANSCRIPT HERE]
Please extract the vocabulary now in YAML format.
---
## 📝 Alternative: Shorter Version for Quick Extraction
Extract 20 difficult English words from this video transcript about [TOPIC] for an Urdu-speaking advanced learner.
For each word provide YAML format:
Transcript: [PASTE TRANSCRIPT]
Output only valid YAML array.
---
## 🎨 Customization Options
### Adjust Difficulty Level
Focus on words with difficulty level [LEVEL] out of 10:
### Focus on Specific Word Types
Prioritize:
### Adjust Number of Words
Extract [NUMBER] words:
---
## 💡 Pro Tips
### 1. Provide Context
Include this in your prompt for better results:
Context: I’m learning [FIELD/DOMAIN] and my current English level is [LEVEL]. I already know common words like [EXAMPLES], so focus on more advanced terms.
### 2. Request Specific Translation Style
For Urdu translations:
### 3. Ask for Additional Information
Also provide for each word:
### 4. Iterative Refinement
If results aren't perfect:
Please refine the translations:
---
## 🔧 Post-Processing Checklist
After Copilot generates the YAML:
- [ ] Copy the output
- [ ] Validate YAML syntax (use yamllint.com or Python)
- [ ] Review Urdu translations for accuracy
- [ ] Check example sentences for clarity
- [ ] Verify all required fields are present
- [ ] Remove any duplicates
- [ ] Save to `data/dictionary/[topic]/vocabulary.yaml`
- [ ] Create/update Hugo content page
- [ ] Test locally with `npm run dev:memory`
---
## 📊 Example Workflow
### Step 1: Get Transcript
```bash
# For YouTube videos
youtube-transcript-api VIDEO_ID --format text > transcript.txt
Copy the main prompt template above and fill in:
[TOPIC] → “RAG and Vector Databases”[TRANSCRIPT] → Paste your transcript contentPaste into Copilot Chat and wait for response.
1# Save to file
2cat > data/dictionary/rag-course/vocabulary.yaml << 'EOF'
3[PASTE COPILOT OUTPUT HERE]
4EOF
1# Validate YAML
2python -c "import yaml; print(len(yaml.safe_load(open('data/dictionary/rag-course/vocabulary.yaml'))))"
Use the archetype or manual creation:
1hugo new content/docs/dictionary/rag-course/index.md --kind dictionary
For multiple videos in a series:
I have 5 video transcripts from a course about [TOPIC]. I'll provide them one at a time.
For EACH transcript, extract 15-20 unique words (don't repeat words from previous transcripts).
Maintain consistent Urdu translation style across all outputs.
Transcript 1:
[PASTE TRANSCRIPT 1]
[Wait for response, then continue with next transcript]
After using this several times, you’ll notice:
Save your own improved prompts in this file!
Last Updated: 2026-05-05 Status: Ready to Use Tested With: GitHub Copilot Chat, ChatGPT-4, Claude 3