Every chapter of a thesis or dissertation starts the same way. You have a draft, a thesis statement, a methodology. Now you need to find the papers that support it, challenge it, or situate it in the existing literature. You run searches on Google Scholar, Semantic Scholar, maybe PubMed. Open tabs. Check citation counts. Copy references. Do it again with slightly different keywords.
For a single chapter that’s 2-4 hours of searching before you’ve written a word.
Built a workflow that does the search automatically when a chapter draft lands in Drive.
What it does
Chapter PDF dropped in Drive → extracts thesis statement, methodology, theoretical framework, key concepts, and keywords → searches Semantic Scholar, PubMed, and ArXiv using those keywords → returns top 10 papers sorted by citation count → posts full research summary to Slack → logs to tracker sheet
Takes about 20-25 seconds per document.
What gets extracted from your chapter
-
Title, author, institution, degree type
-
Chapter title
-
Thesis statement / main argument
-
Key concepts and terms
-
Research methodology
-
Theoretical framework
-
Main findings or claims
-
Research questions
-
Keywords used for the academic search
How the search works
The workflow uses your extracted keywords, key concepts, and theoretical framework to build a search query automatically — no manual keyword entry needed. It searches three databases simultaneously:
-
Semantic Scholar — broad coverage across all disciplines
-
PubMed — medical and life sciences
-
ArXiv — physics, mathematics, CS, AI preprints
Returns up to 15 papers, sorts by citation count, keeps the top 10.
What lands in Slack
📚 Thesis Research Summary
Document: chapter-2-methodology.pdf
Title: Digital Labor Platforms and Worker Agency
Chapter: Chapter 2 — Theoretical Framework
---
📝 Thesis Statement:
This chapter argues that platform-mediated labor creates a
paradox of autonomy — workers experience formal independence
while facing substantive algorithmic control that mirrors
traditional employment subordination.
🔬 Methodology: Qualitative comparative analysis
📐 Framework: Labour process theory, algorithmic management
---
🔑 Key Concepts:
gig economy, algorithmic control, labor process,
platform capitalism, worker agency, precarious work
❓ Research Questions:
RQ1: How do algorithmic management systems constrain worker
decision-making on digital labor platforms?
RQ2: To what extent does platform design replicate traditional
managerial control mechanisms?
---
📖 Top Related Papers (10 found):
1. The Managed Heart (1983) - Hochschild, A. - 14,203 citations
2. Control and Resistance in the Workplace (2006) - Thompson - 3,847 citations
3. Algorithmic Management and the Future of Work (2019) - Duggan et al. - 1,204 citations
4. Platform Labor: On the Gendered and Racialised... (2018) - 891 citations
...
What lands in Google Sheets
Each row: Title, Chapter, Thesis Statement, Methodology, Framework, Key Concepts, Search Terms, Papers Found (count), Research Questions, Processed Date
One row per chapter or document. Track your entire thesis across all chapters in one view.
Setup
You’ll need:
-
Google Drive (folder for thesis chapters)
-
Google Sheets (free)
-
n8n instance (self-hosted — uses PDF Vector community node)
-
PDF Vector account (free tier: 100 credits/month — uses ~6-8 credits per document)
-
Slack (for research summaries)
About 15 minutes to configure.
Download
Workflow JSON:
thesis-research-assistant.json
Full workflow collection:
Setup Guide
Step 1: Get your PDF Vector API key
Sign up at pdfvector.com — free plan works for testing. Go to API Keys and generate a key.
Step 2: Create your Google Drive folder
Create a folder called “Thesis Chapters.” Copy the folder ID from the URL.
Step 3: Create your Google Sheet
Headers in Row 1:
Title | Chapter | Thesis Statement | Methodology | Framework | Key Concepts | Search Terms | Papers Found | Research Questions | Processed Date
Step 4: Import the workflow
Download JSON from GitHub → n8n → Import from File.
Step 5: Configure the nodes
Google Drive Trigger:
-
Connect Google Drive account (OAuth2)
-
Paste your folder ID
-
Event: File Created
Download Document / PDF Vector - Extract Content:
-
Add new PDF Vector credential (Bearer Token)
-
Paste your API key
PDF Vector - Find Related Papers:
-
Same credential
-
Query is built automatically from extracted keywords + theoretical framework
Compile Research:
- No config needed — sorting and formatting run automatically
Log Research:
-
Connect Google Sheets
-
Paste your Sheet ID
Send Research Summary:
-
Connect Slack
-
Select your research channel
Step 6: Test it
Drop any thesis chapter or dissertation section PDF into your Drive folder. Check Slack after about 30 seconds.
Accuracy
Tested on PhD dissertation chapters and Masters thesis sections across humanities, social sciences, and STEM fields.
-
Thesis statement extraction: ~88% — works well when the argument is explicitly stated; less reliable on chapters that build the argument gradually
-
Key concepts and keywords: ~93% — strong on academic writing that uses consistent terminology
-
Methodology and framework: ~90% — best on documents with dedicated methodology sections
-
Related paper quality: depends on database coverage for your field — strongest for STEM, CS, and biomedical; weaker for niche humanities topics
Cost
Each chapter uses 6-8 PDF Vector credits for extraction + academic search. Free tier of 100 credits handles roughly 12-15 chapters per month — enough for a full thesis draft cycle.
Customizing it
Refine the search query:
In the PDF Vector - Find Related Papers node, the query is built from keywords + theoreticalFramework. If results are too broad, add methodology to the query string in the Code-like expression.
Add citation generation:
After Compile Research, add a Code node that formats the top papers as APA citations using the same pattern from the academic paper finder workflow.
Track across multiple thesis iterations:
Since each file upload creates a new row, you get a history of how your thesis statement and framework evolved across drafts.
Limitations
-
Requires self-hosted n8n (PDF Vector is a community node)
-
Thesis statement extraction is less reliable on early-stage drafts without clear argument structure
-
Academic database coverage varies by discipline
-
ArXiv preprints have lower citation counts than final published versions
-
Search quality depends on how well keywords represent your actual research area
Questions? Drop a comment.
