Literature reviews are the part of research nobody enjoys. You download a paper, manually pull out the authors, year, journal, DOI, copy-paste the abstract somewhere, then go hunting for related papers one by one. Multiply that by 50 papers and it’s a week of tedious work.
Built a workflow that handles all of it automatically. Drop a PDF in a Google Drive folder and it does the rest.
What it does
New paper in Drive → extracts all metadata → searches Semantic Scholar + PubMed for related papers → generates APA citation → logs to Google Sheets literature database → creates a Notion summary page
The whole thing takes about 20-30 seconds per paper.
What gets extracted
Paper metadata:
-
Title (exact), all authors (full names), journal name, publication year, DOI
-
Full abstract, keywords, research field
-
Study type: Experimental / Observational / Review / Meta-analysis / Case Study / Qualitative / Mixed Methods / Theoretical
-
Sample size, methodology summary
-
Main findings (numbered list)
-
Conclusions, limitations, future research suggestions
Related papers search:
Uses the paper’s top 3 keywords (or first 5 words of the title if no keywords) to query both Semantic Scholar and PubMed simultaneously. Returns up to 10 results, each with title, year, and citation count. The top 5 get logged to your Sheet.
APA citation:
Auto-generated from extracted metadata in this format:
Author, A., & Author, B. (2024). Title of the paper. Journal Name. https://doi.org/10.xxxx
What lands in Google Sheets
Each row gets: Title, Authors, Year, Journal, DOI, Research Field, Study Type, Sample Size, Keywords, Abstract, Conclusions, APA Citation, Related Papers Found (count), File Link, Added Date
Your entire literature database in one Sheet. Filterable by year, study type, research field.
What lands in Notion
Creates a new page under your chosen parent page with:
-
Full citation header
-
Authors, year, journal
-
Truncated abstract (800 chars) + conclusions
Good for annotation — you can add your own notes directly in Notion after it’s created.
Setup
You’ll need:
-
Google Drive and Sheets (free)
-
Notion account (free)
-
n8n instance (self-hosted — uses PDF Vector community node)
-
PDF Vector account (free tier: 100 credits/month, roughly 20-25 papers)
About 20 minutes to configure.
Download
Workflow JSON:
github.com/khanhduyvt0101/workflows
Full workflow collection:
Setup Guide
Step 1: Get your PDF Vector API key
Sign up at https://www.pdfvector.com — free plan works fine. Go to API Keys and generate a key.
Step 2: Create your Google Drive folder
Create a folder called “Research Papers” in Google Drive. Copy the folder ID from the URL (string after /folders/).
Step 3: Set up your Google Sheet
Create a new spreadsheet with these exact headers in Row 1:
Title | Authors | Year | Journal | DOI | Research Field | Study Type | Sample Size | Keywords | Abstract | Conclusions | Citation | Related Papers Found | File Link | Added Date
Copy the Sheet ID from the URL (long string between /d/ and /edit).
Step 4: Set up Notion
In Notion, create a page called “Literature Review” (or whatever you want). Copy the page ID — it’s the last part of the page URL, the 32-character string after the last -.
Connect your Notion account in n8n via the Notion credential node.
Step 5: Import the workflow
Download the JSON from GitHub and import into n8n via Import from File.
Step 6: Configure the nodes
Google Drive Trigger:
-
Connect your Google account
-
Paste your Research Papers folder ID
Download Paper:
- Same Google credential
PDF Vector - Extract Paper Info:
-
Add new credential (Bearer Token type)
-
Paste your API key
PDF Vector - Find Related Papers:
-
Same PDF Vector credential
-
Uses academic search — queries Semantic Scholar and PubMed automatically
Add to Literature Database:
-
Connect Google Sheets
-
Paste your Sheet ID
-
Sheet tab name should match (default “Sheet1”)
Create Notion Summary:
-
Connect Notion account
-
Paste your parent page ID
Step 7: Test it
Activate the workflow and drop any research paper PDF into your Drive folder. After about 30 seconds check your Sheet — should see a fully populated row. Check Notion for the new summary page.
Accuracy
Tested across papers from medicine, psychology, computer science, and economics.
-
Metadata extraction (title, authors, year, journal): ~98% on digital PDFs
-
Abstract and conclusions: ~95%
-
Study type classification: ~90% — occasionally misclassifies reviews as meta-analyses
-
Keyword extraction: ~92%
-
Related papers search: depends on how niche the topic is — well-indexed fields (medicine, CS) return 10 results consistently; very niche topics may return 3-5
Scanned papers drop to about 85% on metadata accuracy.
Cost
Each paper uses about 4-5 PDF Vector credits (extraction + academic search). Free tier of 100 credits gets you roughly 20-25 papers per month.
Basic plan is $25/month for 3,000 credits if you’re doing a large review.
Customizing it
Change how many related papers are fetched:
In the PDF Vector - Find Related Papers node, change limit: 10 to whatever you want. The Code node currently takes the top 5 for the Notion page.
Search only one database:
In the academic search node, change providers from ["semantic-scholar", "pubmed"] to just one. Useful if your field is primarily indexed in one database.
Add email digest:
Drop a Gmail node at the end to send yourself a daily digest of papers added. Use a scheduled trigger instead of Drive trigger and loop through new Sheet rows.
Skip Notion:
Delete the last node if you don’t use Notion. The workflow works fine without it — everything important is already in Google Sheets.
Add more study types:
Edit the studyType enum in the PDF Vector extraction node schema to add types relevant to your field.
Limitations
-
Requires self-hosted n8n (PDF Vector is a community node)
-
Related paper search quality depends on the paper’s keyword quality — poorly keyworded papers may return irrelevant results
-
Notion page content is text-only (no tables or formatted sections in this version)
-
APA citation is auto-generated and should be verified before use in formal writing
-
Doesn’t handle multi-paper batch uploads — each file triggers individually
Links
-
PDF Vector academic search docs: n8n Integration - PDF Vector
-
Full workflow collection: GitHub - khanhduyvt0101/workflows: Awesome PDF Automation Workflows - A curated collection of ready-to-use automation workflows for PDF processing and document extraction · GitHub
-
n8n docs: https://docs.n8n.io
Questions? Drop a comment if something’s not working or you want to adjust it for your research workflow.
