How I stopped copying listing details into spreadsheets manually

Real estate agents and property investors deal with a lot of PDFs. Every listing comes as a sheet — address, price, beds, baths, square footage, HOA, taxes, agent contact. If you’re comparing 10-15 properties at once, someone has to open each PDF and manually enter the data before you can sort or filter anything.

For an agent managing a buyer looking at 20+ listings, that’s an hour of data entry per search batch. For an investor tracking multiple markets, it never ends.

Built a workflow that reads every listing PDF the moment it lands in Drive and adds it to a comparison database automatically.

What it does

Listing PDF dropped in Drive → extracts all property data → calculates price/sqft and estimated monthly costs → logs to property database → posts summary to Slack

About 10 seconds per listing.

What gets extracted

Location:

  • Street, city, state, zip — formatted as full address

Listing basics:

  • List price, property type, listing status

  • MLS number, days on market, list date

Property details:

  • Bedrooms, bathrooms, square footage

  • Lot size, year built

  • Parking type and spaces

Financials:

  • HOA fees and frequency

  • Annual property taxes

Auto-calculated:

  • Price per square foot

  • Estimated monthly cost — mortgage estimate + monthly taxes + HOA

Agent info:

  • Agent name, phone, email, brokerage

Features:

  • Top 10 amenities and features as a list

What lands in Slack


🏡 New Property Listing

Address: 412 Oakwood Drive, Austin, TX 78704

Price: $685,000 | Type: Single Family

🏠 Details:

• Beds/Baths: 4/3

• Sq Ft: 2,240

• Price/Sq Ft: $306

• Year Built: 2018

💵 Est. Monthly: $4,890

• HOA: $125/month

• Taxes: $572/mo

✨ Features: Hardwood floors, granite counters,

stainless appliances, covered patio, 2-car garage,

master suite with walk-in closet...

🔗 View Listing

What lands in Google Sheets

Each row: Address, Price, Type, Beds, Baths, Sq Ft, $/Sq Ft, Year Built, Lot Size, HOA, Taxes/Year, Est. Monthly, MLS #, Status, DOM, Agent, Agent Phone, Listing Link, Added Date

Every listing in the same format. Sort by $/Sq Ft to find best value. Sort by Est. Monthly to filter by affordability. Filter by DOM to find motivated sellers.

Setup

You’ll need:

  • Google Drive (folder for listing PDFs)

  • Google Sheets (free)

  • n8n instance (self-hosted — uses PDF Vector community node)

  • PDF Vector account (free tier: 100 credits/month)

  • Slack (for listing alerts)

About 10 minutes to configure.

Download

Workflow JSON:

[property-listing-extractor.json][workflows/n8n-workflows/property-listing-extractor.json at main · khanhduyvt0101/workflows · GitHub)

Full workflow collection:

khanhduyvt0101/workflows


Setup Guide

Step 1: Get your PDF Vector API key

Sign up at pdfvector.com — free plan works for testing.

Step 2: Create Drive folder and Sheet

Folder: “Property Listings” — copy folder ID.

Sheet headers:


Address | Price | Type | Beds | Baths | Sq Ft | $/Sq Ft | Year Built | Lot Size | HOA | Taxes/Year | Est. Monthly | MLS # | Status | DOM | Agent | Agent Phone | Listing Link | Added Date

Step 3: Import and configure

Download JSON → n8n → Import from File.

New Listing (Drive Trigger):

  • Connect Google Drive (OAuth2), paste folder ID

PDF Vector Extract:

  • Add PDF Vector credential (Bearer Token), paste API key

Log to Database:

  • Connect Google Sheets, paste Sheet ID

Send to Slack:

  • Connect Slack, select your channel

Accuracy

Tested on MLS listing sheets, broker PDFs, and Zillow/Redfin exported listing documents.

  • Address, price, beds/baths: ~97%

  • Square footage and year built: ~95%

  • HOA and property taxes: ~91% — present in listing when disclosed

  • Days on market: ~88% — varies by listing format

  • Features list: ~90% — reliable on standard listing sheets

Cost

3-4 credits per listing. Free tier handles ~25 listings per month.

Customizing it

Price drop alerts:

Add a Sheets lookup to check if the same MLS number already exists. If the new price is lower, post a separate “Price Reduced” Slack alert.

Filter by criteria:

Add an IF node after Process Listing — only log and notify for listings matching your criteria (e.g., price under $700K, more than 3 beds, less than 30 DOM).

Connect to a CRM:

After Sheets logging, add an HTTP Request to create a contact or deal in HubSpot, Salesforce, or any real estate CRM with an API.


Limitations

  • Requires self-hosted n8n (PDF Vector is a community node)

  • Estimated monthly cost uses a rough mortgage estimate (0.6% of price) — not a precise calculation

  • HOA and tax data only extracted when disclosed in the listing document

  • No deduplication — same listing re-uploaded creates a new row


PDF Vector n8n integration

Full workflow collection

Questions? Drop a comment.

1 Like

solid real-world use case and the accuracy numbers are useful to have — extraction on listing PDFs varies a ton so benchmarks actually matter before you trust it on client data. adding the MLS dedup early makes sense, same listing shows up from multiple sources all the time in a real pipeline. thats probably the first thing id wire in before going live with a client.

1 Like