How to Build an AI Invoice Processing Pipeline Without Hiring Anyone
Automate invoice extraction, validation, and routing using n8n, GPT-4o, and Google Sheets -- no accountant or admin required.
Processing invoices by hand is one of the most avoidable costs a small business pays. You open a PDF, type numbers into a spreadsheet, match them against a purchase order, and email the result to someone who does the same thing again. This tutorial replaces that entire loop with an automated pipeline that extracts, validates, and routes invoices with no human in the loop.
By the end you will have a working system that:
- Watches a Gmail inbox for incoming invoice emails
- Extracts structured data from PDF attachments using GPT-4o
- Validates the totals and supplier details
- Logs approved invoices to Google Sheets
- Flags anomalies for human review via Slack
What you need:
- An n8n account (cloud or self-hosted)
- A Google account with Gmail and Sheets access
- An OpenAI API key (GPT-4o access)
- A Slack workspace (free tier works)
Step 1: Set Up Your Gmail Trigger in n8n
Open n8n and create a new workflow. Add a Gmail Trigger node.
In the node settings:
- Set Trigger on to
New Email - Set a Filter for emails with attachments:
has:attachment filename:pdf - Set the polling interval to
5 minutes
This trigger fires every time a new PDF arrives in the inbox. If you use a dedicated invoices@yourdomain.com address, you avoid noise from unrelated emails. Forward invoices there from your main inbox using a Gmail filter rule.
Connect your Google account using OAuth when prompted. n8n will store the token securely.
Step 2: Extract the PDF Attachment
Add a Gmail node (not the trigger -- the action node) set to Get Email to pull the full message including attachments. Wire it after the trigger.
Then add a Code node. This node will base64-decode the PDF binary and prepare it for the AI step:
const attachment = $input.all()[0].json.attachments[0];
const binaryData = attachment.data; // already base64 in n8n
return [{ json: { base64Pdf: binaryData, filename: attachment.filename } }];
If there are multiple attachments, loop through them using a Split In Batches node first.
Step 3: Send the PDF to GPT-4o for Extraction
Add an HTTP Request node. Configure it as follows:
- Method: POST
- URL:
https://api.openai.com/v1/chat/completions - Authentication: Header Auth, key
Authorization, valueBearer YOUR_OPENAI_API_KEY - Body (JSON):
{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Extract the following fields from this invoice PDF and return valid JSON only: supplier_name, invoice_number, invoice_date, due_date, line_items (array of {description, quantity, unit_price, total}), subtotal, tax, total_amount, currency. If a field is missing, use null."
},
{
"type": "image_url",
"image_url": {
"url": "data:application/pdf;base64,{{ $json.base64Pdf }}"
}
}
]
}
],
"response_format": { "type": "json_object" }
}
GPT-4o handles scanned PDFs and typeset PDFs equally well. The response_format flag forces a clean JSON response with no markdown wrapping, which saves you a parsing step.
Step 4: Parse and Validate the Extracted Data
Add a Code node to parse the AI response and run basic validation:
const raw = JSON.parse($input.all()[0].json.choices[0].message.content);
const errors = [];
if (!raw.supplier_name) errors.push("Missing supplier name");
if (!raw.invoice_number) errors.push("Missing invoice number");
if (!raw.total_amount) errors.push("Missing total");
// Validate that line items sum to subtotal (within 1 cent tolerance)
if (raw.line_items && raw.subtotal) {
const computed = raw.line_items.reduce((sum, item) => sum + (item.total || 0), 0);
if (Math.abs(computed - raw.subtotal) > 0.01) {
errors.push(`Line item sum ${computed} does not match subtotal ${raw.subtotal}`);
}
}
return [{
json: {
...raw,
validationErrors: errors,
isValid: errors.length === 0
}
}];
This catches the most common data quality problems: missing required fields and arithmetic mismatches. You can extend the logic to check against an approved supplier list stored in Google Sheets or Airtable.
Step 5: Route Based on Validation Result
Add an IF node:
- Condition:
{{ $json.isValid }}equalstrue
The true branch goes to the approval step. The false branch goes to a Slack alert.
Step 6: Log Approved Invoices to Google Sheets
On the true branch, add a Google Sheets node set to Append Row.
Map the columns:
| Sheet Column | Value |
|---|---|
| Supplier | {{ $json.supplier_name }} |
| Invoice # | {{ $json.invoice_number }} |
| Date | {{ $json.invoice_date }} |
| Due Date | {{ $json.due_date }} |
| Total | {{ $json.total_amount }} |
| Currency | {{ $json.currency }} |
| Status | Pending Approval |
| Logged At | {{ $now }} |
You now have a running ledger that updates automatically every time a new invoice comes in.
Step 7: Flag Invalid Invoices in Slack
On the false branch, add a Slack node set to post to a #invoice-exceptions channel:
*Invoice Exception: {{ $json.supplier_name || "Unknown Supplier" }}*
File: {{ $json.filename }}
Errors:
{{ $json.validationErrors.join("\n- ") }}
Review the original email and correct the record manually.
This keeps humans in the loop only for edge cases. The Slack message includes enough context to act without going back to the email.
Step 8: Activate and Test
Before activating:
- Send yourself a test invoice PDF by email
- Run the workflow manually using the Test Workflow button
- Confirm the Google Sheet row appears and all fields are populated
- Test a broken invoice (delete a field from a copy) to verify the Slack alert fires
Once both paths work correctly, toggle the workflow to Active. n8n will now poll Gmail every five minutes indefinitely.
Step 9: Optional Additions Worth Adding
Duplicate detection: Before appending to Sheets, query the existing rows and check if the invoice number already exists. If it does, post a Slack warning instead of creating a duplicate.
Currency normalisation: Add a call to an exchange rate API to convert all totals to your base currency before logging. This matters if you pay international suppliers.
Approval workflow: Replace the Pending Approval status with a two-button Slack message using Block Kit. When the approver clicks Approve, a webhook updates the Sheet row and triggers your payment tool.
Archive to Google Drive: After logging, upload the original PDF to a structured Drive folder (/Invoices/2026/Supplier Name/) so you have a searchable archive without touching your inbox.
What This Replaces
A typical small business spends two to four hours a week on manual invoice processing. This pipeline brings that to near zero. The only time a human acts is when the AI flags a problem -- and even then, the Slack message tells them exactly what to fix.
The components (n8n, GPT-4o, Google Sheets, Slack) are all available on free or low-cost tiers. At moderate invoice volumes you are looking at a few dollars a month in OpenAI API costs.
Build this once and it runs until you change your process. No recurring labour cost, no training new staff, no chasing approvals over email.