Who wants to be my HERO ??
- Mostly I would like explanation to learn, so if someone could guide me it would be wonderful.
I retrieve a PDF from my bank every month, and I would like to take the text from the PDF and extract the table from it.
- I’m not sure Text Manipulation is the best to do that
Todo
- remove the garbage before the first
\nValue date:
- and after the last transaction
- dividing transaction per line (again, I focus on
\nValue date:
)
- pushing line by line in Google Sheet
data sample {{ $json.text }} received by Text Manipulation
Holder: Last Name Middle Name Name\nAccount: ES00 0000 0000 0000 0000 0000\nAvailable balance: 0.000,00€ (on 02 Oct. 2023)\nDocument on: 02 Oct. 2023 Page 1 of 5\nYour account activity (05 Jul. 2023–02 Oct. 2023)\nTransaction date Transaction Amount Balance\n02 Oct. 2023\nValue date: 02 Oct. 2023\nRecibo Clientes, S.A.U Nº Recibo 0000 0000 000\nBbjmqbh Ref. Mandato 000000000000, De\n-00,00€ 0.000,00€\n02 Oct. 2023\nValue date: 02 Oct. 2023\nTransferencia A Favor De Name LastName Concepto: Name\nLastName / Comercial xxxx Con xxxxxxx\n-00,00€ 0.000,00€\n02 Oct. 2023\nValue date: 02 Oct. 2023\nTraspaso: Personal Expenses -00,00€ 0.000,00€\n02 Oct. 2023\nValue date: 02 Oct. 2023\nTraspaso: Personal Expense -00,00€ 0.000,00€\n02 Oct. 2023
output I would like
- “26 Sep. 2023
Value date: 26 Sep. 2023”,“Transferencia De La Base Culture, Maison Des Francophonies, Concepto Remboursement Flam.”,000.00€,“0,000.00€”
- where
"Date","Transaction","Amount","Balance"
because I wish pushing line by line in a Google Sheet (but maybe I’m wrong in my logic)
My WiP Workflow
barn4k
3
hi @JOduMonT
I do not have google account, but suppose that should work (if the sample is correct)
example output
1 Like
Pretty close, but not perfect 
Obviously, I forgot to mention that the PDF is multipage, in that case 5
each page has a header and a footer which should be thrown in the garbage.
Like in the header, every line who starts with
- Holder:
- Account:
- Available balance:
- Document on:
- Your account activity (
- Transaction date Transaction Amount Balance
I say start with because I do something like this in my VS-Codium and they are new line

I replace \n by a return carriage
Also for an unknown reason
when the transaction started with Bizum or Traspaso: they are on one line while order are multilines
here a cut and paste directly from the PDF
25 Sep. 2023
Value date: 24 Sep. 2023
Bizum De xxx xxxxx xxxxxxx Concepto Cumpleaños 00,00€ 00.000,08€
25 Sep. 2023
Value date: 24 Sep. 2023
Transaccion Contactless En Fun Jump Sl, Massanassa, Tarj.
:*000000
-000,00€ 00.015,08€
25 Sep. 2023
Value date: 25 Sep. 2023
Traspaso: Family: Lunch At Artysana -00,40€ 00.003,08€
25 Sep. 2023
Value date: 25 Sep. 2023
Traspaso: Grocery: Consum -00,51€ 00.002,48€
In any case; I see I have to learn javascript, do you have any good resources about manipulating data with javascript ?
barn4k
5
@JOduMonT
try this one (it is more complex as it uses regex, but maybe it will act better)
In general, that request is not about javascript itself, it is rather about data manipulation and working with strings.
Some useful resources:
1 Like
I greatly appreciate your help and those link.
I’ll try it soon and come back to you 