Noob searching for help with data manipulation / transformation

Ghost · August 27, 2022, 3:09pm

Describe the issue/error/question

So I am fairly new with n8n and I have a few things I’d like to automate / improve recurring to it. One of them is a list of links another system I have compiles and publishes on a portal… I got to the point where I download that file with an HTTP Request node but then this file is composed with lines where some have things like: “SSG01…” or “SPT03” and then under (or line below) that line is the link to the media.

I would like to filter those 2 lines out in the middle of others that don’t really interest me. It’s always like that… something in the lines of:

#SSG01 - Drifting lines in Suzuka
https://somewebsite.in.internet/link.mkv

and so on…

Any lights on how I can achieve this and with which node exactly?
After extracting this, I want to write down into a file (a txt for instance) and finally publish it in a service.
But I can achieve those last steps (I think…) what I can’t achieve is this data filtering.

Thanks in advance!

Information on my n8n setup

n8n version: 0.192.2
Running n8n via Docker

BillAlex · August 28, 2022, 7:20pm

Hi @Ghost and welcome at our community!

If I understand you correctly, you get a list according to the structure: title, URL, title, URL, …

There is no node directly for this, but with javascript this should not be a problem.

Depending on what exactly is needed now, and how the text file is structured exactly, I would simply convert the lines to an array .split(‘/n’) and then output only lines that start with ‘http’ .filter(e => e.indexOf(‘http’) === 0) and then still map for n8n.

Now you have a list of all links - but not the titles. If the structure is always identical, this is no problem either. Then you fetch the data according to the index. But if the structure is title, link, link, title, link, title, link, link, link … you can’t assign the title unambiguously, unless you can tell us the algorithm for it.

Ghost · August 29, 2022, 8:05pm

Alright, I see the way! Thanks for your help

Ghost · September 4, 2022, 8:10pm

I am dumb… officially!

So, that workflow you shared works but I only get the http line. I’d like to keep the #SSG line as well.
There are some #SSG others #TLK and others #MPP. I want to target the #SSG and extract that + the http line after.
Is this possible?

Thanks once again for your help!

BillAlex · September 5, 2022, 3:42pm

I hope the new workflow will help you.

If not, real data (with fake urls or something) would help a lot

Ghost · September 5, 2022, 8:47pm

The data you’ve placed on the first node fits perfectly the kind of list I’m working with.
The output comes in a table… My objective is to filter / capture only those “headers” and “urls” dropping everything else.
The list should come out in the same format (basically a text file, with the lines separated again).

I tried to fiddle with the javascript but I must confess I don’t know where to start

BillAlex · September 5, 2022, 9:34pm

I hope that meets your requirements

If you want to learn JavaScript, I would recommend you this site: JavaScript Tutorial. For n8n I very often need array methods. These are well and comprehensively explained by Mozilla: Array - JavaScript | MDN

Ghost · September 6, 2022, 5:22pm

Thank you, thank you, thank you!!!
To be honest I tried to learn and change stuff around but I’m an ape on what regards programming
Thanks for your help! Absolute Legend!