I am pulling data from html using the extract html node. here is an example of the html I am pulling
I find the selector and set the key, then set it to text. at the bottom I select clean up text
here is what the output gives me:
The node is removing spaces between words…it happens about every 10 words or so.
I thought maybe there were line breaks in there I couldn’t see and it was removing them for some reason but not putting a space to replace them, but here is the raw input, no line breaks or anything of that nature
Whether to remove leading and trailing whitespaces, line breaks (newlines) and condense multiple consecutive whitespaces into a single space
If you look into the actual HTML code (without having “Clean Up Text” on) you will see that there is a “new line” in the text, \n. Removing that character produces two words connected together