Node postlight-parser

It would help if there was a node for:

Parsing scraped web-content into a more readable format.

My use case:

My use case is scraping jobs, while path, id, classes etc are not reliable enough it would be nice to grab the whole page, parse it and use further down in the workflow.

Any resources to support this?

Are you willing to work on this?

I would like, but not experienced enough with javascript.