HTML Extract getting Div's into an array?

I’m working on a parsing workflow for Tesla SEC filings from their investor relations website.

So far I am able to pull out the block of HTML that I’m looking to iterate on. So far I have a HTML Request and HTML extract that pulls the below HTML snippet into a JSON field called “filings”.

Is there an easy way to get this onto something by using out of the box nodes?

As you can see, there are multiple repetitions of quarterly-disclosures__quarterly-disclosure quarterly-disclosure div’s.

Am looking to pull out the various items.
Time
Title
Description
Accession Number
PDF Link
HTML Link

<div class="quarterly-disclosures__quarterly-disclosure quarterly-disclosure">
   <div class="quarterly-disclosure__date tds-text--caption"><time datetime="2022-02-22T12:00:00Z">Feb 22, 2022</time> </div>
   <div class="quarterly-disclosure__title tds-text--h4">
      <svg class="tcl-table__icon">
         <use xlink:href="#tds-icon-pdf"></use>
      </svg>
      <a href="https://www.sec.gov/Archives/edgar/data/1318605/000177136422000003/xslF345X03/edgardoc.xml"> 4 </a> 
   </div>
   <div class="quarterly-disclosure__description tds-text--caption">Statement of changes in beneficial ownership of securities</div>
   <div class="quarterly-disclosure__accession-number tds-text--caption">Acc-no: 0001771364-22-000003</div>
   <ul>
      <li><a href="https://www.sec.gov/Archives/edgar/data/1318605/000177136422000003/xslF345X03/edgardoc.xml">HTML</a></li>
      <li><a href="/_flysystem/s3/sec/000177136422000003/000177136422000003-gen.pdf">PDF</a></li>
   </ul>
</div>
<div class="quarterly-disclosures__quarterly-disclosure quarterly-disclosure">
   <div class="quarterly-disclosure__date tds-text--caption"><time datetime="2022-02-14T12:00:00Z">Feb 14, 2022</time> </div>
   <div class="quarterly-disclosure__title tds-text--h4">
      <svg class="tcl-table__icon">
         <use xlink:href="#tds-icon-pdf"></use>
      </svg>
      <a href="https://www.sec.gov/Archives/edgar/data/1318605/000119312522042366/d255338dsc13ga.htm"> SC 13G/A </a> 
   </div>
   <div class="quarterly-disclosure__description tds-text--caption">An amendment to the SC 13G filing</div>
   <div class="quarterly-disclosure__accession-number tds-text--caption">Acc-no: 0001193125-22-042366</div>
   <ul>
      <li><a href="https://www.sec.gov/Archives/edgar/data/1318605/000119312522042366/d255338dsc13ga.htm">HTML</a></li>
      <li><a href="/_flysystem/s3/sec/000119312522042366/d255338dsc13ga-gen.pdf">PDF</a></li>
   </ul>
</div>
<div class="quarterly-disclosures__quarterly-disclosure quarterly-disclosure">
   <div class="quarterly-disclosure__date tds-text--caption"><time datetime="2022-02-14T12:00:00Z">Feb 14, 2022</time> </div>
   <div class="quarterly-disclosure__title tds-text--h4">
      <svg class="tcl-table__icon">
         <use xlink:href="#tds-icon-pdf"></use>
      </svg>
      <a href="https://www.sec.gov/Archives/edgar/data/1318605/000089924322006240/xslF345X03/doc5.xml"> 5 </a> 
   </div>
   <div class="quarterly-disclosure__description tds-text--caption">5</div>
   <div class="quarterly-disclosure__accession-number tds-text--caption">Acc-no: 0000899243-22-006240</div>
   <ul>
      <li><a href="https://www.sec.gov/Archives/edgar/data/1318605/000089924322006240/xslF345X03/doc5.xml">HTML</a></li>
      <li><a href="/_flysystem/s3/sec/000089924322006240/000089924322006240-gen.pdf">PDF</a></li>
   </ul>
</div>
<div class="quarterly-disclosures__quarterly-disclosure quarterly-disclosure">
   <div class="quarterly-disclosure__date tds-text--caption"><time datetime="2022-02-14T12:00:00Z">Feb 14, 2022</time> </div>
   <div class="quarterly-disclosure__title tds-text--h4">
      <svg class="tcl-table__icon">
         <use xlink:href="#tds-icon-pdf"></use>
      </svg>
      <a href="https://www.sec.gov/Archives/edgar/data/1318605/000110465922022369/tm225754d13_sc13ga.htm"> SC 13G/A </a> 
   </div>
   <div class="quarterly-disclosure__description tds-text--caption">An amendment to the SC 13G filing</div>
   <div class="quarterly-disclosure__accession-number tds-text--caption">Acc-no: 0001104659-22-022369</div>
   <ul>
      <li><a href="https://www.sec.gov/Archives/edgar/data/1318605/000110465922022369/tm225754d13_sc13ga.htm">HTML</a></li>
      <li><a href="/_flysystem/s3/sec/000110465922022369/tm225754d13_sc13ga-gen.pdf">PDF</a></li>
   </ul>
</div>
<div class="quarterly-disclosures__quarterly-disclosure quarterly-disclosure">
   <div class="quarterly-disclosure__date tds-text--caption"><time datetime="2022-02-10T12:00:00Z">Feb 10, 2022</time> </div>
   <div class="quarterly-disclosure__title tds-text--h4">
      <svg class="tcl-table__icon">
         <use xlink:href="#tds-icon-pdf"></use>
      </svg>
      <a href="https://www.sec.gov/Archives/edgar/data/1318605/000110465922018700/tv02024-teslainc.htm"> SC 13G/A </a> 
   </div>
   <div class="quarterly-disclosure__description tds-text--caption">An amendment to the SC 13G filing</div>
   <div class="quarterly-disclosure__accession-number tds-text--caption">Acc-no: 0001104659-22-018700</div>
   <ul>
      <li><a href="https://www.sec.gov/Archives/edgar/data/1318605/000110465922018700/tv02024-teslainc.htm">HTML</a></li>
      <li><a href="/_flysystem/s3/sec/000110465922018700/tv02024-teslainc-gen.pdf">PDF</a></li>
   </ul>
</div>
<div class="quarterly-disclosures__quarterly-disclosure quarterly-disclosure">
   <div class="quarterly-disclosure__date tds-text--caption"><time datetime="2022-02-08T12:00:00Z">Feb 08, 2022</time> </div>
   <div class="quarterly-disclosure__title tds-text--h4">
      <svg class="tcl-table__icon">
         <use xlink:href="#tds-icon-pdf"></use>
      </svg>
      <a href="https://www.sec.gov/Archives/edgar/data/1318605/000083423722008340/us88160r1014_020822.txt"> SC 13G </a> 
   </div>
   <div class="quarterly-disclosure__description tds-text--caption">A statement of beneficial ownership of common stock by certain persons</div>
   <div class="quarterly-disclosure__accession-number tds-text--caption">Acc-no: 0000834237-22-008340</div>
   <ul>
      <li><a href="/_flysystem/s3/sec/000083423722008340/us88160r1014_020822-gen.pdf">PDF</a></li>
   </ul>
</div>
<div class="quarterly-disclosures__quarterly-disclosure quarterly-disclosure">
   <div class="quarterly-disclosure__date tds-text--caption"><time datetime="2022-02-04T12:00:00Z">Feb 04, 2022</time> </div>
   <div class="quarterly-disclosure__title tds-text--h4">
      <svg class="tcl-table__icon">
         <use xlink:href="#tds-icon-pdf"></use>
      </svg>
      <a href="https://www.sec.gov/Archives/edgar/data/1318605/000095017022000796/tsla-20211231.htm"> 10-K </a> 
   </div>
   <div class="quarterly-disclosure__description tds-text--caption">Annual report which provides a comprehensive overview of the company for the past year</div>
   <div class="quarterly-disclosure__accession-number tds-text--caption">Acc-no: 0000950170-22-000796</div>
   <ul>
      <li><a href="https://www.sec.gov/Archives/edgar/data/1318605/000095017022000796/tsla-20211231.htm">HTML</a></li>
   </ul>
</div>
<div class="quarterly-disclosures__quarterly-disclosure quarterly-disclosure">
   <div class="quarterly-disclosure__date tds-text--caption"><time datetime="2022-02-03T12:00:00Z">Feb 03, 2022</time> </div>
   <div class="quarterly-disclosure__title tds-text--h4">
      <svg class="tcl-table__icon">
         <use xlink:href="#tds-icon-pdf"></use>
      </svg>
      <a href="https://www.sec.gov/Archives/edgar/data/1318605/000177134022000001/xslF345X03/edgardoc.xml"> 4 </a> 
   </div>
   <div class="quarterly-disclosure__description tds-text--caption">Statement of changes in beneficial ownership of securities</div>
   <div class="quarterly-disclosure__accession-number tds-text--caption">Acc-no: 0001771340-22-000001</div>
   <ul>
      <li><a href="https://www.sec.gov/Archives/edgar/data/1318605/000177134022000001/xslF345X03/edgardoc.xml">HTML</a></li>
      <li><a href="/_flysystem/s3/sec/000177134022000001/000177134022000001-gen.pdf">PDF</a></li>
   </ul>
</div>
<div class="quarterly-disclosures__quarterly-disclosure quarterly-disclosure">
   <div class="quarterly-disclosure__date tds-text--caption"><time datetime="2022-01-31T12:00:00Z">Jan 31, 2022</time> </div>
   <div class="quarterly-disclosure__title tds-text--h4">
      <svg class="tcl-table__icon">
         <use xlink:href="#tds-icon-pdf"></use>
      </svg>
      <a href="https://www.sec.gov/Archives/edgar/data/1318605/000179056522000001/xslF345X03/edgardoc.xml"> 4 </a> 
   </div>
   <div class="quarterly-disclosure__description tds-text--caption">Statement of changes in beneficial ownership of securities</div>
   <div class="quarterly-disclosure__accession-number tds-text--caption">Acc-no: 0001790565-22-000001</div>
   <ul>
      <li><a href="https://www.sec.gov/Archives/edgar/data/1318605/000179056522000001/xslF345X03/edgardoc.xml">HTML</a></li>
      <li><a href="/_flysystem/s3/sec/000179056522000001/000179056522000001-gen.pdf">PDF</a></li>
   </ul>
</div>
<div class="quarterly-disclosures__quarterly-disclosure quarterly-disclosure">
   <div class="quarterly-disclosure__date tds-text--caption"><time datetime="2022-01-26T12:00:00Z">Jan 26, 2022</time> </div>
   <div class="quarterly-disclosure__title tds-text--h4">
      <svg class="tcl-table__icon">
         <use xlink:href="#tds-icon-pdf"></use>
      </svg>
      <a href="https://www.sec.gov/Archives/edgar/data/1318605/000156459022002476/tsla-8k_20220126.htm"> 8-K </a> 
   </div>
   <div class="quarterly-disclosure__description tds-text--caption">Report of unscheduled material events or corporate event</div>
   <div class="quarterly-disclosure__accession-number tds-text--caption">Acc-no: 0001564590-22-002476</div>
   <ul>
      <li><a href="https://www.sec.gov/Archives/edgar/data/1318605/000156459022002476/tsla-8k_20220126.htm">HTML</a></li>
      <li><a href="/_flysystem/s3/sec/000156459022002476/tsla-8k_20220126-gen.pdf">PDF</a></li>
   </ul>
</div>
<div class="quarterly-disclosures__quarterly-disclosure quarterly-disclosure">
   <div class="quarterly-disclosure__date tds-text--caption"><time datetime="2022-01-20T12:00:00Z">Jan 20, 2022</time> </div>
   <div class="quarterly-disclosure__title tds-text--h4">
      <svg class="tcl-table__icon">
         <use xlink:href="#tds-icon-pdf"></use>
      </svg>
      <a href="https://www.sec.gov/Archives/edgar/data/1318605/000177136422000002/xslF345X03/edgardoc.xml"> 4 </a> 
   </div>
   <div class="quarterly-disclosure__description tds-text--caption">Statement of changes in beneficial ownership of securities</div>
   <div class="quarterly-disclosure__accession-number tds-text--caption">Acc-no: 0001771364-22-000002</div>
   <ul>
      <li><a href="https://www.sec.gov/Archives/edgar/data/1318605/000177136422000002/xslF345X03/edgardoc.xml">HTML</a></li>
      <li><a href="/_flysystem/s3/sec/000177136422000002/000177136422000002-gen.pdf">PDF</a></li>
   </ul>
</div>
<div class="quarterly-disclosures__quarterly-disclosure quarterly-disclosure">
   <div class="quarterly-disclosure__date tds-text--caption"><time datetime="2022-01-07T12:00:00Z">Jan 07, 2022</time> </div>
   <div class="quarterly-disclosure__title tds-text--h4">
      <svg class="tcl-table__icon">
         <use xlink:href="#tds-icon-pdf"></use>
      </svg>
      <a href="https://www.sec.gov/Archives/edgar/data/1318605/000089924322001396/xslF345X03/doc4.xml"> 4 </a> 
   </div>
   <div class="quarterly-disclosure__description tds-text--caption">Statement of changes in beneficial ownership of securities</div>
   <div class="quarterly-disclosure__accession-number tds-text--caption">Acc-no: 0000899243-22-001396</div>
   <ul>
      <li><a href="https://www.sec.gov/Archives/edgar/data/1318605/000089924322001396/xslF345X03/doc4.xml">HTML</a></li>
      <li><a href="/_flysystem/s3/sec/000089924322001396/000089924322001396-gen.pdf">PDF</a></li>
   </ul>
</div>

Hey @nin9creative, you’d to find suitable CSS selectors for each of the elements you’re interested in. You can then use these selectors in the HTML Extract node to extract an array of each of the values relevant for you, e.g. like so:

In the last step, you’d then transform these arrays into proper n8n items, for example using the Function node. I’ve put together a quick example for the date and text which can easily be extended to suit your needs:

Function Example Workflow

If you want to avoid using the Function node, you could alternatively use the Item Lists node to split up the extracted arrays plus the Merge node to merge them into n8n items. That’s a bit convoluted though as you would need one additional Item Lists and Merge node for every single array.

No-Code Example Workflow

The result of both workflows would be a bunch of items with the data you have previously extracted:

Wow thank you! That was easier than I had imagined.

This platform is super powerful - thanks for the tips. Looking forward to gaining expertise :slight_smile: