Iterate on each item

tuedec16 · January 22, 2024, 5:56pm

The output will be {"result":"success"} or error. However, I’m still confused how iteration works without the crawler piece. Where is the increment value actually stored? I can’t seem to find an example that can console.log the iterated values.

ihortom · January 23, 2024, 2:13am

@tuedec16 ,

Where is the increment value actually stored?

It is stored in count. However, the problem is not where it is stored but how. I told you so may times that the logic you apply does not sound right to me. Here’s my quote:

The problem is rather with your logic of processing the crawling. Nowhere in that logic I see the condition to increment the count during item iteration. It increments only once during initialization and then it is set to “complete”.
– Iterate on each item - #16 by ihortom

All the workflows I shared with you were addressing the iterations only (the subject of this post). I have not touched your counting logic but rather was pointing that something is not right with it.

The workflow represents the logic you presented, namely

foreach( $sources as $source_id ) {
	$complete = false;

	for( $i = 0; $i <= 10; $i++ ) {
		if( $complete == false ) {
			$result_from_crawl = do_the_crawl();

			if( $result_from_crawl == true ) {
				$complete = true;
			}
		}

		if( $i == 10 && $complete == false ) {
			send_an_email();
		}
	}
}

This logic appears to be flawed to me because you mark the crawling as complete as soon as a single crawling is done. You do not give a chance to crawl more than once. Perhaps, you meant this instead?

foreach( $sources as $source_id ) {
  $complete = false;
  $count = 0;                                     // init count

  for( $i = 0; $i <= 10; $i++ ) {
    if( $complete == false ) {
      $count += 1;                                // missing count increment
      $result_from_crawl = do_the_crawl();

      if(
        $result_from_crawl == true  &&
        $result_from_crawl.request == "success"   // mark completed on success (?)
      ) {
        $complete = true;
      }
    }

    if( $i == 10 && $complete == false ) {
      send_an_email();
    }
  }
}

Once you corrected/confirmed your crawling logic I could try to help you with implementing that as well.

So, does {"result":"success"} should mark the crawling as “complete”, and when an error encountered only count should increment and the source should be retried for crawling? When you say “or error” do you mean the HTTP Request node returning something like {"error": "some error"} or just the node errors (status code other than 2xx, for example)?

tuedec16 · January 24, 2024, 2:18am

$count is redundant as $i is the counter.

You do not give a chance to crawl more than once.

Each source iterates 10 times at most.

you mark the crawling as complete as soon as a single crawling is done.

Each source starts as “not complete” $complete = false; it is the outside loop. For n sources there will be n * 10 iterations

Allow me to explain again with comments;

foreach( $sources as $source_id ) {                           // outer loop, iterates on n sources
	$complete = false;                                        // each source starts as "not complete"

	for( $counter = 0; $counter <= 10; $counter++ ) {         // inner loop of 10x for each source.
	                                                          // counter increments for each iteration
		if( $complete == false ) {                            // attempt the crawl if the $compplete is false
			$result_from_crawl = do_the_crawl();

			if( $result_from_crawl.request == "success" ) {   // if the result is "success" set the complete
				$complete = true;                             // flag to true
				$counter = 10;                                // break from the inner loop
			}
		}

		if( $counter == 10 && $complete == false ) {          // if the counter is at 10, attempts have been
		    send_an_email();                                  // exhausted and complete is false, send an email
		}
	}
}

If I could ask for a basic example of a loop showing where the counter value is being stored and output, I believe I could complete the process.

ihortom · January 24, 2024, 3:46am

@tuedec16 ,

If I could ask for a basic example of a loop showing where the counter value is being stored and output, I believe I could complete the process.

Sure, I have placed notes in the workflow explaining what is happening. Note that count is controlled in Set nodes themselves!!! I also made a slight rearrangement to ensure when the workflow completes you have the stats available in “Done” node.

Finally, when the crawling is completed successfully I do not set count to 10. There is not need for that, and this approach also provides a more accurate information how the source was crawled (how many times).

You didn’t answer about the error question. Depending on that answer you need to make sure that your HTTP Request node (the crawler) is configured accordingly. For example, I set it to always output data assuming that by the error you mean the node failes to crawl/reach destination and returns no JSON containing response.

tuedec16 · January 24, 2024, 4:16pm

That is wonderful and is working I completed the error handling. Please let me know where I can send a beer.

system · January 31, 2024, 4:17pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.