The output will be {"result":"success"}
or error
. However, I’m still confused how iteration works without the crawler piece. Where is the increment value actually stored? I can’t seem to find an example that can console.log
the iterated values.
Where is the increment value actually stored?
It is stored in count
. However, the problem is not where it is stored but how. I told you so may times that the logic you apply does not sound right to me. Here’s my quote:
The problem is rather with your logic of processing the crawling. Nowhere in that logic I see the condition to increment the count during item iteration. It increments only once during initialization and then it is set to “complete”.
– Iterate on each item - #16 by ihortom
All the workflows I shared with you were addressing the iterations only (the subject of this post). I have not touched your counting logic but rather was pointing that something is not right with it.
The workflow represents the logic you presented, namely
foreach( $sources as $source_id ) {
$complete = false;
for( $i = 0; $i <= 10; $i++ ) {
if( $complete == false ) {
$result_from_crawl = do_the_crawl();
if( $result_from_crawl == true ) {
$complete = true;
}
}
if( $i == 10 && $complete == false ) {
send_an_email();
}
}
}
This logic appears to be flawed to me because you mark the crawling as complete as soon as a single crawling is done. You do not give a chance to crawl more than once. Perhaps, you meant this instead?
foreach( $sources as $source_id ) {
$complete = false;
$count = 0; // init count
for( $i = 0; $i <= 10; $i++ ) {
if( $complete == false ) {
$count += 1; // missing count increment
$result_from_crawl = do_the_crawl();
if(
$result_from_crawl == true &&
$result_from_crawl.request == "success" // mark completed on success (?)
) {
$complete = true;
}
}
if( $i == 10 && $complete == false ) {
send_an_email();
}
}
}
Once you corrected/confirmed your crawling logic I could try to help you with implementing that as well.
So, does {"result":"success"}
should mark the crawling as “complete”, and when an error encountered only count
should increment and the source should be retried for crawling? When you say “or error
” do you mean the HTTP Request node returning something like {"error": "some error"}
or just the node errors (status code other than 2xx, for example)?
$count
is redundant as $i
is the counter.
You do not give a chance to crawl more than once.
Each source iterates 10 times at most.
you mark the crawling as complete as soon as a single crawling is done.
Each source starts as “not complete” $complete = false
; it is the outside loop. For n
sources there will be n * 10
iterations
Allow me to explain again with comments;
foreach( $sources as $source_id ) { // outer loop, iterates on n sources
$complete = false; // each source starts as "not complete"
for( $counter = 0; $counter <= 10; $counter++ ) { // inner loop of 10x for each source.
// counter increments for each iteration
if( $complete == false ) { // attempt the crawl if the $compplete is false
$result_from_crawl = do_the_crawl();
if( $result_from_crawl.request == "success" ) { // if the result is "success" set the complete
$complete = true; // flag to true
$counter = 10; // break from the inner loop
}
}
if( $counter == 10 && $complete == false ) { // if the counter is at 10, attempts have been
send_an_email(); // exhausted and complete is false, send an email
}
}
}
If I could ask for a basic example of a loop showing where the counter value is being stored and output, I believe I could complete the process.
If I could ask for a basic example of a loop showing where the counter value is being stored and output, I believe I could complete the process.
Sure, I have placed notes in the workflow explaining what is happening. Note that count
is controlled in Set nodes themselves!!! I also made a slight rearrangement to ensure when the workflow completes you have the stats available in “Done” node.
Finally, when the crawling is completed successfully I do not set count
to 10. There is not need for that, and this approach also provides a more accurate information how the source was crawled (how many times).
You didn’t answer about the error question. Depending on that answer you need to make sure that your HTTP Request node (the crawler) is configured accordingly. For example, I set it to always output data assuming that by the error you mean the node failes to crawl/reach destination and returns no JSON containing response
.
That is wonderful and is working I completed the error handling. Please let me know where I can send a beer.
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.