How to build state aware Bots

https://youtu.be/izHzChsnR8E

Hi, guys. Good evening. Today I would like to talk about state full programming or status aware bots, self recovery, self-healing bots, self-repair features.

So what do I mean by that? The thing is, when you code in a straightforward way, like for example, this code here you see. You have navigate command and then you have an alert command. Or not an alert, but you’re using a script command that you put some data into a list or a table or whatever.

So there’s nothing wrong with that. The only problem with that is, is that you don’t know what is going on and the bot does not know what is going on. When you just execute a navigate command, and then you have a lot of other stuff going on, like clicking buttons, extracting data, doing stuff, you always expect that it works if you code it in that way. So in this code, there are no checks. You’re not validating if the browser actually navigated, if the data you want to extract is actually there, if you got some results.

So that’s what I mean by state aware bot. Because the problem with that is, if you don’t do that, then you cannot recover and detect errors or failures. So if you write a bot that is just for yourself, and this is a very simple thing. You go to one page. You do some stuff, and you normally monitor that. Because it’s a task that you do a couple of times, and then you just don’t want to click manually and do that stuff. You’ll have your tool, your bot to do that for you. I mean, then you do not have to invest a lot of time and energy to develop state aware bots.

But let’s say it’s a more complex thing. So that you have list of URLs, 100, 500, whatever websites that you go to and that you want to extract some data from, for example. Or if you develop a bot that you sell that does multiple things. So it can be multiple URLs, or it could be doing multiple things on one page, or whatever, with multithreading, without multithreading. I highly recommend that you add checks into your code, first of all.

So there are a couple of things. Validate, what you expect the bot to do, validate that it’s actually happening. So after a navigate command, you could have a little script command and check if actually a page is loaded, or if that page is loaded. Find something that is unique on that page. For example, script for the text Amazon in the HTML source code, after you navigate it. Check the ExBrowser URL, if it’s actually on that page. So then you get checks.

And you can put that into defined commands, for example, one example. There are many ways to do that. I just want to get you thinking and implement that. You have to code that. So I’m not giving you a ready-to-use solution here. I’m just telling you the mindset you need, or you should apply here, and then some ideas what to do.

So for example, if you have a click command, you could also write your own define. You write define and it’s called My XP Click, or whatever, with some parameter. And then you add that. And then, instead of… It doesn’t matter really. I’m just making up some stuff here. Let’s see, just a second. So add that variable here, or whatever. You know what I mean.

So now you can add other things. You can add element exists. And you put some checks here. If element exists, browser… This is slow today here. Element exists, whatever. Then fine. If it does not exist, write to log file or whatever or make an alert. Error.

So best of course is you write to a log file here to an internal variable and then you save that to a file. Log file for example, error, viff, expat, expressions so that you have the data on URL, for example. Whatever. You can add the variables here that you see in the log file. Okay, what’s the current URL? What do I try to do here? You could also add some info text here. So when you call that command that you add an info that you say, “Okay, I have that my EXP click command for that expression in the new write…” Just for example, here’s the expat expression, whatever. First click on login button for example.

And then you always have that data and that’s very useful in your code because then if all your commands, you have a unique defined command for click and for other. You can do that for all the commands and functions you use. You have your own define for a click command. You have your own define for a script command, for dropdown, whatever you need.

And you can reuse that in all your bots. So you write that once and then you have a template. And in that define you check if the element you want to interact with is there. If not, you write to a log file. Put some additional data into that log file and the expat expression, then like an info. So you know when you see that error message or that log file that you know, okay it failed here at that point at step number 27. For example, when we try to fill out that box with the username or whatever it is. Then you have the checks there. If it’s there, great. Resume. If not add it to a log file and move on from there.

So that’s something I would add into more complex bots, or when you sell a bot. So that’s the first thing. Check what you try to do, if that really worked. Also check, you want to extract some data, validate that you got some output. So that’s the first thing. So let’s see. So it’s status aware. And we write to a log file.

Please take into account when you do multithreading, this is a little bit more complicated to do, right because within the threads you cannot write to a log file. The file could be locked or stuff could be overwritten. So from within the threads you have to write to a global variable or into a table, which is much better. When you, for example, you script through 100 URLs, you make a table with 100 lines and every URL is one line. And you put the URL in the first row and then the error message in the second row if there’s any, and then you have the data for all your 100 URLs. Was it successful? Green. Unsuccessful, red. So that’s what I mean, state aware.

And then you have, you know which URLs worked. For example, 100 URLs, 80, everything is great. The stuff you wanted to extract was there. You actually got some results. So 80 of the URLs you processed correctly. 20 you did not.

So if you have that in a log file, then you always have the ability in your code to repeat the failed ones. For example, okay, you run 100 threads and then you check the result. Okay, 80 were successful, 20 were not. So repeat the 20, for example.

You could also add that into a routine that you can restart your bot. There are ways to resume. If you have a state aware bot, every time the bot starts, it will load the last log or the last status file. As I said, many ways to do that, the last status file. So it knows, okay, I have those 100 URLs. We already processed 78 URLs. So there are 22 left. Please start at line item 79 or whatever. So the bot knows when it resumes, maybe it crashed or you restart it from, from time to time. Then the bot knows where to start again, where to repeat.

So don’t expect that everything always works. And if you have like 500 URLs, and you’re not checking the status. You do not track at what position in the process the bot is, what URLs you already processed and so on, then it’s really hard to first of all find errors. You don’t know where something happened and you have no way to repeat the stuff that failed. So that’s the thing. Always have checks in your code to validate what’s going on.

And if you write that to a status file, then you have the possibility to restart and resume. Restart the whole bot. Or for example, restart the browser process for example. We have the launcher command as you know.

So for example, if you launched that bot and then you detect, okay, it did not navigate and I did not get any results. So that first URL failed. Then you try the next one and that failed again, and then the next one. The third one failed again. So at that point you could have like a error counter, like errors in a row. So after like for example three errors in a row, navigating to different URLs, there might be a different problem. So maybe the browser crashed or the driver for the browser crashed.

As you know, here we have that Chrome 80 process. So I’m using the portable browser and then we have something called Chrome Driver 80 something. So that’s the connector between the ExBrowser Plugin and the actual browser kind of met in the middle communication thing. The driver that’s necessary. So that one can crash as well. And if that crash, then the browser might be still there. But it’s not taking any commands and not giving you any results. And that’s something you maybe not see directly.

So that means if you have like three commands that are failing and not giving you results, you could say, “Okay, something is strange. Now let’s stop the process and relaunch the browser just to be sure.” You can also make that with multithreading, but it’s a bit more complicated to code. But don’t expect that everything works. So if it fails three times in a row, okay, I run cleanup and I relaunch the browser and then our retry the three URLs.

If it fails again three times with different URLs, okay, maybe the whole bot has a problem. I relaunch the bot. It loads the status file. It knows, okay, here I was at position 27. That’s where I start. It tries it again and in your status log file, you know, okay, we already restarted the browser and it’s the first time we restarted the bot. Now we try those URLs again. If they fail again, then something else must be wrong. And then you could maybe send an email alert or whatever.

So you have different levels of escalation. Retry the URL. Restart the browser process with cleanup and launcher. Restart the whole bot. And then if it still does not do what it should do, send an email message and then get a notification for you or your staff to check that. Maybe then the proxy is not working or internet connection is broken or whatever.

So that’s what I mean by status aware. The bot should be able, if it’s a more complex thing or you want to run it 24/7 on a system, the bot needs to be able to auto-recover and auto-heal itself. So you know what could fail. The site could not load. The proxy might be giving you a error. Internet connection is down. The browser crashes. The driver crashes. Whole Ubot crashes. The PC crashes. You can add all that into account.

Of course, more complicated. Make sure the bot restarts when the PC reboots. Make sure when the bot crashes that you have some kind of control software running that checks if that process is still running and if not, it will restart it. The bot will load the status files or it knows what URLs, or whatever it already processed at what position it was. What was the last step I finished correctly and where should I continue?

So all of that stuff can make your bots, your botting and your results way better. So of course it’s a little bit more work to do and to integrate that. And yeah, maybe I will do some examples in the future, but you guys are smart. You know how to do that. It’s not that complicated and are many ways to do this.

And just another example, or let’s see if I can show that to you here. So please, if you’re not sure what cleanup exactly does, please re-watch that launcher tutorial that is in the ExBrowser version two membership area. It’s very important that you understand the difference between what no means, what full, what full plus kill, and what light means. So if you have no idea and you cannot explain now what the difference is, then re-watch that. It’s very critical for state aware botting and especially an even more critical if you’re using multithreading, or want to in the future. That’s mandatory. You have to know the difference here and what the commands, what does this doing. I mean cleanup here. No, yes is similar to full and full plus kill.

So if here’s a yes, that is the full plus kill in the launcher. If there’s a no, and this is a full. No is no, no cleanup at all. And light is running cleanup light. So I’m not going through what it is now again, because that’s in that other tutorial I mentioned. But you have to understand that.

You can use it in the launcher. You can use it in a separate cleanup. That depends on what you try to do. And if you’re using multithreading, the launcher command will be within your thread and then you don’t want to run cleanup from within the thread. Then you have the whole management layer to launch browsers and to run cleanup on the main level. The threads are running below that and the control layer’s above kind of. You know what I mean.

So because then for example, if you have 100 URLs, and I just recently yesterday or two days ago, had a support ticket from a guy. I mean, “Oh, well I have a bot that is running all the time and it’s scripting 300 URLs. And at some point my driver crashes.” So yeah, that can happen. It’s a very critical tool to kind of fix that because you never know what’s going on. Maybe it’s a memory leak. Maybe there is some website that’s loading some strange JavaScripts. Maybe after running the browser for 10 hours it gets fucked up, messed up somehow.

So what can you do? Instead of launching one browser and then running, navigating to the first URL, the second, the third and URL 500, why not like restart the browser every 10 URLs? So run launcher with portable. Process 10 URLs, restart the browser. Next 10 URLs, restart the browser, next 10 URLs.

And you run it with cleanup. Talking without multithreading, now. Run it with cleanup to launch it with cleanup full, which means it will try to kill all the browsers that are launched by ExBrowser Plugin. It will use the internal database to do that, which is an internal database within ExBrowser Plugin that keeps track on how many browser sessions you launched. But the cleanup full will also terminate the Chrome driver process itself. So it will look for Chrome 80.0 dot and so on EXE on your system and also the Chrome driver, and it will kill all of them.

So be careful here. If you run multiple bots on the same system or you’re using multithreading. As soon as you run cleanup with that option, it will kill all of them, even though another bot might also use the same Chrome version.

So you have to know what you’re doing there. But you could run that cleanup every 10 or 20 iterations, just to clean it up and have a fresh browser, and then process the next URLs. So don’t expect that the browser will be fine for three days in a row processing 5,000 URLs. Probably not going to happen.

So you can be proactive and just say, as I said in the multithreading video, so don’t run 10,000 threads after running it. Run 10. Relaunch the stuff. Run the next 10 or 50 or whatever makes sense for you. So that’s one thing can be proactive. Like break it down into 10, 20 whatever.

And besides being proactive, make the bot state aware so it knows that when it failed, as I said before, okay retry the URL. Failed again. Strange. Okay, let’s restart the browser with cleanup as well. Make that correct with monitoring and so on. Restart the browser with cleanup again, just to have a new browser to ensure that the Chrome driver is loaded again and not crashed or whatever. Then try again and so on. So that’s what I mean by by state aware bot. Check if that what you’re looking for is there, that URL is loaded, that the site is showing the data, that you get the data you want. Write to a log file so that you can validate it afterwards if some errors happened. Especially important if you sell bots and give them to your customers.

Do you have some kind of tracking? Write your own defines for the important commands like click and so on that you don’t have to write the, if element exists, yes, go on. No, write to the log file. So put that into define commands that you don’t have to do that 500 times. Then write it to a log file so you can check that later. And have routines and processes in your bot that are able to recover, and to resume at the point and repeat the stuff that failed and restart. Restart the processes. Try again. If not, restart the bot. Try again. If not, alert the admin.

So yeah, that’s the kind of thing I wanted to mention. So maybe a question that could come up is how do I know that the browser crashed? So this is sometimes a little bit tricky to understand because at some point the browser count will go down. At some point the browser count will go down.

Now it’s showing one. If I, for example… task manager. If I for example, go here and close the Chrome driver, just end that process, nothing will happen. The browser’s still there, but you can’t navigate anymore. So I run that, but nothing happens and it’s not directly giving me an error. So, but nothing happens.

So how do I detect that? I mean I can detect that now because if I, for example, run alert URL, the result is empty, which is not empty if everything is correct. Or I could run a simple script command and look for some stuff. So that way… Like what’s the thing here? Script element. Whatever. It doesn’t really matter.

All right. I could do that and it will give me an error here or not return anything. So, and then you have that in the debug log, ExBrowser debug log. It’s saying no browser running. And at some point it will also clear the browser count, but that can take a while. So it’s still showing one. Because in the internal database it still has that reference that we launched one browser instance even though the browser, the driver crashed. But the bot cannot detect that automatically out of the box.

So all right. But you can detect that because it’s not returning the results you expect it to return. So now you see that. And you try, okay, let’s navigate to the URL again, and then run the script command again. Still not working. Okay, write that to your log file. URL one failed again, second time. Okay, next URL. Try the second URL. Navigate to that. Try to extract your data. Still not getting any results. The second URL that failed. Let’s retry the second URL and if it fails again, so two URLs failed two times, then say, “Okay, maybe something is wrong with the browser itself. So let’s relaunch the whole browser.”

So we run the launcher again with the cleanup or with multithreading, you run cleanup separately and then launch the browser threads again. Or, whatever makes sense for your scenario. We launch the browser process again, which also then relaunches the Chrome driver as you can see here. Then we navigate to the URL. Yeah. Nice. See cleanup. Try license. Okay. Navigate to the URL again. Then we have our check. Okay. Multiple expat expression. No, I don’t want to do that or have that URL thing here that are running. Okay, now I get a result again.

So okay, now it works. And maybe you like write that to the log file as well. Okay. We restarted the browser. Now it works again. You can add all those status informations into the log file as well. Okay, restarting the browser session because we have two failed URLs. We retried two times. Doesn’t work. Restarting the browser. Add that to the log file. Maybe also add time and date to that log file, so you know when it happened, all that kind of stuff.

Then the bot is able to self heal itself. So it’s not expecting everything to work correctly. Have different levels of checks and different steps to recover. Retry with a new proxy. Restart the browser. Restart the bot. Restart the PC. Call that.

So that’s what I mean when I say state aware bots. And that definitely will fix a lot of your problems guys. Everyone who has issues with stuff like, “My bot is running a long time or I process a lot of URLs and at some point something happened.” Then this is for you. You should know when and at what point it failed. Because very, very often with those things, with browsers and Ubot and automation, there is not a specific reason why it fails. And it’s not always the same URL. There can be many reasons for that. There was a timeout in the website. It took too long. Then the plugin run into a timeout. Then some stuff crashed, whatever.

And it doesn’t really matter because you cannot always fix it. If it’s always a very specific thing, always that one URL and when you load it or click that one button or execute that one JavaScript, then the browser will always crash, and that happens on different PCs always, that’s something we can look into if that can be fixed.

But if it’s an error, like sometimes it happens, sometimes not. Sometimes it runs for two days, no problems. Sometimes it crashes after two hours and I’m not doing anything different, then this is the kind of stuff I’m talking about. Then your bot needs to be able to detect, okay, now it crushed. Okay, let’s restart and retry. That’s the best thing you can do. Build that logic and build that intelligence into your bots that they can check the status and that they are able to recover. Very, very important to make bots that are, yeah, working better and longer. And you will have a lot less stress and more happy customers.

So let me know what you think. If you have specific questions, how to do certain things here, please let me know. Open a support ticket or if you see that video in the Ubot forum, or in the Facebook group, reply there as well. If you want to discuss that with the other guys, no problem. No problem at all. Yeah. Think about that and maybe try to implement it step by step into your future projects and bots.

Thanks a lot guys. Happy coding. Talk to you soon. Bye-bye. Ciao.

How to build state aware Bots

ExBrowser 2.018

Multithreading – Control the Flow

How to Code – Instagram Login

Use a Stat Monitor to notify Users

Advanced Coding – Write Bots with Logging Feature

How to build state aware Bots