Job description
Requirements:
- technology: python, java, nodejs or a good alternative
- the bot has to run on the server non-stop and run every x minutes
- In file with phrases and an indication of how many subpages were searched. Out csv file with results and search details
- eliminates duplicate domains during execution and by comparing with the entire out file
- after finding a particular page, it also tries to search the other subpages for as much data as possible
- error/exception handling so that an unexpected error does not interrupt the operation
- request for information on how many minutes the bot can run so that it is not blocked by google
- a request for information on whether it is possible to implement a free or paid proxy on the server and how this will affect the possible frequency of bot firing
- correction of the phone number to the format of 9 characters
- correction and removal of duplicate data before saving to out
- is to work on a Linux type server
- determine the cost of execution, implementation, and possibly proxy
- in addition, options, setting in pairs to how many wyn in google should search for example up to 100, 200