Job description
To make a Crawler to retrieve data from various websites.
The application is to retrieve crawl addresses from a .csv file retrieve the designated data and save to a database.
Requirements:
- Application running on a server (AWS, OVH, Digitalocean, etc...).
- Ability to configure and add more pages
- Saving the results to the database (updating the records when downloading again)
- Proxy and other methods of detection mechanisms
- Statistics and logs
- Ability to download data as logged-in users, store cookies
- Ability to set a schedule for starting downloads