Hello Python scrape programmers!
I'm only looking for an experienced developer in Python scraping, beginners please pass on! For now I need a tool developed by which I can extract specific data from websites. Further Python-projects are well possible if job is done perfectly!
Objectives:The goal is, that the tool can import an XML-sitemap, crawl along those links, check for and extract several information from each page and save the results from all pages into a CSV-file. This means that the CSV-File has the same structure as the sitemap (= the entire site) but is complemented with the results.
The "information to be extracted” is described in detail in the Excel file attached, only look at the columns in RED colour (the rest is optional for now, probably we do this in a second job after this first job is finished). Be aware that you must scroll inside the Excel-file far to the right to see all points! Please read the red text carefully to fully understand.
If any error occurs while scraping, the tool must go on with the next task and should not stop/hang.
Coding:It MUST be written in Python.
The code must be fully compliant with the actual standards of the programming language used.
Interface:There must be a simple interface with an import(=upload) function for a (local) XML-Sitemap and also a run and (if possible) a stop button. After the script has run through everything it must prompt me with a download link of the CSV-file (or just save the file in the same directory where the script is).
Delivrables:- Python Program/Script with all dependencies/extensions etc. if any needed