scraping websites

已取消 已发布的 Feb 4, 2012 货到付款
已取消 货到付款

Hello coders.

Background:

The project is part of a site which is supposed to give a better answer to consumer needs regarding comparing prices and getting to better deals over the web.

Project description:

The data is supposed to be the most basic corner stone of the site.

Scrapping of 10 sites for the same data.

Entering the scraped data to csv format.

The scrapping scripts are supposed to run in a loop so that the data should be always up to date.

In addition the data should be achieved without the need for logging into the sites.

Looping the scripts is not part of the requirements for this project.

Preference:

[url removed, login to view] + Scrapy for scrapping the sites.

[url removed, login to view] - should be both speed and space.

The scripts should not consume minimal memory and run as fast as possible.

[url removed, login to view] programmer should be smart and think out of the box , take decisions , and make everything work as expected.

4.I will be available and expect consultation in case it is needed.

## Deliverables

sites:

[[url removed, login to view]][1]

http://www.[[url removed, login to view]][2]

http://www.[[url removed, login to view]][3]

[[url removed, login to view]][4]

[[url removed, login to view]][5]

[[url removed, login to view]][6]

[[url removed, login to view]][7]

[[url removed, login to view]][8]

[[url removed, login to view]][9]

All scripts/code given by the programmer should run under in-motion hosting under "launch" plan(<[url removed, login to view]>).

All code will be handed to Employer and will be solely owned by him.

The scraping should be 100% accurate and no errors will be accepted. The data should be coherent and the be a reflection to the data in the scraped sites.

Accuracy is extremely important in this case and crucial in order for the full funds to be admitted to the worker.

some sites might require logging in.

The script should scrape all data regardless of whether it requires logging in or not.

In cases logging in is required , the script should do it as preliminary step.

It is the worker's responsibility to open a fictitious account for that need.

Perl PHP

项目ID: #2708209

关于项目

远程项目 活跃的Feb 10, 2012