已关闭

Update an already existing web scraping tool and gather data for Startups database. It needs to gather data from known sites

1st of all - apologies for the change of budget, the project description is totally different - it is just a modification/upgrade of an existing scraper, not a dev of a new one :)

NOT A BIG PROJECT, BUT AN INTERESTING ONE FOR SURE :)

THE WEBSCRAPER IS ALREADY DEVELOPED ACCORDING TO THE INSTRUCTIONS BELOW, BUT NEEDS TO BE UPGRADED (GUI-UX & SOME FUNCTIONS) (you will find the dev files and documentation in the attached zip file). You can also check the project out on GitHub ([login to view URL]). We need it to upgrade it so that it can adapt to the changes of all the websites it needs to scrape data from and also we want to add this website : www (dot) startupblink (dot) com

web scraping tool and gather data for Startups database. It needs to gather data from known sites (more info in attached documents):

Web scraper should be capable to gather basic info as a lead, such as Startup name and some contact information. Possibly startup description and logo

Web scraper should accept an URL parameter (where to scrap for the data) and depth level (how deep scraper should dig, e.g. how many sub-links, sub-sections per URL and whether should scraper go outside specified URL, e.g. follow external links)

In later stage, same web scraper should be capable to be configured to search for additional leads other than startups - such as: investment entities, service providers, etc...

Background and strategic fit.

All scraped info should be saved into two databases (startups max 3 years old), other companies. There should be a simple way to convert the DBs into CSV files.

This web scraping tool should be configured in such a way that admins can insert starting URL and define what are they looking for, among, for example: startups, investment entities, service providers, etc... as well as list of data they are looking for, such as: company(startup) name, contact data, descriptions, and/or other properties.

From tech perspective, the tools should use some already made Web Scraper, regardless of it's tech. stack... There are some pretty cool Java, Python and Node based web Scrapers.

From tech perspective, it must be easily deployable tool not requiring some additional server resources or specific infrastructure stack which would create an overhead. Basically, what ever can be run from a container or similar environment could work for us, for as long as it is not resource-hungry and cost a lot when operating.

When scraping tool is started, it should find required data from specified URL, then check do we already have found data in our databases, and if not, it should save it into our Startup database

Assumptions

1 Starting URL As an operator I want to be able to input starting point (URL) for web scraping MUST HAVE

Operator inputs starting URL for scraping

2 Search params As an operator I want to be able to input parameters I am looking for MUST HAVE

Operator inputs what type of data, properties is looking for, such as: startup name, startup contact data, startup descriptions, startup logo

The params should be added dynamically because they will vary from URL to URL

Each searching param should accept multiple selectors... On some websites Startup name is titled as "startup name" while on others as "company name" or just "name"... We need to be able to define multiple params names and group them into single title.

3 Depth level As an operator I want to be able to input the depth level for my starting URL MUST HAVE

Operator can select depth level for scraping, choosing from dropdown with values "1, 2, 3, 4, 5, any" defining how deep scraper should dig the starting URL

4 Follow External links As an operator I want to be able to choose whether my scraping tool should follow any external links from my starting URL MUST HAVE

Operator choose Yes or No

User interaction and design

The tool need to have very simple interface for the operators and it requires authorization before the tool can be used.

技能: 网页搜罗, PHP, 数据挖掘, 网页搜索, 数据输入

查看更多: create setup installshield web project database, web admin tool database, web scraping tool, beautiful soup web scraping, parsehub, beautifulsoup, scrapy, web scraping python, web scraping tutorial, octoparse, web scraping tools, web scraping database, web scraping create database, access database web scraping, database development web scraping, scraping tool visual web ripper, best web scraping tool, all in tools web scraping tool, best tool for web scraping, email database web scraping

关于此雇主:
( 3个评论 ) Ljubljana, Slovenia

项目ID: #25625798

18 威客就此工作平均出价 €303

wpoppo

Hi, As per need, you need to Update an already existing web scraping tool and gather data for Startups database, We have shared our company portfolio please go through it: We have a team of professional Full Stack de 更多

€400 EUR 在20天内
(193条评论)
8.4
djlucas1234

Hello, there. I'm good at scraping with python script. I already used that source. So no problem to improve UI and other features. Let me know if possible to work. Thanks David Lucas

€175EUR 在1天里
(56条评论)
6.0
vladzolotukhin

Hello. As a web scraping and data mining expert by python and node.js, selenium web driver, I am glad to place the bid on your project. I have experienced LinkedIn profile scraping, amazon products scraping and ticket 更多

€400 EUR 在7天内
(23条评论)
5.2
arturkandalyan

Hi sir, I am a Web & Data Scraping Expert who have career for 6 years over. I am very happy to bid on your job. I have already worked on several similar projects for collect data & contact & business information such a 更多

€200 EUR 在2天内
(6条评论)
4.8
StanislavRezer

Hi, Manger! Here is a Web Scraping expert. I have experience web scrapping and auto send mail app. So I think it's best for this project. I can work 24/7. if you require, I can work 60hours per week. if you hire me, y 更多

€175 EUR 在7天内
(7条评论)
4.0
PKonstiantyn

@Hello!@ I have read your description and understand your idea. I have also checked your attachment files. Web scrapping and auto script are my favorite skill. I did so many scrapping and auto script projects using py 更多

€500 EUR 在7天内
(2条评论)
3.4
havrentiy

Hi, there. I have read your description carefully. I am very interested in your web scraper updating project. I have rich experience with web scraping using Python and PHP. Looking forward to hearing from you. Best wi 更多

€175 EUR 在3天内
(4条评论)
2.8
ivanovic0216niko

Hi, how are you? I read your description carefully and understood all concept right. I've beend a PHP full stack developer for more than 4 years and have extensive experience in building the multi vendor marketplace 更多

€999 EUR 在30天内
(1条评论)
2.0
€510 EUR 在10天内
(0条评论)
0.0
Ashokomkar

I am interested for this job

€556 EUR 在25天内
(0条评论)
0.0
€23 EUR 在2天内
(0条评论)
0.0
muhammadsufyan85

Hi sir, I'm a Professional Person in this field, i can do this job efficiently. Why me ? Strict Confidentiality I won't provide client's data to any one You will get me The work will not be outsourced to anyon 更多

€250 EUR 在4天内
(0条评论)
0.0
AnujMB

I will send you a set of lecture slides and notes and you will need to summarize and make them into concise notes.

€194 EUR 在10天内
(0条评论)
0.0
Sunilbg1

I have been working with Software Developer for more than 08 years. I’ve come to know that you are looking for a software developer expert who knows the work very well. I want to let you know that I can fulfill your r 更多

€600 EUR 在7天内
(0条评论)
0.0
ramsinghrawat184

Hi, there. I have read your description carefully. I am very interested in your web scraper updating project. I have rich experience with web scraping using Python Looking forward to hearing from you. Best wishes.

€175 EUR 在7天内
(0条评论)
0.0
Babarali572

Hello! I’m Babar Ali. I’m a Data Entry/Data Processing Expert who knows the value of time, very hard working and always delivers the work on time. My Motive is to make my employer happy without adding additional charge 更多

€8EUR 在1天里
(0条评论)
0.0
AlekseevnaKF

I have rich experience of webscraping using python. I interest your project. If you award me this project, i'll work very hard for you Thanks

€100 EUR 在3天内
(0条评论)
0.0
Kristijan08

Hello there, if you are looking for high quality results for your project, we are ready to cooperate. Feel free to contact us in private for more details. Thanks & Kind Regards [login to view URL]

€19 EUR 在7天内
(0条评论)
0.0