Find Jobs
Hire Freelancers

Parse web pages for specific string contents

$8-15 USD / hour

已关闭
已发布大约 9 年前

$8-15 USD / hour

I have a list with approx 20.-50.000 URLs. These URLs have a totally different format and i need a parser to identify strings within the websites. The websites are listed job positions found via a specific search on [login to view URL] The goal is to parse the job positions to get some information Examples to do: -identify a e-mail on the webpage -existing DB with 40.0000 short strings. All of them must be checked if they are in the text of the website. Experience with such numbers needed, also performance-wise (probably best to download/cache the webpages) Not needed: Analyzing the structure or to find certain fields is NOT NEEDED. only to find strings or specific string/number combination anywhere on the webpage. Freelancer must be experienced with parsing/parsing of different web formats/regular expressions/string operations/semantic search is a plus Beside 1:1 string search it is an absolute bonus to bring some ideas of semantic ways / hacks to identify strings that are similar to the existing ones. e.g. known string in the DB is "Human Ressources", then ideally it is also identified with "human resource". If the freelancer is good in this i am happy to pay a bonus or create a longer cooperation. Note: access to the parser or handover of the script/tool must be given because the parser is needed on a regular basis. Also you might be interested in the other job - the creation of the URL list: https://www.freelancer.com/jobs/Data-Entry/Crawl-Indeed-create-lists-job/ I'm happy if somebody can take over both jobs; to get experts i splitted crawling and parsing into two jobs.
项目 ID: 7247754

关于此项目

17提案
远程项目
活跃9 年前

想赚点钱吗?

在Freelancer上竞价的好处

设定您的预算和时间范围
为您的工作获得报酬
简要概述您的提案
免费注册和竞标工作
17威客以平均价$10 USD/小时来参与此工作竞价
用户头像
Hello, Sir I am from vSol CORP. We are 19 people team, worked on more than 500 completed projects. We provide round the clock service. I have checked your project requirement. As this is somewhat different type of requirement. Can we speak together for a better understanding of the project. Can we chat here? Please reply me when you will be available. We are really like to discuss with you regarding this project. We are open for any type of negotiation. Let's speak together. Thanks vSol CORP
$8 USD 在40天之内
4.6 (124条评论)
6.8
6.8
用户头像
Hello sir,we have 8 member team and we can start work right now plz give us one chance we will do our best.
$12 USD 在3天之内
4.8 (240条评论)
6.6
6.6
用户头像
A proposal has not yet been provided
$12 USD 在3天之内
4.8 (136条评论)
6.4
6.4
用户头像
Hi, I m really interested on your project. I ll code a VBA macro which will loop through all the URL list, and for each url, it will download the content in order to treat the second part which is the searching of the email and the list of your strings. The problem here is that the list of your strings is too big... so it will takes alot of time to check the existence of all these strings and for each page. Suppose you have 40000 url and 40000 string, it will be 1 600 000 000 itérations. Which is very huge ! We have to find a solution for this (perhaps by dividing the strings into sets which treats the same lexique (ontologie)) so we can decrease the number of iteration. Also please note that I m also interested in the second project. I have a large experience scraping data from the web using my softwares or by coding my own macro. I m a Ph D in computer science and know what semantic search is. I have a small idea to get the macro run faster : by using the meta data of each page instead of searching on the webpage content. Looking for a long term cooperation. Waiting for your reply. Best regards, Issam
$9 USD 在3天之内
4.9 (81条评论)
6.2
6.2
用户头像
Hello sir, i am ready to accept both projects as you posted, i have total 6 workers in my own office and we are very expert in all kind of admin support work, kindly reply me so i can do sample work for you, awaiting for your positive response, thanks & best regards, Faheem.
$10 USD 在6天之内
5.0 (36条评论)
5.3
5.3
用户头像
I am interested to do this project. ..............Can start now............................................Lets start
$8 USD 在112天之内
4.8 (44条评论)
4.7
4.7
用户头像
Hello! I believe I'm a person who can handle both jobs or any of them. You can check out my recent works in my portfolio. If you notice this message, PM to discuss details if you have any questions about my experience in crawling or big data or regexp. Though I'm not good in semantic search, I have a huge experience in building crawlers, and nice clean code base to start with.
$12 USD 在3天之内
5.0 (1条评论)
1.4
1.4
用户头像
A proposal has not yet been provided
$12 USD 在3天之内
0.0 (0条评论)
0.0
0.0
用户头像
I Dont know how to do it but if i get to know how ? than i will do it no matter how ! and basically i am in this computer field since a long time so i will do it easily and even i have more time to do it than the others !
$12 USD 在3天之内
0.0 (0条评论)
0.0
0.0
用户头像
Hi, expert programmer and web/data scraper here with over 19 years experience in programming and RDBMS - please see my reviews. I'm using Perl parsers for this kind of jobs. I'm able to extract data fast.
$15 USD 在3天之内
0.0 (0条评论)
0.0
0.0
用户头像
A proposal has not yet been provided
$8 USD 在24天之内
0.0 (0条评论)
0.0
0.0
用户头像
Hello. My name is James Urbanic, I am 19 years old, and a high school graduate. I am really trying to look for something part or full time. I type 127wpm and have done data entry and excel work for my father's business, GIE Media. Cannot wait to hear back from you.
$12 USD 在40天之内
0.0 (0条评论)
0.0
0.0
用户头像
Data scraping from web site is one of my regular job. If you choice me, I'll do my best as much as I can. Thankyou.
$14 USD 在3天之内
0.0 (0条评论)
0.0
0.0
用户头像
I can input the data in this project for you in a way that will interest anyone that will read or use it. I have great skills on data management and information policy. I can help you research and enter any information or data you need and craft those information into powerful pieces that would captivate and interest any reader. I have done projects for people and all my employers applauded my professionalism. If you award me this project,I guarantee you 100% unique and error free work. I work to meet deadline, no delay. Thanks.
$8 USD 在3天之内
0.0 (0条评论)
0.0
0.0
用户头像
A proposal has not yet been provided
$12 USD 在15天之内
0.0 (0条评论)
0.0
0.0

关于客户

AUSTRALIA的国旗
Muenchen, Australia
0.0
0
付款方式已验证
会员自5月 29, 2014起

客户认证

谢谢!我们已通过电子邮件向您发送了索取免费积分的链接。
发送电子邮件时出现问题。请再试一次。
已注册用户 发布工作总数
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
加载预览
授予地理位置权限。
您的登录会话已过期而且您已经登出,请再次登录。