Find Jobs
Hire Freelancers

Parallel Python Code That Counts How Many Websites Have Canvas

$10-30 USD

已完成
已发布超过 6 年前

$10-30 USD

货到付款
I need a simple Python script that scrapes a list of websites in a csv file (e.g. top 500,000 Alexa sites attached), and checks if the website uses Canvas in the HTML (by checking for "<Canvas>") or in JavaScript (by checking for "createElement("canvas")" or "createElement('canvas')"). The code should output the number and percentage of websites using Canvas out of the list. It is recommended that the code uses the Python Libraries “Requests” and/or "BeautifulSoup4" with a similar logic as the one I started writing (attached). The following points need to be satisfied: • The code uses parallel computing for efficiency, so it doesn't run for so long • The http header has to look like it came from a real browser, so websites don't block it • The reading time of a website should not exceed 30 seconds, and should time out if no response for 30 seconds and go to the next website • The script needs to count and print the number of successfully read and unread sites from the csv file of top sites (as the one I am attaching does for the unread). The unread sites could be because a website is no longer available or responsive, or any other reason • The script needs to handle errors and doesn't crash • The script has to print the duration of execution (how many hours, minutes or seconds) • The script has to print the number and percentage of sites containing Canvas either in the HTML source code or JavaScript It would be great if we can have a version that is not parallel to compare the performance, but not super important
项目 ID: 15614099

关于此项目

1条提案
远程项目
活跃6 年前

想赚点钱吗?

在Freelancer上竞价的好处

设定您的预算和时间范围
为您的工作获得报酬
简要概述您的提案
免费注册和竞标工作
颁发给:
用户头像
I am a python expert and i can do your work. i can start immediately. and complete your work on time.
$30 USD 在1天之内
4.9 (161条评论)
6.2
6.2

关于客户

SAUDI ARABIA的国旗
Alkhobar, Saudi Arabia
5.0
3
付款方式已验证
会员自2月 9, 2014起

客户认证

谢谢!我们已通过电子邮件向您发送了索取免费积分的链接。
发送电子邮件时出现问题。请再试一次。
已注册用户 发布工作总数
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
加载预览
授予地理位置权限。
您的登录会话已过期而且您已经登出,请再次登录。