Data Mining Project

已取消 已发布的 Jan 3, 2011 货到付款
已取消 货到付款

Requirements:

Produce a database program or Excel Spreadsheet Program that fetches Event and Ticket information from the internet and allows the user to add a Markup percentage to the Ticket Price, do lookups on Events that display the available tickets and the pricing with a markup.

Specifically the user should be able to filll the program with a list of URL’s then hit an Update Button which populates the database from the list and the data from each url. There should also be a field in the TICKETS table that is entered by the user as a percentage markup. There should be a look up facility so that the user can find by any field in the ARTIST Table that produces a report (worksheet) showinf results from the search. (see spreadsheet attached)

Customer populates records by entering URL/s into a field, the program is ran and goes to each URL and extracts the following data from the page and enters it into a number of tables

Table 1 = ARTIST

artist_event_id (assigned by system)

artist_event_name (retreived from web page)

artist_venue (retreived from web page)

artist_location (retreived from web page)

artist_event_date (ideally entered as a database date-time value)

Example of code to extract from:

Sample URL= [url removed, login to view]

WWE Presents: Raw World Tour

Madison Square Garden

New York, NY

Sun, Dec 26, 2010 07:00 PM

Table 2 = Tickets

artist_event_id (assigned by system as above)

tickets_id (assigned by system)

Section (retreived from web page)

Row (retreived from web page)

Seats (retreived from web page)

Qty (retreived from web page)

Price/Ticket (retreived from web page)

Markup (variable assigned by the user)

Example of code to extract from:

Sample URL= [url removed, login to view]

Section

Row

Seats

Qty

Price/Ticket

Here are some other sample url’s

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

It would be nice to include some kind of method to change the FETCH functions incase the existing format of the web page URL's changes.

An additional table of Artist_event_names will be joined to the program, this table will be populated by the Owner, When adding new URL's to be scraped the Owner will choose the event Artist_event_name from a drop down list, Upon Scraping the program will check for any instance of Artist_event_name in the relevant field eg The owner may have entered 'Bon Jovi' but the web page says 'Bon Jovie 2011 World Tour' the program should enter 'Bon Jovi' into the local database.

If no Artist_event_name has been entered by the owner or if the text is not found within the field then the program should enter the exact text that appears on the web page for Artist_event_name .

An Additional field to be entered by the user TIcket ID Start - will be the starting number for the Ticket ID field which will increment automatically by 1

Before Scraping the program should check for and delete any existing entries where the Event Date/Time has passed and delete those entries so that those url's do not get scraped again.

Prior to exporting the information to a spreadsheet this rule should be applied

If there is more than one record containing the exact same artist_event_name, Section and Row AND the difference in price is LESS THAN $30 AND the seats are consecutive THEN the records should be combined into one record using the SUM of the QTY's and using the Greatest Price.

There should be the following items on the main menu of the program

ENTER Artist_event_name DATA - This

数据挖掘 数据处理 MySQL PHP 网页搜罗

项目ID: #901629

关于项目

10个方案 远程项目 活跃的Jan 7, 2011

有10名威客正在参与此工作的竞标,均价$192/小时

SigmaVisual

We can help in your project, please check PMB and our ratings/reviews to get idea of our experience.

$250 USD 在10天内
(238条评论)
7.8
aruhat

Hello Pleases see PM. Regards, Chandni

$285 USD 在10天内
(12条评论)
6.1
developer13

Please see your PMB - thank you!

$250 USD 在14天内
(8条评论)
5.1
manishadk

please view PMB.

$125USD 在1天里
(9条评论)
4.2
ChinaCoCo

Please check PMB

$250 USD 在5天内
(2条评论)
3.6
LambodarL

Good processing knowledge

$250 USD 在15天内
(0条评论)
0.0
samkhan

Please see PMB.

$50 USD 在7天内
(0条评论)
0.0