已取消

Data collector engine to query twitter, facebook, other api continiously and store the data

We are looking for someone who can build a web application hosted on amazon aws which continuously queries twitter, facebook and may some other public api's to get all possible fields corresponding to a provided list of id's and store them in mysql (amazon rds).

More information in the detailed requirements section.

## Deliverables

Spec: You will be given access to the following

-- All the keys related to the api's

-- Access to our AWS dashboard so that you have ability to spin up test instances, use any other services

-- Initial set of twitter handles and facebook id's (The task also involves an ability to enter more id's in the future)

There are around 1500 id's we want to monitor continuously (near real time) and update the database with the information found on to databases. In future we want to build a visualization platform on top of this data, so the mysql schema has to be normalized. There are limits on the api calls and your system should honor it but not compromising too much on the information availability.

Platform requirements:

1) Jetty or tomcat based web application

2) Data storage engine is mysql (You can pick amazon rds so that setup and maintenance is much easier)

3) Some java servlets to respond to basic queries

4) Some mechanism of alerting (amazon SES) or other health checks to monitor the liveness of the web application (optional)

5) We can discuss about data storage layout and important fields to populate once you start working on the project.

The project is considered done when we have a live version of the web application running on the aws populating the data continuously. This is a very beginning stage of an interesting project, so there is a huge scope for further work.

A note regarding this web application usage: We are only trying to collect some metrics exposed by public api's and apply some data mining on top of it collect the insights. This is not an external user facing website and the information mined is only used for analytics purposes. Hence we are not violating any TOS of the api's in consideration.

技能: 亚马逊网络服务, 软件构架

查看更多: top query, system health monitor, system architecture of a web application, schema update, monitor architecture, layout services, java web start setup, java web services up and running, id analytics, get external data, data set visualization, data for data visualization, dashboard for web application, dashboard architecture, build mysql query, Build Amazon store, availability dashboard, api architecture, amazon store setup, 16 handles, servlets in java, real engine , rds database, api analytics, amazon aws database

About the Employer:
( 0 reviews ) United States

项目ID: #2722374