Mail Delivery and Automated Parsing and Storage of Bounces and Deliverables.
The goal of this project is to centralize all PMTA delivery accounting files for a large-scale fleet of PowerMTA mail server nodes and to build a central database of non-deliverable (bounced) email addresses and known-good "deliverable" email addresses.
We want to be able to automatically deploy logstash on each of the nodes, which pushes all of the PMTA accting log files back to an Elasticsearch cluster, which would be accessed by a dedicated "bounce-processing" server constantly parsing the new logs and storing bounce addresses, deliverable addresses, and other metrics into the centralized database.
To do this, we need a PMTA Accounting Logs parser written in both Clojure and Python (to compare performance)
For example, we may have 100+ PMTA servers that we'd like to sync to a central log processing server and all logged into a central MySQL DB and a basic report panel written that will show graphs based on the DB info. We would like the solution to be scalable to over 1000+ systems by simply expanding the Elasticsearch cluster as activity grows.
The solution involved here will be similar to that described here:[url removed, login to view]
For more info on similar projects see this comment as well:[url removed, login to view] except this will need to be both Linux and Windows PMTA boxes syncing back to the central log processing server (shouldnt matter on the log cluster side, will still be a logstash install on each node environment pushing it back to the central store)
This main infrastructure in this cluster will consist of elasticsearch+logstash+Redis broker setup and with Graphite for the stats/monitoring (deliveries, bounces, connections, and perhaps a few other metrics such as types of errors, softbounces/hardbounces, etc).
Would also like to have an instance of Kibana for basic reporting and overview, and logstash + statsd + Graphite for more sophisticated analysis.