EuroPython 2015

Actionable data analytics in retail marketing analysis

We propose a web BI dashboard system developed for companies operating in the big market composed by several point of sales (POS) and providing services as stocking, distribution logistics, commercial support and promotional actions.

We have endowed the infrastructure with a set of statistical machine learning tools typical of high throughput bioinformatics, e.g., clustering procedures for time-series. Machine learning functionalities are actionable from on-line graphs, such as biclustering panels in which subset of retails and sales categories can be interactively selected. Currently 250 million entries are managed from the sales stream within the system. Network analysis (detection of community structure and co-occurrence patterns) combined with geospatial and socio-economic data are being developed as strategic tools.

The system is implemented as a web-based Django framework deployed on a AWS machine, using Celery and Redis to distribute tasks. This scalable framework can be accessed through a web interface from the strategic marketing and R&D departments and other directive figures; a similar and leaner interface is available for the individual POS owners. The web interface integrates Javascript libraries to obtain interactive displays connecting machine learning and data exploration (D3js, Highcharts, Sigma.js, Heatmap.js, leaflet, InCHlib). In particular we fork the django-highchart repository to improve functionalities available for the Django framework. Actionable dendrogram structures and sunburst plots allow the handling of big taxonomies typical of the category managment reference structures. Internally, the statistical machine learning methods are deployed as stored procedures for a PostgreSQL/PostGIS database, powered by the PL/R and PL/Python extensions.

Do you have some questions on this talk?

New comment