Are you wondering why your subscribers leave?
Or you have collected data to predict how your customers will behave, but you don't know where to start?
Do you have a new product, but your landing page doesn't convince potential customers to try it out?
Or maybe you want to recommend products to your customers that they will definitely like?
I carry along a unique combination of expertise
It takes careful planning and thorough cleaning to get a neat dataset ready for analysis. I know how to clean dirty data and save weeks of working time with parallelized data analysis pipelines.
I can build a quick insightful model when you need to have better predictions or want to know which side of operation needs more attention. I also know how to develop a sophisticated and fine-tuned model that will power the decisions with great predictive accuracy.
Shaping an idea into a working piece of software requires knowledge, experience and hard work. My software developing skills, experience and intuition are always ready for new challenges.
Stack: scikit-learn, rpy2, Bayesian CCA, logistic regression (here vanilla, for publication I used Bayesian with Laplace prior, similar to L1 regularization)
Bayesian methods allow building models that include prior knowledge and make sense of models that are otherwise too complicated to be solved analytically. In example presented here I provide a tutorial that analysed real data on moral judgment of American and Russian individuals. Similar approach can naturally be expanded to A/B testing providing more accurate estimates.
Stack: PyMC3, plot.ly, matplotlib
Subscription-based enterprises rely on long-lasting clients.
My experience with logistic regression and ensemble methods, such as random forest, can serve you to pinpoint the problematic service areas that lead to churn, and suggest clients that are likely to leave soon and may require additional attention.
Here is a model that successfully predicts individuals who are going to churn, and reveals which aspects of subscription may increase chance of churning.
Classifier used: Random Forest
Predictive accuracy: 95%
Stack: pandas, scikit-learn, plot.ly
Dashboard template built with ♥ by Keen IO
Stack: Flask, Pandas, PostgreSQL, D3.js. Click on region to zoom.
11/01/2011 - Present
1/12/2009 - 18/11/2010
14/10/2007 - 9/05/2009