Interactive analytics and predictions on Restaurant tips

Imagine you own a restaurant and you want to analyze not only the trend of your revenue, but also the reason behind periods of particularly high earnings, moment of the day where a particular kind of clients comes to your restaurant, why some days tips are higher than others and so on. Knowing all those […]

Time Series: why do we need Stationarity and Ergodicity

A time series is a series of data points indexed in time order, normally with equally spaced points in time. Examples of time series are stocks’ prices, monthly returns, company’s sales and so forth. Time series can be seen as data with a target variable (price, returns, amount of sales…) and one feature only: time. […]

Optimization algorithms: the Newton Method

Predictive Statistics and Machine Learning aim at building models with parameters such that the final output/prediction is as close as possible to the actual value. This implies the optimization of an objective function, which might be either minimized (like loss functions) or maximized (like Maximum Likelihood function). The idea behind optimization routine is starting from […]

Building Machine Learning Apps with Streamlit

Streamlit is an open-source Python library that makes it easy to build beautiful apps for machine learning. You can easily install it via pip in your terminal and then start writing your web app in Python. In this article, I’m going to show some interesting features about Streamlit, building an app with the purpose of […]

Cross-Validation for model selection

When you are dealing with a Machine Learning task, you have to properly identify your problem so that you can pick the most suitable algorithm. As first thing, namely, you could categorize your task either as supervised or unsupervised and, if supervised, either as classification or as regression (you can read more about it here). […]

Building a ML model in 3 lines of code? Yes you can

Machine Learning as a subject is not easy. It is indeed a set of tools (mainly algorithms and optimization procedures) whose comprehension involves, inevitably, a deep understanding of Maths and Stats. Nevertheless, the implementation of a ML model to a real scenario might be easier than expected. Indeed, once you got familiar with theoretical concepts, […]

Ensemble Methods for Machine Learning: AdaBoost

In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could not be obtained from any of the constituent learning algorithms alone. The idea of combining multiple algorithms was first developed by computer scientist and Professor Michael Kerns, who was wondering whether “weakly learnability is equivalent to strong learnability”. The goal was turning a weak […]

One-way Analysis of Variance (ANOVA) with Python

When you are dealing with data which are presented to you in different groups or sub-populations, you might be interested in knowing whether they arise from the same population, or they represent different populations (with different parameters). Let’s consider the following picture: As you can see, there are three different footpaths. Now the question is: […]

5 Python Packages a Data Scientist can’t live without

Python is a general purpose language and, as such, it offers a great number of extensions which range from scientific programming to data visualization, from statistical tools to machine learning. It is almost impossible knowing every available extension, however there are a few of them which are pivotal if your task consists of analyzing data […]

Handling missing values with Missingo

Whenever you are about to inspect and manage some data, one of the first inconvenient which might arises is the presence of some missing values. Together with eventual outliers, they might affect the robustness of your Machine Learning model, it is worth spending some extra time during your cleaning procedure and investigating about the nature […]