Data Science and Analytics have a huge variety of fields of applications, basically every time pieces of information are delivered in the form of data. The sports industry makes no exception. There is a great business all around, and having the possibility to study the market of sports via powerful analytics tools is a great […]

# Author Archives: valentinaalto

## Building Machine Learning Apps with Streamlit

Streamlit is an open-source Python library that makes it easy to build beautiful apps for machine learning. You can easily install it via pip in your terminal and then start writing your web app in Python. In this article, I’m going to show some interesting features about Streamlit, building an app with the purpose of […]

## Maximum Likelihood Estimation

The main goal of statistical inference is learning from data. However, data we want to learn from are not always available/easy to handle. Imagine we want to know the average income of American women: it might be unfeasible or highly expensive to collect all American women’s income data. Besides, even in a scenario when this […]

## Markov Chain Montecarlo

A Markov chain can be defined as a stochastic process Y in which the value at each point at time t depends only on the value at time t-1. It means that the probability for our stochastic process to have state x at time t, given all its past states, is equal to the probability […]

## Cross-Validation for model selection

When you are dealing with a Machine Learning task, you have to properly identify your problem so that you can pick the most suitable algorithm. As first thing, namely, you could categorize your task either as supervised or unsupervised and, if supervised, either as classification or as regression (you can read more about it here). […]

## Building a ML model in 3 lines of code? Yes you can

Machine Learning as a subject is not easy. It is indeed a set of tools (mainly algorithms and optimization procedures) whose comprehension involves, inevitably, a deep understanding of Maths and Stats. Nevertheless, the implementation of a ML model to a real scenario might be easier than expected. Indeed, once you got familiar with theoretical concepts, […]

## Analyzing U.S. exports with Plotly

In my previous article, I’ve been providing an introduction to some useful graphical tools available in Plotly, an opensource library which can be used both in Python and R. Here, I’m going to play a bit more with Plotly’s functionalities, using as input some data about USA exports in 2011. So let’s import and explore […]

## Understanding Rejection Sampling method

Rejection sampling is a computational technique whose aim is generating random numbers from a target probability distribution f(x). It is related to the general field of MonteCarlo methods, whose core is generating repeated random sampling to make numerical estimation of unknown parameters. Some words about Randomness One might ask why a random variable with probability […]

## Ensemble Methods for Machine Learning: AdaBoost

In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could not be obtained from any of the constituent learning algorithms alone. The idea of combining multiple algorithms was first developed by computer scientist and Professor Michael Kerns, who was wondering whether “weakly learnability is equivalent to strong learnability”. The goal was turning a weak […]

## Combinatorics: permutations, combinations and dispositions

Combinatorics is that field of mathematics primarily concerned with counting elements from one or more sets. It can help us counting the number of orders in which something can happen. In this article, I’m going to dwell on three different types of techniques: permutationsdispositionscombinations Permutations Those are the easiest to compute. Imagine we have n objects, different among each others. […]