Data Science

Analyzing Epidemiological Time Series With The R Package {distantia}

Tutorial on the applications of the R package {distantia} to the analysis of epidemiological time series.

Coding a Minimalistic Dynamic Time Warping Library with R

Tutorial on how to implement dynamic time warping in R

R package collinear

R package for multicollinearity management in data frames with numeric and categorical variables.

R package distantia

R package to compare multivariate time-series with dynamic time warping and lock-step methods.

A Gentle Intro to Dynamic Time Warping

Brief introduction to Dynamic Time Warping with a conceptual step-by-step break down.

My Reading List: Data Science

Live post with a curated list of high-quality data science posts and videos I found enlightening.

Mapping Categorical Predictors to Numeric With Target Encoding

Target encoding is commonly used to map categorical variables to numeric with the objective of facilitating exploratory data analysis and machine learning modeling. This post covers the basics of this method, and explains how and when to use it.

Everything You Don't Need to Know About Variance Inflation Factors

Deep explanation of what Variance Inflation Factors (VIF) are, how they work, what they really mean, and how they are used to manage multicollinearity in linear models.

Multicollinearity Hinders Model Interpretability

In this post, I delve into the intricacies of model interpretation under the influence of multicollinearity, and use R and a toy data set to demonstrate how this phenomenon impacts both linear and machine learning models.

Documenting, storing, and executing models in Ecology: A conceptual framework and real implementation in a global change monitoring program

Many of the best practices concerning the development of ecological models or analytic techniques published in the scientific literature are not fully available to modelers but rather are stored in scientists' digital or biological memories. We propose that it is time to address the problem of storing, documenting, and executing ecological models and analytical procedures. In this paper, we propose a conceptual framework to design and implement a web application that will help to meet this challenge. This tool will foster cooperation among scientists, enhancing the creation of relevant knowledge that could be transferred to environmental managers. We have implemented this conceptual framework in a tool called ModeleR. This is being used to document, share, and execute more than 200 models and analytical processes associated with a global change monitoring program that is being undertaken in the Sierra Nevada Mountains (south Spain). ModeleR uses the concept of scientific workflow to connect and execute different types of models and analytical processes. Finally, we have envisioned the creation of a federation of model repositories where models documented within a local repository could be linked and even executed by other researchers.