Segmentation: from grouping to classification
Segmentation is a key process in data analysis, dividing a data set into relatively homogeneous groups based on specific criteria. The purpose of segmentation is to identify hidden patterns, differences and similarities between objects in a dataset, enabling more precise and relevant analyses. Two …
Read moreRecoding quantitative variables into qualitative ones – techniques and their practical application
When analysing the data, we take into account both quantitative information (such as salary, age, number of products ordered) and qualitative information (e.g. gender, education, level of satisfaction with service). In order to make it easier to work with the data or to adapt it to a specific stati…
Read moreOutlier or anomaly? Detection of abnormal observations
Can one abnormal occurrence cause concern? Based on one deviation from the norm, should a red light start flashing? Of course! In many industries and businesses, an anomaly is a sign that must be reacted to quickly and efficiently in order to prevent consequences. So how do you recognise an anomaly…
Read moreStatistical inference
Statistical inference is the branch of statistics through which it becomes possible to describe, analyse and make inferences about the whole population on the basis of a sample.
Read moreOutlier cases. Identification and significance in data analysis
In data analysis, it is important to identify unusual observations that are significantly different from the others. Such values, called outliers or outlier cases, can affect the results of statistical analysis and lead to erroneous conclusions. In this material we will look at what outliers are, t…
Read moreLevels of measurement
The level of measurement is one of the most important properties of variables. It determines which statistical tests will be available to the researcher during the course of the analysis. But what information does it convey to us specifically? A level of measurement is a pattern of measurement that…
Read morePearson's chi-square correlation test
Popular statistical tests include Pearson's chi-square tests. It is worth noting at the outset that this test has more than one application. In this material, I will discuss the main differences between the tests and introduce the most important issues related to the chi-square test.
Read moreGini index
The Gini index is a measure of the concentration of the distribution of a variable.
Read more