Blog
Automatic preparation of data for analysis
Data preparation plays a key role in data analysis and machine learning processes. Its importance stems from several important aspects that affect the quality and reliability of the results. High-quality data influences more accurate and reliable statistical models. Raw, unprocessed data often cont…
Read moreData gaps in quantitative data analysis - what are they and how to deal with them?
Missing data in the context of data analysis refers to situations where there are no values for certain variables or observations in a dataset. In other words, they are places where a number, text, or some other form of data was expected, but for various reasons was not there. Missing data can take…
Read moreWhen is a value ‘good’ or ‘bad’? About thresholds and alerts in dashboards
The dashboard is a form of reporting, which allows the analyst to convey information in a concise way.
Read moreCustomer Effort Score (CES)
The Customer Effort Score (CES) is, along with the Net Promoter Score (NPS) and Customer Satisfaction (CSAT), one of the main indicators related to customer satisfaction.
Read moreBuilding charts with bricks
Data visualisation helps present in an attractive way information that could be difficult to interpret in the form of a table. Undoubtedly, for many managers, the phrase ‘time is money’ is a life motto, not a cliché. They receive a lot of information in tables but often need to quickly verify the p…
Read moreMedian
The median is a statistic that we classify as a measure of central tendency. It is one of the most popular descriptive statistics next to the arithmetic mean. For students of analytics, it is a statistic with which they become familiar as one of the first. In addition to its simple interpretation …
Read moreThe power of a test
The power of a test is the probability of detecting a statistically significant effect when one actually occurs in the population under study.
Read moreNet Promoter Score (NPS)
Customer satisfaction and loyalty surveys are now an integral part of a business focused on growth and building competitive advantage. The NPS index, which is an acronym for Net Promoter Score, is now the standard in this area.
Read moreGeneral linear models and generalised linear models - differences and similarities
In data analysis, the use of general linear models is common due to their simplicity and ease of interpretation of the results obtained. However, there are times when the analyst encounters situations where the assumptions of classical linear models are difficult or impossible to meet. This may be …
Read moreOutlier cases. Identification and significance in data analysis
In data analysis, it is important to identify unusual observations that are significantly different from the others. Such values, called outliers or outlier cases, can affect the results of statistical analysis and lead to erroneous conclusions. In this material we will look at what outliers are, t…
Read moreThe three sigma rule
The three sigma rule is an important tool in statistics and quality management. In the context of data analysis, it allows the identification of outlier points that are significantly different from the rest of the data. The use of the three-sigma rule in quality control also allows anomalies to be …
Read moreSegmentation: from grouping to classification
Segmentation is a key process in data analysis, dividing a data set into relatively homogeneous groups based on specific criteria. The purpose of segmentation is to identify hidden patterns, differences and similarities between objects in a dataset, enabling more precise and relevant analyses. Two …
Read moreData sourcing: pros and cons of desk research
What is data? What is data needed for? Can data be divided according to specific criteria ? Does data only come from surveys?
Read moreTables for multiple questions measured using the same scale
Surveys often use questions where respondents assess different elements on the same scale.
Read morePearson's chi-square independence test
The chi-square test of independence is one of the most common statistical tests. It is used to test whether there is a statistically significant relationship between two qualitative variables.
Read morePearson's chi-square correlation test
Popular statistical tests include Pearson's chi-square tests. It is worth noting at the outset that this test has more than one application. In this material, I will discuss the main differences between the tests and introduce the most important issues related to the chi-square test.
Read moreParametric versus non-parametric tests. Which test to choose for analysis?
Statistical analysis is an integral part of scientific research and working with data. In order to draw valid conclusions, the use of appropriate statistical tests is essential. The analyst is often faced with the choice of which test to choose in a given situation. This is important because the wr…
Read moreStudent T-tests
The Student's t-test group is used to compare two groups of results, measured by the arithmetic mean, against each other.
Read moreSeries plot
The series plot is a type o linear or layered plot most often used to represent changes in time. The plot may represent size for a primary variable category or statistics for a selected quantitative variable. Individual points of the plot representing data are connected with a line from the first v…
Read moreHierarchical graph
Hierarchical data can be presented using various visualisations. Today’s article will focus on the hierarchical graph.
Read moreWhat do waterfalls have to do with data visualization?
A waterfall graph, otherwise known as a cascade chart, may mean very little to those of us from outside the financial services industry.
Read moreColumn chart, bar chart and histogram
Column and bar charts have long been some of the most popular ways of visualising data. Before deciding to use any of them, it is worth taking a closer look at them.
Read moreScatter plot
A scatterplot (also otherwise known as a dot plot or scatterplot) is a graph with two perpendicular axes on which two variables are presented.
Read moreViolin plot
Before starting a more complex data analysis, it is worth taking a closer look at variable distributions which are of interest to us.
Read more