Sequential rule induction

Reading time:  4 minutes

Sequential rule induction is an advanced data mining technique that enables the discovery of patterns occurring in sequences of events. 

Unlike traditional association rules, such as basket analysis, where the co-occurrence of products in a single basket is analysed, sequential rule induction focuses on the order in which products or events occur over time. This method is particularly useful in understanding customer behaviour, predicting future events and creating effective marketing strategies.

What is sequence rule induction?

Sequential rule induction involves identifying regularities in sequential data, where it is important not only what elements occur, but also in what order. 

Sequential rules take the form of ‘if A happens and then B happens, then with high probability C will happen’. For example, in the analysis of purchasing behaviour, it can be discovered that customers who first buy a camera and then a lens will often buy a tripod in the next step. Sequentiality is very important here. While someone may want to purchase several different lenses for a new camera, or new films for popular instant cameras, this rule works one way. Therefore, offering customers a new camera on the basis of such purchases is unlikely to achieve the expected sales increase.

Looking at the example of food products, on the other hand, no purchase invalidates the chances of a repeat purchase. Having information on the sequential nature of a customer's purchases (e.g. through the use of a loyalty card) will allow us to better understand the pattern of choices and tailor personalisation for subsequent transactions.

Sequential rules and their characteristics

Sequential rules are an extension of traditional association rules, taking into account not only the co-occurrence of elements in a set, but also their order of occurrence over time. Rules consist of two parts: a predecessor (one or more) and a successor. 

Predecessors are sequences of events or products that occur prior to the occurrence of other elements in the rule. A successor is an element that appears after certain conditions have been met, i.e. after the predecessors in the rule have occurred.

As with traditional association rules, sequential rules are evaluated using several key indicators:

  • Coverage – measures how often a sequence occurs across the dataset, expressed as a percentage of the number of occurrences of a given predecessor.
  • Confidence – measures the probability of a successor occurring after the predecessor has occurred. 
  • Growth – shows how many times more likely a sequence is to occur compared to a random occurrence. A growth value greater than 1 indicates a positive correlation between the predecessor and the successor.
  • Implementability – indicates what proportion of cases meet the predecessor but do not meet the successor. This is important for identifying cross-selling opportunities.

Applications of sequential rule induction

Sequential rule induction is widely used in various fields of business and science, enabling a deeper understanding of the dynamics of processes over time. 

In retail and e-commerce, it enables the analysis of customers' purchase paths, which translates into personalised offers and recommendations. For example, if customers frequently visit computer accessories pages after buying a laptop, the shop can offer them add-ons at the right moment, increasing the chances of additional sales.

In the area of marketing, sequence rule induction helps optimise promotional campaigns by identifying effective sequences of marketing activities. This allows personalised content to be delivered to customers at the most effective times for them. For example, sending a newsletter with information about a promotion and then displaying an ad on social media can significantly increase audience engagement.

The financial sector uses this method to detect fraud and abuse by analysing unusual sequences of transactions. By monitoring the sequence of financial transactions, it is possible to quickly identify suspicious activities and take appropriate preventive measures. 

In logistics and supply chain management, the induction of sequential rules enables operational processes to be streamlined. Analysing the sequence of events, such as raw material deliveries or production steps, allows bottlenecks to be identified and schedules to be optimised. Companies can thus increase efficiency, reduce costs and improve on-time delivery.

In the medical field, too, this method is finding applications, supporting the analysis of the course of diseases and the effectiveness of therapies. Studying the sequence of symptoms and responses to treatment helps doctors to make more accurate diagnostic and therapeutic decisions. 

Sequence rule induction is also used in telecommunications to create personalised service offers based on the order in which customers use various functions. Operators can thus better tailor their offers to individual users' needs, increasing user satisfaction and loyalty.

Challenges of sequential rule induction

As with basket analysis, effective induction of sequence rules requires a suitable tool. The Sequences node in the PS CLEMENTINE PRO tool provides such an opportunity. It is based on the CARMA association rule algorithm, adding to it, of course, the discussed aspect of event sequentiality.

In order to be able to use sequential rules on your data, it is necessary to define their sequentiality. Such a role can be played by a time variable, which, in the case of transactional data, we can easily obtain - e.g. when a product was purchased or when a transfer was made. If we do not have access to such a timestamp, the algorithm will take the order of the observations in the set as the order in which they occurred. In such a case, special care should be taken to ensure that the dataset is correct, as simply changing the way the records are sorted will render the results obtained useless from the perspective of sequentiality analysis. 

We need to have an identifying variable in the set so that we can link subsequent events to a given entity, e.g. subsequent purchases made by a given customer.

For this type of analysis, which we can categorise as a data mining approach, i.e. data scrambling, it has to be carried out on large datasets. This requires the right tools and infrastructure to ensure that the conclusions obtained are meaningful.

Sequential rule induction algorithms are often complex and need to be optimised for performance. The analysis can generate a very large number of rules, which poses challenges not only in terms of computation, but also in interpreting the results. The key is to select the most valuable rules, based on indicators such as coverage, confidence, growth or implementability.

In order for the results of the analysis to translate into real implementations and results, it is important that the generated rules are interpreted in the appropriate business or scientific context. Combining the results of the analysis with expert knowledge helps to better understand and exploit the patterns discovered.

Summary

Sequential rule induction is a data analysis method used to discover relationships between events that occur in a specific temporal order. It allows the identification of patterns in which one event follows another with a certain probability. It is used in various fields such as marketing, customer behaviour analysis or product recommendations. These rules help to understand what sequences of events occur most frequently and can be used to predict future actions based on past behaviour. 

As technology develops and more data becomes available, the importance of sequence rule induction will increase. Companies that use this method effectively will gain a competitive advantage through a deeper understanding of their customers and the ability to react quickly to changing trends.

Accessibility settings
Line height
Letter spacing
No animations
Reading line
Speech
No images
Focus on content
Bigger cursor
Hotkeys