Business intelligence in e-commerce

Smart product matching for price comparison


An international pure player wants to gain in productivity in the process of his competitive intelligence. It is carried out in two stages: a monitoring of competitor sites carried out by an external service provider delivered in an excel file, then a matching established manually by the pricing and purchasing teams according to a sampling method by cross-referencing the product catalogues with the competition files.

This represents some 7,000 new references per day to be processed.

This work is not very rewarding, takes a lot of time, and has an impact on the margin (external service provider cost).

This customer has three objectives:

  • to keep a good vision on its positioning
  • provide a buyer instantly with the information needed to evaluate and renegotiate supplier offers
  • automate to gain accuracy, save time, and gain margins…



Data from competitor sites is collected and processed on our Big Data as a Service platform (in compliance with applicable legislation). At the same time, data from the customer catalogue is also stored on our BDaaS.

Prior to this, a first phase of Data Preparation is carried out. We will extract the words (tokenization phase), eliminate punctuation, remove stop words. – all the elements that do not bring any semantic value to the text. Colors, weight and volume are normalized too.

The data are then submitted to our model based on NLP (TF IDF – Term Frequency-Inverse Document Frequency algorithm) which takes into account certain problems encountered during product matching such as sometimes site-specific product references, information under different labels, writing standards, spelling mistakes… It is an algorithm that allows you to see the similarity between two texts.

We are going to find the competitor’s product that corresponds most closely to the one in the customer’s catalogue.

We then establish a confidence score that is normalized with respect to the business and the customer’s data. This makes it possible to obtain a good compromise between false positives and false negatives. 

AI does not do everything. The strength of our approach is not to leave it up to the AI all the time. The profession must be able to arbitrate the degrees of freedom that we give to the AI. Hence the importance of good collaboration between AI experts and business experts to maximize matching.


An HMI allows the exploitation of valued data, the visualization of results and the sharing between different managers.

Automation has enabled significant time savings: any volume of references can now be processed in almost one click and allows results at least as good as the work of a human or even better (5 to 10% more) and a visualization of KPI’s instantaneously.

Teams that used to spend all their days crossing data can now be repositioned on more rewarding tasks.

Why Octopeek?

The retail company and e-commerce need to move from ad-hoc experimentation with AI to enable everybody within the organization to use it to improve both operational efficiency and deliver business value thanks to true business apps.

Octopeek aim at becoming the platform that democratizes AI in the enterprise: it makes possible a dynamic creation and provisioning of full-fledged entreprise AI apps, customizable by business analysts.


Credit scoring disruption

Time to market: Credit scoring for Gen Z


Customer intelligence in insurance

Identify and target customers with a second car


Dramatic churn reduction in insurance policies

Detect and prevent churn from unsatisfied customers


Improving vehicle quality

Exploitation of customer feedback during garage repairs


Process automation in investment banking

Automation of subscription warrant by OCR


Telecom quality of service

Improve the quality, increase the renewal rate, reduce the bad payers