E-commerce: a price monitoring solution using Big Data


The golden rule for an e-trader is to be the most competitive on the prices of items and to manage their promotion. E-commerce players have to justify the real drop in prices on their special offers or sales. Ditto for event sales specialists. To meet this need, Octopeek has developed a price watch solution based on a Big Data platform.

Online sales sites offer hundreds of thousands of items. Some are general e-commerce sites with all types of categories: clothing, shoes, care products, furniture, small household appliances, food, etc. Others specialize in as smaller range of products. All have daily offers on line every day. Thousands of products and therefore prizes must be verified by the pricing and purchase teams as part of promotional sales*. Additionally, they must do this in an extremely short period of time, sometimes less than a week before the launch of these sales. Distributors only have a few days to research and check several hundred prices on sellers and brands’ sites. These brand sites are the only ones to show the famous Recommended Retail Price or RRP of the item when it comes onto the market.
How can the team find hundreds of prices within the time limit? How can you mobilize the necessary internal resources while preserving your margin? The workload is overwhelming and extremely time-consuming for distributors.

Construction of the Big Data platform and price watch software

To meet this need, Octopeek has developed price watch software on a Big Data platform. The Octopeek solution queries the e-shops of suppliers and manufacturers of products to gather the price, sku, availability and other pertinent details. The data is stored in our Big Data infrastructure. This is sourced and structured and can be exploited and interrogated by the customer as needed.

The architecture of the solution is divided into 3 parts:

  1. Data recovery
  2. Data ingestion
  3. Data restitution

The dataflows are implemented via Apache NiFi. Python scripts take care of collecting the data and depositing it in directories monitored by MiNiFi (minimalist NiFi processes). As soon as a new file is dropped into the target directory, the MiNiFi ingests it and sends it to the central NiFi cluster. The NiFi cluster implements the different dataflows that feed the databases used by the price watch application. The data is first fed into an Apache Hive database that allows both archiving (under HDFS) and one-off reporting.

Using ElasticSearch as a business database

Another NiFi workflow is responsible for creating the ElasticSearch indexes that will serve the web application via API. Using ElasticSearch as a business database allows you to take advantage of its query speed (low latency) and the search engine (Lucene). This has a search engine for the database including features such as auto-completion, automatic correction and management of synonyms.

Finally, an important part of the solution implemented by Octopeek is the restitution of data via an intuitive and easy-to-use web application, validated by customer business teams. Octopeek’s goal is that a simple tutorial allows a user to quickly become productive with the tool.

*In 2017, 19 e-commerce companies were fined by the Direction Générale de la Concurrence, de la Consommation et de la Répression des Fraudes – DGCCRF (Directorate General for Competition, Consumption and the Prevention of Fraud) for a total amount of 2.4 million Euros. 116,000 establishments and 11,000 websites were audited.   A first in France, accompanied by requests for documents and prerequisites.  15,000 reports have been drawn up, and not only distributors are included. The brands themselves via their own promotions on their e-shop sites have had to justify an effective price reduction compared to their initial RRP (Recommended Retail Price). The same treatment goes for the airline sector, where advertised promotions on airline tickets were not consistent with the RRP.

Learn more about the Octopeek offers: https://octopeek.com/consulting-big-data-et-data-science/

Contact us about our price watch solution: contact@octopeek.com or +33 953 737 474