Real-time evaluation platform for recommending news articles

Octopeek funds many R & D topics in artificial intelligence, including that of Julien Hay, PhD and Data Scientist, specializing in artificial intelligence and natural language processing (NLP).

Julien was interviewed recently about a project he presented at NEURIPS:

The Renewal Project

Last December, you were in Montreal for the NeurIPS 2018 conference to present a project you are working on. Before telling us about your project, can you explain the purpose of this conference to the uninitiated?

The NeurIPS conference is THE big annual event for Machine Learning and Neural Networks. It deals with artificial intelligence and computational neuroscience. In concrete terms, it is one of the largest conferences in the world on artificial intelligence. This year, we met many researchers from many companies such as Google, Netflix, Nvidia, Amazon and others.

What were your observations resulting from your research?

We were able to identify gaps in the state of the art of Challenge Platforms, particularly in terms of available data (content of news articles and users’ reading history), but also in terms of evaluation bias introduced by the technical constraints of certain platforms such as NewsREEL. Then in 2018 we developed a first prototype of this project.

You presented your project “Renewal”. Can you tell us more about its origin and its focus?

The project was introduced in the CiML workshop (Machine Learning competitions “in the wild”: Playing in the real world or real time) on the subject of challenges and platforms for real-time interaction with the world, and, in the case of the Renewal project, real-time interaction with users. The main focus of this workshop is to gather researchers around specific tasks and thus to promote research around common issues.

The Renewal platform is a Challenge Platform project dedicated to recommendation systems. It will allow different research teams to propose recommendation algorithms that will be evaluated and compared in real time.

The Renewal project comes about from the desire of the Octopeek R & D team to contribute to research through the organization of challenges, as well as the lack of reliable evaluation frameworks in the recommendation system community. The challenges are a way of obtaining both solid indicators of the performance of the algorithms and models proposed in a community, as well as being able to compare them and thus determine which model is the most efficient for which task. The specification of an evaluation framework such as that proposed is a fully-fledged research project in the exploration of methods to debug evaluations and compose an architecture for algorithmic competition.

Can you describe the design and development stages of the application?

The first step was to develop the mobile app to send recommendations of news articles to users, in order to set up our experimental framework.

In parallel, we developed an architecture adapted to this type of comparison platform. This platform manages both the indexing of news articles on the web, as well as the assignment of users to the different recommendation algorithms of the research teams. A first prototype was developed during 2018. The diagram below illustrates the overall Renewal architecture. The back-end platform (on the left) supports the indexing of articles on the web by exploiting different sources. Then all the information is sent to the research teams’ algorithms (bottom right of the diagram) who send their recommendations to the mobiles of the users (top right). Finally, through the mobile app we observe all user interactions to evaluate each system involved.

Of what does the mobile app consist and how will you evaluate the performance of the algorithms?

The application presents news articles recommended by different systems (see diagram on the left). In the case where the user chooses to read System A items more often than System B, then we will establish in real time that the A versus B comparison favors A (of course by averaging all users assigned to systems A and B). We therefore use the evaluation technique known as A/B testing.

Thank you Julien. That makes it very clear. What are the next steps?

To start with, we will finalize the development of the platform by implementing the real-time evaluation part, as well as the communication between the mobile app and our back-end architecture. Then we plan to organize evaluation campaigns as well as workshops about the platform and the task of recommending news articles. The development of the project resumed in September 2019 with funding from the LRI.