Early prediction of cart abandonement
This shared task is a simplified version of the Coveo Data Challenge organised at the eCommerce workshop co-located with the Special Interest Group for Information Retrieval (SIGIR) Conference in July 2021 in Montreal. Your goal will be to perform early prediction about whether a user will abandon their cart in the current shopping session after at least a product has been added to the cart. This is therefore a binary classification problem: a session can either feature in the abandon category or in the purchase category.
Your baseline model is a Naïve Bayes classifier of order 3, which thus predicts cart abandonement based on 3grams of symbolised clicks. No other piece of information should go into this model.
Early prediction should be evaluated at 5, 10, and 15 clicks after the first add-to-cart event in a session. You should ignore sessions where no product is added to the cart. Be careful when you test the model: if a session ends 12 clicks after the first add-to-cart event, it should not feature in the model evaluation at 15 clicks post add-to-cart.
You will receive a training set right away and use it to develop your pipeline, models and such. Then, one week before the deadline you will receive a test set. The notebook you submit should have run on the test set and provide results about it, not the training set. Hence, make sure to develop your solution and then apply everything to the right file.
You should carry out the following tasks:
-
pre-process the sessions (you can execute these tasks in the order you prefer, but make sure to specify which task is being solved in which block of code):
- sessionise (1pt)
- select sessions with at least one add-to-cart (1pt)
- add class labels: treat purchase as the positive class (1pt)
- cut purchase sessions to the last event before the first purchase (1pt)
- remove sessions shorter than 5 and longer than 155 clicks (1pt)
- symbolise actions (1pt)
-
implement an oracle model to get the upper bound on performance and compute the oracle's f1 at
- 5 clicks post add-to-cart (1 pt for correct f1 at 5 clicks)
- 10 clicks post add-to-cart (1 pt for correct f1 at 10 clicks)
- 15 clicks post add-to-cart (1 pt for correct f1 at 15 clicks)
-
implement the baseline model defined above (Naïve Bayes, order 3) and obtain the f1 score on the dev set at:
- 5 clicks post add-to-cart (2pts for correct f1 at 5 clicks)
- 10 clicks post add-to-cart (2pts for correct f1 at 10 clicks)
- 15 clicks post add-to-cart (2pts for correct f1 at 15 clicks)
-
implement another model of your choice (an SVM, a Markov Chain, a neural network, an anomaly detection algorithm if you feel more adventurous, or something else) and obtain the f1 score on the dev set at:
- 5 clicks post add-to-cart
- 10 clicks post add-to-cart
- 15 clicks post add-to-cart
You are not allowed to submit another Naïve Bayes Classifier where you only changed the ngram size to solve this task, but you can feed your model any information you can retrieve from the dataset, including dwell times, day-of-the-week, time of the day, and SKUs. Therefore, a NBC fed with more than just clicks is a valid model. However, you are not allowed to use additional data sources other than the training set: your notebook has to run considering only the information provided in the file you downloaded from the WeTransfer link. One week prior to the deadline I will share a test set: you'll have to run your code on the test set and the grading will be done on your output for the test set: make sure you run the code on the correct file!! (10 pts attributed to the model with the best f1 at 15 clicks post add-to-cart. Then, the difference in f1 between the baseline and the best model will be partitioned in deciles. Models falling in the top decile will get 9 pts, models falling in the second decile will get 8.5 pts and so on. Models falling in the lowest decile will get 4.5 pts while working models not outperforming the baseline will get 2 pts. If you don't show f1 scores at 5, 10, and 15 clicks post add-to-cart you won't get points, even if the grading only considers f1 at 15 clicks post add-to-cart.
- do error analysis on one of the models (5 pts: you can use any approach, you will get points based on the conclusions you draw from the error analysis, there isn't a right and wrong here, but sensible and not)
Make sure to indicate clearly which output matches a certain task, using the task ID (e.g., 3A for the baseline's f1 at 5 clicks post-add-to-cart). No points will be awarded if you fail to indicate which task a certain result, code-block or markdown cell is addressing.
You should submit a complete notebook which takes in the test set. If dependencies are required to run your code, make sure to install them at the beginning of the notebook. Make sure you comment the code explaining what happens (use doc-strings to define what functions do, inline comments for more detailed information inside functions, as well as markdown cells to highlight the flow). If you submit undocumented code, you will automatically get a 5.
If you submit a complete notebook, but I cannot re-run your code, you automatically get a 5, so make sure your code run from top to bottom after re-starting the kernel.
You need 17 points out of 30 to pass, meaning that 17/30 translates to a 5.5/10. In principle you can pass if you submit a running model which doesn't beat the baseline, but you have to do tasks 1-3 perfectly. We have seen how to pre-process the data, build an oracle model and how to implement a Naive Bayes Classifier during the practicals, so you have a blueprint for the first three tasks. If you stop here, however, you won't pass the assignment, so you have to engage with the implementation of a new model or feed new info to the baseline architecture. If you want to get a grade higher than 8 or edge your chances to pass, you have to engage with the error analysis.
IMPORTANT: submit a single model to beat the baseline. You might have to test more than one, but only submit the one you think works best!! If you include more than one experimental model in the notebook, I will only consider the first regardless of hwich performs best.