This is an extra solution to the Melon Playlist Continuation Challenge by the *** Team.
It was inspired by the following two papers: A hybrid two-stage recommender system for automatic playlist continuation, which won 3rd place in the RecSys Challenge ’18; and Relational Learning via Collective Matrix Factorization.
As stated in the Challenge README, the dataset in data.tar.gz contains 150K playlists that have been created by Melon users.
To untar the dataset:
tar -xvzf data.tar.gzThe data/train.json contains all the data, whereas data/val.json and data/test.json are just for submission, so only some of the songs and tags are included.
For this repository, we just consider data/val.json and data/test.json as additional information.
- Phase 1: Extract candidates using CMF Recommandation(song+tag matrix)
- Phase 2: Re-rank candidates using Learning-To-Rank Boosting
For local evaluation, we create the new evaluation dataset. The part2 and part3 are for the training and validation datasets for boosting, respectively.
These are divided into question (_q) and answer (_a) parts.
In Phase 1, we train part1+part2_q+part3_q+evaluation_q and optionally include valid.json+test.json as additional information.
In Phase 2, we use part2_q and part3_q as inputs and use part2_a and part3_a as labels, respectively.
Please refer to A hybrid two-stage recommender system for automatic playlist continuation for detailed partitioning.
python3 preprocess.py run ./data/train.jsonAfter running the above, the preprocessed directory is as follows.
├── preprocessed
├── inputs
├── part1.json
├── part2_q.json
├── part3_q.json
└── evaluation_q.json
└── labels
├── part2_a.json
├── part3_a.json
└── evaluation_a.json
python3 run.py --dir ./preprocessed --additional ./data/val.json ./data/test.jsonThe --additional flag is optional.
python3 run.py --dir ./preprocessedpython3 evaluate.py --result ./result.json --answer ./preprocessed/labels/evaluation_a.jsonMusic nDCG: 0.250488
Tag nDCG: 0.413651
Final Score: 0.274963
Final Score = Music nDCG * 0.85 + Tag nDCG * 0.15
We tested this implementation using Python 3.6.9 with an Intel Core i7-9700 CPU and 32GB RAM.