A Recommender System

A recommender system with personalized tourist destinations.

Intelligent tourist information system giving recommendations for users based on their point of interest, designed to help travelers find attractions, historical, and other interesting places.

Challenge

Quite often, when we go travel, our main puzzle while planning a journey is an answer to the question: “What to see in the city, and how to have an exciting time at that place?” It is impossible to learn all travel guides, and it can be useless because you aren’t sure the list of destinations suits you. For this reason, we wondered how to apply our artificial intelligence capabilities in tourism to develop a system recommending the best travel places tailored for the particular person.

Hypothesis

Our proof-of-concept solution gives recommendations to travelers based on their point of interest (POI) by advising a similar to some place or by suggesting similar user destinations. Metric algorithms works with a Flickr User-POI Visits Dataset comprising a set of users and their visits to various points in eight cities and builds recommendations using several data features, such as a visited city, a unique user number, a unique number of the point of interest, and attraction category.

The research and development process was based on our DIET approach to Computer Vision projects development and its Discovery, Ideation, Experiment, and Transformation stages. This methodology provides our clients and us with measurable and actionable results on every project step.

Hypothesis research

Discovery

Firstly, we have identified a pain point for most tourists — they need an AI system in the travel industry recommending interesting destinations based on their preferences. We explored two approaches to meet this challenge:

  1. Content filtering. The algorithm recommends similar to selected places: for example, if you choose as desired destinations some museums and temples, you likely get advice to visit historical heritage locations.
  2. User filtering. The algorithm recommends places similar users have visited. For example, suppose you’ve been in two or more locations, and the algorithm finds several users with the same route and other destinations they visited additionally. In that case, you get advice based on other users’ experiences.

For further work, we chose the last approach as recommendations based on user filtering could be more diverse.

Also, we defined the conditions to ensure accurate system performance. It was the absence of explicit user ratings because there is only the fact of visiting a particular place and a cold start — no information about the user yet. In addition, we had to deal with opposite situations: low similarity with other users, which means few suggestions, and a lot of similar users — too many suggestions affecting badly on the system accuracy and result.

Ideation

At the Ideation, we searched for a suitable dataset to train a recommendation algorithm. We tried to find AI ready-made datasets in travel and tourism, as it lowers the cost for hypothesis testing, with user ratings of cultural and historical attractions, but no such data exists in the public domain. Therefore, we decided to use the Flickr User-POI Visits Dataset, which presents user visits to attractions from 8 popular tourist destinations – Budapest, Vienna, Delly, Edinburgh, Glasgow, Prague, Osaka, and Perth.

As a result, we’ve made initial data analysis and got a dataset comprising a set of users and their visits to various points of interest in eight cities. The users’ visits were determined based on geo-tagged YFCC100M Flickr photos matched to specific points of interest locations and categories, as museums, attractions, etc.

It’s important to mention that an exploratory analysis of the data showed that most users visited only one city’s sights. So, we divided users who had sights from several cities in their views into several users who had sights of only one city in their views with their own unique IDs.

Experiment

Next step, we defined data features for building recommendations:

  • userId — unique user number;
  • poiId – unique number of the point of interest;
  • poiTheme — attraction categories ”Architectural”, “Cultural”, “Museum”, “Entertainment”, “Historical”, “Parks”, “Religious”;
  • сity — a visited city.

Before developing the user-attractions interaction matrix, we removed all duplicate users’ entries. By duplicate entries, we mean a user visiting the same place several times, since in this implementation of the recommender system, we don’t have ratings of attractions, but only the fact of seeing them. Also, all users with less than three views were removed from the data sample cause it is rather difficult to make recommendations based on only two views.

To check the accuracy of the system prototype work, we divided the entire dataset into train and test ones. The test dataset had a third of all users with views from 3 to 9; a third of all their views were randomly selected. The remaining 2/3 of the views stayed in the train dataset.

We made recommendations based on the train dataset for each user. Then, we checked their accuracy with the test dataset using precision and recall metrics, and F1-score based on True Positive Rate, False Positive and False Negative errors. False Positive errors showed the inconsistency of the received recommendation with the test user visited places, and False Negative — the inconsistency of the test user visited places with the recommendation.

The results of the system testing allowed us to determine the best parameters for data preprocessing, similarity thresholds for generating recommendations, and a similarity metric.

We also created UI where the user can select several attractions and, based on them, receive recommendations for tourist destinations. A recommender system can be successfully used not only in AI apps for tourism but is also helpful in Retail, Marketing, Media, and many other services. It is ready for the Transformation stage and implementation in the client’s infrastructure to help improve user experience, audience engagement and loyalty, and better business performance.

Technologies used:

pandas
To top