DEV Community

Anne Lim
Anne Lim

Posted on

Appropriate Algorithm for Influencers Ranking

Hello everyone,

I have just started working on my first project on ranking influencers and I have a few doubts about it.

DESCRIPTION

The objective of this project is to rank different social influencers according to their influential power based on a set of metrics. The idea is rank the influencers based on the score calculated.

Metrics collected:

  • username
  • categories (the niche the influencer is in)
  • influencer_type
  • followers
  • follower grow, follower_growth_rate
  • highlightReelCount, igtvVideoCount, postsCount
  • avg_likes, avg_comments
  • likes_comments_ratio (comments per 100 likes, use as in authentic indicator)
  • engagement_rate
  • authentic_engagement (the number of likes and comments that come from real people๏ผ‰
  • post_per_week
  • etc

Here's how the data collected looks like:

Sample_data

While here is the expected results:

Expected_results

I have tried a numbers of approaches for the ranking algorithms on my project as in the following:
a) Regression model

b) Classification model

c) Machine learning model like SVM, Decision Tree and Deep Neural Network

d) Learning to rank algorithm like CatBoost

e) Any other suitable algorithm

QUESTION

I would like to ask which of the algorithm from the above will be more appropriate and applicable to this project? Any ideas will be much appreciated if anyone of you here came across with similar projects or have some ideas about it. Thank you in advance!

Top comments (2)

Collapse
 
fahminlb33 profile image
Fahmi Noor Fiqri

I guess a regression model is the way to go. Try out the scikit-learn's decision tree regressor or catboost's regression models. CatBoost is very good for data with many categorical features.

Also you might want to consider which features to include in your regression model, I think username is not appropriate here. And don't forget to record the metrics (MSE, MAE, RMSE, etc) and then compare which model performs the best.

You might want to try doing some hyperparameter search too (grid search, random search, etc) to fine tune your model. I suggest using MLflow to keep track of your experiments

Collapse
 
lil_anne_c40239e240a465bc profile image
Anne Lim

Okie sure, really appreciate your reply. Will try out these few approaches!