How do News Feed algorithms work? Buckle in, this will go into all sorts of detail, but hopefully you’ll leave yourself with a full overview as these algorithms are really important to the internet, but I’ve found very little written about how they actually work.
I’ll work through Facebook as an example, but most work the same way. I haven’t worked on the newsfeed so I’m not divulging any inside knowledge, but how these systems work is shared a lot within the industry.
Facebook wants you to keep coming back and becoming a daily utility in your life. keep you engaged, they should offer you interesting content to read. This content should come largely from a collection of posts, photos, etc. created by your friends and pages you like. Let’s call them all stories for simplicity. The pool of stories you are eligible to see is your candidate stories. Facebook has candidate selection principles around what’s considered a candidate, for example the story should be from or liked by someone in your friend or follow the chart, and to maintain freshness there’s a time window from which stories are considered.
When someone creates a post, a reference to that story is pushed to a candidate index/stream. When a friend logs in, the News Feed crawls through all of their friends’ indexes to pull up articles they might be interested in, then jumps to classify them. For those of you interested in infrastructure, there are a fewhere, but I’ll move on. Also note that people with search experience will often call anything that pre-ranks the recovery phase.
Grading is the bread and butter of these products. In order to keep you engaged, you need to captivating storieswhich are mainly determined by signals like your likelihood to click, like, comment or share. These signals are also called Stocks, and they can be explicit (eg, liking) or implicit (time spent on a page before returning to the feed).
the central idea with news feeds is use ML on past behavior to predict probabilities of action to determine the most engaging stories and highlight them.
With some familiarity with ML you can define this as an optimization problem where the objective function (also called loss function, value function) to be maximized is how accurately you predict stocks, given characteristics such as click-through rate on an author, language features, content type, etc. There are usually thousands to tens of thousands of features, and this is where most of the engineering work takes place and the product intuition is encoded.
To recap, we extract real-time stories from candidate streams and rank them by an action prediction score based on many features. Often the objective function is composed of multiple action predictions, e.g. aP(like) + bP(share) – cP(report).
I hope you are starting to see the limits of action prediction and the appearance of clickbait like a problem. Now, one can make the naive argument that maximizing click => maximizing impressions => maximizing ad dollars, and that’s how the industry has more or less operated for a while.
But the burnout is real and the most money is in high-intent advertising (travel, wedding, etc.) and not just any clicks, so companies are trying to find ways to optimize long-term metrics on short-term measures. The problem with long-term metrics is that they are too nebulous and rare to perform effective machine learning, for example a satisfaction survey or a qualitative feedback report, or worse, a user abandoning your product for good.
Thus, we approach long-term engagement with short-term leading indicators (and often just intuition of knowledge of human behavior), and encode them into the ranking using strategic boosted (e.g. a multiplier on the predicted click score for stories posted by friends) or constraints (for example, always show the story of a major event in a friend’s life at the top). We can also encode this in the value function, for example by putting extra weight on stories that are well rated byor vice versa.
Often, you need to categorize stories in a feed that are of different content types, for example, videos and text. For example, a video lens function might be time spent watching, but that will always be different and often more than text, so we can either calibrate model score by artificially normalizing the score distributions for each type of story, or by relying on fixed composition/models, where for a user it is determined that the optimal model for displaying stories is eg (text, video, photo, photo), etc.
And that’s how the sausage is made. It is important to note that modern systems do not simply rank by things like coefficient or edgerank, but use those things as features. It’s also important to note that this is just the tip of the iceberg and there are many fascinating details that vary widely from company to company that allow one of these systems to be adequate to provide an excellent user experience.
That question originally appeared on Quora – the place to acquire and share knowledge, allowing people to learn from others and better understand the world. You can follow Quora on Twitter, Facebookand Google+. More questions: