Yandex ranking source code was leaked in 2022. Around 1922 ranking factors we leaked publicly in 45 GB source files. We have taken the levarge to decrypt those files and to get the ranking factors used by yandex. The complete file of all ranking parameters are avaliable here on WebMarketingSchool. Codes showed the Yandex search engines ranking factors.
We have made the list of leaked yandex ranking parameter and categorized them in three parts.
On-Page SEO Ranking factors
Off-Page SEO Ranking factors
Technical SEO Ranking factors.
The complete list of all the factors starts from below.
On-Page Ranking factors (Yandex).
On-page Optimization means changes which make on our webpage in order to increase the search rankings in the search engines. Here are few ON page ranking factors of Yandex Search Engine.
Text relevance (Maxfreq is the frequency of the most frequent word that makes sense of the length of the document).
Binar factor, has a value of 0 for all monosyllabic queries, and the value of 1 almost all two or more words, except for a very small number of answers for which there is not a single link that has passed quorum
There are all the words of the request somewhere in the document
There are all the words of the request in a row in the document.
There are all the words of the request in one link.
(Phras) There are all the words of a request in a row in one line.
The presence of an accurate phrase (text of the request) in the header (more precisely, in the first sentence of the document). Contextual restrictions and feet are taken into account exactly as in TRP2, i.e. Factor [8] minors factor [5]
There was a plot that has passed the quorum in which all the word positions are designated as having the relevance of Best_relev (title or Meta Keywords).
Long document (the longer the document, the greater the value of the factor).
The sum of the IDF words of the request. The name does not reflect the essence: for example, for the request of 'Gadyach' this factor will be more than for the request 'Moscow Peter Yekaterinburg Samara.
Long text without links.
Means the coincidence of the user and the site at the level of countries.
Coincidence of the thematic spectra of the request and document. Request theme-the result of work ((http://wiki.yandex-team.ru/evgenijkroxalev/subquery Rules of the sorcerer SubquerySearch)) The subject of the document is taken from Yandex-Catalog
The complex Static Rank is assembled from static components in a separate formula
The factor about the number of refines. The queries in the language have a photo of User Refines ('The Word, Before which there is a percentage sign'). As an idea, it means something like
Lemma coincides in text relevance.
DSSM model, trained for reformulations, in the documentary part uses relevant proposals
The document LR> 20 (Load Runner) The number of words of the words of the request in the Links> 16, the factor about lr.
URL High Lr. (Load Runner)
The Popularity of the Request
TR divided by a cube of the number of words in a request and transformed by a standard REMAPTR.
Document language - Russian. (for google it would english for worldwide and religional domains are promoted for local levels)
The time of adding a page, more - a more old document; the root is placed from time displayed for the interval [0.1] so that 3+ years gives 1.
If the main page of the owner (most often a second -level domain, for example, xxxx.ru), then the factor is 1. For bums, hosting, personal blogs, etc. (for example, Lifejornal, people, etc.) - Third -level domains (such as xxxxx.narod.ru) will also have a factor of equal 1.
The time of adding the main page of the owner (host?), Remapes the same as Addtime.
How often do they click in this URL for this request - CTR blasting for the correction factor
There is advertising on the site. negative ranking factor
The presence of pairs of words in the exact form
The number of sentences in which there are many words in the exact form
The presence of words in the title in the exact form
The presence of pairs of words based on synonyms (> = txtpair)
The presence of words in the title based on synonyms
How often do they click in the URLs of this Domainid for this request - Ctr Domainid downstream for the correction factor
The owner's clickness regardless of the request
There are all the words of the request in the links
Link relevance from Gulin.
There is an exact form of all the words of the request in the text/links
There is a lemma of all the words of the request in the text/links
Quality of the text (classifier Alekseev)
The core of the audience of the owners according to Yandex.Browszing
Spam Karma named after Antispamer is the likelihood that the host is spam; based on Whois information
The length of the document in sentences
The length of the URL, divided by 5
Document type - HTML
Link relevance taking into account the quality of each link
Fast document
Page from ru.wikipedia.org
Commercial page (Classifier of Savina)
The document does not have all the words of the request (with the exact synonym)
The percentage of all the words of the request in the text (with the accuracy of the form)
The degree of centralization of the points from which the request is set
Does the request of blog vocabulary contains
The content of content is not used.
The ratio of the number of clicks on this URL to all clicks on request
How often they click in this URL for this request - CTR dormant for the correction factor
How often do they click in this URL for this request - CTR blasting for the correction factor, by small regions from Relev_regions.web.txt
TR of the best passage - how high -quality snippet can do
TR with a discount for the offer number
Clickable domain by words
There is an ancient date in the URL. Ancient news is recognized. Factor 1 if there is a year in Url <= 2007.
The weight of the maximum coincidence of forms in the text and request
On the page there is a about 'payment SMS'.
Three levels of coincidence of the region of links and request
Geographical proximity
The average for users Active continuous time of the user is (in second) on the host pages after the transition on request from the search engine (the factor depends on the pair (request, Domattr)).
average for users, the number of active actions (clicks, clicks) with the continuous finding of the user on the host pages after the transition by request from the search engine (the factor depends on the pair (request, Domattr)). In the intra core, the Yandex.Bara/Elements meter/elements /Browser
The number of unique visitors from search engines for a specific request
Average for users, the number of active actions (clicks, keystrokes) on the page after the transition on request from the search engine (the factor depends on the pair (request, URL))
Geographical proximity of the user and site
The number of unique visitors, remembrance exponentially
The share of traffic from search engines
The share of suits to the site is not by links (set with hands or from bookmarks)
The number of unique visitors to URL
The average for users time is the user on the page. It is read as the difference between neighboring transitions.
Link factor about the availability of video on the page.
Geographical distribution of request
The request is set mainly at night
The request is given mainly in the morning
The request is given mainly during the day
The request is set mainly in the evening
The degree of severity of the queries at different times of the day
The weight of the words of the request that is in the text and links
Entropy - distribution of clicks
Entropy - distribution of impressions
Entropy - distribution of clique/shows
The ratio of the amount of IDFs met in a sentence+Title to all words.
Request about the video
The owner's clickness, regardless of the request, separately in the regions
Domain in the .com zone
Poemality of the Document
Document language - English
Requestful factors - the result of work
The document has textual relevance
An indicator of the unnaturalness of the text from the point of view of the Russian language. The number of bad pairs of words in the text, transferred to the segment [0.1] by formula z/(z+10)
The number of words in the text (the word is what the lemmeter selected) is displayed in [0.1] according to the formula x/(x+a)
The average length of the word
The percentage of the number of words inside the tag <a> .. </a> from the number of all words
The percentage of the number of words outside the tags (outside the brackets <>) from the number of all wordsThe percentage of the number of words, which are 200 of the most frequent words of the language, from the number of all words of the text.
The number used in the text 500 of the most popular language words, divided by 500
The number used in the text 500 of the most popular language words, divided by 500
Logarithm of the average geometric probabilities of the trigrams in the text. (The probability of a trigram - the number of its meetings in the text, divided by the number of all trigrams) is displayed in [0.1] by formula -x (x+a)
There is a big picture on the page
The share of pronoun adjectives
The share of pronoun nouns
The share of verbs
The share of words that can be both masculine nouns and nouns of the feminine, but not of the middle kind, among all nouns (examples:
The number of different internal links to the page
Frequency of links to the site
The number of almost periodic links
The number ofOwner shows on request, normalization x/(100 + x).
The number of URL shows on request, normalization x/(100 + x).
The popularity of Owner in query
The factor about that is a good snippet.
The language of the document corresponds to the query language
Popularity of the request within the country
The degree of centralization of the points from which the request (within the country) is set
Geographical distribution of the request within the country
The hour at which this request is given the most
The degree of severity of the task of requests at different times of the day (inside the country)
The country of the document (domain) and the country of the user coincide.
Country classifier of localization - how much the request implies the context of the country
The service factor that was needed to search the site, and in the future it will still be needed.
Probably model built in texts of incoming links
The number of clicks on the owner and the number of clicks on request is more than 5
The factor evaluates the differences between the positions of words in the title from the position of words in the request
URL's length with an accuracy of a symbol.
The degree of borenness of the page title.
The number of those in-line links between hosts
The average dwelltime, with DwellTime, is cut from the session if more than 180 seconds
The probability of a click on URL will be more than 120 seconds
The probability is that they do not click on the URL if they click at least one URL lower.
The average dwelltime, and DwellTime from the session is trimmed if more than 3600 seconds.
The average dwelltime, and DwellTime from the session is trimmed if more than 180 seconds.
The core of the page of the pages on which there is a metric counter
The share of clicks on this URL among all clicks on similar requests
The clickableness of the host according to the latest request
URL contains a token that coincides with the short name of the user country. The factor is considered only on the EU stream.
URL is an offer in the latest version of the market base.
The average length of the logical session in which there was a request
The document contains a name from a request.
The document has a direct link to the file
Regional attendance from search engines for a specific request
Clicks on the URLs shown in the issuance for requests, by which they went to look for other search engines
URLs shows in the issuance for requests, by which they went to look for other search engines
Classifier of the commerciality of the site
The document contains user review/comment
The share of clicks on this URL among all clicks on similar queries, the country version
The average number of found on request
The average position of URL on a normalized request
The average position of URL for all requests
The average position of the host for all requests
Number of URL queries
Number of requests for host
The share of the words of the document from segments with Score> 2.
The weight of the document on a monosyllabic dictionary of commercial vocabulary
The number of requests in the group of frequency requests similar to a given
The share of clicks on this URL among all clicks on similar requests calculated according to Popular Search Engine
The length of the Depth Nodes petal calculated for hosts
The dispersion of the angle in the space of Nodes Time, calculated for hosts
Average, according to the request, the probability of download the file from the host after click.
Factor of content of content.
CTR according to click data, the request is normalized according to Sinsets
Regional CTR according to click data, the request is normalized according to Sinsets
The number of chains on request / (the number of chains in which URL + the number of chains on request participated).
The number of chains in which URL was the last, normalized for the total number of chains in which this URL was.
Number of transitions to URL from Wikipedia
The page indicator is like a hub (how many pages are the bar users go from it).
The probability of URL is the last on request in the Hopes chain.
DSSM Prediction of the probability of URL + Title,that the page has one product.
DSSM Prediction of the probability of URL + Title, that the page has a lot of goods.
Factor by name from the original request is considered according to the contents of the document.
Owner has an E Com purchase.
User return on URL
Rank of hacking sites
The share of users who returned within a month
The number of users returned within a month
The share of the capital letters in Title
The share of incoming traffic from search engines among all incoming traffic
The share of direct visits among all incoming traffic
Neural model of content quality for medical subjects
Neural model of content quality for SOS topics
The ratio of the total area of all Flash blocks to the screen area
URL is a channel/post from a verified account of a social network
The distance from the city, from where a request was set
Logarithm of the number of shows.
The probability that URL is clicking if they do not click at least one URL higher.
The probability that URL is not clicks if they click at least one URL below.
Ordinary CTR. Localization to the level of countries.
The average dwelltime, and DwellTime from the session is cut if more than 3600 seconds. Localization to the level of countries.
The probability that the URL click will be more than 120 seconds. Localization to the level of countries.
The percentage of traffic from social networks in all traffic from other hosts and search.
The average number of direct descendants from the host on which more than 90 seconds were spent. Only if there is a link from our page for the descendant and crossed it.
The average maximum depth of word with the root in the current URL is visited from other hosts.
The number of times when the Delena went to the page to the total number of pages to which they switched from a sickle. The closer to 1, the more often the page was opened in the session.
The average length of search sessions when they passed from a sickle
Static url factor in browser logs for the maximum period. The probability that the user will spend on the page> 120 seconds.
The average time spent on the page and in all descendants of the page (URLS to which they crossed) from the host. WHOLE, if the total DT is more than 10 minutes
Probability of jumping from the page
The likelihood of image surges from the page
The document has a turbo page for mobile platforms.
BM25FDPR with standardization on the average length of the document, depending on the language of the document. Only texts hits are used.
The document has a protocol HTTPS
Domain in the international zone
The request was recognized as having intent to copyright objects protected by the anti-pirate memorandum.
The host contains pirate videos protected by anti-pirate memorandum.
The host contains a video protected by anti-pirate memorandum.
Average surplus of freshness of the host in 30 days
The share of documents with a positive freshness from the host for 30 days
The average position of the owner at the request for the last week.
The average attitude of punctuation to all dividers in the documents of the owner.
The value of the freshness detector, calculated in the hippo. Always 0 with the value of the detector less than the threshold.
The host contains a video protected by anti-pirate memorandum.
OFF - Page Ranking Factors (Yandex). (Link Building)
Yandex has given a great importance to those pages which get backlinks from other webpages. Here are few ranking factor relating to it.
Link relevance. The factor is remaped.
Quality of incoming links (classifier of the bream) - broken, cm
The number of incoming links. Remap.
TFIDF ordinary TF*IDF by links. The frequency of a word in the links is multiplied by the reverse document frequency and summarized in all words, then it is normalized to the length of the document.
Link relevance, taking into account the non -profitness of each link and quality of each link
The percentage of incoming links with the words of the request
The number of links that match the text of the request (other Remap)
The average age of links that brought something to LR Linkage = Min (LOG (average age of the link)/7, 1), 3 years are accepted for 1
The share of incoming corrupt links. The algorithm for recognition of commercial links is implemented. The factor is remaped [0.1] if the share of such links is 50%, otherwise 0. ((http://wiki.yandex-team.ru/svetlanashorina/topseolinks Sample of wound sites))
It characterizes the frequency of words in the links. The factor is large if the word that played in a lingon relevance is rare for links.
The presence in the links of the pairs of words taking into account synonyms
The number of links passed the threshold
Link relevance with pessimization for greater Link age
Quality of incoming links (hausel classifier) corrected
The quality classifier of incoming links 2 corrected
Link relevance without taking into account rare words
Dispersion of the number of words in the links.
You can also read and find 'Is it relevant to buy backlink in increase ranking?'
Yandex Technical SEO Ranking Factors.
We did not find any as such ranking factor for technical SEO but faster the webpage loads higher it ranks on the search engine.
Conclusion
We have tried to decrpt these ranking factors so that it can help the seo community.
Comentarios