SEO and AI – A look into the future

The ‘ Artificial Intelligence (AI in its English acronym) is on everyone’s lips. Unfortunately, most of the time it is named inappropriately.

In this container are inserted approaches and techniques that are very distant from the definition of Artificial Intelligence to which the experts of the sector usually refer. Today a small algorithm, a minimum of automation, is enough and AI is immediately.

So why should we talk seriously about SEO and AI?

Because Google uses Artificial Intelligence .

Google Search has not become (yet) an omnipotent entity with a life of its own, but we know for sure that it uses many tools that are part of that complex system of technologies that is currently artificial intelligence.

And it would be strange if he didn’t.

The task of Google Search is immense: to find, fathom, understand and index the web to be able to then answer any question with relevant content .

With Artificial Intelligence they have found a solution.

Google’s problem is not so much finding and downloading documents on the web. This is what comes next: understanding the content of the pages, interpreting the intentions of the users who have carried out the research and relating the two things, to propose the most relevant results.

At the beginning, the road traveled by Google was the easiest one. Analyzed the pages and the searches for what they were: strings of code. If the string corresponding to the user’s search string was found on a page, the problem was solved. And to choose the best result it was enough to look at how many inbound links the pages found had.

As many will remember, this has led to the use of techniques and terms still in vogue today ( keyword stuffing and keyword density ), which have the goal of “deceiving” the algorithm beyond the actual goodness of the page content.

This solution was not sustainable in the long term: manipulating the search results was relatively simple and this had deleterious effects on quality. The SERPs were in fact filled with a large amount of low-value content , created solely for the purpose of positioning themselves for a particular research.

Google and Artificial Intelligence


Starting in 2013, with the update called ” Hummingbird “, Google has chosen the most difficult path: trying to understand human language.

But this road had a gigantic limit: for a car, understanding human language is very difficult. So how do you go about achieving the goal?

Over the centuries, many theories have been proposed to give meaning to language , without ever arriving at a definitive “algorithm”, capable of decoding in a certain way the meaning of our words. A linguist would talk about understanding the relationship between langue and parole . However, algorithms capable of decent approximations have been created: they are extraordinarily complex algorithms, which for decades have been only an exercise in abstraction.

In recent years we have witnessed a great change, which has allowed us to put into practice the theories and mathematical / statistical elaborations created under the hat of Artificial Intelligence : the computing power necessary to run these algorithms has finally been reached.

Machine Learning and Natural Language Processing

The real turning point took place when a specific category of algorithms was developed which requires that machines can learn specific tasks and then work independently. Let’s talk about the so-called Machine Learning.

The basic concept is quite simple in theory, a little less in practice.

It starts with data related to questions for which the desired answer is already known. We have these data analyzed by complex statistical algorithms and we choose the algorithm that manages to get closer to the initial response. The machine then learns and can repeat the same process with new data.

You can give suggestions or rules at the beginning ( supervised ) or let the machine approach the answer ( unsupervised ) on its own. The more levels of calculation are added, the more the Learning becomes Deep .

To make sure that everything works, in addition to the computing power and statistically relevant results, you need to have lots and lots of data.

Case wants Google to have the entire web at its disposal.

In addition to algorithms, computing power and a large amount of data, there is only one piece left to try to understand the text of a page without going through a letter-by-letter comparison with a user’s search alone: ​​making machines digest the written text .

Computers are wonderful machines when dealing with numbers and with everything that is ” digital ” (ie all that is discreet ), but they are less useful with everything analogical , like language.

Thanks to the recently achieved computing power, however, it is finally possible not only to analyze the text as a sequence of letters, but also to relate words, concepts and sentences to each other.

Thanks to models that go under the name of Natural Language Processing (or NLP) it is possible to find the logical and meaningful connections within a textual corpus as big as you want.

Here therefore sanctioned the union that has allowed the birth of the last updates of algorithm (and the development of all the technologies that concern for example the voice assistants).

What allowed Google to “understand” documents and research was a combination of Machine Learning and NLP , used on a scale never seen before .

From the semantic search to RankBrain

With this wealth of knowledge, constantly updated, Google has succeeded in creating a huge graph of knowledge of semantic and syntactic relationships that has allowed it to arrive at answers to questions and to interpret complex requests.

This graph is the product of the analysis of billions of documents. Google can therefore understand that, if in a hypothetical document I talk about “starting from Piazza del Duomo, taking the subway and getting off at Bisceglie”, I’m not talking about any city with a Duomo or a subway. And I’m certainly not talking about the Apulian municipality. He understands that I refer to Milan, even if the word Milan is not present in the text. This kind of analysis is trivial for us, but getting such an “abstract” reasoning from a machine is amazing .

Imagine this process applied to all the words, concepts, places and people you can think of and you will have the size of Google’s semantic knowledge graph.

Obviously, this knowledge was immediately put to work, both in returning simple answers (as in the answers in SERP and Knowledge Graph), and in identifying the most pertinent documents to show for a particular research (in updates such as Hummingbird and Panda ).

At the end of 2015, RankBrain arrived and the explicit reference to the use of Artificial Intelligence brought this revolution to the fore.

RankBrain is the most striking case of using Machine Learning techniques that Google uses or could use. There are many other patents that start from the semantic analysis made possible by these techniques .

RankBrain was actually born as a search rewrite algorithm. Google estimates that 15% of daily searches are new , that is, that they have never been seen by Google before.

To be able to answer these unknown questions, Google relies on RankBrain to transform them into known questions (and therefore known answers). In practice, Google is capable of paraphrasing , interpreting new structures and concepts based on what it already knows.

While experimenting on this specific model, Google engineers realized that by playing with RankBrain, user searches could find documents more relevant than when they relied solely on their ranking signals. So the field of application has been expanded to all the searches and not only to the unknown ones.

Not just text

The application of this method of analysis certainly goes beyond text alone.

We know that Google is able to recognize (quite well) the content of the images and is at the forefront of what concerns the interpretation of the voice (therefore Voice Search and Video).

We must also consider that a web page is not just the body of the main text. There are all the meta that contain text and are therefore interpretable . We must also consider everything that is outside the main content, such as menus, sidebars and footers.

Furthermore, to correctly evaluate a page, it is necessary to consider the site as a whole. Therefore, to all the evaluations taken from the backlink profile, an evaluation is added on the contents of the site (as happens with Panda ).

What can be done?

Google’s communication has been quite clear over the years: nothing can be done, other than writing ” great content “.

In itself it is a great piece of advice, but in light of all the facts above, how do you understand what constitutes “great” content for Google?

If I wanted to write a recipe for a cake or a description of a pair of shoes, what would make my content as close as possible to the Platonic idea of ​​”cake recipe” or “sneaker description” for Google?

A recipe site and a shoe eCommerce are evaluated differently by Google. We can deduce this simply by looking at the SERPs and their different conformations. A cinema blog and a health blog will be subject to different algorithmic rules (as well seen by the August 2016 Medic Update ), because they talk about totally different topics.

Our vision

Our goal is to face Google on its own ground.

We will never be able to compete in the volume of data analyzed, but we can use the same logical processes and the same tools.

The SEO must address this change by embracing the complexity and scalability of its actions.

Google is more complex than ever to decipher, but it is he himself who has suggested to us the way to go to face such a problem. It is about adopting technological solutions appropriate to the problem.

Google was able to understand and judge a text using the AI’s own tools , so we propose to use the same tools to do the reverse process: analyze and study the documents that the SERPs propose to us, discovering which elements Google uses to determine his answers.

Using ML and NLP we can approach the reasoning models used and have concrete answers.

On our approach aims to be both holistic through analysis of pages and sites as a whole and of all the “offsite” factors, both vertically, through the study of the ways in which Google considers specific industries and specific purpose of research.

Above all we want to go beyond the classic audit method , go beyond the simple check from green traffic lights and the answers “there is / is not”. This is why we have developed a tool that can offer targeted and contextual solutions.

Thanks to neural networks we can study the complex and hidden relationships behind a search result.

Above all, we want to be a purposeful system. A practical help in daily SEO management, with suggestions and specific solutions for the site we work on, the goals we set ourselves and the ecosystem in which we operate.

To find out more about SEO and AI take part in our Digital Breakfast dedicated to the topic.