@position - SEPTEMBRE 2007
chiffres clés zoom sur... actualité études en cours... dossier
rechercher dans le site
 
faq
Les différentes formes de référencement sur les outils de recherche

inscription
votre adresse email
html texte
inscription à la revue du référencement
 
sondage
La recherche d'images, vous l'utilisez ? 
Jamais
Parfois
Souvent
 
archives
archives dossiers
archives études
archives interviews
archives chiffres clés
archives en cours
archives clin d'oeil
archives zoom
archives FAQ

Voir aussi :
Interview of Marc Najork, Microsoft Search Octobre 2004
Interview of Chris Searson Director, Broadband Communication Europe
Le référencement et la mesure de visibilité - 1 / 3 @position - mai 2003
Etudier les brevets pour comprendre les évolutions des moteurs de recherche - Janvier 2007
L'accentuation des mots clés @position / eStat - tous les résultats - octobre 2002

version imprimable
Studies
How do search tools work : the freshness
October 2003 / @position
I - The average freshness for Google, AltaVista, Alltheweb and Inktomir
II - Focus on “AltaVista Dance”.
 
I - The average freshness for Google, AltaVista, Alltheweb and Inktomir

In last June’s edition of the « Revue du Référencement », a study comparing the database volume of French search engines was published (See: la taille des outils de recherche francophones, in French ).

This month, the focus has been put on the freshness of the pages proposed by search engines.

Among all criteria of satisfaction sought by a web-surfers using a search tool, having a database with the most recent information is a key factor.

For a webmaster or a search engine optimiser, being able to define the functioning rhythm of a search engine (the date when the pages are captured and when the pages appear in the results) is a fundamental component for leading an efficient web-visibility campaign.

Baring in mind that, for some search tools, the “freshness” of a page (or the estimated age of a page or information) is becoming a ranking criterion.

Freshness management is one of the most complicated issues for a search engine. Crawling, document indexation calculations and new base publications are conducted in different ways by the main actors. This difference is significant for internet users.

Study methodology

The number of pages created at a fixed date and present in every search engine database was measured every day from august 11th to October 10th 2003. In order to avoid overloading search engines with automatic interrogations, we decided to limit this study to pages created up to 250 days prior to the measure date.

Based on daily measurements, we were able to calculate the average age of pages available. Please note that this study deals with newly added pages in databases and not with updates of pre-indexed pages that are some times treated differently.

Results and Analysis:

From august 11th, the following evolutions have been observed:



  • As expected, Google is the search engine that proposes the most recent information. Pages created in the last 250 days have been present, on average, for 100 days.
    If a search engine were to index everyday the same number of pages and publish them immediately without ever take them out of its the database; the average page would be 125 days old.
    Google is the only search engine that offers information with an average age inferior to this model. This difference means that the average time a page spends in Google database without being checked is less than 250 days.
    This curve on 2 months also shows the Google dance disappearance, as we knew them. Incomplete but daily updates in the database do not disturb the average pages age with significant breaks.


  • The second notable element of this study concerns AltaVista. We were fortunate to capture an “AltaVista Dance”. This means a major update of its database.
    Around September 21st, AltaVista included a very large number of new pages in its database. The sudden decrease of the average page age indicates the precise date at which this insertion took place.
    The fact that the average page age deceases by 30 days may indicate a monthly cycle. In the second part of the study, it is shown that the pages were mostly captured in July (2 months before). It is thus, understandable that the average page age after this update is still high (135 days).

  • Fast/Alltheweb has a different way in integrating new pages. Updates are made by changing parts of the global database for a “fresher” version. These updates, although substantial, are not total modifications of the database. It only seems to concern 25% to 33% of its indexed pages.
    Alltheweb has announced having a bigger index than Google just before its last major update

  • Inktomi is the last search engine of our study. There were no major insertions in this search engine since the beginning of this study. The last major inclusions were of pages captured in March and in the beginning of April 2003.

  •  
     
    next : Focus on “AltaVista Dance” >>>
     
    la revue du référencement - - -
    copyright @position 2006 - tous droits de reproduction réservés
    contact crédits plan du site