CitedEvidence
User Settings
Article

Dynamic characterization of a large Web graph∗

1

TL;DRAbstract

The Web is characterized by an extremely dynamic nature, as it is proved by the rapid and significant growth it has experimented in the last decade and by its continuous evolution through creation or deletion of pages and hyperlinks. Consequently, analyzing the tem-poral evolution of the Web has become a crucial task that can pro-vide search engines with valuable information for refining crawling policies, improving ranking models or detecting spam. In this paper we study a temporal dataset [6] made of twelve 100M pages snapshots of the.uk domain. We analyze the data at the level of interconnection between hosts, studying the tempo-ral evolution of 3 500 sites with respect to a number of topologi-cal properties, including degrees, number of degree supporters and eigenvector distributions. Our results show that a major fraction of the sites exhibit a very stable behavior. However, a non negligible percentage of hosts is characterized by increasing or decreasing evolution patterns. In-te

Chat with Paper

AI Agents for this Paper

The Web is characterized by an extremely dynamic nature, as it is proved by the rapid and significant growth it has experimented in the last decade and by its continuous evolution through creation or deletion of pages and hyperlinks. Consequently, analyzing the tem-poral evolution of the Web has become a crucial task that can pro-vide search engines with valuable information for refining crawling policies, improving ranking models or detecting spam. In this paper we study a temporal dataset [6] made of twelve 100M pages snapshots of the.uk domain. We analyze the data at the level of interconnection between hosts, studying the tempo-ral evolution of 3 500 sites with respect to a number of topologi-cal properties, including degrees, number of degree supporters and eigenvector distributions. Our results show that a major fraction of the sites exhibit a very stable behavior. However, a non negligible percentage of hosts is characterized by increasing or decreasing evolution patterns. In-te

Keywords

Computer scienceHyperlinkInformation retrievalNoveltyGraphWeb pageWorld Wide WebWeb crawler

Chat

Click to start Chat