Who links to whom: Mining linkage between Web sites

Bharat K, Chang B-W, Henzinger MH, Ruhl M. 2001. Who links to whom: Mining linkage between Web sites. 1st IEEE International Conference on Data Mining. ICMD: International Conference on Data Mining, 51–58.

Download
No fulltext has been uploaded. References only!

Conference Paper | Published | English

Scopus indexed
Author
Bharat, K.; Chang, Bay-Wei; Henzinger, MonikaISTA ; Ruhl, M.
Abstract
Previous studies of the Web graph structure have focused on the graph structure at the level of individual pages. In actuality the Web is a hierarchically nested graph, with domains, hosts and Web sites introducing intermediate levels of affiliation and administrative control. To better understand the growth of the Web we need to understand its macro-structure, in terms of the linkage between Web sites. We approximate this by studying the graph of the linkage between hosts on the Web. This was done based on snapshots of the Web taken by Google in Oct 1999, Aug 2000 and Jun 2001. The connectivity between hosts is represented by a directed graph, with hosts as nodes and weighted edges representing the count of hyperlinks between pages on the corresponding hosts. We demonstrate how such a "hostgraph" can be used to study connectivity properties of hosts and domains over time, and discuss a modified "copy model" to explain observed link weight distributions as a function of subgraph size. We discuss changes in the Web over time in the size and connectivity of Web sites and country domains. We also describe a data mining application of the hostgraph: a related host finding algorithm which achieves a precision of 0.65 at rank 3.
Publishing Year
Date Published
2001-12-01
Proceedings Title
1st IEEE International Conference on Data Mining
Publisher
Institute of Electrical and Electronics Engineers
Page
51-58
Conference
ICMD: International Conference on Data Mining
Conference Location
San Jose, CA, United States
Conference Date
2001-11-29 – 2001-12-02
ISSN
IST-REx-ID

Cite this

Bharat K, Chang B-W, Henzinger MH, Ruhl M. Who links to whom: Mining linkage between Web sites. In: 1st IEEE International Conference on Data Mining. Institute of Electrical and Electronics Engineers; 2001:51-58. doi:10.1109/ICDM.2001.989500
Bharat, K., Chang, B.-W., Henzinger, M. H., & Ruhl, M. (2001). Who links to whom: Mining linkage between Web sites. In 1st IEEE International Conference on Data Mining (pp. 51–58). San Jose, CA, United States: Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/ICDM.2001.989500
Bharat, K., Bay-Wei Chang, Monika H Henzinger, and M. Ruhl. “Who Links to Whom: Mining Linkage between Web Sites.” In 1st IEEE International Conference on Data Mining, 51–58. Institute of Electrical and Electronics Engineers, 2001. https://doi.org/10.1109/ICDM.2001.989500.
K. Bharat, B.-W. Chang, M. H. Henzinger, and M. Ruhl, “Who links to whom: Mining linkage between Web sites,” in 1st IEEE International Conference on Data Mining, San Jose, CA, United States, 2001, pp. 51–58.
Bharat K, Chang B-W, Henzinger MH, Ruhl M. 2001. Who links to whom: Mining linkage between Web sites. 1st IEEE International Conference on Data Mining. ICMD: International Conference on Data Mining, 51–58.
Bharat, K., et al. “Who Links to Whom: Mining Linkage between Web Sites.” 1st IEEE International Conference on Data Mining, Institute of Electrical and Electronics Engineers, 2001, pp. 51–58, doi:10.1109/ICDM.2001.989500.

Export

Marked Publications

Open Data ISTA Research Explorer

Search this title in

Google Scholar
ISBN Search