---
_id: '11914'
abstract:
- lang: eng
  text: 'Previous studies of the Web graph structure have focused on the graph structure
    at the level of individual pages. In actuality the Web is a hierarchically nested
    graph, with domains, hosts and Web sites introducing intermediate levels of affiliation
    and administrative control. To better understand the growth of the Web we need
    to understand its macro-structure, in terms of the linkage between Web sites.
    We approximate this by studying the graph of the linkage between hosts on the
    Web. This was done based on snapshots of the Web taken by Google in Oct 1999,
    Aug 2000 and Jun 2001. The connectivity between hosts is represented by a directed
    graph, with hosts as nodes and weighted edges representing the count of hyperlinks
    between pages on the corresponding hosts. We demonstrate how such a "hostgraph"
    can be used to study connectivity properties of hosts and domains over time, and
    discuss a modified "copy model" to explain observed link weight distributions
    as a function of subgraph size. We discuss changes in the Web over time in the
    size and connectivity of Web sites and country domains. We also describe a data
    mining application of the hostgraph: a related host finding algorithm which achieves
    a precision of 0.65 at rank 3.'
article_processing_charge: No
author:
- first_name: K.
  full_name: Bharat, K.
  last_name: Bharat
- first_name: Bay-Wei
  full_name: Chang, Bay-Wei
  last_name: Chang
- first_name: Monika H
  full_name: Henzinger, Monika H
  id: 540c9bbd-f2de-11ec-812d-d04a5be85630
  last_name: Henzinger
  orcid: 0000-0002-5008-6530
- first_name: M.
  full_name: Ruhl, M.
  last_name: Ruhl
citation:
  ama: 'Bharat K, Chang B-W, Henzinger MH, Ruhl M. Who links to whom: Mining linkage
    between Web sites. In: <i>1st IEEE International Conference on Data Mining</i>.
    Institute of Electrical and Electronics Engineers; 2001:51-58. doi:<a href="https://doi.org/10.1109/ICDM.2001.989500">10.1109/ICDM.2001.989500</a>'
  apa: 'Bharat, K., Chang, B.-W., Henzinger, M. H., &#38; Ruhl, M. (2001). Who links
    to whom: Mining linkage between Web sites. In <i>1st IEEE International Conference
    on Data Mining</i> (pp. 51–58). San Jose, CA, United States: Institute of Electrical
    and Electronics Engineers. <a href="https://doi.org/10.1109/ICDM.2001.989500">https://doi.org/10.1109/ICDM.2001.989500</a>'
  chicago: 'Bharat, K., Bay-Wei Chang, Monika H Henzinger, and M. Ruhl. “Who Links
    to Whom: Mining Linkage between Web Sites.” In <i>1st IEEE International Conference
    on Data Mining</i>, 51–58. Institute of Electrical and Electronics Engineers,
    2001. <a href="https://doi.org/10.1109/ICDM.2001.989500">https://doi.org/10.1109/ICDM.2001.989500</a>.'
  ieee: 'K. Bharat, B.-W. Chang, M. H. Henzinger, and M. Ruhl, “Who links to whom:
    Mining linkage between Web sites,” in <i>1st IEEE International Conference on
    Data Mining</i>, San Jose, CA, United States, 2001, pp. 51–58.'
  ista: 'Bharat K, Chang B-W, Henzinger MH, Ruhl M. 2001. Who links to whom: Mining
    linkage between Web sites. 1st IEEE International Conference on Data Mining. ICMD:
    International Conference on Data Mining, 51–58.'
  mla: 'Bharat, K., et al. “Who Links to Whom: Mining Linkage between Web Sites.”
    <i>1st IEEE International Conference on Data Mining</i>, Institute of Electrical
    and Electronics Engineers, 2001, pp. 51–58, doi:<a href="https://doi.org/10.1109/ICDM.2001.989500">10.1109/ICDM.2001.989500</a>.'
  short: K. Bharat, B.-W. Chang, M.H. Henzinger, M. Ruhl, in:, 1st IEEE International
    Conference on Data Mining, Institute of Electrical and Electronics Engineers,
    2001, pp. 51–58.
conference:
  end_date: 2001-12-02
  location: San Jose, CA, United States
  name: 'ICMD: International Conference on Data Mining'
  start_date: 2001-11-29
date_created: 2022-08-18T07:12:46Z
date_published: 2001-12-01T00:00:00Z
date_updated: 2023-02-17T09:59:47Z
day: '01'
doi: 10.1109/ICDM.2001.989500
extern: '1'
language:
- iso: eng
month: '12'
oa_version: None
page: 51-58
publication: 1st IEEE International Conference on Data Mining
publication_identifier:
  isbn:
  - 0-7695-1119-8
  issn:
  - '15504786'
publication_status: published
publisher: Institute of Electrical and Electronics Engineers
quality_controlled: '1'
scopus_import: '1'
status: public
title: 'Who links to whom: Mining linkage between Web sites'
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2001'
...
