iSeeker logo

Abstract

Project Plan

Ideas

  • web-communities
  • live update
  • HTML parser
  • adaptable GUI
  • distributed SE

    Research

    Download

    Source Code

    Misc

    Contact


    Home

  • iSeeker: Project Ideas
    [web-communities]

    No matter how fast and disorganised the web grows, there is still a tendency for resources which relate to common subjects to cluster together.
    A good example of this is the frequent chaining of web sites which are dedicated to the same subject, by the use of a common browser (normally placed either at the top or bottom of the site’s main page). These clusters are normally known as ‘web-rings’ and can be thought of as web-communities.
    Even when web-rings are not used, we frequently find good sites on some subject containing a page with links to other related sites on the web.
    This tendency for organised clustering could be extremely useful for any internet information retrieval tool.

    Even though present search tools can’t really understand what the user is searching for (only the addition of a good common-sense knowledge base will overcome this problem), they can still identify potential ‘web-communities’ therefore building up a useful level of abstraction.

    A great advantage from this approach results from the combined use of user feedback.
    Whenever a user supplies feedback on a particular page, other pages belonging to the same web-community should be affected since they somehow relate to each other.

    For example, if a user seaches for 'dolphins', without supplying any other clues, the search tool would probably find 2 main clusters which link closely together: the Miami Dolphins american football team sites, and sites related to the mammal. Say that, in this case, the user was indeed looking for the football team. He/She will supply negative feedback to pages like National Geographic, or Friends of the Earth, which will bring that web-community’s weight down, while the positive feedback to pages related to the american football team will raise the weight of that web-community and with it all of its member sites.
    Without knowing quite why, the user will see many irrelevant sites dropping in the ranking while the pages which interest him/her will raise to the top of the list. The tool is doing all the background work, based on the web-communities and the feedback from the user.

    However, we shouldn’t take an approach were a certain page is either going to belong or not to a certain community, as in a binary state (1/0 or on/off). Instead we should think of a page having a degree of membership, based on the number of links to and from that page within the community. So a page which is only linked to another site on the same community will have a ‘weaker’ membership than a page which links to a number of other sites within the same community.
    This makes sense because pages which have a very strong membership in the community should be more affected by the community’s weight (may it be positive or negative) than a page which is barely a member.