Bilkent University
Department of Computer Engineering
Alpay Erdem
M.S. in Computer Engineering
Supervisor: Assist. Prof. Dr. Attila Gursoy
With the increased use of web, large volumes of click-stream data, embedded inside server logs, has become available for revealing user access patterns especially on specified web sites. Efficient web content presentation conveyed through links structure is a very important issue for efficient use of the site. Web Usage Mining can be used to improve web site design by finding deficiencies of the web site hidden in user access
patterns.
Misleading web site design leads users spent much more time for reaching target pages by reasoning redundant paths to be followed or lost in cyber-space without any hit. Furthermore, changing needs and interests of users by the time requires re-structuring of the web site. Therefore web sites can be updated according to user expectations. For that reason, most popular pages should be easily accessed, conceptually related pages
either should be categorized close enough or should be linked and misleading guidance directing users to different pages other than target should be detected. However barely finding frequent sequences is not sufficient for improving a web site. Since explored frequent patterns covers both interested patterns used for reaching popular sites and redundant patterns that is followed previous to reaching target page(s). Frequent backward references embeds knowledge of redundant and also related pages according to interestedness of those pages.
In this thesis, we propose a web usage-mining framework that advises re-design suggestions for web site improvement. In the usage preprocessing part of this framework, user navigation sessions are analyzed and both forward and backward references are obtained by considering cached pages. In order to interpret backward and forward references in terms of interestedness, we also incorporated time domain that finds page viewing timing for each visited page. In the mining process and interpretation part, frequent interested and redundant patterns are explored and interpreted for enabling popular pages more visible, linking related pages, reporting misleading categorization and detecting misleading guidance or categorization.
DATE: March 27, 2002, Wednesday @ 16:00
PLACE: EA-409