Major Challenges In Web Analytics

1193 Words5 Pages

Abstract Analytics technologies that mine large amount of structured and unstructured data to gain insights are becoming increasingly important to businesses. Current web analytics are focused on e-commerce sites, where the visits have to converge in a purchase. The behavior of e-learning environments users is driven by information acquiring. In this paper we are going to analyse an educational website avatto.com using two analytical tools google analytics and statcounter. Keywords: Google Analytics, analytics for online education, statcounter, Online Analytical Tools 1. Introduction Each time a resource hosted on a web server is requested on the web a record is written in the web server log. A resource can be an HTML file which is typically …show more content…

Main Challenges Data processing in web analytics starts with determining unique visitors and visits. 3.1 Unique user identification One major challenge in web analytics is to identify unique visitors. One method is to identify them based on their IP addresses and the User Agent [6]. An alternative is to use cookies. Therefore, Google Analytics and other web analytics instruments use them to determine unique visitors. Cookies are used because IP addresses are not always unique to users and may be shared by large groups or proxies. However, there are other circumstances in which both of these (i.e., IP + User Agent and cookies) methods are inaccurate. 3.2 Multiple IP addresses - Single Visitor An individual that accesses the website from different locations/devices will have different IP addresses (respectively different cookie ID) from visit to visit and thus will be counted more than once. This makes tracking repeat visits from the same user difficult. 3.3 Multiple User Agents - Single …show more content…

Moreover, cookies can be deleted or blocked. The most accurate solution is to use registered user account information in order to identify individuals, especially for e-learning as most web-based educational systems use user authentication. This solution is the most realistic one and can be implemented only in an integrated system. 3.4 Visit/Session identification Identifying accurate visits is not a trivial task. That is mainly because HTTP protocol is stateless and connectionless. Thus, it is virtually impossible to determine when a user is consulting the site or visiting other sites or if actually leaves the website. Moreover, some ISPs or privacy tools randomly assign each request from a user to one of several IP addresses. Although rare, in these cases, a single server session can have multiple IP addresses. There are three main heuristics that are generally used to determine the visit termination: 1) Temporal heuristics that restricts the duration of the entire visit to a predefined upper bound (usually accepted as 30 minutes) [9] 2) Temporal heuristics that limits the time spent on any page to a threshold value accepted as 30 minutes according to [5],

More about Major Challenges In Web Analytics

Open Document