Haimo Zhang, Shengdong Zhao

 

Video

Measuring Web Page Revisitation in Tabbed Browsing

Paper

Take-away Messages

Conventional measure largely under-estimate the revisitation rate under tabbed browsing. In our paper, we propose a revised method to measure revisitation for tabbed browsing.

The effective revisitation rate (EffRev%) is calculated as the number of effective revisits (or repeated page viewings) (#EffRev) divided by the total number of page viewings (#View). The definitions of each term are listed below.

  • Number of page viewings (#View). The total number of loading-based and tab-switching-based page viewings.
  • Number of effective revisits (#EffRev). A subset of all page viewings. Effective revisits are viewings of previously viewed pages.
  • Effective revisitation rate (EffRev%). The formula #EffRev/#View calculates the effective revisitation rate based on page viewings.

image

Has the frequency of users’ repeated page visits changed in recent years?

We found four previous studies that explicitly addressed revisitation rate and present them chronologically alongside our study (table 1). The first three studies used non-tabbed browsers, while the last two studies used tabbed browsers. The study of Dubroy and Balakrishnan [3] is not included since it did not explicitly report revisitation rate. Before the introduction of tabbed browsers, effective web
revisitation activities were accurately measured as the loading of previously loaded pages. This was done in studies 1, 2 and 3, all of which have revisitation rates of above 50%.

Did the introduction of tabbed browsers change this rate?
Study 4 suggested that there is a change: revisitation rate dropped to 43.7%. However, this may be misleading since the conventional measurement for revisitation was used in tabbed browsing (at least 15 out of the 25 participants in study 4 used tabbed browsers). Our study shows that the effective revisitation rate has not dropped that dramatically (59.6% is lower than study 3’s 81%, but is comparable with study 1’s 61% and study 2’s 58%) even with the introduction of tabbed browsers. Due to lack of tab switching data from the previous studies, we are unable to calculate effective revisitation rates for study 4.

 

ABSTRACT

Browsing the web has been shown to be a highly recurrent activity. Aimed to optimize the browsing experience, extensive previous research has been carried out on users’ revisitation behavior. However, the conventional definition for revisitation, which only considers page loading activities, largely underestimates the revisitation activities with tabbed browsers. Thus, we introduce a goal-oriented definition and a refined revisitation measurement based on page viewings in tabbed browsers. An empirical analysis of statistics taken from a client-side log study showed that although the overall revisitation rate remained relatively constant, tabbed browsing has introduced new behaviors, which should be addressed in the design of new browsing interfaces and features.

Author Keywords

Web revisitation, tabbed browsing, effective revisitation.

ACM Classification Keywords

H5.m. Information interfaces and presentation (e.g., HCI): Miscellaneous.

General Terms

Human factors

INTRODUCTION

Browsing the web has been shown to be a highly recurrent activity [8]. Extensive research that has been conducted to understand user behavior and optimize the browsing experience focused on “web page revisitation”, a term that refers to the repeated visits to previously visited web pages [2,4,5,6,7,8].

Previous studies, conducted mostly between 1994 and 2000, were mostly on non-tabbed browsers [1,5,8] and revisitation was defined as “the repeated loading of a web page as identified by its URL” [4,8].

Using this definition, the revisitation rate is calculated using the number of repeated page loading events (the difference between the total number of page loading events and the number of distinct URLs loaded, termed as the size of the URL vocabulary [8]) divided by the total number of page loading events.

In 2009, Dubroy and Balakrishnan [2] conducted a study on user behaviors in Mozilla Firefox browser, highlighting the significance of tab usage in revisiting web pages. In 2010, Huang and White [3] investigated parallel browsing, specifically how web searching tasks were performed with multiple tabs. Both papers used the conventional definition for revisitation and did not point out how repeated page visits in non-tabbed browsers were different from those in tabbed browsers.

Scope and Definition of Terms

From the user’s perspective, the purpose of “visiting” a web page is to get information. Thus, we propose the term “effective revisitation” to describe the repetition of obtaining information from a web page as identified by its URL. Since the majority of web content is visual (rather than audio), we focus on revisitation of visual content in this paper. Note that the experience with tabbed browsers could be approximated with non-tabbed browsers by launching multiple browser instances (thus, windows). For simplicity, we assume single browser instance in this paper.

A clear line needs to be drawn between the definition and the measurement of revisitation. Our proposed definition explains what revisitation is, rather than how it is measured. The conventional definition is somewhat misleading as it is actually a measurement of revisitations in non-tabbed browsers.

To understand why the conventional measurement for revisitation is insufficient under the tabbed browsing paradigm, several terms first need to be clarified:

  • Focused tab or current tab. The current visible tab in a browser. In tabbed browsers, only the tab that is being displayed is visible. Non-tabbed browsers could be thought of as having only one tab, which is always in focus.
  • Background tabs. They only exist in a tabbed browser. They contain opened pages, but unlike the focused tab, their content is not visible. All tabs except the focused tab are background tabs.
  • Page loading. A page loading event is recorded whenever an http request is sent to the server.

revisitation fig.1

 

  • Page viewing. A page is viewable when the tab containing it is in focus. Since it is difficult to know whether the user is actually viewing a web page unless eye-tracking mechanisms are used, we assume that the user is viewing a page whenever it is displayed in the focused tab.

Limitations of the Conventional Measurement for Revisitation

In non-tabbed browsers (since non-tabbed browsers only displays one web page, it can also be regarded as single-tabbed browsers), all web pages are loaded into the only tab, which is always in focus; so they are displayed whenever they are loaded. Therefore, the conventional measurement for revisitation is able to determine the number of page viewings through the concurrence of page loading events and page displaying events, given the assumption that page displaying equals page viewing.

In tabbed browsers, however, a page loading is not always a page displaying. The conventional measurement cannot be used to determine the number of page viewings since it would introduce two types of errors:

  • Over-count of revisitation activities. In tabbed browsers, a background tab can be closed without being viewed. This introduces three types of over-count errors that will be counted as revisitations:
    • Type 1. A previously loaded-but-not-viewed page is loaded again and viewed;
    • Type 2. A previously loaded-but-not-viewed page is loaded again but not viewed; and
    • Type 3. A previously loaded-and-viewed page is loaded again but not viewed.
  • In all of these cases, the same page was loaded several times but viewed not more than once. They are considered revisitations in the conventional definition but should be excluded under the new definition.
  • Under-count of revisitation activities. When users switch to a tab to display its content, which has been viewed before, no additional loading events are triggered. The conventional definition does not consider this behavior a revisitation; our proposed definition does.

Proposed Approach for Measuring Revisitations

To accurately measure revisitation in tabbed browsing, we propose to focus on page viewings instead of page loadings. There are two types of page viewing activities in tabbed browsers:

  • Loading-based page viewing. A web page (uniquely represented by its URL) that is loaded into the focused tab is considered viewed.
  • Tab-switching-based page viewing. A page in a background tab is considered viewed when the tab becomes the focused tab.

The effective revisitation rate (EffRev%) is calculated as the number of effective revisits (or repeated page viewings) (#EffRev) divided by the total number of page viewings (#View). The definitions of each term are listed below.

  • Number of page viewings (#View). The total number of loading-based and tab-switching-based page viewings.
  • Number of effective revisits (#EffRev). A subset of all page viewings. Effective revisits are viewings of previously viewed pages.
  • Effective revisitation rate (EffRev%). The formula #EffRev/#View calculates the effective revisitation rate based on page viewings.

Note that the difference between total number of page viewings and URL vocabulary size (number of distinct URL loads) does not equal the number of effective revisits, as some URLs could have been loaded but never viewed.

The new definition and measurement prompted the following research questions:

  • Using the proposed viewing-based measurement, what is the revisitation rate under the tabbed browsing paradigm? How does it differ from the results using the conventional measurement?
  • How significant are the over-count and under-count errors when the conventional method is used to measure revisitations in tabbed browsers?
  • Has the frequency of users’ repeated page visits changed in recent years?
  • Are there changes in revisitation behaviors with the introduction of tabbed browsers?
  • To answer these questions, we carried out a one-month study of 20 participants and their browsing behaviors in tabbed browsers.

A STUDY OF REVISITATION BEHAVIOR

After examining the design of Dubroy and Balakrishnan’s study on tabbed browsing behavior [2], we found that both their logger and study procedure can be used for our research on revisitation. We adopted those and changed only the interview questions to target our topic.

Participants and Duration

20 participants (7 females, age range 23-26, mean 24.1) from the university community took part in the one-month study. All participants are adequately experienced with the Mozilla Firefox browser in Windows operating systems. The participants were instructed to use the browser in their normal manner, without any bias towards using its tab feature.

Results and Analysis

A total of 235,707 browser events were captured from the 20 participants over one-month, among which there were 89,851 page loadings and 127,344 page viewings.

Using the proposed viewing-based measurement, what is the revisitation rate under the tabbed browsing paradigm? How does it differ from the results using the conventional measurement?

revisitation fig.2

The blue bar on the right end of figure 1 shows the overall conventional revisitation rate (39.3%, or 35,342 out of 89,851 loading events) based on page loadings, which was calculated using Tauscher and Greenberg’s method [8]. The red bar besides it shows the overall effective revisitation rate (59.6%, or 75,912 out of 127,344 page viewing events) using our proposed definition and measurement. The bars on the left of the figure show revisitation rates using the two measurements for each participant.

Our calculation shows that the conventional measurement largely underestimates the amount of revisitation activities under tabbed browsing (T19 = 11.25, p < .001). If we break down the effective revisitation of all 75,912 events into the two types of revisitation (loading-based revisitation and tab-switching-based revisitation), the former comprises 53.0% (40,221 events) while the latter comprises 47.0% (35,691 events). This shows that tab-switching-based revisitation, which was neglected in previous studies, is about as frequent as loading-based revisitation. It reinforces the point raised by Dubroy and Balakrishnan [2] that tab switching should be considered an important means of revisitation.

How significant are the over-count and under-count errors when the conventional method is used to measure revisitations in tabbed browsers?

There are a total of 4,135 over-count error events (11.7% of all conventional revisits). Of these, 3.9% (160) are type 1, 55.2% (2,283) are type 2 and 40.9% (1,692) are type 3.

There are 38,639 under-count error events (50.9% of all effective revisits), in which revisitation activities were done with tab switching alone.

These results show that conventional measurement suffers from both over-count and under-count errors. The under-count errors are much more than over-count errors, giving rise to the overall effect that conventional measurement largely underestimates revisitations in tabbed browsing.

Has the frequency of users’ repeated page visits changed in recent years?

We found four previous studies[1] that explicitly addressed revisitation rate and present them chronologically alongside our study (figure 2). The first three studies used non-tabbed browsers, while the last two studies used tabbed browsers.

Before the introduction of tabbed browsers, effective web revisitation activities were accurately measured as the loading of previously loaded pages. This was done in studies 1, 2 and 3, all of which have revisitation rates of above 50%.

Did the introduction of tabbed browsers change this rate? Study 4 suggested that there is a change: revisitation rate dropped to 43.7%. However, this may be misleading since the conventional measurement for revisitation was used in tabbed browsing (at least 15 out of the 25 participants in study 4 used tabbed browsers).

Our study shows that the effective revisitation rate has not dropped that dramatically (59.6% is lower than study 3’s 81%, but is comparable with study 1’s 61% and study 2’s 58%) even with the introduction of tabbed browsers.

Are there changes in revisitation behaviors with the introduction of tabbed browsers?

Yes. We observed several phenomena in revisitation behaviors that were not possible in non-tabbed browsers.

1)    Duplicate tabs. Users sometimes had two or more tabs containing a same page in a browser. Our findings show that 3.4% of all page loadings (3,085 out of 89,851) result in such duplicate tabs.

2)    Page wastage and revisiting previous unread pages. In a significant number of page loading events, the pages were loaded but never viewed (7,389 out of 89,851 page loadings, or 8.2%). Of these events, 33.1% (2,443 out of 7,389) were loading of pages that were previously loaded but not viewed by the user (i.e. type 1 and type 2 over-count errors). Surprisingly, the majority of these loading events (2,283 out of 2,443) are type 2 over-count errors, which means when previously loaded-but-not-viewed pages were loaded again, they very likely remained unviewed.

These behaviors seem irrational. Why would a user duplicate tabs that contain the same content? Why would he load a web page but never view it, and why would he load it again?

The semi-structured interviews we conducted give insights into these behaviors. There are two main reasons for duplicate tabs:

1)    Users intend to compare different parts of a same page; and

2)    Users may lose track of the location of a previously opened page or simply do not bother to look for it when too many tabs are open. When this happens, they just load the page again instead of looking for it in the existing tabs.

There are also two main reasons for page wastage, unread pages and their revisitations:

1)    Users reopen tabs that they accidentally closed before they could view them; and

2)    Users revisit audio content of a web page whose visual content do not need to be displayed in the focused tab.

DESIGN IMPLICATIONS

The findings of this paper have several implications. First, better tools need to be designed to support easy comparison and to reduce duplicate tabs. For example, when the browser detects that a duplicate tab is about to be created, it can either ask the user to switch to the existing one without opening a new one or provide a split screen of the existing tab to facilitate comparison.

Second, browsers can be designed to combine loading-based tools and tab-switching-based tools to facilitate browsing and revisitations. Current loading-based revisitation tools include back button, history, bookmarks, etc., while the only tool for tab-switching-based revisitation is the tabbed interface. A user may open a link from a parent tab in a new tab, thus creating a child tab. With its existing design, the back button in the child tab is unavailable because the history stack in the child tab is empty. However, the user’s mental model of the browsing history continues from the parent tab. With a hybrid design of the back button which incorporates page viewing actions into the interaction, clicking the back button in the child tab will put the parent tab back in focus, making it consistent with the user’s mental model of browsing history.

Third, as indicated by interview results, pages that consist of mostly audio content seem to be different from pages with mostly visual content since audio content of a page could be perceived without actually viewing that page in the browser. Designers may want to propose browser functions to manage these pages with audio content, such as a separate placeholder to access and manipulate the audio content of a page without showing the tab in its entirety. This can help reduce the need to open multiple tabs, which in turn reduces users’ cognitive load for tab management.

CONCLUSION

We propose a goal-oriented definition and measurement for revisitation under the tabbed browsing paradigm. Our client-side log study shows that the conventional measurement for revisitation largely underestimates revisitation activities in tabbed browsing. Although the overall revisitation rate has remained relatively steady over the years, tabbed browsing has introduced new behaviors. In the future, these need to be taken into account in studies of web page revisitation and when optimizing the browsing and revisitation experience in tabbed browsers.

REFERENCES

  1. Catledge, L.D., Pitkow, J.E. Characterizing browsing strategies in the World-Wide Web. In Proc. WWW 1995, ACM Press (1995), 1065-1073.
  2. Dubroy, P. and Balakrishnan, R. A study of tabbed browsing among Mozilla firefox users. In Proc. CHI 2010, ACM Press (2010), 673-682.
  3. Huang, J. and White, R. Parallel Browsing Behavior on the Web. In Proc. HT 2010, ACM Press (2010).
  4. Mayer, M. Visualizing Web Sessions: Improving Web Browser History by a Better Understanding of Web Page Revisitation and a New Session- and Task-Based, Visual Web History Approach. PhD Dissertation, University of Hamburg, 28-29.
  5. McKenzie, B. and Cockburn, A. An empirical analysis of web page revisitation. System Sciences, 2001. Proceedings of the 34th Annual Hawaii International Conference on
  6. Morrison, J. B., Pirolli, P., et al. A taxonomic analysis of what would world wide web activities significantly impact people’s decisions and actions. Ext. Abstracts CHI 2001, ACM Press (2001), 163-164.
  7. Obendorf, H., Weinreich, H., et al. Web page revisitation revisited: implications of a long-term click-stream study of browser usage. In Proc. CHI 2007, ACM Press (2007), 597-606.
  8. Tauscher, L. and Greenberg, S. Revisitation patterns in World Wide Web navigation. In Proc. CHI 1997, ACM Press (1997), 399-406.

Written by Shengdong Zhao

Shen is an Associate Professor in the Computer Science Department, National University of Singapore (NUS). He is the founding director of the NUS-HCI Lab, specializing in research and innovation in the area of human computer interaction.