123 papers published in conference proceedings and journals in 2000 were submitted by IBM Research authors worldwide for consideration as 2000 Computer Science Best Papers. The following three were selected (in alphabetical order):
Graph Structure in the Web
Andrei Broder (Altavista), Ravi Kumar, Farzin Maghoul (Altavista), Prabhakar Raghavan (Verity), Sridhar Rajagopalan, Ramie Stata (Compaq), Andrew Tomkins, Janet Wiener (Compaq)
Published in 9th International World Wide Web Conference (WWW9)
This paper describes the largest study of the link structure of the Web to date culminating with the discovery of an unexpected global structure. The study is the first analysis of global Web structure, and is an order of magnitude larger than any previous connectivity-oriented study of the Web. Earlier studies of the Web had shown it to be a loose collection of pages all reachable from one another by clicking on a relatively small number of hyperlinks. This study shows that the chance of a path between two arbitrary pages is only 1 in 4. It shows that the Web is more accurately represented as a "bowtie"-shape picture, in which a surfer can travel from any page on the left side of the bowtie to all the pages in the middle, and from any page in the middle to all the pages on the right side, but not vice versa. The three regions (left, middle, and right) are surprisingly of about the same size. A fourth region, again of about the same size, contains "islands" of pages that are not known to the Web at large, and that do not link strongly into the better-known regions. The study additionally shows some important properties regarding the distribution of the in degree and out degree of Web pages, namely that they both obey power laws.
The results of the study have strong implications for anyone wishing to randomly sample a Web page as the page found will be biased by the starting point of the sampling process and the region in which the page is located. In addition, the discovered structure makes easier certain linear-algebraic computations on the Web graph used by Google and to be used by IBM Research's Web Fountain. This paper was selected as the best paper of WWW9. It has received significant coverage in the popular press, including articles in the New York Times, MSNBC, Nature, PC World and The Economist. The topic was the subject of numerous invited talks in workshops and conferences throughout the year.
Shopbots and Pricebots
Amy R. Greenwald (Brown) and Jeffrey O. Kephart
Published in Agent Mediated Electronic Commerce II
"Shopbots" are software agents that automatically gather and collate information from multiple online vendors about price and quality of consumer goods and services. This paper posits the rise of "pricebots", automated agents that employ price-setting algorithms in an attempt to maximize profits. It studies a simple economic model which helps characterize the likely impacts of a proliferation of shopbots and pricebots, in which some buyers select sellers at random and some use shopbots, and it studies several pricing algorithms that could be used by pricebots. It turns out that only a rather small fraction of the buying population needs to use shopbots before it becomes profitable for the sellers to use pricebots in return, suggesting that pricebots will be an important part of the future economic landscape. The results of simulation studies are presented which demonstrate the ubiquity of price wars, but that incorporating learning into the pricebot strategy greatly diminishes these price wars and seems to pave the way for practical pricebot algorithms. It is important for pricebots to know the prices of their competitors in order to set their own prices. The obvious way for them to do this is to use a shopbot. So, if shopbots and pricebots proliferate, sellers will be constantly bombarded with requests for price, and may cope by charging for this information to avoid overload. Both the idea of pricebots and this resulting overload are novel predictions that have been picked up subsequently by the research community.
This is likely to become a seminal paper. It coined the term and, more importantly, the idea of "pricebots". A number of research papers, citing this work, have investigated other aspects of pricebots, and the idea seems to be gaining in significance and is likely to be the basis for a great deal of work in automated price-setting in the future. Articles in the popular media, including the New York Times, Time Magazine and Wired, have mentioned pricebots and this work.
Transparently Obtaining Scalability for Java Applications on a Cluster
Yariv Aridor, Michael Factor, Avi Teperman, Tamar Eilam and Assaf Schuster (Technion)
Published in the Journal of Parallel and Distributed Computing
This paper describes cJVM, a clustered Java Virtual Machine developed by the authors, on which Java programs can be run without modifications. The significance of cJVM is that it maintains a single system image and achieves scalability of Java applications without using special hardware. The cJVM architecture relies on master and proxy nodes. Single system image is achieved by implementing proxy objects with the same internal representation as their masters and having all the proxy implementations coexist within a single class object. Scalability is achieved by three types of optimization techniques, cache-based optimizations, method invocation optimizations and object placement optimizations, which aim at enhancing locality, hence minimizing the amount of communication. The optimizations leverage the semantics of Java by tailoring the optimizations to the expected behavior of each Java data type and make use of data usage patterns, which are extracted from the Java bytecodes during class loading. The results of detailed experimental studies are presented which show that substantial performance increases can be obtained by combining the various optimization techniques. In particular, it is shown that caching techniques, employed at the level of classes, objects and individual object fields yield the largest optimizations when combined with invocation optimizations. Results obtained from running pBOB, a variant on the TPC-C benchmark, show 80% efficiency on a 4-node cluster.
An earlier much less complete conference version of this paper, cJVM: a Single System Image of a JVM on a Cluster by Aridor, Factor and Teperman, was selected as the best paper at the 28th IEEE International Conference on Parallel Processing in 1999.