Seminars

IBM Research welcomes members of the research community to our seminars. To ensure compliance with IBM security guidelines, we request you to contact the seminar host in advance. When you arrive at the Research lab, please provide the host's name to the receptionist.


Past seminars

The Surprising Versatility of Trace Tree Compilation: How We Set Out To Build A Better Just-In-Time Compiler And Suddenly Found Ourselves At The Center Of The Mozilla-Google Browser War
Prof. Michael Franz    On:  4-Jun-2009 10:00 AM - 11:30 AM
UC Irvine   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Calin Cascaval

Abstract:
Except for the fact that they operate at run-time and use smaller compilation units, just-in-time compilers have otherwise been quite similar to traditional batch compilers: they all first construct a model (graph) of a program's control flow, which is then traversed while generating code. My research group at UCI has been investigating an alternative technique in which the compiler never even attempts to build a complete control flow graph. Instead, our technique uses a novel (and patent pending) intermediate representation, the Trace Tree, which is constructed lazily on-demand while the program is simultaneously executed, incrementally compiled, and optimized. Our compilation technique is surprisingly competitive at much lower implementation complexity. Our academic prototype Java compiler has been able to attain a performance similar to commercial production compilers while using only about 1/7th of the memory footprint, 1/30th of the compile time, and 1/100th of the actual compiler size. Early experiments showed an even greater benefit for dynamically-typed languages such as JavaScript. Our academic experiments are now being validated in one of the largest "real world" trials imaginable. Mozilla recently selected our Trace Tree compiler as the new JavaScript engine for Firefox 3.1, with an expected 400 Million installations to come. Even in the current beta version, the Trace Tree compiler's JavaScript performance is a surprising 700% higher than that of FireFox's 3.0, and a staggering 15 times (1500%) faster than that of Internet Explorer. On the other hand, Google recently launched their Chrome browser, which uses a superbly well engineered but "traditional" control-flow based JavaScript compiler. As both browsers mature, the performance competition between them is likely to settle the question whether or not one should base future compilers on Trace Trees.

Speaker biography:
Michael Franz is a Professor of Computer Science in the Donald Bren School of Information and Computer Science at the University of California, Irvine (UCI) and has an additional courtesy appointment as a Professor of Electrical Engineering and Computer Science in the Henry Samueli School of Engineering. Franz received a Dr. sc. techn. degree in Computer Science (advised by Niklaus Wirth) and a Dipl. Informatik-Ing. degree, both from the Swiss Federal Institute of Technology, ETH Zurich.

Interprocedural Extended Static Checking
Domagoj Babic   On:  4-May-2009 10:00 AM - 11:30 AM
University of British Columbia   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Vijay Saraswat

Abstract:
Automatic software verification and bug finding have been a long-held goal in software engineering. Many techniques exist, trading off varying levels of automation, coverage, precision, and scalability. Extended Stating Checking (ESC), a combination of static checking and decision procedures, has emerged as a powerful technique for improving software reliability. The major limitation of the ESC paradigm is that it requires programmers to write tedious pre- and post-conditions for every function. One way around this restriction is to perform a whole-program interprocedural analysis. Unfortunately, previously known scalable interprocedural analyses are not precise enough for verification purposes. This talk presents a practical new technique for scalable interprocedural path-sensitive analysis, named structural abstraction, prototyped in the Calysto extended static checker. While interprocedural analysis eliminates the need for manual code annotation with pre- and post-conditions, it imposes a heavy burden on decision procedures because it requires checking the validity of large, complex logical formulas (i.e., verification conditions). The second topic of this talk is a novel approach to solving such large verification conditions. The presented approach gave a 100-fold improvement in performance over the best previously known approaches, significantly improving the usability of modern ESC tools. Thanks to these advancements, Calysto discovered dozens of bugs, completely automatically, in hundreds of thousands of lines of production, open-source applications, with a low rate of false error reports.

Speaker biography:
Domagoj Babic received his Dipl.Ing. in Electrical Engineering and M.Sc. in Computer Science from the Zagreb University (Faculty of Electrical Engineering and Computing) in 2001 and 2003. He received his Ph.D. in Computer Science in 2008 from the University of British Columbia. Domagoj was a recipient of Microsoft Research Graduate Fellowship in 2005. Domagoj's research focuses on software verification, tools for software engineering, and decision procedures. Lately, he is spending most of his free time researching verification and design of concurrent programs.

Measuring Perceptible Performance with Listener Latency Profiling
Prof. Matthias Hauswirth   On:  29-Apr-2009 10:00 AM - 11:30 PM
University of Lugano, Switzerland   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Peter Sweeney

Abstract:
Characterizing the performance of modern interactive applications is difficult for two main reasons. First, measuring the time from start to end of the application is pointless because that time is dominated by the user's think time, not by the execution time of the application. So, how should one characterize application performance? Second, modern interactive applications are often built on a framework, such as IBM's Eclipse or Sun's NetBeans, and consist of hundreds of plugins. Different users running these applications will install and use different sets of plugins. So, which plugins should one include when assessing the performance of an application? In this talk we address these two issues. We present an approach, listener latency profiling, to characterize the perceptible performance of interactive applications, and we describe an implementation, LiLa, which is lightweight enough so it can be deployed in production settings to gather the performance of the configurations installed by the actual users. While traditional profilers capture a profile that summarizes the total time spent in each method, a listener latency profiler identifies individual user-initiated operations with perceptible latency, and it does so on the level of listeners (also called observers) that form the basis of object-oriented interactive applications. LiLa, our lightweight listener latency profiler, produces accurate profiles while incurring minimal runtime overhead. Thanks to its low overhead it can be deployed in production settings to gather latency profiles in the field and report them back to the developers. We describe the different aspects of our approach, evaluate how they impact accuracy, and present case studies using our profiler on several complex object-oriented interactive applications.

Speaker biography:
Matthias Hauswirth is an assistant professor at the University of Lugano in Switzerland. Matthias is interested in approaches for measuring, understanding, and improving the performance of modern, complex systems. He is particularly intrigued by the intricate interactions between the different system layers, from the hardware, over the operating system, virtual machine, application frameworks, all the way to the applications and their users.

The Lively Kernel, and Others I Have Known
Dan Ingalls   On:  28-Apr-2009 02:00 PM - 03:00 PM
Distinguished Engineer   At:  Watson Research Center (Hawthorne), Room GN-F15
Sun Microsystems   Host:  Brent Hailpern

Abstract:
The first half of this talk will survey a number of the projects Dan has masterminded, including language design, virtual machine design, graphics and user interfaces. Each of these has involved a certain amount of thinking outside the box, fitting systems into a box, and in some cases programming with boxes, and all have turned out to be a lot of fun because of their aliveness. In the second part, Dan will demonstrate his latest work, the Lively Kernel, a rethinking of the Web as a universe of active objects. The Lively Kernel starts with JavaScript and browser graphics and builds a complete, reflective, self-supporting programming environment from scratch. This is a live web application with no download or installation required -- it is simply a web page. That's what Dan likes to do, but the idea is that if you can do that, you can build lots of other useful Internet applications in the same manner.

Speaker biography:
Dan Ingalls, a Distinguished Engineer at Sun Microsystems, is best known for his work on the Smalltalk programming environment, which revolutionized computing for both users and developers through human- computer interaction, the object-oriented paradigm, and development in integrated environments. He also invented pop-up menus and advanced the state of graphics with BitBlt and its variations that support rotation and antialiasing. For his noteworthy contributions, he has received the ACM Grace Murray Hopper Award and the ACM Software System Award. His most recent work takes these ideas to the World Wide Web in the Lively Kernel Project.

Finding and Exploiting Parallelism in Irregular Applications
Dr. Milind Kulkarni    On:  20-Apr-2009 10:00 AM - 11:30 AM
UT Austin   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  David Grove

Abstract:
With the advent of multicore processors, the challenge of increasing program performance has become one of parallelization. While much research over the past three decades has focused on parallelizing dense-array and matrix programs, far less attention has been paid to irregular programs, which operate over pointer-based data structures such as trees and graphs, where traditional approaches have largely failed to uncover significant amounts of parallelism. In this talk, I will show that irregular programs do, indeed, have large amounts of parallelism, and that this type of parallelism, which I call "amorphous data-parallelism," can be exploited efficiently and easily. I will describe the Galois system, which uses high-level abstractions to expose amorphous data-parallelism in sequential irregular programs, and uses semantic properties of these programs to perform automatic parallelization. I will then present a number of optimizations which allow programmers to exploit locality in pointer-based data structures to improve scalability and reduce overheads. I will show that the Galois approach is able to extract significant amounts of parallelism from amorphous data-parallel programs with little programmer effort.

Speaker biography:
Milind Kulkarni is a Postdoctoral Research Associate at the Institute for Computational Engineering and Sciences at the University of Texas at Austin. His research focuses on developing language features, compiler techniques and runtime systems that can be used to easily and efficiently parallelize applications for multicore processors. He received his Ph.D. in Computer Science from Cornell University in 2008, where he worked with Keshav Pingali.

Dynamically Fighting Bugs: Prevention, Detection, and Elimination
Shay Artzi   On:  16-Apr-2009 10:00 AM - 11:30 AM
   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Frank Tip

Abstract:
In this talk, I will present three techniques, each offering a solution to one of the tasks mentioned in the title (prevention, detection, and elimination of bugs), and each based on dynamic analysis or a combination of static and dynamic analysis. In addition, I will present a combined static and dynamic analysis for classifying the mutability of references. This mutability classification increases the effectiveness of the prevention technique and the performance of the elimination technique. I will spend the bulk of my time on the detection technique. This technique leverages dynamic testing and explicit state model-checking to automatically expose and localize crashes and malformed dynamically-generated Web pages in dynamic Web applications. These common problems seriously impact the usability of Web applications and are not addressed adequately by current tools for Web-page validation, which cannot handle the dynamically-generated pages that are ubiquitous on today's Internet. My technique generates tests automatically, runs the tests capturing logical constraints on inputs, and minimizes the conditions on the inputs to failing tests, so that the resulting bug reports are small and useful. The detection technique was implemented in Apollo, a tool that automates my technique. Apollo found 304 failures in six open source web applications.

Interprocedural Analysis and the Verification of Concurrent Programs
Akash Lal    On:  13-Apr-2009 10:00 AM - 11:30 AM
University of Wisconsin-Madison   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Olivier Tardieu

Abstract:
In the modern world, not only is software getting larger and more complex, it is also becoming pervasive in our daily lives. On the one hand, the advent of multi-core processors is pushing software towards becoming more concurrent. On the other hand, software is everywhere, inside nuclear reactors, space shuttles, cars, traffic signals, cell phones, etc. In meeting this demand for software, reliability will be a key factor, for which program verification plays an important role. My research addresses both of these challenges. First, I designed new techniques for the verification of concurrent programs. My thesis explores the problem of context-bounded verification, where only those concurrent behaviors that have a bounded number of context switches between threads are considered. We show that context-bounded verification provides a viable alternative to full verification because it can be very efficient and most bugs can be found in a few context switches. Second, along with colleagues at Wisconsin, I designed a model checker for machine code. In many scenarios, including those with malicious software, source code or debugging information may not be available. In those cases, it is necessary to have a tool that can understand machine code. My thesis shows how we went about systematically building an infrastructure for machine code for use inside a verification tool called McDash. We give evidence that McDash can understand the intricacies of low-level code, while competing well with source-level tools.

Abstraction Mechanisms for Modular Software Construction
Shan Shan Huang   On:  6-Apr-2009 09:30 AM - 11:00 AM
Georgia Tech   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Erik Altman

Abstract:
The complexity of modern software can be better managed through modular software construction, where software is built through the composition of modular, reusable pieces of code. Language abstraction mechanisms are key in writing reuseable code: procedural abstraction, developed in the 1960's, allows code to be reused over sets of concrete values; type abstraction, developed in the 1970's, allows code to be reused over sets of concrete types. In this talk, I introduce a possible next step in the development of abstraction mechanisms: structural abstraction. Structural abstraction supports an even higher level of reuse, by allowing code to abstract over the concrete structure of other code. I introduce structural abstraction through my techniques, ""static type conditions"" and ""morphing""; I demonstrate the benefits of these techniques through large case studies; and I show that they are the first to offer this higher level of reusability, while maintaining the high level safety guarantees of separate type-checking. I also briefly touch upon my work in raising the abstraction level used in programming hardware (e.g., FPGAs), from gates, wires, and registers to high-level Object-Orientation, with features such as data encapsulation, dynamic dispatch, and automatic memory management.

Speaker biography:
Shan Shan Huang is a Ph.D. candidate at the Georgia Institute of Technology, advised by Prof. Yannis Smaragdakis. Her research focuses on increasing programmer productivity by raising the level of abstraction offered by programming languages. Shan Shan is the recipient of the National Science Foundation Graduate Fellowship and the Intel Corporation Ph.D. Fellowship. She has conducted language design research as a intern at IBM T.J. Watson Research Center and at Sandia National Laboratory. She received her Bachelors of Science in Electrical Engineering and Computer Science from the Massachusetts Institute of Technology.

Supporting Liquid Metal with a Flexible FPGA Environment
Prof. Krste Asanovic   On:  6-Apr-2009 01:00 PM - 02:30 PM
UC Berkeley   At:  Watson Research Center (Hawthorne), Room GNF15
   Host:  Rodric Rabbah

Abstract:
In this talk, I will describe work at UC Berkeley as part of the Liquid Metal OCR (Open Collaborative Research) project. The Liquid Metal project seeks to map code written in the LiME language to execute efficiently on a system with both FPGAs and traditional processors. The LiMe language is essentially a superset of Java, while FPGAs are traditionally programmed at the logic gate level. Bridging this large ""semantic gap"" requires a combination of extensive compiler transformations and an effective runtime system bridging both conventional processors and attached FPGAs. UC Berkeley has developed several generations of software and gateware layers that make it easier to use FPGAs to accelerate a range of computing tasks, including architectural simulation in the RAMP project. I will describe how we are adapting and extending these layers for use with LiMe, and also highlight several related developments at UC Berkeley in the areas of high-performance reconfigurable computing.

Speaker biography:
Krste Asanovic is an Associate Professor in the Computer Science Division of the EECS Department at the University of California at Berkeley. He received a B.A. Electrical Engineering and Information Sciences from Cambridge University in 1987 and a PhD in Computer Science from UC Berkeley in 1998. His main research areas are computer architecture, VLSI design, parallel programming, and operating system design. He was one of the original instigators of the multi-university RAMP project, which aims to make it easier to use FPGAs for architecture research. He helped found the Par Lab at UC Berkeley, which is tackling the design and programming of manycore architectures. He also leads the Architecture Group at the International Computer Science Institute, and holds a joint appointment with the Lawrence Berkeley National Laboratory. Previously at MIT, he led the SCALE group, investigating advanced architectures for energy-efficient high-performance computing.

Social Software Engineering: Individual Communication, Coordination and Congruence
Patrick Wagstrom    On:  30-Mar-2009 10:00 AM - 11:30 AM
CMU   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Clay Williams

Abstract:
Often utilizing formal definitions, new processes, and alternative programming paradigms designed to increase success, software engineering is not typically portrayed as collaborative social process. Yet, a growing number of researchers and practitioners recognize that social interactions amongst developers, firms and foundations are a key element of building and maintaining high quality software in an open collaborative ecosystem. This talk examines one method of understanding how social interactions, task assignments, and dependencies between tasks interact at a team and individual level and their relation to overall team productivity through the lens of socio-technical congruence (STC). I explain how this organizational level view can be modified to provide increased relevance for individual team members. These modifications have the benefit that they allow the team member or manager to differentiate between general communication and communication that addresses dependencies within the project. I further address the practical implications of building tools around STC by evaluating the sensitivity of such networks to noise and architectural changes.

Double-header talk: (1) Automatically Finding Patches Using Genetic Programming, (2) The Road Not Taken: Estimating Path Execution Frequency Statically
Wes Weimer and Ray Buse   On:  12-Mar-2009 10:00 AM - 11:30 AM
University of Virginia   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Mark Wegman

Abstract:
Talk 1 (Wes Weimer, ~20 minutes): Automatically Finding Patches Using Genetic Programming The automatic repair of programs has been a longstanding goal in software engineering, yet debugging remains a largely manual process. We introduce a fully automated method for locating and repairing bugs in software. The approach works on off-the-shelf legacy applications and does not require formal specifications, program annotations or special coding practices. Once a program fault is discovered, an extended form of genetic programming is used to evolve program variants until one is found that both retains required functionality and also avoids the defect in question. Standard test cases are used to exercise the fault and to encode program requirements. After a successful repair has been discovered, it is minimized using structural differencing algorithms and delta debugging. We describe the proposed method and report results from an initial set of experiments demonstrating that it can successfully repair a dozen different C programs totaling over 100,000 lines in under 200 seconds, on average. Talk 2 (Ray Buse, ~20 minutes): The Road Not Taken: Estimating Path Execution Frequency Statically A variety of compilers, static analyses, and testing frameworks rely heavily on path frequency information. Uses for such information range from optimizing transformations to bug finding. Path frequencies are typically obtained through profiling, but that approach is severely restricted: it requires running programs in an indicative environment, and on indicative test inputs. We present a descriptive statistical model of path frequency based on features that can be readily obtained from a program's source code. Our model is over 90% accurate with respect to several benchmarks, and is sufficient for selecting the 5% of paths that account for over half of a program's total runtime. We demonstrate our technique's robustness by measuring its performance as a static branch predictor, finding it to be more accurate than previous approaches on average. Finally, our qualitative analysis of the model provides insight into which source-level features indicate "hot paths."

Speaker biography:
Wes Weimer research interests: static and dynamic analyses, automatic program repair, (formal) specification mining, defect reporting and triage, program-level security, model checking, regression testing for web applications. Ray Buse research interests: large-scale program analyses, imprecise static analyses, software readability, automatic documentation inference (e.g., javadoc), statistical models, "tools that are useful for people in the trenches" (e.g., architect's workbench).

Compositional Shape Analysis by means of Bi-Abduction
Dr. Dino Distefano   On:  9-Mar-2009 10:00 AM - 11:30 AM
Queen Mary, University of London   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:   Eran Yahav

Abstract:
This talk describes a compositional shape analysis, where each procedure is analyzed independently of its callers. The analysis uses an abstract domain based on a restricted fragment of separation logic, and assigns a collection of Hoare triples to each procedure; the triples provide an over-approximation of data structure usage. Compositionality brings its usual benefits --increased potential to scale, ability to deal with unknown calling contexts, graceful way to deal with imprecision -- to shape analysis, for the first time. The analysis rests on a generalized form of abduction (inference of explanatory hypotheses) which we call bi-abduction. Bi-abduction displays abduction as a kind of inverse to the frame problem: it jointly infers anti-frames (missing portions of state) and frames (portions of state not touched by an operation), and is the basis of a new interprocedural analysis algorithm. We have implemented our analysis algorithm and we report case studies on smaller programs to evaluate the quality of discovered specifications, and larger programs (e.g., an entire Linux distribution) to test scalability and graceful imprecision.

Protecting Java from Native Code
Prof. Gang Tan   On:  12-Feb-2009 10:00 AM - 11:15 AM
Lehigh University   At:  Watson Research Center (Hawthorne), Room 1NF28
   Host:  Martin Hirzel

Abstract:
Most software systems are multilingual; that is, they consist of components developed in multiple programming languages. For example, Sun's Java Development Kit 1.6 (JDK) contains around 2 million lines of Java code as well as 800,000 lines of C/C++ code for implementing native methods. It is well known that native methods in Java defeat Java's guarantees of safety and security and are constant sources of software bugs and vulnerabilities. This was confirmed by our recent empirical security study, which identified O(100) bugs in the native code of JDK 1.6. In this talk, we present language-based techniques for ensuring safety and security of Java-C interoperation. The first system, SafeJNI, employs static and dynamic checks to enforce security policies on C code (such as protecting the integrity of the JVM heap). The second system, ILEA, enables existing Java analyzers to understand foreign C code, by automatically extracting an approximate Java specification from C code. Finally, we discuss some ongoing work, including binary rewriting for enforcing security policies on native code.

Speaker biography:
Gang Tan is an assistant professor of Computer Science and Engineering at Lehigh University. He received his B.E. degree in Computer Science from Tsinghua University in 1999, and his Ph.D. degree from Princeton University in 2005. His research interests are in the areas of programming languages and computer security.

Parrot VM: dynamic features for modern virtual machines
Allison Randal   On:  5-Feb-2009 10:00 AM - 11:30 AM
O'Reilly   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Michael Hind

Abstract:
Dynamic languages are a hot area in commercial and academic research today. Early R&D in modern dynamic languages (such as Perl, Ruby, Python, and PHP) was often marked by a conservative approach, taking dynamic features in tightly controlled doses to make them fit existing models developed for static languages. That attitude has begun to change in recent years, and the major commercial virtual machines are now seeking to support dynamic features. Parrot is unique in the field of virtual machines targeting modern dynamic languages. It incorporates an object-oriented assembly language, is register-based rather than stack-based, and employs continuations as the core means of flow control. It hosts a powerful suite of compiler tools tailored to dynamic languages and a next generation regular expression engine. This talk outlines the overall architecture of Parrot with emphasis on the features that set it apart.

Speaker biography:
Allison Randal is chief architect and lead developer of the open source project Parrot. In over 25 years as a programmer, she has developed everything from games to linguistic analysis tools, e-commerce websites, shipping fulfillment, compilers, and database replication systems, worked as a language designer, project manager, conference organizer, editor, and consultant, been president of an open source software foundation, written two books, and founded a tech publishing company.

Automated Fault Localization: Current Research and Future Directions
Prof. Mary Jean Harrold   On:  15-Jan-2009 10:00 AM - 11:30 AM
Georgia Tech   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Frank Tip

Abstract:
The high cost of locating faults in software motivates the development of techniques that assist in fault localization--the task of finding the places in the code that must be changed to fix the faults. Fault-localization techniques attempt to automate this task by modeling the software using runtime information, and using these models to compute the suspiciousness of parts of the software. The effectiveness of these techniques depends on the type of information gathered, the way in which that information is used to compute the suspiciousness, and the composition of the test suite that is used to produce the runtime information. We have developed techniques that gather information at runtime, such as simple coverage, and techniques that gather more complex information, such as dependencies and state. Our techniques then use this runtime information to build statistical models on which we compute suspiciousness. We have shown that our techniques are useful for locating both single and multiple faults in the software. We have studied empirically the effectiveness of these different types of runtime information on the fault localization, and shown that our techniques outperform other existing techniques. We have also investigated the way in which the composition of the test suite affects the effectiveness of the fault localization, and shown that the test-suite composition is an important factor in the fault localization.
In this talk, I will describe our fault-localization techniques and discuss the results of the empirical studies that demonstrate the effectiveness of the techniques relative to other techniques, the impact of using different types of coverage, and the implications of the test-suite composition. I will also discuss remaining problems in the area of fault localization and discuss next steps in assisting in debugging, such as fault comprehension, fault diagnosis, and fault repair.

Speaker biography:
Mary Jean Harrold is the ADVANCE Professor of Computing and the Associate Dean for Faculty Affairs in the College of Computing at Georgia Tech. She performs research in analysis and testing of large evolving software, in fault localization and failure identification using statistical analysis, machine learning, and visualization, and in monitoring deployed software to improve quality. She has received funding for her research from government agencies, such as NSF and NASA, and industries, such as Boeing Commercial Airplanes, Tata Consultancy Services, IBM, and Microsoft. She served on the editorial boards of ACM TOPLAS and TOSEM and JSTVR, served as program chair for the ACM ISSTA 2000, program co-chair of the ACM/IEEE ICSE 2001, and general chair for ACM SIGSOFT FSE 2008. She also serves as a member of the Board of Directors for the Computing Research Association (CRA). Professor Harrold actively works to increase the participation of women in computing. She is a member, and past co-chair, of CRA-W and a member of the Leadership Team of the National Center for Women and Information Technology (NCWIT). Professor Harrold received an NSF NYI Award and was named an ACM Fellow. She received the Ph.D. in computer science from the University of Pittsburgh.

A tag-based approach for the design and composition of information processing applications
Anand Ranganathan   On:  12-Jan-2009 10:00 AM - 11:30 AM
IBM Research   At:  Watson Research Center (Hawthorne), Room 1S-F40
   

Abstract:
This talk is based on a paper in the OOPSLA'08 Onward! track by Eric Bouillet, Mark Feblowitz, Zhen Liu, Anand Ranganathan, and Anton Riabov.
  In the realm of component-based software systems, pursuers of the holy grail of automated application composition face many significant challenges. In this paper we argue that, while the general problem of automated composition in response to high-level goal statements is indeed very difficult to solve, we can realize composition in a restricted context, supporting varying degrees of manual to automated assembly for specific types of applications. We propose a novel paradigm for composition in flow-based information processing systems, where application design and component development are facilitated by the pervasive use of faceted, tag-based descriptions of processing goals, of component capabilities, and of structural patterns of families of application. The facets and tags represent different dimensions of both data and processing, where each facet is modeled as a finite set of tags that are defined in a controlled folksonomy. All data flowing through the system, as well as the functional capabilities of components are described using tags. A customized AI planner is used to automatically build an application, in the form of a flow of components, given a high-level goal specification in the form of a set of tags. End-users use an automatically populated faceted search and navigation mechanism to construct these high-level goals. We also propose a novel software engineering methodology to design and develop a set of reusable, well-described components that can be assembled into a variety of applications. With examples from a case study in the Financial Services domain, we demonstrate that composition using a faceted, tag-based application design is not only possible, but also extremely useful in helping end-users create situational applications from a wide variety of available components.


Programming in Prose
Prof. Jerry L. Potter   On:  12-Jan-2009 01:00 PM - 02:15 PM
Colorado State University   At:  Watson Research Center (Yorktown), Room 20-001
   Host:  Craig Stunkel

Abstract:
As a major component of the CS community, IT deserves its own computing paradigm. Tasks such as spreadsheets, browsers, databases and data mining are very time consuming and require large volumes of data, fast I/O and benefit greatly from multi-core massive data parallelism, human direction and interaction. Yet the von Neumann/Fortran paradigm with its central processing unit bottleneck, which was developed in the 1950s for scientific and engineering applications with significantly different needs, still dominates. The prose model of computing is easy to use and understand. It supports user interaction with simple tabular data structures and natural language like commands. The model supports the use of data parallelism and can easily handle inputting, processing and outputting large volumes of data in real time without incurring intractable multi-processor scheduling and communication overheads. The pseudo natural language syntax, based on the use of multi-word identifiers, verbs and prepositions, has several advantages. First, the code is easy to read and is self documenting, making it easier to debug and therefore more efficient and reliable. Second, the commands can be verbalized so that Plain English can be used where keyboards and mice can not be used or are not otherwise needed, i.e. PDAs, cell phones, table PCs, etc. Third, programs do not need to be compiled off line so that programming is dynamic. That is, the user can generate Plain English commands as needed, based on the data being retrieved and processed. New sequences of commands can be saved and used to generate new commands, allowing the users to dynamically generate and update their own applications.

High-Level Low-Level Programming and Immix Garbage Collection
Prof. Steve Blackburn   On:  11-Dec-2008 01:00 PM - 02:30 PM
Australian National University   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  David Grove

Abstract:
I will present two short talks. The first explores high level languages for systems programming (VEE'09) and the second describes the Immix garbage collector (PLDI'08). The power of high level languages lies in their abstraction over hardware and software complexity, leading to greater security, better reliability and lower development costs. However, opaque abstractions are often show-stoppers for systems programmers, forcing them to either break the abstraction, or more often, simply give up and use a different language. My second short talk addresses the challenge of opening up a high-level language to allow practical low-level programming without forsaking integrity or performance. The contribution of this work is three-fold: 1) we draw together common threads in a diverse and piece-meal literature, 2) we identify a framework for extending high-level languages for low-level programming, and 3) we illustrate the power of this approach through two concrete case studies. The simple framework we describe allows a broad family of low-level extensions to high-level languages via just three core ideas: extending semantics via intrinsic methods, extending types via unboxing and architectural-width primitives, and controlling semantics via scoped semantic regimes. We develop these ideas through the context of a rich literature and substantial practical experience. We show that they provide the power necessary to implement substantial artifacts such as a high-performance virtual machine, while preserving the software engineering benefits of the host language. This is part of Daniel Frampton's PhD at ANU, and joint work with Dave Grove, Perry Cheng, Robin Garner, Eliot Moss and Sergey Salishev. A garbage collector can directly determine program performance by making a classic space-time tradeoff that seeks to provide space efficiency, fast reclamation, and mutator performance. The three canonical tracing garbage collectors: semi-space, mark-sweep, and mark-compact each sacrifice one objective. My first short talk describes a collector family, called Mark-Region, and introduces opportunistic defragmentation, which mixes copying and marking in a single pass. Combining both, we implement Immix, a novel high performance garbage collector that achieves all three performance objectives. We show that immix outperforms existing canonical algorithms, improving total application performance by 7 to 25% on average across 20 benchmarks. As the mature space in a generational collector, immix matches or beats a highly tuned generational collector, e.g. it improves jbb by 5%. These innovations and the identification of a new family of collectors open new opportunities for garbage collector design. This is joint work with Kathryn McKinley

Towards a Science of Parallel Programming: From Backus to Codd
Prof. Keshav Pingali   On:  11-Dec-2008 01:00 PM - 02:30 PM
UT Austin   At:  Watson Research Center (Yorktown), Room 20-001
   Host:  Calin Cascaval

Abstract:
When parallel programming started in the 70's and 80's, it was mostly art: languages such as functional and logic programming languages were designed and appreciated mainly for their elegance and beauty. More recently, parallel programming has become engineering: conventional languages like FORTRAN and C++ have been extended with constructs such as OpenMP, and we now spend our time benchmarking and tweaking large programs no one understands to obtain performance improvements of 5-10%. In spite of all this activity, we have few insights into how to write parallel programs to exploit the performance potential of multicore processors. To break this impasse, we need a science of parallel programming. In this talk, I will argue that this requires replacing our current world view, which we owe to an IBM Turing award winner, John Backus, with that of another IBM Turing award winner, Ted Codd. I will then introduce a concept called "amorphous data-parallelism" that provides a simple, unified picture of parallelism in a host of diverse applications ranging from mesh generation/refinement/partitioning to SAT solvers, maxflow algorithms, stencil computations and event-driven simulation. Finally, I will present a natural classification of these kinds of algorithms that provides insight into the structure of parallelism and locality in these algorithms, and into appropriate language and systems support for exploiting this parallelism.

Speaker biography:
Keshav Pingali is the W.A."Tex" Moncrief Chair of Computing in the Department of Computer Sciences at the University of Texas, Austin.

Practical Checking of Heap Assertions
Bard Bloom   On:  9-Dec-2008 10:00 AM - 11:30 AM
IBM Research   At:  Watson Research Center (Hawthorne), Room GN-K35
   

Abstract:
Unrestricted use of heap pointers complicates understanding software systems. Incidental and accidental pointer aliasing results in unexpected side effects of seemingly unrelated operations, and are a major source of system failures. It is difficult or impossible to test and debug such failures with existing tools, especially when programs are concurrent or the assertions may inspect the entire heap. In this paper, we present a practical solution that enables the programmer to check ownership, sharing, reachability and other heap properties during program runtime. Technically, we allow the programmer to add expressive heap assertions into her code. Our approach uses a specialized virtual machine that efficiently evaluates the assertions using the components of a parallel garbage collector. We have implemented our approach on top of a production virtual machine. In this paper we demonstrate its usefulness by describing numerous real-world usage scenarios in which we found expressive heap assertions to be valuable.

Dynamic Program Understanding
Prof. Steven P. Reiss    On:  5-Dec-2008 10:00 AM - 11:30 AM
Brown University   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  John Field

Abstract:
We are interested in understanding the dynamic behavior of large-scale, long-running software systems. This talk will describe our current approaches to dynamic program understanding. Performance analysis, in the broadest sense, is a central issue in understanding the dynamic behavior of a system. Understanding the performance of server-based software that runs continually for long periods of time under widely varying loads presents a number of challenges. To address these issues we have developed a methodology consisting of a monitoring framework along with a set of specialized monitoring agents called proflets. The system gathers performance data with a guaranteed maximum overhead that is settable by the user by doing appropriate sampling and scheduling of the proflets. This makes the system suitable for use on production systems. The data is displayed dynamically through either a graphical or a web-based interface. Many server applications are event-based, with the bulk of the server execution listening for events and then processing them. Performance analysis of such applications should concentrate on event processing rather than on traditional performance measures. As part of our performance analysis tool we have developed a proflet that finds and then monitors event handlers. We are currently in the process of extending this to actually track the flow of events through the system. To fully understand the dynamics of a complex application, one needs to be able to ask and answer specific questions about the dynamics in terms of the application. Here we have developed a visualization system that lets the programmer define a application-specific dynamic visualization in terms of program events and an underlying data model and then view the result dynamically and historically. Our eventual goal is to combine these efforts into a comprehensive system that provides detailed dynamic analysis of just the parts of an application that are of interest to the programmer and does so in terms of the application and the expressed interests. We conclude by describing how this approach might work.

QVM: An Efficient Runtime for Detecting Defects in Deployed Systems
Matthew Arnold   On:  4-Dec-2008 01:00 PM - 02:30 PM
IBM Research   At:  Watson Research Center (Hawthorne), Room 
   

Abstract:
This talk is based on an OOPSLA 2008 paper by Matthew Arnold, Martin Vechev, and Eran Yahav. Coping with software defects that occur in the post-deployment stage is a challenging problem: bugs may occur only when the system uses a specific configuration and only under certain usage scenarios. Nevertheless, halting production systems until the bug is tracked and fixed is often impossible. Thus, developers have to try to reproduce the bug in laboratory conditions. Often the reproduction of the bug consists of the lion share of the debugging effort. In this paper we suggest an approach to address the aforementioned problem by using a specialized runtime environment (QVM, for Quality Virtual Machine). QVM efficiently detects defects by continuously monitoring the execution of the application in a production setting. QVM enables the efficient checking of violations of user-specied correctness properties, e.g., typestate safety properties, Java assertions, and heap properties pertaining to ownership. QVM is markedly different from existing techniques for continuous monitoring by using a novel overhead manager which enforces a user-specied overhead budget for quality checks. Existing tools for error detection in the field usually disrupt the operation of the deployed system. QVM, on the other hand, provides a balanced trade off between the cost of the monitoring process and the maintenance of sufficient accuracy for detecting defects. Specifically, the overhead cost of using QVM instead of a standard JVM is low enough to be acceptable in production environments. We implemented QVM on top of IBM's J9 Java Virtual Machine and used it to detect and fix various errors in real world applications.

Agile software development
Matt Ganis   On:  13-Nov-2008 01:00 PM - 03:00 PM
STSM   At:  Watson Research Center (Hawthorne), Room GN-F15
IBM SWG   Host:   Peter Santhanam

Abstract:
Agile software development -- embodied in techniques such as Extreme Programming (XP), Rational Unified Process (RUP), Open Unified Process (OpenUP), and Scrum -- are quickly becoming the dominant approach within IT organizations. This highly collaborative, iterative, and incremental approach favored by agile teams implies that all aspects of IT will need to change to accommodate and adopt this new paradigm. This workshop introduces participants to the fundamental principles, practices, and methodologies of agile software development. Some techniques will seem familiar to you and others will prove to be very provocative. Agile software development works and it is here to stay: Are you prepared?

Speaker biography:
Dr. Matthew Ganis is an IBM STSM and ibm.com site architect. Matt is the co-creator of the Agile@IBM community and was an early adopter of agile practices (XP) within IBM. Matt currently teaches the IBM Disciplined Agile Development class and has published numerous articles/papers on the use of Agile methods within ibm.com - both within it's traditional Web Development and the development/support of ibm.com's Second Life Island. Matt has been the co-chair and chair of the Academy of Technology's Agile Conferences for the past two years and is a Certified Scrum Master and Practitioner. Externally Matt serves on the editorial board of the International Journal of AGILE AND EXTREME SOFTWARE DEVELOPMENT and is a steering committee member of New York City's Agile Project Leadership Network (APLN) chapter.

Static checking of contracts in .NET via Abstract interpretation
Francesco Logozzo   On:  12-Nov-2008 10:00 AM - 11:30 AM
Microsoft Research   At:  Watson Research Center (Hawthorne), Room GN-K35
   Host:   Marco Pistoia

Abstract:
Managed contracts bring the advantages of design-by-contract programming to all .NET programming languages. Programming with contracts means writing pre-condition, post-conditions and object invariants. Contracts can be checked at runtime or at static time. After introducing the Managed contracts project, I will focus on the underlying static checker, Clousot. Clousot is an abstract interpretation-based analyzer designed to be efficient and automatic. It uses a combination of new abstract domains (Pentagons, Stripes, Subpolyhedra, ...), of new analysis techniques (partial backward analysis, iterative refinement, invariants on demand, ...) to achieve scalability without giving up performances. To reduce the annotation burden, and to produce a better user experience, Clousot can automatically infer pre-conditions and post-conditions. This is joint work with Mike Barnett and Manuel Fähndrich.

Speaker biography:
Francesco Logozzo is a researcher in the Programming Languages and Analysis group at Microsoft Research, Redmond. He is one of the main contributors of the Managed Tools project (http://research.microsoft.com/contracts/), mainly working on the static contract checker (Clousot). Before joining MSR, Francesco was a post-doctoral researcher at the ENS, Paris in the research group of Prof. Patrick Cousot. He graduated from Ecole Polytechnique, Palaiseau, France with a thesis entitle: "Modular static analysis of object-oriented programs". The advisor was Dr. Radhia Cousot.

The Human Side of Software Engineering
Dr. Janice Singer   On:  4-Nov-2008 02:30 PM - 04:00 PM
National Research Council Canada (NRC)   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Catalina Danis

Abstract:
Software Engineering is first and foremost a human endeavour. Yet we still know little (from a research perspective) about the human drivers, intentions, and barriers to producing good software. In this talk, Dr. Singer examines six aspects of the human side of software engineering. She looks at "the big picture," process, teamwork, communication, personality, and cognition. Each aspect is illustrated via video clips of interviews with software engineers.

Speaker biography:
Janice Singer is a Senior Research Officer in the Software Engineering Group of the National Research Council Canada (NRC). Dr. Singer's research lies at the boundaries of Human Computer Interaction, Computer Supported Cooperative Work, and Software Engineering. In these areas, Dr. Singer had conducted numerous qualitative and quantitative studies. Additionally, Dr. Singer advises others on appropriate study design. Dr. Singer is also an expert on research ethics and their application in software engineering contexts. She recently co-edited, "Guide to Advanced Empirical Software Engineering." Dr. Singer received her Ph.D. in Cognition and Learning from the Learning Research and Development Center of the University of Pittsburgh. Before coming to the NRC, she worked for Tektronix, IBM, and Xerox PARC.

Multicore programming models and their implementation challenges
Prof. Vivek Sarkar   On:  4-Nov-2008 10:00 AM - 11:30 AM
E.D. Butcher Professor of Computer Science   At:  Watson Research Center (Hawthorne), Room GN-F15
Rice University   Host:  Vijay Saraswat

Abstract:
The computer industry is at a major inflection point in its hardware roadmap due to the end of a decades-long trend of exponentially increasing clock frequencies. It is widely agreed that spatial parallelism in the form of multiple power-efficient cores must be exploited to compensate for this lack of frequency scaling. Unlike previous generations of hardware evolution, this shift towards multicore and manycore computing will have a profound impact on software. These software challenges are further compounded by the need to enable parallelism in workloads and application domains that have traditionally not had to worry about multiprocessor parallelism in the past. In this talk, we will focus on the programming problem for tightly coupled homogeneous and heterogeneous multicore processors. We present early experiences with the new Habanero Multicore Software Research project at Rice University (http://habanero.rice.edu) that encompasses work on programming models, compilers, runtimes, and concurrency libraries so as to enable portable software that can run unchanged on a range of homogeneous and heterogeneous multicore systems. The Habanero project takes a two-level approach to programming models, with a high-level model based on Intel Concurrent Collections for parallelism-oblivious domain experts , and a lower-level model based on the high productivity X10 language for parallelism-aware developers. We discuss compiler and runtime implementation challenges that must be overcome to enable mainstream applications to use these models on multicore systems. Solutions to some of these challenges are being addressed in collaboration with IBM Research as part of the Multicore Open Collaborative Research program.

Speaker biography:
Vivek Sarkar conducts research in programming languages, program analysis, compiler optimizations and virtual machines for parallel and high performance computer systems, and currently leads the Habanero Multicore Software Research project at Rice University (www.habanero.rice.edu). Prior to joining Rice, he was Senior Manager of Programming Technologies at IBM Research. His responsibilities at IBM included leading IBM's research efforts in programming model, tools, and productivity in the PERCS project during 2002- 2007 as part of the DARPA High Productivity Computing System program. His past projects include the X10 programming language, the Jikes Research Virtual Machine for the Java language, the ASTI optimizer used in IBM's XL Fortran product compilers, the PTRAN automatic parallelization system, and profile-directed partitioning and scheduling of Sisal programs. Vivek became a member of the IBM Academy of Technology in 1995, an ACM Distinguished Scientist in 2006, and the E.D. Butcher Professor of Computer Science at Rice University in 2007. He holds a B.Tech. degree from the Indian Institute of Technology, Kanpur, an M.S. degree from University of Wisconsin-Madison, and a Ph.D. from Stanford University. In 1997, he was on sabbatical as a visiting associate professor at MIT, where he was a founding member of the MIT RAW multicore project

Parallel Scheduling, Theory and Practice
Prof. Guy Blelloch    On:  3-Nov-2008 10:00 AM - 11:30 AM
CMU   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Vijay Saraswat

Abstract:
With the many levels of parallelism available on modern systems and the need for high-level parallel programming languages, developing effective dynamic scheduling algorithms is likely to be one of the biggest computational challenges of the next decade. Although different schedules might execute the exact same tasks, the schedule can have a huge effect on performance due to differences in overheads, locality, pipeline effectiveness, and memory latency. In this talk I will review the theoretical results from the past 20 years on dynamic schedulers and how well these results transfer to practice. This will include a review of work stealing scheduling, parallel depth-first (PDF) scheduling, and various hybrid approaches. I will then discuss some recent results on scheduling approaches that take advantage of both shared and distributed caches, as are present on multicore systems. Finally I will discuss some ongoing work on scheduling in the context of the X10 programming language, and some challenges for the future.

Engineering for Fine Grained Parallelism
Prof. Doug Lea   On:  3-Nov-2008 02:00 PM - 03:30 PM
SUNY Oswego   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Vijay Saraswat

Abstract:
Lightweight work-stealing frameworks are among the most promising means of exploiting multicores and multiprocessors. This talk will survey the basic approach, and discuss some of the challenges faced in engineering and extending functionality to capture a broader range of "everyday" parallel applications.

Chameleon: Adaptive Online Selection of Collections
Ohad Shaham   On:  31-Oct-2008 10:00 AM - 11:30 AM
Tel Aviv University   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Eran Yahav

Abstract:
Languages such as Java and C#, as well as scripting languages like Python, and Ruby, make extensive use of Collection classes. Different collection implementations have different characteristics in terms of the time required to perform certain operations, space utilization, and synchronization. Making an optimal choice of a collection implementation in a particular program usage is a nontrivial task. Choosing the right collection is even more challenging in the presence of concurrency due to the wide range of alternative implementations and their subtle differences. We present Chameleon, a tool that relieves the programmer from the burden of choosing an appropriate collection implementation for a particular usage point in the program. Chameleon collects on-the-fly information of how collections are used within a specific context, and automatically chooses the most suitable collection implementation available for a context. We have implemented our tool on top of the Java libraries provided by IBM's production JVM J9. We have also implemented a version of that works on top of QVM. Using conceptually simple QVM extensions we are able to significantly improve the performance of the tool. We have evaluated our tool over a small set of benchmarks. We show that Chameleon's choice of a collection is at least as good as the choice made by a typical Java programmer. (This is joint work with Martin Vechev and Eran Yahav)

Inferring Synchronization under Limited Observability
Greta Yorsh   On:  27-Oct-2008 10:00 AM - 11:30 AM
IBM Research   At:  Watson Research Center (Hawthorne), Room 1S-F40
   

Abstract:
This paper addresses the problem of automatically inferring synchronization for concurrent programs. Given a program and a specification, we infer synchronization that avoids all interleavings violating the specification, but permits as many valid interleavings as possible. We let the user specify an upper bound on the cost of synchronization, which may limit the observability - what observations on program state can be made by the synchronization code. We present an algorithm that infers, under certain conditions, the maximally permissive synchronization for a given cost. We implemented a prototype of our approach and applied it to infer synchronization in a number of small programs. (joint work with Martin Vechev and Eran Yahav)

BROWSER WARS 2: Browser Competition And The Evolution Of The Open Web Platform
Rob O'Callahan   On:  24-Oct-2008 03:30 PM - 05:00 PM
Mozilla   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:   David Bacon

Abstract:
Driven by a desire for influence over the Internet, Microsoft, Apple, Google and Mozilla are increasing their investments in Web browser development. This is having a positive effect on the design and implementation of Web standards to ease application development and dramatically broaden the scope of what Web apps can do. One key area of competition is Javascript performance; Apple, Google and Mozilla have all made dramatic improvements using a variety of compilation techniques. Other new features for the open Web include "worker thread" parallelism, support for running Web applications while offline, and lots of visual "bling". I'll talk about what's coming in the current and next generation of browsers, and how and why we're doing it. I'll also discuss the barriers and challenges that we're grappling with, especially in the area of security.

The birth of a Manycore Virtual Machine; A Renaissance update
David Ungar and Sam Adams    On:  24-Oct-2008 01:30 PM - 03:00 PM
IBM Research   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:   Erik Altman

Abstract:
The Renaissance ER project began earlier this year with the twin goals of discovering breakthrough performance and programmability for future massively multicore or "manycore" processors. The first six months of the project has seen excellent progress towards those goals with the development of the first manycore virtual machine operating on 56 cores of a Tilera64 manycore processor. This talk will cover the development of the virtual machine to date along with the coevolution of measurement and visualization strategies for manycore VM development. Early manycore enhancements to the programming and debugging environment will also be demonstrated.

Speaker biography:
Sam's bio: Sam Adams works in the Programming Models and Tools group in Software Research and was one of IBM's first Distinguished Engineers. In his 15 years at IBM, Sam has helped found the first Object Technology Practice in the IBM Consulting Group, developed the first large scale object reuse library in IBM, pioneered Self-Configuring Systems which foreshadowed Autonomic Computing, co-authored the XML Technical Strategy for IBM that set our direction into Web Services, coined the term Service Oriented Architectures and the Publish-Find-Bind model, explored Artificial General Intelligence in the Joshua Blue project, pioneered End User Programming for Web Services and mashup technology, and co-led Research's Enterprise 2.0 initiative, exploring Web 2.0 in the enterprise. Since mid-2007 he has been focused on discovering new programming models for massively multicore systems and is a Principal Investigator on the Renaissance Exploratory Research project.

Tax-and-Spend: Democratic Scheduling for Real-time Garbage Collection
David P. Grove    On:  14-Oct-2008 01:30 PM - 02:30 PM
IBM Research   At:  Watson Research Center (Hawthorne), Room 1S-F40
   

Abstract:
This talk is based on an EMSOFT'08 paper by Joshua Auerbach, David F. Bacon, Ben Biron, Perry Cheng, Charlie Gracie, David Grove, Bill McCloskey, Aleks Micic, and Ryan Sciampacone. Real-time Garbage Collection (RTGC) has recently advanced to the point where it is being used in production for financial trading, military command-and-control, and telecommunications. However, among potential users of RTGC, there is enormous diversity in both application requirements and deployment environments. Previously described RTGCs tend to work well in a narrow band of possible environments, leading to fragile systems and limiting adoption of real-time garbage collection technology. This paper introduces a collector scheduling methodology called tax-and-spend and the collector design revisions needed to support it. Tax-and-spend provides a general mechanism which works well across a variety of application, machine, and operating system configurations. Tax-and-spend subsumes the predominant pre-existing RTGC scheduling techniques. It allows different policies to be applied in different contexts depending on the needs of the application. Virtual machines can co-exist compositionally on a single machine. We describe the implementation of our system, Metronome-TS, as an extension of the Metronome collector in IBM's Real-time J9 virtual machine product, and we evaluate it running on an 8-way SMP blade with a real-time Linux kernel. Compared to the stateof-the-art Metronome system on which it is based, implemented in the identical infrastructure, it achieves almost 3x shorter latencies, comparable utilization at a 2.5x shorter time window, and mean throughput improvements of 10-20%.

Speaker biography:
My primary research interests are in programming language design and implementation. I've done work in the analysis and optimization of object-oriented languages, virtual machine design and implementation, JIT compilation, online feedback-directed optimization, and garbage collection. Most of my current research is done in the context of the Metronome Project, which is making Java suitable for use in implementing large scale real-time systems. The Metronome garbage collector is now available as part of IBM's WebSphere Real Time product and TuningFork is available as an alphaworks download. For more details, see the project software page. I am a member of the Dynamic Optimization Group, which developed the optimizing compiler and adaptive optimization system for the Jalapeno virtual machine. In 2001, Jalapeno became Jikes RVM and was released as an open source project. Since 2001, I have been a member of the Jikes RVM core team and steering committee. I received my Ph.D. in Computer Science in October, 1998 from the University of Washington's Department of Computer Science and Engineering. While at UW, I was a member of the Cecil/Vortex project.

Behavioural Model Fusion: Merge, Composition and Verification
Shiva Nejati   On:  13-Oct-2008 09:45 AM - 11:00 AM
University of Toronto)   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Amit Paradkar

Abstract:
There is a rapidly growing interest in model-based development as a way to increase the level of abstraction and automation in software engineering. The ultimate goal of model-based development is to improve the software process by promoting the use of models as the primary artifacts of development, and to provide computer-supported technologies to transform models into running systems. Model-based development becomes particularly challenging in projects where developers have to handle multiple partial models of a system. Individual models may represent different system features, describe alternative perspectives on a single feature, or express ways in which features alter one another's structure or behaviour. We refer to the process of integrating a collection of partial models into a whole system as "model fusion". In this talk, I present my work on fusion of behavioural software models. In particular, I focus on the following two problems: (1) merging variant models of individual features with the goal of simplifying system maintenance, and (2) composing models of different features with the goal of identifying and resolving their undesirable interactions. I explain the theory behind the work, and demonstrate how our techniques can be applied for management and analysis of models from a telecommunication domain.

Speaker biography:
Shiva is a Ph.D. candidate in the Computer Science department of University of Toronto where she recently defended her thesis. Her thesis adviser is Prof. Marsha Chechik. She is a co-author of the paper on "Matching and Merging Statechart Specifications" which received one of the best paper awards at ICSE'2007.

Constrained Types for Object-Oriented Languages
Nate Nystrom   On:  13-Oct-2008 11:00 AM - 12:30 PM
IBM Research   At:  Watson Research Center (Hawthorne), Room 1S-F40
   

Abstract:
X10 is a modern object-oriented language designed for productivity and performance in concurrent and distributed systems. In this setting, dependent types offer significant opportunities for detecting design errors statically, documenting design decisions, eliminating costly run-time checks (e.g., for array bounds, null values), and improving the quality of generated code. We present the design and implementation of constrained types, a natural, simple, clean, and expressive extension to object-oriented programming: A type C{c} names a class or interface C and a constraint c on the immutable state of C and in-scope final variables. Constraints may also be associated with class definitions (representing class invariants) and with method and constructor definitions (representing preconditions). Dynamic casting is permitted. The system is parametric on the underlying constraint system: the compiler supports a simple equality-based constraint system but, in addition, supports extension with new constraint systems using compiler plugins.

The Java Module System, and its Problems
Rok Strnisa   On:  6-Oct-2008 10:00 AM - 11:30 AM
Cambridge, UK   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  John Field

Abstract:
The talk will outline the key ideas behind the upcoming Java Module System, the module system proposed for Java 7. The talk will also describe some of the problems with the proposal that our research has identified, and outline our ideas for removing them.

The new importance of language features in raising the abstraction bar in software engineering
Prof. Judith Bishop    On:  1-Oct-2008 10:00 AM - 11:30 AM
University of Pretoria, South Africa   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Satish Chandra

Abstract:
In the context of software engineering, abstraction is the means by which developers move from layer to layer in the realization of the solution to a large problem. For more than a decade, programming languages in wide industrial use have been at a fixed point, defined by Java and C++. This talk reports on the relevance of new language features in C# 3.0 and their applicability for raising the level of abstraction at which ordinary programmers can operate. These features include delegates, properties, extension methods and lambda expressions, most of which are not (yet) in the other major languages, as well as the finer details of reflection, generics and iterators. As a corpus for the investigation, we have used the classic design patterns. From their very inception, there was the anticipation that some of them would be superceded by new language features. Yet published implementations of classic patterns do not generally live up to this promise. Examples of new approaches to pattern implementations using the new language features will be given. A look at the efficiency of some alternative implementations will be presented, highlighting the visitor pattern. Finally we see how the concepts embodied in the new C# 3.0 features can be incorporated into Java by means of a small set of generic classes.

Speaker biography:
Judith Bishop is a professor of Computer Science at the University of Pretoria, South Africa. She previously worked at the universities of the Witwatersrand and Southampton, where she received her PhD in 1977, working on compilers for structured architectures. Her research interests are on the principles of adaptive software in a multi-lingual and mobile environment, in collaboration with Microsoft Research, local companies and collaborators in Germany and Italy. Her 14 books on programming and languages have been translated into six languages and are read worldwide. Judith is a proud South African with a visible presence abroad, serving on international editorial, programme and award committees, and organizing conferences and summer schools in South Africa aimed at keeping postgraduates involved in cutting edge research. On sabbaticals she has worked at Microsoft Research Cambridge, the SEI in Pittsburgh and t the Universities of Victoria, Karlsruhe, Berlin and Milan. She was elected chair of IFIP's working group on Software Implementation Technology for two terms, and is now chair of the World Computer Congress in 2008. In 2010, she will be bringing the ACM/IEEE International Conference on Software Engineering Conference to Cape Town. She has received many awards for excellence in and service to computer science, and most recently was elected a Fellow of the Royal Society of South Africa. In 2008, she received UP Leading Mind Medal in the University's Centenary Year.

Instrumentation made Easy: An Introduction to Tracematches
Pavel Avgustinov   On:  29-Sep-2008 10:00 AM - 11:30 AM
Oxford University   At:  Watson Research Center (Hawthorne), Room GNK35
   Host:  Frank Tip

Abstract:
It is very common in the area of fault-detection to instrument a base program with additional code that either detects or corrects errors. Conventionally, such instrumentation is either written by hand or in some form of the aspect-oriented programming paradigm, but both of these approaches tend to be error-prone and costly in terms of development effort. Tracematches are one of the more mature incarnations of "trace monitoring". They allow the developer to write a short, declarative specification of their concern, and generate efficient (and provably correct) instrumentation that checks the specification. This talk will give a basic introduction to tracematches, discuss some of the performance implications and look at their applicability to common problems.

Lowering the Bar for Precise Pointer Analysis
Ben Hardekopf    On:  22-Sep-2008 10:00 AM - 11:30 AM
UT Austin   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:   Stephen Fink

Abstract:
Pointer information is a fundamental prerequisite for many program analyses, including hot topics such as program verification, program comprehension, security analysis, and many more. This talk presents a set of new, highly-scalable pointer analysis algorithms that target two different levels of precision: flow-insensitive, inclusion-based pointer analysis (i.e., Andersen-style analysis) and flow-sensitive pointer analysis. The main part of the talk shows how to significantly improve the scalability of Andersen-style analysis by targeting two different notions of equivalence: pointer equivalence and location equivalence. These techniques yield an Andersen-style analysis that is over 4x faster and uses over 7x less memory than the previous state of the art. The remainder of the talk briefly covers the major challenges of flow-sensitive pointer analysis and details our approach to meeting those challenges. We report on the results of the first stage of our research, which yields a flow-sensitive algorithm that is 197x faster than the previous state of the art. We also give a glimpse into the next stage of our research, which is currently under development. This work promises to increase the scalability of flow-sensitive pointer analysis even further, perhaps even to millions of lines of code.

Speaker biography:
Ben Hardekopf received a BS in Computer Science and BSE in Electrical Engineering from Duke University in 1997. He received a Masters in Computer Science from SUNY at Utica/Rome in 2000 while serving as an active duty officer in the United States Air Force. He is currently in the Ph.D. program at The University of Texas at Austin and expects to graduate in May of 2009. His advisor is Calvin Lin.

Message-Passing Concurrency Models for Writing Parallel Programs
Prof. Martin Sulzmann   On:  19-Sep-2008 10:00 AM - 11:30 AM
ITU Copenhagen   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  John Field

Abstract:
In order to benefit from the additional performance (cores) of multi-core architectures, we have to write concurrent programs which can then be executed in parallel on these platforms. We review message-passing concurrency a la Erlang and Join calculus and illustrate its use via several programming examples. We report on our own recent work where we apply ideas from multi-set constraint rewriting to describe richer coordination patterns among messages and to support the parallel execution of Join patterns.

Speaker biography:
Martin Sulzmann, PhD, graduated from Yale University in 2000. He is currently an Associate Professor at the IT University of Copenhagen in the Programming, Logics and Semantics group. He held previous positions at the National University of Singapore and the University of Melbourne. His research interests are centered around programming language design, implementation and analysis, mainly of functional and constraint logic languages.

Efficient Sparse Matrix Vector Multiplication on GPUs
Muthu Baskaran   On:  18-Sep-2008 10:00 AM - 11:30 AM
Ohio State University   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:   Rajesh Bordawekar

Abstract:
We are witnessing the emergence of Graphics Processor units (GPUs) as powerful massively parallel systems. Furthermore, the introduction of new APIs for general-purpose computations on GPUs, namely CUDA from NVIDIA and CTM from ATI, makes GPUs an attrative choice for high-performance numerical and scientific computing. Sparse matrix-vector multiply (SpMV) is one of the most important and heavily used kernels in scientific computing. However with indirect and irregular memory accesses resulting in more memory accesses per floating point operation, optimization of SpMV kernel is a significant challenge in any architecture. In this work, we examine the various challenges in developing a high-performance SpMV kernel on NVIDIA GPUs using CUDA and devise techniques to address them. We develop techniques that optimize memory accesses in memory-bound SpMV kernel by (1) effectively utilizing the various (low-latency) memories available in GPUs, (2) extracting optimal memory access pattern pertaining to the type of memory, and (3) reordering computation to exploit data reuse. We develop an optional inspector-executor module to preprocess and analyze the non-zero pattern in sparse matrices to guide the optimizations. We demonstrate the performance improvements achieved by our approach compared to existing state-of-the-art parallel SpMV implementaions on GPUs.

Experience with META / xWB -- Generating Modeling Workbenches
Joel Ossher    On:  12-Sep-2008 10:00 AM - 11:30 AM
UC Irvine, IBM Research summer intern   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Doug Kimelman

Abstract:
Metaworkbenches have taken strong hold in the enterprise architecture domain (e.g. Telelogic System Architect) and in the embedded systems domain (e.g. MetaCase). Feedback from IBM Rational customers (e.g. Pitney Bowes, Unisys, HSBC) indicates a strong desire for metaworkbenches in the mainstream requirements and IT architecture domain as well. Analysts and architects want tools to conform to their way of thinking about and conceiving of systems, rather than vice versa. We describe a summer project in which the metaworkbench technology underlying Architects' Workbench was used to define a workbench for Pitney Bowes, and to deploy it into early production use. The Pitney Bowes Workbench (""PBWB"") is aimed at requirements analysts and architects following an approach inspired by the Gottesdiener multi-modeling work, and in the future will be expanded to incorporate the Rozanski and Woods Viewpoints and Perspectives approach to architecture as well as SEI Quality Attribute Scenarios. We will show the workbench live, we will show snippets of its metamodel and modeling interface definition, and we will present a few initial architectural principles we have derived concerning workbench definition. Back in the day, when everyone made up their own language, there was YACC. For an age where everyone wants a workbench for the metamodel of their choice...... Discussion to follow :-) Joint work with: Doug Kimelman, Ian Simmonds

Static Specification Mining Using Automata-Based Abstractions
Eran Yahav   On:  10-Sep-2008 10:00 AM - 11:30 AM
IBM Research   At:  Watson Research Center (Hawthorne), Room  GN-F15
   

Abstract:
This talk is based on an ISSTA 2007 paper by Sharon Shoham, Eran Yahav, Steve Fink, and Marco Pistoia. The paper won a best paper award at ISSTA, and won an IBM Research Pat Goldberg best paper award. We present a novel approach to client-side mining of temporal API specifications based on static analysis. Specifically, we present an interprocedural analysis over a combined domain that abstracts both aliasing and event sequences for individual objects. The analysis uses a new family of automata-based abstractions to represent unbounded event sequences, designed to disambiguate distinct usage patterns and merge similar usage patterns. Additionally, our approach includes an algorithm that summarizes abstract traces based on automata clusters, and effectively rules out spurious behaviors. We show experimental results mining specifications from a number of Java clients and APIs. The results indicate that effective static analysis for client-side mining requires fairly precise treatment of aliasing and abstract event sequences. Based on the results, we conclude that static client-side specification mining shows promise as a complement or alternative to dynamic approaches.

Testing and Verification with Aspects
Prof. Shmuel Katz    On:  21-Aug-2008 10:00 AM - 11:30 AM
Computer Science Department, The Technion, Israel   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Harold Ossher

Abstract:
Aspects can be used to help in testing, debugging, and verifying systems, as well as in adding other types of functionality in an evolving system or for a product line. This talk will demonstrate typical uses of aspects for testing and verification of Object- Oriented systems. It will also consider the threats to reliability introduced by aspects themselves, and how verification techniques can be extended to treat systems with aspects, including their specification, modular verification, and analysis of possible interferences among aspects. A new Eclipse framework of analysis tools for aspect systems called the CAPE will be described, as a first step towards their integration into a development process for systems and product lines that use aspects.

Speaker biography:
Shmuel Katz received his Ph.D. in Computer Science from the Weizmann Institute of Science in 1976. He is a Professor in the Computer Science Department at the Technion, and founded the Systems and Software Development Laboratory there. His research interests include formal specification, verification methods, and software engineering, with over 70 papers in these areas. In recent years his work has centered on formal methods and design for aspect-oriented software development. He heads the Formal Methods Laboratory of the EU Network of Excellence on Aspect-Oriented Software Development, coordinating work on testing and formal methods for aspects.

The Clojure Programming Language
Rich Hickey   On:  14-Aug-2008 10:00 AM - 11:30 AM
Independent Software Designer   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Martin Hirzel

Abstract:
Customers and stakeholders have substantial investments in, and are comfortable with the performance, security and stability of, industry-standard platforms like the JVM and CLR. While Java and C# developers on those platforms may envy the succinctness, flexibility and productivity of dynamic languages, they have concerns about running on customer-approved infrastructure, access to their existing code base and libraries, and performance. In addition, they face ongoing problems dealing with concurrency using native threads and locking. Clojure is an effort in pragmatic dynamic language design in this context. It endeavors to be a general-purpose language suitable in those areas where Java is suitable. It reflects the reality that, for the concurrent programming future, pervasive, unmoderated mutation simply has to go. Clojure meets its goals by: embracing an industry-standard, open platform - the JVM; modernizing a venerable language - Lisp; fostering functional programming with immutable persistent data structures; and providing built-in concurrency support via software transactional memory and asynchronous agents. The result is robust, practical, and fast. This talk will focus on the motivations, mechanisms and experiences of the implementation of Clojure.

Speaker biography:
Rich Hickey, the author of Clojure, is an independent software designer, consultant and application architect with over 20 years of experience in all facets of software development. Rich has worked on scheduling systems, broadcast automation, audio analysis and fingerprinting, database design, yield management, exit poll systems, and machine listening, in a variety of languages.

Research to Product: Building a Billion Dollar Garbage Collector
David Bacon    On:  5-Aug-2008 10:00 AM - 11:30 AM
IBM Research   At:  Watson Research Center (Hawthorne), Room GN-F15
   

Abstract:
In this talk I will describe how a small long-term research project focusing on the fringes of a commoditized portion of the software industry turned into the key component of IBM's entry into a new business area bringing in hundreds of millions of dollars in revenue. I'll describe the genesis of the project and how we overcame different kinds of obstacles to bring the work to fruition: technical, organizational, competitive, and cultural. I'll also describe the key technical features and organizing research principles that led to the success of the project. Joint work with Joshua Auerbach, Perry Cheng, David Grove, and V.T. Rajan (A version of this talk was presented as the keynote at ISMM'08)

Program Analysis for Web Application Security
Prof. Zhendong Su   On:  4-Aug-2008 10:00 AM - 11:30 AM
UC Davis   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Marco Pistoia

Abstract:
Web applications enable much of today's online business including banking, shopping, university admissions, and various governmental activities. Anyone with a web browser can access them, and the data they manage typically has significant value both to the users and to the service providers. Thus, they are becoming increasing targets of attacks; since 2005, the most frequently reported classes of vulnerabilities are for web applications. These vulnerabilities arise because web applications' layers (client, server, and database) communicate via unstructured strings, and validating untrusted input is error-prone and introduces a challenging software engineering problem. In this talk, I will present a simple, yet general characterization of input validation-related attacks and a set of dynamic and static techniques to detect and prevent such attacks. I will also present empirical results to demonstrate that the techniques can prevent real-world attacks and detect previously unknown vulnerabilities in large web applications. I will conclude this talk by discussing some future challenges in this domain.

Speaker biography:
Zhendong Su is an Associate Professor in Computer Science at the University of California, Davis. His research interests span programming languages, software engineering, and computer security, focusing on developing practical techniques and tools for improving software reliability and security. He received his M.S. and Ph.D. degrees in Computer Science from UC Berkeley, and his B.S. degree in Computer Science and B.A. degree in Mathematics from UT Austin. He is the recipient of a Best Paper Award from the European Association for Programming Languages and Systems, an ACM SIGSOFT Distinguished Paper Award, an NSF CAREEER Award, and a College of Engineering Outstanding Junior Faculty Award at UC Davis.

A Report on a Survey and Study of Static Analysis Users
Nat Ayewah    On:  31-Jul-2008 10:00 AM - 11:30 AM
University of Maryland   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Martin Hirzel

Abstract:
As static analysis tools mature and attract more users, vendors and researchers have an increased interest in understanding how users interact with them, and how they impact the software development process. The FindBugs project has conducted a number of studies including online surveys, interviews and a preliminary controlled user study to better understand the practices, experiences and needs of its users. Through these studies we have learned that many users are interested in even low priority warnings, and some organizations are building custom solutions to more seamlessly and automatically integrate FindBugs into their software processes. We've also observed that developers can make decisions about the accuracy and severity of warnings fairly quickly and independent reviewers will generally reach the same conclusions about warnings.

Improving developer documentation with active guides
Barthélémy Dagenais   On:  24-Jul-2008 01:00 PM - 02:30 PM
McGill University   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Harold Ossher

Abstract:
Learning how to use or extend a large framework is difficult and time-consuming. Framework developers often create recipes in the form of cookbooks and tutorials to help, but that process is time-consuming and error prone in itself, both for the documentation authors and the users. Our intuition (back in 2006) was that by shifting the focus of the developers from a recipe to a concern -- the key elements of the framework that are involved in an extension task -- we could improve both the creation and the usage of framework documentation. Following this intuition, we built two related tools, integrated with Eclipse. Mismar supports concern-oriented guides, which are quick and easy for developers to create and which actively assist users in performing the tasks they describe. XFinder automatically locates in a code base implementation examples of Mismar guides, as an added aid to users following the guides. In this talk I will describe and demonstrate both Mismar and XFinder, and present the results of a validation study of XFinder that we performed with six Eclipse committers. I will conclude by describing briefly some new work on concern-oriented documentation that we are just beginning.

Speaker biography:
Barthélémy Dagenais is a student at McGill University in Montréal and a part-time researcher at IBM T.J. Watson Research Center (through IBM Canada). He is currently completing his Master thesis under the supervision of Prof. Martin Robillard on the dual topic of framework evolution and static analysis of partial programs. At IBM, he works with Harold Ossher on easing the creation and usage of framework documentation through a concern-oriented toolset. Beside programming and writing papers, he can be found practicing Kung Fu. He will start his PhD at McGill in Fall 2008.

We have it easy, but do we have it right?
Prof. Amer Diwan   On:  23-Jul-2008 10:00 AM - 11:30 AM
http://www-plan.cs.colorado.edu/diwan/   At:  Watson Research Center (Hawthorne), Room 1S-F40
University of Colorado at Boulder   Host:  Peter Sweeney

Abstract:
To evaluate an innovation in computer systems, performance analysts measure execution time or other metrics using one or more standard workloads. The performance analyst may carefully minimize the amount of measurement instrumentation, control the environment in which measurement takes place, and repeat each measurement multiple times. Finally, the performance analyst may use statistical techniques to characterize the data. Unfortunately, even with such a responsible approach, the collected data may be misleading. This talk shows how easy it is to produce poor (and thus misleading) data for computer systems due to observer effect and measurement bias. Observer effect occurs if data collection perturbs the behavior of the system. Measurement bias occurs when a particular environment in which the measurement takes place favors some configurations over others. This talk demonstrates that observer effect and measurement bias have significant impact on performance and can lead to incorrect conclusions. These effects are large enough to easily mislead a performance analyst.

The Dataflow Interchange Format: A Language and Environment for Experimenting with DSP-Oriented
Prof. Shuvra Bhattacharyya   On:  22-Jul-2008 03:30 PM - 05:00 PM
UMD   At:  Watson Research Center (Hawthorne), Room 2NF28
   Host:  Rodric Rabbah

Abstract:
This talk provides an overview of the dataflow interchange format (DIF) project at the University of Maryland. DIF is a textual language for specifying mixed-grain dataflow representations of digital signal processing (DSP) applications. A major emphasis in DIF is support for working with and integrating different kinds of specialized dataflow models of computation and their associated analysis techniques. One way that DIF achieves this is by allowing designers to specify subgraphs of a design in terms of specific dataflow modeling techniques, such as synchronous, cyclo-static, and parameterized dataflow, through corresponding keywords in the language. DIF also incorporates a new dataflow model of computation called enable-invoke dataflow, which is geared towards high expressive power, functional simulation, rapid prototyping, and efficient refinement into more specialized dataflow models.

Speaker biography:
SHUVRA S. BHATTACHARYYA is a Professor in the Department of Electrical and Computer Engineering, University of Maryland at College Park. He holds a joint appointment in the University of Maryland Institute for Advanced Computer Studies (UMIACS), and an affiliate appointment in the Department of Computer Science. Dr. Bhattacharyya is coauthor or coeditor of five books and the author or coauthor of more than 100 refereed technical articles. His research interests include design and implementation of signal processing systems; biomedical circuits and systems; embedded software; and hardware/software co-design. He received the B.S. degree from the University of Wisconsin at Madison, and the Ph.D. degree from the University of California at Berkeley. Dr. Bhattacharyya has held industrial positions as a Researcher at the Hitachi America Semiconductor Research Laboratory (San Jose, California), and Compiler Developer at Kuck & Associates (Champaign, Illinois).

Shape Analysis Overview
Prof. Mooly Sagiv    On:  22-Jul-2008 10:00 AM - 11:00 AM
Tel Aviv University   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Eran Yahav

Abstract:
Shape analysis concerns the problem of determining “shape invariants” for programs that perform destructive updating on dynamically allocated storage. One way to conservatively solve the shape analysis problem is to iteratively compute sets of shape graphs program locations which describe the shape invariants. I will give an overview of the method and its applications in program understanding, compiler optimizations, and program verification.

Automatically proving that non-blocking concurrent programs make progress
Alexey Gotsman   On:  17-Jul-2008 10:00 AM - 11:30 AM
Cambridge, UK   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Eran Yahav

Abstract:
Modern programs are often designed such that certain events must eventually happen (e.g., termination of callbacks, releases of locks, etc). Examples can be found in all classes of software, ranging from device drivers to high-level banking software. With the increased use of non-blocking concurrency, the difficulty of proving these properties is exacerbated. This talk will describe a new method and a tool for proving progress properties of non-blocking concurrent programs.

Speaker biography:
Alexey Gotsman will soon be finishing his PhD at Cambridge University. His research interests are in the area or formal verification, particularly, in logical foundations and practical tools for verifying concurrent software. During his PhD he has interned at Microsoft Research and Cadence Berkeley Labs.

Finding Bugs in Dynamic Web Applications
Shay Artzi   On:  16-Jul-2008 10:00 AM - 11:30 AM
MIT   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Frank Tip

Abstract:
Web script crashes and malformed dynamically-generated Web pages are common errors, and they seriously impact usability of Web applications. Current tools for Web-page validation cannot handle the dynamically-generated pages that are ubiquitous on today's Internet. In this work, we apply a dynamic test generation technique, based on combined concrete and symbolic execution, to the domain of dynamic Web applications. The technique generates tests automatically, and it minimizes the bug-inducing inputs to reduce duplication and to make the bug reports small and easy to understand and fix. Our tool Apollo implements the technique for PHP. Apollo generates test inputs for the Web application, monitors the application for crashes, and validates that the output conforms to the HTML specification. This paper presents Apollo's algorithms and implementation, and an experimental evaluation that revealed a total of 214 bugs in 4 real PHP web applications.

Liquid Metal: Object-Oriented Programming Across the Hardware/Software Boundary
Rodric Rabbah   On:  14-Jul-2008 01:00 PM - 02:30 PM
IBM Research   At:  Watson Research Center (Hawthorne), Room 1SF40
   

Abstract:
The paradigm shift in processor design from monolithic processors to multicore has renewed interest in programming models that facilitate parallelism. While multicores are here today, the future is likely to witness architectures that use reconfigurable fabrics (FPGAs) as coprocessors. FPGAs provide an unmatched ability to tailor their circuitry per application, leading to better performance at lower power. Unfortunately, the skills required to program FPGAs are beyond the expertise of skilled software programmers. This paper shows how to bridge the gap between programming software vs. hardware. We introduce Lime, a new Object-Oriented language that can be compiled for the JVM or into a synthesizable hardware description language. Lime extends Java with features that provide a way to carry OO concepts into efficient hardware. We detail an end-to-end system from the language down to hardware synthesis and demonstrate a Lime program running on both a conventional processor and in an FPGA.

A Trust Management Approach for Flexible Policy Management in Security-Typed Languages
Sruthi Bandhakavi   On:  10-Jul-2008 10:00 AM - 11:30 AM
University of Illinois, Urbana-Champagne   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Michael Burke

Abstract:
Early work on security-typed languages required that legal information flows be defined statically. More recently, techniques have been introduced that relax these assumptions and allow policies to change at run-time. For example, the Rx language uses a policy language based on RT, a trust management framework for representing authorization policies. While Rx made significant strides toward the goal of allowing policy updates in security-typed languages, in this talk we observe that certain design choices of Rx violate the privacy and autonomy requirements of principals in trust management systems, thus making decentralized control over information difficult. To address these problems, we propose RTI, a new security-typed language. In addition to avoiding prior pitfalls, RTI's most distinguishing characteristic is that it supports fine-grained specification of security for dynamic policy. We also provide a proof of noninterference for RTI.

Systematic Concurrency Testing using CHESS
Madan Musuvathi    On:  3-Jul-2008 10:00 AM - 11:30 AM
Microsoft Research   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Eran Yahav

Abstract:
People always identify concurrency testing with stress testing, even when their fundamental goals differ. Stress testing evaluates the program under load, while concurrency testing aims for better thread interleaving coverage. While it is true that stress indirectly increases the variety of thread interleavings, it is far from sufficient and has unpredictable results. Stories are legend of the so-called ``Heisenbugs'' that rarely surface and are hard to reproduce. In this talk, I will argue for a first-class notion of concurrency testing and describe CHESS, a tool we have developed towards that end. A user of CHESS provides simple concurrency scenarios and CHESS uses model checking techniques to systematically enumerate all interleavings of these scenarios. CHESS employs various algorithms to reduce the search space, focus on potentially bug-yielding schedules, and provide sound quantifiable notions of coverage. Moreover, on finding an error CHESS has the capability to replay the erroneous interleaving, greatly simplifying the debugging process. CHESS has been integrated with various codebases inside Microsoft. It has found numerous bugs and helped reproduce many stress-test crashes. CHESS is also available for download at google(``systematic concurrency testing'').