Seminars

IBM Research welcomes members of the research community to our seminars. To ensure compliance with IBM security guidelines, we request you to contact the seminar host in advance. When you arrive at the Research lab, please provide the host's name to the receptionist.


Upcoming Seminars

Dynamic Program Understanding
Prof. Steven P. Reiss    On:  5-Dec-2008 10:00 AM - 11:30 AM
Brown University   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  John Field

Abstract:
We are interested in understanding the dynamic behavior of large-scale, long-running software systems. This talk will describe our current approaches to dynamic program understanding. Performance analysis, in the broadest sense, is a central issue in understanding the dynamic behavior of a system. Understanding the performance of server-based software that runs continually for long periods of time under widely varying loads presents a number of challenges. To address these issues we have developed a methodology consisting of a monitoring framework along with a set of specialized monitoring agents called proflets. The system gathers performance data with a guaranteed maximum overhead that is settable by the user by doing appropriate sampling and scheduling of the proflets. This makes the system suitable for use on production systems. The data is displayed dynamically through either a graphical or a web-based interface. Many server applications are event-based, with the bulk of the server execution listening for events and then processing them. Performance analysis of such applications should concentrate on event processing rather than on traditional performance measures. As part of our performance analysis tool we have developed a proflet that finds and then monitors event handlers. We are currently in the process of extending this to actually track the flow of events through the system. To fully understand the dynamics of a complex application, one needs to be able to ask and answer specific questions about the dynamics in terms of the application. Here we have developed a visualization system that lets the programmer define a application-specific dynamic visualization in terms of program events and an underlying data model and then view the result dynamically and historically. Our eventual goal is to combine these efforts into a comprehensive system that provides detailed dynamic analysis of just the parts of an application that are of interest to the programmer and does so in terms of the application and the expressed interests. We conclude by describing how this approach might work.

Practical Checking of Heap Assertions
Bard Bloom   On:  9-Dec-2008 10:00 AM - 11:30 AM
IBM Research   At:  Watson Research Center (Hawthorne), Room GN-K35
   

Abstract:
Unrestricted use of heap pointers complicates understanding software systems. Incidental and accidental pointer aliasing results in unexpected side effects of seemingly unrelated operations, and are a major source of system failures. It is difficult or impossible to test and debug such failures with existing tools, especially when programs are concurrent or the assertions may inspect the entire heap. In this paper, we present a practical solution that enables the programmer to check ownership, sharing, reachability and other heap properties during program runtime. Technically, we allow the programmer to add expressive heap assertions into her code. Our approach uses a specialized virtual machine that efficiently evaluates the assertions using the components of a parallel garbage collector. We have implemented our approach on top of a production virtual machine. In this paper we demonstrate its usefulness by describing numerous real-world usage scenarios in which we found expressive heap assertions to be valuable.


Past seminars

Agile software development
Matt Ganis   On:  13-Nov-2008 01:00 PM - 03:00 PM
STSM   At:  Watson Research Center (Hawthorne), Room GN-F15
IBM SWG   Host:   Peter Santhanam

Abstract:
Agile software development -- embodied in techniques such as Extreme Programming (XP), Rational Unified Process (RUP), Open Unified Process (OpenUP), and Scrum -- are quickly becoming the dominant approach within IT organizations. This highly collaborative, iterative, and incremental approach favored by agile teams implies that all aspects of IT will need to change to accommodate and adopt this new paradigm. This workshop introduces participants to the fundamental principles, practices, and methodologies of agile software development. Some techniques will seem familiar to you and others will prove to be very provocative. Agile software development works and it is here to stay: Are you prepared?

Speaker biography:
Dr. Matthew Ganis is an IBM STSM and ibm.com site architect. Matt is the co-creator of the Agile@IBM community and was an early adopter of agile practices (XP) within IBM. Matt currently teaches the IBM Disciplined Agile Development class and has published numerous articles/papers on the use of Agile methods within ibm.com - both within it's traditional Web Development and the development/support of ibm.com's Second Life Island. Matt has been the co-chair and chair of the Academy of Technology's Agile Conferences for the past two years and is a Certified Scrum Master and Practitioner. Externally Matt serves on the editorial board of the International Journal of AGILE AND EXTREME SOFTWARE DEVELOPMENT and is a steering committee member of New York City's Agile Project Leadership Network (APLN) chapter.

Static checking of contracts in .NET via Abstract interpretation
Francesco Logozzo   On:  12-Nov-2008 10:00 AM - 11:30 AM
Microsoft Research   At:  Watson Research Center (Hawthorne), Room GN-K35
   Host:   Marco Pistoia

Abstract:
Managed contracts bring the advantages of design-by-contract programming to all .NET programming languages. Programming with contracts means writing pre-condition, post-conditions and object invariants. Contracts can be checked at runtime or at static time. After introducing the Managed contracts project, I will focus on the underlying static checker, Clousot. Clousot is an abstract interpretation-based analyzer designed to be efficient and automatic. It uses a combination of new abstract domains (Pentagons, Stripes, Subpolyhedra, ...), of new analysis techniques (partial backward analysis, iterative refinement, invariants on demand, ...) to achieve scalability without giving up performances. To reduce the annotation burden, and to produce a better user experience, Clousot can automatically infer pre-conditions and post-conditions. This is joint work with Mike Barnett and Manuel Fähndrich.

Speaker biography:
Francesco Logozzo is a researcher in the Programming Languages and Analysis group at Microsoft Research, Redmond. He is one of the main contributors of the Managed Tools project (http://research.microsoft.com/contracts/), mainly working on the static contract checker (Clousot). Before joining MSR, Francesco was a post-doctoral researcher at the ENS, Paris in the research group of Prof. Patrick Cousot. He graduated from Ecole Polytechnique, Palaiseau, France with a thesis entitle: "Modular static analysis of object-oriented programs". The advisor was Dr. Radhia Cousot.

The Human Side of Software Engineering
Dr. Janice Singer   On:  4-Nov-2008 02:30 PM - 04:00 PM
National Research Council Canada (NRC)   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Catalina Danis

Abstract:
Software Engineering is first and foremost a human endeavour. Yet we still know little (from a research perspective) about the human drivers, intentions, and barriers to producing good software. In this talk, Dr. Singer examines six aspects of the human side of software engineering. She looks at "the big picture," process, teamwork, communication, personality, and cognition. Each aspect is illustrated via video clips of interviews with software engineers.

Speaker biography:
Janice Singer is a Senior Research Officer in the Software Engineering Group of the National Research Council Canada (NRC). Dr. Singer's research lies at the boundaries of Human Computer Interaction, Computer Supported Cooperative Work, and Software Engineering. In these areas, Dr. Singer had conducted numerous qualitative and quantitative studies. Additionally, Dr. Singer advises others on appropriate study design. Dr. Singer is also an expert on research ethics and their application in software engineering contexts. She recently co-edited, "Guide to Advanced Empirical Software Engineering." Dr. Singer received her Ph.D. in Cognition and Learning from the Learning Research and Development Center of the University of Pittsburgh. Before coming to the NRC, she worked for Tektronix, IBM, and Xerox PARC.

Multicore programming models and their implementation challenges
Prof. Vivek Sarkar   On:  4-Nov-2008 10:00 AM - 11:30 AM
E.D. Butcher Professor of Computer Science   At:  Watson Research Center (Hawthorne), Room GN-F15
Rice University   Host:  Vijay Saraswat

Abstract:
The computer industry is at a major inflection point in its hardware roadmap due to the end of a decades-long trend of exponentially increasing clock frequencies. It is widely agreed that spatial parallelism in the form of multiple power-efficient cores must be exploited to compensate for this lack of frequency scaling. Unlike previous generations of hardware evolution, this shift towards multicore and manycore computing will have a profound impact on software. These software challenges are further compounded by the need to enable parallelism in workloads and application domains that have traditionally not had to worry about multiprocessor parallelism in the past. In this talk, we will focus on the programming problem for tightly coupled homogeneous and heterogeneous multicore processors. We present early experiences with the new Habanero Multicore Software Research project at Rice University (http://habanero.rice.edu) that encompasses work on programming models, compilers, runtimes, and concurrency libraries so as to enable portable software that can run unchanged on a range of homogeneous and heterogeneous multicore systems. The Habanero project takes a two-level approach to programming models, with a high-level model based on Intel Concurrent Collections for parallelism-oblivious domain experts , and a lower-level model based on the high productivity X10 language for parallelism-aware developers. We discuss compiler and runtime implementation challenges that must be overcome to enable mainstream applications to use these models on multicore systems. Solutions to some of these challenges are being addressed in collaboration with IBM Research as part of the Multicore Open Collaborative Research program.

Speaker biography:
Vivek Sarkar conducts research in programming languages, program analysis, compiler optimizations and virtual machines for parallel and high performance computer systems, and currently leads the Habanero Multicore Software Research project at Rice University (www.habanero.rice.edu). Prior to joining Rice, he was Senior Manager of Programming Technologies at IBM Research. His responsibilities at IBM included leading IBM's research efforts in programming model, tools, and productivity in the PERCS project during 2002- 2007 as part of the DARPA High Productivity Computing System program. His past projects include the X10 programming language, the Jikes Research Virtual Machine for the Java language, the ASTI optimizer used in IBM's XL Fortran product compilers, the PTRAN automatic parallelization system, and profile-directed partitioning and scheduling of Sisal programs. Vivek became a member of the IBM Academy of Technology in 1995, an ACM Distinguished Scientist in 2006, and the E.D. Butcher Professor of Computer Science at Rice University in 2007. He holds a B.Tech. degree from the Indian Institute of Technology, Kanpur, an M.S. degree from University of Wisconsin-Madison, and a Ph.D. from Stanford University. In 1997, he was on sabbatical as a visiting associate professor at MIT, where he was a founding member of the MIT RAW multicore project

Parallel Scheduling, Theory and Practice
Prof. Guy Blelloch    On:  3-Nov-2008 10:00 AM - 11:30 AM
CMU   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Vijay Saraswat

Abstract:
With the many levels of parallelism available on modern systems and the need for high-level parallel programming languages, developing effective dynamic scheduling algorithms is likely to be one of the biggest computational challenges of the next decade. Although different schedules might execute the exact same tasks, the schedule can have a huge effect on performance due to differences in overheads, locality, pipeline effectiveness, and memory latency. In this talk I will review the theoretical results from the past 20 years on dynamic schedulers and how well these results transfer to practice. This will include a review of work stealing scheduling, parallel depth-first (PDF) scheduling, and various hybrid approaches. I will then discuss some recent results on scheduling approaches that take advantage of both shared and distributed caches, as are present on multicore systems. Finally I will discuss some ongoing work on scheduling in the context of the X10 programming language, and some challenges for the future.

Engineering for Fine Grained Parallelism
Prof. Doug Lea   On:  3-Nov-2008 02:00 PM - 03:30 PM
SUNY Oswego   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Vijay Saraswat

Abstract:
Lightweight work-stealing frameworks are among the most promising means of exploiting multicores and multiprocessors. This talk will survey the basic approach, and discuss some of the challenges faced in engineering and extending functionality to capture a broader range of "everyday" parallel applications.

Chameleon: Adaptive Online Selection of Collections
Ohad Shaham   On:  31-Oct-2008 10:00 AM - 11:30 AM
Tel Aviv University   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Eran Yahav

Abstract:
Languages such as Java and C#, as well as scripting languages like Python, and Ruby, make extensive use of Collection classes. Different collection implementations have different characteristics in terms of the time required to perform certain operations, space utilization, and synchronization. Making an optimal choice of a collection implementation in a particular program usage is a nontrivial task. Choosing the right collection is even more challenging in the presence of concurrency due to the wide range of alternative implementations and their subtle differences. We present Chameleon, a tool that relieves the programmer from the burden of choosing an appropriate collection implementation for a particular usage point in the program. Chameleon collects on-the-fly information of how collections are used within a specific context, and automatically chooses the most suitable collection implementation available for a context. We have implemented our tool on top of the Java libraries provided by IBM's production JVM J9. We have also implemented a version of that works on top of QVM. Using conceptually simple QVM extensions we are able to significantly improve the performance of the tool. We have evaluated our tool over a small set of benchmarks. We show that Chameleon's choice of a collection is at least as good as the choice made by a typical Java programmer. (This is joint work with Martin Vechev and Eran Yahav)

Inferring Synchronization under Limited Observability
Greta Yorsh   On:  27-Oct-2008 10:00 AM - 11:30 AM
IBM Research   At:  Watson Research Center (Hawthorne), Room 1S-F40
   

Abstract:
This paper addresses the problem of automatically inferring synchronization for concurrent programs. Given a program and a specification, we infer synchronization that avoids all interleavings violating the specification, but permits as many valid interleavings as possible. We let the user specify an upper bound on the cost of synchronization, which may limit the observability - what observations on program state can be made by the synchronization code. We present an algorithm that infers, under certain conditions, the maximally permissive synchronization for a given cost. We implemented a prototype of our approach and applied it to infer synchronization in a number of small programs. (joint work with Martin Vechev and Eran Yahav)

BROWSER WARS 2: Browser Competition And The Evolution Of The Open Web Platform
Rob O'Callahan   On:  24-Oct-2008 03:30 PM - 05:00 PM
Mozilla   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:   David Bacon

Abstract:
Driven by a desire for influence over the Internet, Microsoft, Apple, Google and Mozilla are increasing their investments in Web browser development. This is having a positive effect on the design and implementation of Web standards to ease application development and dramatically broaden the scope of what Web apps can do. One key area of competition is Javascript performance; Apple, Google and Mozilla have all made dramatic improvements using a variety of compilation techniques. Other new features for the open Web include "worker thread" parallelism, support for running Web applications while offline, and lots of visual "bling". I'll talk about what's coming in the current and next generation of browsers, and how and why we're doing it. I'll also discuss the barriers and challenges that we're grappling with, especially in the area of security.

The birth of a Manycore Virtual Machine; A Renaissance update
David Ungar and Sam Adams    On:  24-Oct-2008 01:30 PM - 03:00 PM
IBM Research   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:   Erik Altman

Abstract:
The Renaissance ER project began earlier this year with the twin goals of discovering breakthrough performance and programmability for future massively multicore or "manycore" processors. The first six months of the project has seen excellent progress towards those goals with the development of the first manycore virtual machine operating on 56 cores of a Tilera64 manycore processor. This talk will cover the development of the virtual machine to date along with the coevolution of measurement and visualization strategies for manycore VM development. Early manycore enhancements to the programming and debugging environment will also be demonstrated.

Speaker biography:
Sam's bio: Sam Adams works in the Programming Models and Tools group in Software Research and was one of IBM's first Distinguished Engineers. In his 15 years at IBM, Sam has helped found the first Object Technology Practice in the IBM Consulting Group, developed the first large scale object reuse library in IBM, pioneered Self-Configuring Systems which foreshadowed Autonomic Computing, co-authored the XML Technical Strategy for IBM that set our direction into Web Services, coined the term Service Oriented Architectures and the Publish-Find-Bind model, explored Artificial General Intelligence in the Joshua Blue project, pioneered End User Programming for Web Services and mashup technology, and co-led Research's Enterprise 2.0 initiative, exploring Web 2.0 in the enterprise. Since mid-2007 he has been focused on discovering new programming models for massively multicore systems and is a Principal Investigator on the Renaissance Exploratory Research project.

Tax-and-Spend: Democratic Scheduling for Real-time Garbage Collection
David P. Grove    On:  14-Oct-2008 01:30 PM - 02:30 PM
IBM Research   At:  Watson Research Center (Hawthorne), Room 1S-F40
   

Abstract:
This talk is based on an EMSOFT'08 paper by Joshua Auerbach, David F. Bacon, Ben Biron, Perry Cheng, Charlie Gracie, David Grove, Bill McCloskey, Aleks Micic, and Ryan Sciampacone. Real-time Garbage Collection (RTGC) has recently advanced to the point where it is being used in production for financial trading, military command-and-control, and telecommunications. However, among potential users of RTGC, there is enormous diversity in both application requirements and deployment environments. Previously described RTGCs tend to work well in a narrow band of possible environments, leading to fragile systems and limiting adoption of real-time garbage collection technology. This paper introduces a collector scheduling methodology called tax-and-spend and the collector design revisions needed to support it. Tax-and-spend provides a general mechanism which works well across a variety of application, machine, and operating system configurations. Tax-and-spend subsumes the predominant pre-existing RTGC scheduling techniques. It allows different policies to be applied in different contexts depending on the needs of the application. Virtual machines can co-exist compositionally on a single machine. We describe the implementation of our system, Metronome-TS, as an extension of the Metronome collector in IBM's Real-time J9 virtual machine product, and we evaluate it running on an 8-way SMP blade with a real-time Linux kernel. Compared to the stateof-the-art Metronome system on which it is based, implemented in the identical infrastructure, it achieves almost 3x shorter latencies, comparable utilization at a 2.5x shorter time window, and mean throughput improvements of 10-20%.

Speaker biography:
My primary research interests are in programming language design and implementation. I've done work in the analysis and optimization of object-oriented languages, virtual machine design and implementation, JIT compilation, online feedback-directed optimization, and garbage collection. Most of my current research is done in the context of the Metronome Project, which is making Java suitable for use in implementing large scale real-time systems. The Metronome garbage collector is now available as part of IBM's WebSphere Real Time product and TuningFork is available as an alphaworks download. For more details, see the project software page. I am a member of the Dynamic Optimization Group, which developed the optimizing compiler and adaptive optimization system for the Jalapeno virtual machine. In 2001, Jalapeno became Jikes RVM and was released as an open source project. Since 2001, I have been a member of the Jikes RVM core team and steering committee. I received my Ph.D. in Computer Science in October, 1998 from the University of Washington's Department of Computer Science and Engineering. While at UW, I was a member of the Cecil/Vortex project.

Behavioural Model Fusion: Merge, Composition and Verification
Shiva Nejati   On:  13-Oct-2008 09:45 AM - 11:00 AM
University of Toronto)   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Amit Paradkar

Abstract:
There is a rapidly growing interest in model-based development as a way to increase the level of abstraction and automation in software engineering. The ultimate goal of model-based development is to improve the software process by promoting the use of models as the primary artifacts of development, and to provide computer-supported technologies to transform models into running systems. Model-based development becomes particularly challenging in projects where developers have to handle multiple partial models of a system. Individual models may represent different system features, describe alternative perspectives on a single feature, or express ways in which features alter one another's structure or behaviour. We refer to the process of integrating a collection of partial models into a whole system as "model fusion". In this talk, I present my work on fusion of behavioural software models. In particular, I focus on the following two problems: (1) merging variant models of individual features with the goal of simplifying system maintenance, and (2) composing models of different features with the goal of identifying and resolving their undesirable interactions. I explain the theory behind the work, and demonstrate how our techniques can be applied for management and analysis of models from a telecommunication domain.

Speaker biography:
Shiva is a Ph.D. candidate in the Computer Science department of University of Toronto where she recently defended her thesis. Her thesis adviser is Prof. Marsha Chechik. She is a co-author of the paper on "Matching and Merging Statechart Specifications" which received one of the best paper awards at ICSE'2007.

Constrained Types for Object-Oriented Languages
Nate Nystrom   On:  13-Oct-2008 11:00 AM - 12:30 PM
IBM Research   At:  Watson Research Center (Hawthorne), Room 1S-F40
   

Abstract:
X10 is a modern object-oriented language designed for productivity and performance in concurrent and distributed systems. In this setting, dependent types offer significant opportunities for detecting design errors statically, documenting design decisions, eliminating costly run-time checks (e.g., for array bounds, null values), and improving the quality of generated code. We present the design and implementation of constrained types, a natural, simple, clean, and expressive extension to object-oriented programming: A type C{c} names a class or interface C and a constraint c on the immutable state of C and in-scope final variables. Constraints may also be associated with class definitions (representing class invariants) and with method and constructor definitions (representing preconditions). Dynamic casting is permitted. The system is parametric on the underlying constraint system: the compiler supports a simple equality-based constraint system but, in addition, supports extension with new constraint systems using compiler plugins.

The Java Module System, and its Problems
Rok Strnisa   On:  6-Oct-2008 10:00 AM - 11:30 AM
Cambridge, UK   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  John Field

Abstract:
The talk will outline the key ideas behind the upcoming Java Module System, the module system proposed for Java 7. The talk will also describe some of the problems with the proposal that our research has identified, and outline our ideas for removing them.

The new importance of language features in raising the abstraction bar in software engineering
Prof. Judith Bishop    On:  1-Oct-2008 10:00 AM - 11:30 AM
University of Pretoria, South Africa   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Satish Chandra

Abstract:
In the context of software engineering, abstraction is the means by which developers move from layer to layer in the realization of the solution to a large problem. For more than a decade, programming languages in wide industrial use have been at a fixed point, defined by Java and C++. This talk reports on the relevance of new language features in C# 3.0 and their applicability for raising the level of abstraction at which ordinary programmers can operate. These features include delegates, properties, extension methods and lambda expressions, most of which are not (yet) in the other major languages, as well as the finer details of reflection, generics and iterators. As a corpus for the investigation, we have used the classic design patterns. From their very inception, there was the anticipation that some of them would be superceded by new language features. Yet published implementations of classic patterns do not generally live up to this promise. Examples of new approaches to pattern implementations using the new language features will be given. A look at the efficiency of some alternative implementations will be presented, highlighting the visitor pattern. Finally we see how the concepts embodied in the new C# 3.0 features can be incorporated into Java by means of a small set of generic classes.

Speaker biography:
Judith Bishop is a professor of Computer Science at the University of Pretoria, South Africa. She previously worked at the universities of the Witwatersrand and Southampton, where she received her PhD in 1977, working on compilers for structured architectures. Her research interests are on the principles of adaptive software in a multi-lingual and mobile environment, in collaboration with Microsoft Research, local companies and collaborators in Germany and Italy. Her 14 books on programming and languages have been translated into six languages and are read worldwide. Judith is a proud South African with a visible presence abroad, serving on international editorial, programme and award committees, and organizing conferences and summer schools in South Africa aimed at keeping postgraduates involved in cutting edge research. On sabbaticals she has worked at Microsoft Research Cambridge, the SEI in Pittsburgh and t the Universities of Victoria, Karlsruhe, Berlin and Milan. She was elected chair of IFIP's working group on Software Implementation Technology for two terms, and is now chair of the World Computer Congress in 2008. In 2010, she will be bringing the ACM/IEEE International Conference on Software Engineering Conference to Cape Town. She has received many awards for excellence in and service to computer science, and most recently was elected a Fellow of the Royal Society of South Africa. In 2008, she received UP Leading Mind Medal in the University's Centenary Year.

Instrumentation made Easy: An Introduction to Tracematches
Pavel Avgustinov   On:  29-Sep-2008 10:00 AM - 11:30 AM
Oxford University   At:  Watson Research Center (Hawthorne), Room GNK35
   Host:  Frank Tip

Abstract:
It is very common in the area of fault-detection to instrument a base program with additional code that either detects or corrects errors. Conventionally, such instrumentation is either written by hand or in some form of the aspect-oriented programming paradigm, but both of these approaches tend to be error-prone and costly in terms of development effort. Tracematches are one of the more mature incarnations of "trace monitoring". They allow the developer to write a short, declarative specification of their concern, and generate efficient (and provably correct) instrumentation that checks the specification. This talk will give a basic introduction to tracematches, discuss some of the performance implications and look at their applicability to common problems.

Lowering the Bar for Precise Pointer Analysis
Ben Hardekopf    On:  22-Sep-2008 10:00 AM - 11:30 AM
UT Austin   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:   Stephen Fink

Abstract:
Pointer information is a fundamental prerequisite for many program analyses, including hot topics such as program verification, program comprehension, security analysis, and many more. This talk presents a set of new, highly-scalable pointer analysis algorithms that target two different levels of precision: flow-insensitive, inclusion-based pointer analysis (i.e., Andersen-style analysis) and flow-sensitive pointer analysis. The main part of the talk shows how to significantly improve the scalability of Andersen-style analysis by targeting two different notions of equivalence: pointer equivalence and location equivalence. These techniques yield an Andersen-style analysis that is over 4x faster and uses over 7x less memory than the previous state of the art. The remainder of the talk briefly covers the major challenges of flow-sensitive pointer analysis and details our approach to meeting those challenges. We report on the results of the first stage of our research, which yields a flow-sensitive algorithm that is 197x faster than the previous state of the art. We also give a glimpse into the next stage of our research, which is currently under development. This work promises to increase the scalability of flow-sensitive pointer analysis even further, perhaps even to millions of lines of code.

Speaker biography:
Ben Hardekopf received a BS in Computer Science and BSE in Electrical Engineering from Duke University in 1997. He received a Masters in Computer Science from SUNY at Utica/Rome in 2000 while serving as an active duty officer in the United States Air Force. He is currently in the Ph.D. program at The University of Texas at Austin and expects to graduate in May of 2009. His advisor is Calvin Lin.

Message-Passing Concurrency Models for Writing Parallel Programs
Prof. Martin Sulzmann   On:  19-Sep-2008 10:00 AM - 11:30 AM
ITU Copenhagen   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  John Field

Abstract:
In order to benefit from the additional performance (cores) of multi-core architectures, we have to write concurrent programs which can then be executed in parallel on these platforms. We review message-passing concurrency a la Erlang and Join calculus and illustrate its use via several programming examples. We report on our own recent work where we apply ideas from multi-set constraint rewriting to describe richer coordination patterns among messages and to support the parallel execution of Join patterns.

Speaker biography:
Martin Sulzmann, PhD, graduated from Yale University in 2000. He is currently an Associate Professor at the IT University of Copenhagen in the Programming, Logics and Semantics group. He held previous positions at the National University of Singapore and the University of Melbourne. His research interests are centered around programming language design, implementation and analysis, mainly of functional and constraint logic languages.

Efficient Sparse Matrix Vector Multiplication on GPUs
Muthu Baskaran   On:  18-Sep-2008 10:00 AM - 11:30 AM
Ohio State University   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:   Rajesh Bordawekar

Abstract:
We are witnessing the emergence of Graphics Processor units (GPUs) as powerful massively parallel systems. Furthermore, the introduction of new APIs for general-purpose computations on GPUs, namely CUDA from NVIDIA and CTM from ATI, makes GPUs an attrative choice for high-performance numerical and scientific computing. Sparse matrix-vector multiply (SpMV) is one of the most important and heavily used kernels in scientific computing. However with indirect and irregular memory accesses resulting in more memory accesses per floating point operation, optimization of SpMV kernel is a significant challenge in any architecture. In this work, we examine the various challenges in developing a high-performance SpMV kernel on NVIDIA GPUs using CUDA and devise techniques to address them. We develop techniques that optimize memory accesses in memory-bound SpMV kernel by (1) effectively utilizing the various (low-latency) memories available in GPUs, (2) extracting optimal memory access pattern pertaining to the type of memory, and (3) reordering computation to exploit data reuse. We develop an optional inspector-executor module to preprocess and analyze the non-zero pattern in sparse matrices to guide the optimizations. We demonstrate the performance improvements achieved by our approach compared to existing state-of-the-art parallel SpMV implementaions on GPUs.

Experience with META / xWB -- Generating Modeling Workbenches
Joel Ossher    On:  12-Sep-2008 10:00 AM - 11:30 AM
UC Irvine, IBM Research summer intern   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Doug Kimelman

Abstract:
Metaworkbenches have taken strong hold in the enterprise architecture domain (e.g. Telelogic System Architect) and in the embedded systems domain (e.g. MetaCase). Feedback from IBM Rational customers (e.g. Pitney Bowes, Unisys, HSBC) indicates a strong desire for metaworkbenches in the mainstream requirements and IT architecture domain as well. Analysts and architects want tools to conform to their way of thinking about and conceiving of systems, rather than vice versa. We describe a summer project in which the metaworkbench technology underlying Architects' Workbench was used to define a workbench for Pitney Bowes, and to deploy it into early production use. The Pitney Bowes Workbench (""PBWB"") is aimed at requirements analysts and architects following an approach inspired by the Gottesdiener multi-modeling work, and in the future will be expanded to incorporate the Rozanski and Woods Viewpoints and Perspectives approach to architecture as well as SEI Quality Attribute Scenarios. We will show the workbench live, we will show snippets of its metamodel and modeling interface definition, and we will present a few initial architectural principles we have derived concerning workbench definition. Back in the day, when everyone made up their own language, there was YACC. For an age where everyone wants a workbench for the metamodel of their choice...... Discussion to follow :-) Joint work with: Doug Kimelman, Ian Simmonds

Static Specification Mining Using Automata-Based Abstractions
Eran Yahav   On:  10-Sep-2008 10:00 AM - 11:30 AM
IBM Research   At:  Watson Research Center (Hawthorne), Room  GN-F15
   

Abstract:
This talk is based on an ISSTA 2007 paper by Sharon Shoham, Eran Yahav, Steve Fink, and Marco Pistoia. The paper won a best paper award at ISSTA, and won an IBM Research Pat Goldberg best paper award. We present a novel approach to client-side mining of temporal API specifications based on static analysis. Specifically, we present an interprocedural analysis over a combined domain that abstracts both aliasing and event sequences for individual objects. The analysis uses a new family of automata-based abstractions to represent unbounded event sequences, designed to disambiguate distinct usage patterns and merge similar usage patterns. Additionally, our approach includes an algorithm that summarizes abstract traces based on automata clusters, and effectively rules out spurious behaviors. We show experimental results mining specifications from a number of Java clients and APIs. The results indicate that effective static analysis for client-side mining requires fairly precise treatment of aliasing and abstract event sequences. Based on the results, we conclude that static client-side specification mining shows promise as a complement or alternative to dynamic approaches.

Testing and Verification with Aspects
Prof. Shmuel Katz    On:  21-Aug-2008 10:00 AM - 11:30 AM
Computer Science Department, The Technion, Israel   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Harold Ossher

Abstract:
Aspects can be used to help in testing, debugging, and verifying systems, as well as in adding other types of functionality in an evolving system or for a product line. This talk will demonstrate typical uses of aspects for testing and verification of Object- Oriented systems. It will also consider the threats to reliability introduced by aspects themselves, and how verification techniques can be extended to treat systems with aspects, including their specification, modular verification, and analysis of possible interferences among aspects. A new Eclipse framework of analysis tools for aspect systems called the CAPE will be described, as a first step towards their integration into a development process for systems and product lines that use aspects.

Speaker biography:
Shmuel Katz received his Ph.D. in Computer Science from the Weizmann Institute of Science in 1976. He is a Professor in the Computer Science Department at the Technion, and founded the Systems and Software Development Laboratory there. His research interests include formal specification, verification methods, and software engineering, with over 70 papers in these areas. In recent years his work has centered on formal methods and design for aspect-oriented software development. He heads the Formal Methods Laboratory of the EU Network of Excellence on Aspect-Oriented Software Development, coordinating work on testing and formal methods for aspects.

The Clojure Programming Language
Rich Hickey   On:  14-Aug-2008 10:00 AM - 11:30 AM
Independent Software Designer   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Martin Hirzel

Abstract:
Customers and stakeholders have substantial investments in, and are comfortable with the performance, security and stability of, industry-standard platforms like the JVM and CLR. While Java and C# developers on those platforms may envy the succinctness, flexibility and productivity of dynamic languages, they have concerns about running on customer-approved infrastructure, access to their existing code base and libraries, and performance. In addition, they face ongoing problems dealing with concurrency using native threads and locking. Clojure is an effort in pragmatic dynamic language design in this context. It endeavors to be a general-purpose language suitable in those areas where Java is suitable. It reflects the reality that, for the concurrent programming future, pervasive, unmoderated mutation simply has to go. Clojure meets its goals by: embracing an industry-standard, open platform - the JVM; modernizing a venerable language - Lisp; fostering functional programming with immutable persistent data structures; and providing built-in concurrency support via software transactional memory and asynchronous agents. The result is robust, practical, and fast. This talk will focus on the motivations, mechanisms and experiences of the implementation of Clojure.

Speaker biography:
Rich Hickey, the author of Clojure, is an independent software designer, consultant and application architect with over 20 years of experience in all facets of software development. Rich has worked on scheduling systems, broadcast automation, audio analysis and fingerprinting, database design, yield management, exit poll systems, and machine listening, in a variety of languages.

Research to Product: Building a Billion Dollar Garbage Collector
David Bacon    On:  5-Aug-2008 10:00 AM - 11:30 AM
IBM Research   At:  Watson Research Center (Hawthorne), Room GN-F15
   

Abstract:
In this talk I will describe how a small long-term research project focusing on the fringes of a commoditized portion of the software industry turned into the key component of IBM's entry into a new business area bringing in hundreds of millions of dollars in revenue. I'll describe the genesis of the project and how we overcame different kinds of obstacles to bring the work to fruition: technical, organizational, competitive, and cultural. I'll also describe the key technical features and organizing research principles that led to the success of the project. Joint work with Joshua Auerbach, Perry Cheng, David Grove, and V.T. Rajan (A version of this talk was presented as the keynote at ISMM'08)

Program Analysis for Web Application Security
Prof. Zhendong Su   On:  4-Aug-2008 10:00 AM - 11:30 AM
UC Davis   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Marco Pistoia

Abstract:
Web applications enable much of today's online business including banking, shopping, university admissions, and various governmental activities. Anyone with a web browser can access them, and the data they manage typically has significant value both to the users and to the service providers. Thus, they are becoming increasing targets of attacks; since 2005, the most frequently reported classes of vulnerabilities are for web applications. These vulnerabilities arise because web applications' layers (client, server, and database) communicate via unstructured strings, and validating untrusted input is error-prone and introduces a challenging software engineering problem. In this talk, I will present a simple, yet general characterization of input validation-related attacks and a set of dynamic and static techniques to detect and prevent such attacks. I will also present empirical results to demonstrate that the techniques can prevent real-world attacks and detect previously unknown vulnerabilities in large web applications. I will conclude this talk by discussing some future challenges in this domain.

Speaker biography:
Zhendong Su is an Associate Professor in Computer Science at the University of California, Davis. His research interests span programming languages, software engineering, and computer security, focusing on developing practical techniques and tools for improving software reliability and security. He received his M.S. and Ph.D. degrees in Computer Science from UC Berkeley, and his B.S. degree in Computer Science and B.A. degree in Mathematics from UT Austin. He is the recipient of a Best Paper Award from the European Association for Programming Languages and Systems, an ACM SIGSOFT Distinguished Paper Award, an NSF CAREEER Award, and a College of Engineering Outstanding Junior Faculty Award at UC Davis.

A Report on a Survey and Study of Static Analysis Users
Nat Ayewah    On:  31-Jul-2008 10:00 AM - 11:30 AM
University of Maryland   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Martin Hirzel

Abstract:
As static analysis tools mature and attract more users, vendors and researchers have an increased interest in understanding how users interact with them, and how they impact the software development process. The FindBugs project has conducted a number of studies including online surveys, interviews and a preliminary controlled user study to better understand the practices, experiences and needs of its users. Through these studies we have learned that many users are interested in even low priority warnings, and some organizations are building custom solutions to more seamlessly and automatically integrate FindBugs into their software processes. We've also observed that developers can make decisions about the accuracy and severity of warnings fairly quickly and independent reviewers will generally reach the same conclusions about warnings.

Improving developer documentation with active guides
Barthélémy Dagenais   On:  24-Jul-2008 01:00 PM - 02:30 PM
McGill University   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Harold Ossher

Abstract:
Learning how to use or extend a large framework is difficult and time-consuming. Framework developers often create recipes in the form of cookbooks and tutorials to help, but that process is time-consuming and error prone in itself, both for the documentation authors and the users. Our intuition (back in 2006) was that by shifting the focus of the developers from a recipe to a concern -- the key elements of the framework that are involved in an extension task -- we could improve both the creation and the usage of framework documentation. Following this intuition, we built two related tools, integrated with Eclipse. Mismar supports concern-oriented guides, which are quick and easy for developers to create and which actively assist users in performing the tasks they describe. XFinder automatically locates in a code base implementation examples of Mismar guides, as an added aid to users following the guides. In this talk I will describe and demonstrate both Mismar and XFinder, and present the results of a validation study of XFinder that we performed with six Eclipse committers. I will conclude by describing briefly some new work on concern-oriented documentation that we are just beginning.

Speaker biography:
Barthélémy Dagenais is a student at McGill University in Montréal and a part-time researcher at IBM T.J. Watson Research Center (through IBM Canada). He is currently completing his Master thesis under the supervision of Prof. Martin Robillard on the dual topic of framework evolution and static analysis of partial programs. At IBM, he works with Harold Ossher on easing the creation and usage of framework documentation through a concern-oriented toolset. Beside programming and writing papers, he can be found practicing Kung Fu. He will start his PhD at McGill in Fall 2008.

We have it easy, but do we have it right?
Prof. Amer Diwan   On:  23-Jul-2008 10:00 AM - 11:30 AM
http://www-plan.cs.colorado.edu/diwan/   At:  Watson Research Center (Hawthorne), Room 1S-F40
University of Colorado at Boulder   Host:  Peter Sweeney

Abstract:
To evaluate an innovation in computer systems, performance analysts measure execution time or other metrics using one or more standard workloads. The performance analyst may carefully minimize the amount of measurement instrumentation, control the environment in which measurement takes place, and repeat each measurement multiple times. Finally, the performance analyst may use statistical techniques to characterize the data. Unfortunately, even with such a responsible approach, the collected data may be misleading. This talk shows how easy it is to produce poor (and thus misleading) data for computer systems due to observer effect and measurement bias. Observer effect occurs if data collection perturbs the behavior of the system. Measurement bias occurs when a particular environment in which the measurement takes place favors some configurations over others. This talk demonstrates that observer effect and measurement bias have significant impact on performance and can lead to incorrect conclusions. These effects are large enough to easily mislead a performance analyst.

The Dataflow Interchange Format: A Language and Environment for Experimenting with DSP-Oriented
Prof. Shuvra Bhattacharyya   On:  22-Jul-2008 03:30 PM - 05:00 PM
UMD   At:  Watson Research Center (Hawthorne), Room 2NF28
   Host:  Rodric Rabbah

Abstract:
This talk provides an overview of the dataflow interchange format (DIF) project at the University of Maryland. DIF is a textual language for specifying mixed-grain dataflow representations of digital signal processing (DSP) applications. A major emphasis in DIF is support for working with and integrating different kinds of specialized dataflow models of computation and their associated analysis techniques. One way that DIF achieves this is by allowing designers to specify subgraphs of a design in terms of specific dataflow modeling techniques, such as synchronous, cyclo-static, and parameterized dataflow, through corresponding keywords in the language. DIF also incorporates a new dataflow model of computation called enable-invoke dataflow, which is geared towards high expressive power, functional simulation, rapid prototyping, and efficient refinement into more specialized dataflow models.

Speaker biography:
SHUVRA S. BHATTACHARYYA is a Professor in the Department of Electrical and Computer Engineering, University of Maryland at College Park. He holds a joint appointment in the University of Maryland Institute for Advanced Computer Studies (UMIACS), and an affiliate appointment in the Department of Computer Science. Dr. Bhattacharyya is coauthor or coeditor of five books and the author or coauthor of more than 100 refereed technical articles. His research interests include design and implementation of signal processing systems; biomedical circuits and systems; embedded software; and hardware/software co-design. He received the B.S. degree from the University of Wisconsin at Madison, and the Ph.D. degree from the University of California at Berkeley. Dr. Bhattacharyya has held industrial positions as a Researcher at the Hitachi America Semiconductor Research Laboratory (San Jose, California), and Compiler Developer at Kuck & Associates (Champaign, Illinois).

Shape Analysis Overview
Prof. Mooly Sagiv    On:  22-Jul-2008 10:00 AM - 11:00 AM
Tel Aviv University   At:  Watson Research Center (Hawthorne), Room GN-F15
   Host:  Eran Yahav

Abstract:
Shape analysis concerns the problem of determining “shape invariants” for programs that perform destructive updating on dynamically allocated storage. One way to conservatively solve the shape analysis problem is to iteratively compute sets of shape graphs program locations which describe the shape invariants. I will give an overview of the method and its applications in program understanding, compiler optimizations, and program verification.

Automatically proving that non-blocking concurrent programs make progress
Alexey Gotsman   On:  17-Jul-2008 10:00 AM - 11:30 AM
Cambridge, UK   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Eran Yahav

Abstract:
Modern programs are often designed such that certain events must eventually happen (e.g., termination of callbacks, releases of locks, etc). Examples can be found in all classes of software, ranging from device drivers to high-level banking software. With the increased use of non-blocking concurrency, the difficulty of proving these properties is exacerbated. This talk will describe a new method and a tool for proving progress properties of non-blocking concurrent programs.

Speaker biography:
Alexey Gotsman will soon be finishing his PhD at Cambridge University. His research interests are in the area or formal verification, particularly, in logical foundations and practical tools for verifying concurrent software. During his PhD he has interned at Microsoft Research and Cadence Berkeley Labs.

Finding Bugs in Dynamic Web Applications
Shay Artzi   On:  16-Jul-2008 10:00 AM - 11:30 AM
MIT   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Frank Tip

Abstract:
Web script crashes and malformed dynamically-generated Web pages are common errors, and they seriously impact usability of Web applications. Current tools for Web-page validation cannot handle the dynamically-generated pages that are ubiquitous on today's Internet. In this work, we apply a dynamic test generation technique, based on combined concrete and symbolic execution, to the domain of dynamic Web applications. The technique generates tests automatically, and it minimizes the bug-inducing inputs to reduce duplication and to make the bug reports small and easy to understand and fix. Our tool Apollo implements the technique for PHP. Apollo generates test inputs for the Web application, monitors the application for crashes, and validates that the output conforms to the HTML specification. This paper presents Apollo's algorithms and implementation, and an experimental evaluation that revealed a total of 214 bugs in 4 real PHP web applications.

Liquid Metal: Object-Oriented Programming Across the Hardware/Software Boundary
Rodric Rabbah   On:  14-Jul-2008 01:00 PM - 02:30 PM
IBM Research   At:  Watson Research Center (Hawthorne), Room 1SF40
   

Abstract:
The paradigm shift in processor design from monolithic processors to multicore has renewed interest in programming models that facilitate parallelism. While multicores are here today, the future is likely to witness architectures that use reconfigurable fabrics (FPGAs) as coprocessors. FPGAs provide an unmatched ability to tailor their circuitry per application, leading to better performance at lower power. Unfortunately, the skills required to program FPGAs are beyond the expertise of skilled software programmers. This paper shows how to bridge the gap between programming software vs. hardware. We introduce Lime, a new Object-Oriented language that can be compiled for the JVM or into a synthesizable hardware description language. Lime extends Java with features that provide a way to carry OO concepts into efficient hardware. We detail an end-to-end system from the language down to hardware synthesis and demonstrate a Lime program running on both a conventional processor and in an FPGA.

A Trust Management Approach for Flexible Policy Management in Security-Typed Languages
Sruthi Bandhakavi   On:  10-Jul-2008 10:00 AM - 11:30 AM
University of Illinois, Urbana-Champagne   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Michael Burke

Abstract:
Early work on security-typed languages required that legal information flows be defined statically. More recently, techniques have been introduced that relax these assumptions and allow policies to change at run-time. For example, the Rx language uses a policy language based on RT, a trust management framework for representing authorization policies. While Rx made significant strides toward the goal of allowing policy updates in security-typed languages, in this talk we observe that certain design choices of Rx violate the privacy and autonomy requirements of principals in trust management systems, thus making decentralized control over information difficult. To address these problems, we propose RTI, a new security-typed language. In addition to avoiding prior pitfalls, RTI's most distinguishing characteristic is that it supports fine-grained specification of security for dynamic policy. We also provide a proof of noninterference for RTI.

Systematic Concurrency Testing using CHESS
Madan Musuvathi    On:  3-Jul-2008 10:00 AM - 11:30 AM
Microsoft Research   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Eran Yahav

Abstract:
People always identify concurrency testing with stress testing, even when their fundamental goals differ. Stress testing evaluates the program under load, while concurrency testing aims for better thread interleaving coverage. While it is true that stress indirectly increases the variety of thread interleavings, it is far from sufficient and has unpredictable results. Stories are legend of the so-called ``Heisenbugs'' that rarely surface and are hard to reproduce. In this talk, I will argue for a first-class notion of concurrency testing and describe CHESS, a tool we have developed towards that end. A user of CHESS provides simple concurrency scenarios and CHESS uses model checking techniques to systematically enumerate all interleavings of these scenarios. CHESS employs various algorithms to reduce the search space, focus on potentially bug-yielding schedules, and provide sound quantifiable notions of coverage. Moreover, on finding an error CHESS has the capability to replay the erroneous interleaving, greatly simplifying the debugging process. CHESS has been integrated with various codebases inside Microsoft. It has found numerous bugs and helped reproduce many stress-test crashes. CHESS is also available for download at google(``systematic concurrency testing'').

Online Phase-Adaptive Data Layout Selection
Martin Hirzel   On:  1-Jul-2008 10:00 AM - 11:30 AM
IBM Research   At:  Watson Research Center (Hawthorne), Room 1SF40
   

Abstract:
This talk is based on an ECOOP'08 paper by Chengliang Zhang and Martin Hirzel. Good data layouts improve cache and TLB performance of object-oriented software, but unfortunately, selecting an optimal data layout a priori is NP-hard. This talk introduces layout auditing, a technique that selects the best among a set of layouts online (while the program is running). Layout auditing randomly applies different layouts over time and observes their performance. As it becomes confident about which layout performs best, it selects that layout with higher probability. But if a phase shift causes a different layout to perform better, layout auditing learns the new best layout. We implemented our technique in a product Java virtual machine, using copying generational garbage collection to produce different layouts, and tested it on 20 long-running benchmarks and 4 hardware platforms. Given any combination of benchmark and platform, layout auditing consistently performs close to the best layout for that combination, without requiring offline training.

Universal Symbolic Execution and its Application to Likely Data Structure Invariant Generation
Yamini Kannan   On:  23-Jun-2008 10:00 AM - 11:30 AM
UC Berkeley   At:  Watson Research Center (Hawthorne), Room GN-K35
   Host:  Stephen Fink

Abstract:
Local data structure invariants are asserted over a bounded fragment of a data structure around a distinguished node M of the data structure. An example of such an invariant for a sorted doubly linked list is "for all nodes M of the list, if M != null and M.next != null, then M.next.prev = M and M.value <= M.next.value." It has been shown that such local invariants are both natural and sufficient for describing a large class of data structures. I will be presenting a technique, called Krystal, to infer likely local data structure invariants using a variant of symbolic execution, called universal symbolic execution. Universal symbolic execution is like traditional symbolic execution except the fact that we create a fresh symbolic variable for every read of a lvalue that has no mapping in the symbolic state rather than creating a symbolic variable only for inputs. This helps universal symbolic execution to symbolically track data flow for all memory locations along an execution even if input values do not flow directly into those memory locations. We have implemented our algorithm and applied it to several data structure implementations in Java. Our experimental results show that we can infer many interesting local invariants for these data structures.

Exploring Massively Multicore Programming Models: Introducing the Renaissance ER Project
Sam Adams and David Ungar   On:  4-Jun-2008 02:00 PM - 03:30 PM
   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Erik Altman

Abstract:
We as an industry have bet our collective future on multicore/manycore hardware architectures without knowing how to easily and efficiently write programs for these systems. The existing software stack has tightly co-evolved with the dramatic uni-processor performance improvements driven by the feature and frequency scaling described by Moore's Law. Without a radical re-examination of the programming models, development tools and runtime environments needed for these future systems, IBM stands in grave danger of losing its leadership in high performance computing and servers as well as losing the next generation of system and applications programmers to whatever successful models emerge from our competitors. The Renaissance Exploratory Research project is exploring new programming models for the massively multicore future. In this talk we will introduce the project's goals, philosophy and approach, including discussions about our initial experiences with the Tilera64 manycore processor. We will also present our ideas for a new class of virtual machine, a "multi-vm", discuss design issues and trade offs, and demonstrate our first running virtual machine.

Speaker biography:
Sam's bio:
Sam Adams works in the Programming Models and Tools group in Software Research and was one of IBM's first Distinguished Engineers. In his 15 years at IBM, Sam has helped found the first Object Technology Practice in the IBM Consulting Group, developed the first large scale object reuse library in IBM, pioneered Self-Configuring Systems which foreshadowed Autonomic Computing, co-authored the XML Technical Strategy for IBM that set our direction into Web Services, coined the term Service Oriented Architectures and the Publish-Find-Bind model, explored Artificial General Intelligence in the Joshua Blue project, pioneered End User Programming for Web Services and mashup technology, and co-led Research's Enterprise 2.0 initiative, exploring Web 2.0 in the enterprise. Since mid-2007 he has been focused on discovering new programming models for massively multicore systems and is a Principal Investigator on the Renaissance Exploratory Research project.
David's bio:
David Ungar has long been fascinated by programming paradigms that can change the way people think, novel implementation techniques that make new languages feasible, and user interfaces that vanish. With Dr. Randall B. Smith at PARC, he designed a simple yet powerful prototype-based object-oriented programming language called "Self." As an Assistant Professor at Stanford, David and his students developed new compilation techniques and heap structures for pure object-oriented programming languages. Rejoining Dr. Smith at Sun Microsystems Laboratories, David co-led a project to create a complete programming environment for Self. The implementation techniques developed for Self have been harnessed for Sun's HotSpot Java™ Virtual Machine. David's Klein project explored metacircularity in pursuit of simpler, more malleable high-performance virtual machines and better development environments for them. David's doctoral research was performed at the University of California at Berkeley with David Patterson, and concerned the development of a RISC for Smalltalk. The dissertation was published by the MIT press as an ACM Distinguished Dissertation. It introduced a fast automatic storage reclamation algorithm, Generation Scavenging, which has since influenced many production systems, and isolated those architectural features that significantly improved performance. David Ungar is an ACM Distinguished Engineer, and two of his papers have been recognized as having been among the most influential in their respective fields: one was the original paper on the Self language, the other was on the application of cartoon animation techniques to improve the legibility of user interfaces. Since 2007, David has been working in IBM research, where he has added a facility for collaboration to Tuning Fork, and is now part of the Renaissance project, which has given him a bad case of Tilera fever.


Declarative Object Identity Using Relation Types
Mandana Vaziri   On:  27-May-2008 10:00 AM - 11:30 AM
IBM Research   At:  Watson Research Center (Hawthorne), Room  1S-F40
   

Abstract:
This talk is based on an ECOOP'07 paper by Mandana Vaziri, Frank Tip, Stephen Fink, and Julian Dolby. Object-oriented languages define the identity of an object to be an address-based object identifier. The programmer may customize the notion of object identity by overriding the equals() and hashCode() methods following a specified contract. This customization often introduces latent errors, since the contract is unenforced and at times impossible to satisfy, and its implementation requires tedious and error-prone boilerplate code. Relation types are a programming model in which object identity is defined declaratively, obviating the need for equals() and hashCode() methods. This entails a stricter contract: identity never changes during an execution. We formalize the model as an adaptation of Featherweight Java, and implement it by extending Java with relation types. Experiments on a set of Java programs show that the majority of classes that override equals() can be refactored into relation types, and that most of the remainder are buggy or fragile.

Declarative Object Identity Using Relation Types
Mandana Vaziri   On:  27-May-2008 10:00 AM - 11:30 AM
IBM   At:  Watson Research Center (Hawthorne), Room 1S-F40
   

Abstract:
This talk is based on an ECOOP'07 paper by Mandana Vaziri, Frank Tip, Stephen Fink, and Julian Dolby. Object-oriented languages define the identity of an object to be an address-based object identifier. The programmer may customize the notion of object identity by overriding the equals() and hashCode() methods following a specified contract. This customization often introduces latent errors, since the contract is unenforced and at times impossible to satisfy, and its implementation requires tedious and error-prone boilerplate code. Relation types are a programming model in which object identity is defined declaratively, obviating the need for equals() and hashCode() methods. This entails a stricter contract: identity never changes during an execution. We formalize the model as an adaptation of Featherweight Java, and implement it by extending Java with relation types. Experiments on a set of Java programs show that the majority of classes that override equals() can be refactored into relation types, and that most of the remainder are buggy or fragile.

Static Deadlock Detection for the SHIM Concurrent Language
Nalini Vasudevan   On:  23-May-2008 10:00 AM - 11:30 AM
Columbia Uiniversity   At:  Watson Research Center (Hawthorne), Room 1S-F40
   Host:  Olivier Tardieu

Abstract:
Concurrent programming languages are becoming mandatory with the advent of multi-core processors. Two major concerns in any concurrent program are data races and deadlocks. Each are potentially subtle bugs that can be caused by non-deterministic scheduling choices in most concurrent formalisms. As an alternative, the SHIM concurrent language guarantees the absence of data races by eschewing shared memory, but a SHIM program may still deadlock if a program violates a communication protocol. We present a model-checking-based static deadlock detection technique for the SHIM language. Although SHIM is asynchronous, its semantics allow us to model it synchronously without losing precision, greatly reducing the state space that must be explored. This plus the obvious division between control and data in SHIM programs makes it easy to construct concise abstractions. Experimentally, we find our procedure runs in only a few seconds for modest-sized programs, making it practical to use as part of a compilation chain.

A Taste of Erlang
Bard Bloom   On:  21-May-2008 10:00 AM - 11:30 AM
IBM Research   At:  Watson Research Center (Hawthorne), Room 1S-F40
   

Abstract:
Sequentially, Erlang is a simple functional language of concrete data structures and pattern matching, not far from pure Scheme. With a few distributed constructs and a few good libraries, it's a slick and powerful tool for reliable distributed computing. It's getting popular: major companies are starting to use it for high-performance message services. Come see what it's about.

Satisfiability Modulo Theories
Prof. Clark Barrett   On:  19-May-2008 10:00 AM - 11:30 AM
Assistant Professor   At:  Watson Research Center (Hawthorne), Room GN-K35
NYU   Host:   Satish Chandra

Abstract:
The Satisfiability Modulo Theories (SMT) problem is that of checking the satisfiability of first-order formulas with respect to some logical theory T of interest. This talk will start with some motivation and applications of SMT. I will briefly review the theoretical framework for SMT and then discuss some of the main issues involved in creating a practical implementation, in parcticular, the need for and implementation of fast Boolean reasoning. The talk will also cover some of our recent work such as the implementation of non-convex theories and strategies for quantifier instantiation in the context of SMT.

Speaker biography:
Clark Barrett received his bachelor's degree in Mathematics, Computer Science, and Electrical Engineering from Brigham Young University in 1995. He received his PhD from Stanford University under David Dill and joined the faculty of New York University in the Fall of 2002. Professor Barrett is the co-author of a number of SMT systems, including the Stanford Validity Checker (SVC), and its successor, the Cooperating Validity Checker (CVC). His current research includes work on the lastest version of CVC, called CVC3, as well as applications of CVC3 to hardware and software verification.

Fortran Development Tools: Providing a RoadMap for Application Development on Advanced Computer Architectures
Dr. Craig E. Rasmussen   On:  8-May-2008 10:00 AM - 11:30 AM
Los Alamos National Labs   At:  Watson Research Center (Hawthorne), Room GNF15
   Host:  Greg Watson

Abstract:
The Fortran Development Tools (FDT) project is dedicated to delivering productivity enhancing and program correctness tools to the scientific application developer. The FDT project also provides an important vehicle for source-to-source transformations, enabling research in the mapping of high-level language constructs on to advanced computer architectures like RoadRunner. The goal of this research is to allow the programmer to "write once," while relying on the source-to-source compiler to target the disparate variety of computer architectures that are available today. We describe initial results from this auto-vectorizing and auto-parallelizing, "data-parallel" Fortran compiler showing a factor of ten speedup on the IBM Cell/B.E. processor.

Speaker biography:
Craig E Rasmussen, Ph.D, is a staff member of the Advanced Computing Laboratory (ACL) at Los Alamos National Laboratory. He has an extensive publication record in space plasma physics, medical physics, and computational and computer sciences. His current research interests include studying ways in which computer languages and programming environments can improve productivity in scientific computing. As a member of the Common Component Architecture (CCA) forum, he has worked to make component technology easier to use by developing the Dune CCA/Python framework and experimenting with it as a rapid prototyping environment for scientific computing. As a member of the J3 Fortran standards body, he has worked on the Fortran Bind(C) interoperability standard and on other ways to improve the parallel expressiveness and performance of the Fortran language. At the University of Michigan, Craig was lead programmer on the Upper Atmospheric Research Collaboratory (UARC), which enabled scientists to view, steer, and collaborate remotely over data gathered in real time from instruments located at Sondrestrom, Greenland. The UARC project was inducted into the Smithsonian Institution's Permanent Research Collection on Information Technology Innovation. He is currently working on a source-to-source compiler for Fortran 2003.

Fighting concurrency bugs
Shan Lu    On:  30-Apr-2008 10:00 AM - 11:30 AM
University of Illinois, Urbana-Champagne   At:  Watson Research Center (Hawthorne), Room 1SF40
   Host:  Evelyn Duesterwald

Abstract:
Driven by the hardware shift to multi-core architectures, concurrency is being brought into mainstream software development. Unfortunately, concurrent programs are prone to concurrency bugs, because of the inherent complexity of concurrency and the sequential thinking habits of programmers. Concurrency bugs' non-deterministic property also brings a lot of trouble to developers. Improving the reliability of concurrent programs is a critical and urgent task. This talk presents recent work on understanding, detecting, and exposing concurrency bugs. I will first present two techniques that detect concurrency bugs from the angle of programmers' synchronization intention. The first one is AVIO (technology transfer to Intel). AVIO automatically infers programmers' atomicity intentions from correct execution. It detects violations to the inferred intentions and reports concurrency bugs. The second technique is MUVI. MUVI automatically infers variable correlation relationship from the source code and detects unsynchronized concurrent accesses to correlated variables. During this talk, I will also briefly discuss some findings from our characteristics study of real-world concurrency bugs. These findings have motivated the above bug detection work and inspired us to improve concurrent program testing.

Flexible Task Graphs: A Unified Restricted Thread Programming Model for Java
Jesper Honig Spring   On:  29-Apr-2008 10:00 AM - 11:30 AM
EPFL   At:  Watson Research Center (Hawthorne), Room  1S-F40
   Host:  Josh Auerbach

Abstract:
The disadvantages of unconstrained shared-memory multi-threading in Java have given rise to a variety of language extensions that place restrictions on how threads allocate, share, and communicate memory, leading to order-of-magnitude reductions in latency and jitter. Examples of such extensions include Eventrons (PLDI '06) and Exotasks (LCTES '07) from IBM Research, and Reflexes (VEE '07) and StreamFlex (OOPSLA '07) from Purdue University/EPFL. Each of these models makes different trade-offs with respect to expressiveness, efficiency, enforcement, and latency, and thus no single model is best suited for all applications scenarios. A common motivation for restricted thread programming models is real-time behavior. Recent advances in real-time garbage collection algorithms have reduced GC-related latency to around 1ms, but some applications have latency and throughput requirements that go beyond what can be achieved with current real-time garbage collection algorithms. For applications with sub-millisecond latency requirements, any synchronous interaction between real-time code and the GC or a time-oblivious task will cause a deadline miss. In this talk, we will present Flexible Task Graphs (Flexotasks), a research collaboration between IBM Research, Purdue University and EPFL. Flexotasks provide a restricted thread programming model that unifies the four previous models and allows different isolation policies and mechanisms to be combined in an orthogonal manner, making it possible to use the best suited combination to meet the specific applications needs. An article describing the work was accepted at LCTES '08 and will be presented at the conference in July. Note, however, this is not a preparatory talk

Challenges and Approaches in Providing Quality of Service in Chip Multi-Processor Systems
Prof. Yan Solihin    On:  28-Apr-2008 10:00 AM - 11:30 AM
Associate Professor   At:  Watson Research Center (Hawthorne), Room GNF15
NCSU   Host:   Erik Altman

Abstract:
The talk consists of two parts. The first part discusses the trend in increasing the number of cores on a chip in multicore architectures has produced new challenges in achieving scalable performance. We examine bandwidth as a source of scalability bottlenecks and show the relationship between cache size, bandwidth requirement, and the number of cores on a chip. We project the effectiveness of various bandwidth reduction techniques on improving the scalability of multicore designs. In the second part, we look at a problem related to the impact of sharing fine-grain platform resources among cores, for example the lowest level cache. We will show how different applications are affected by cache sharing. In particular, we will highlight the types and severity of pathological performance cases that can arise when applications run together on different cores but sharing the lowest level cache. The trends in enterprise IT toward service-oriented computing, server consolidation, and virtual computing point to a future in which workloads are becoming increasingly diverse in terms of performance, reliability, and availability requirements. In this environment, it is desirable to have microarchitecture and software support that can provide a guarantee of a certain level of performance (Quality of Service or QoS). We will present a framework for multicore architectures to fully provide QoS. We found that in addition to the ability to partition platform resources, a full QoS framework also needs an appropriate way to specify a QoS target, and an admission control policy that accepts jobs only when their QoS targets can be satisfied. We also found that providing strict QoS often leads to a significant reduction in throughput due to resource fragmentation. We will show throughput optimization techniques that include: (1) exploiting various QoS execution modes, and (2) a microarchitecture technique that steals excess resources from a job while still meeting its QoS target.

Speaker biography:
Yan Solihin is an associate professor at the Department of Electrical and Computer Engineering at North Carolina State University. He obtained his PhD from the University of Illinois at Urbana-Champaign in 2002. He is a recipient of 2006 IBM Faculty Partnership Award and 2004 NSF Faculty Early Career Award. He has graduated two PhD students and is currently advising eight PhD students. His research interests include parallel programming and parallel computer architectures, performance modeling, and architecture support for software reliability and computer security.

Program Synthesis by Sketching
Armando Solar-Lezama   On:  21-Apr-2008 10:00 AM - 11:30 AM
UC Berkeley   At:  Watson Research Center (Hawthorne), Room 1SF40
   Host:  Vijay Saraswat

Abstract:
For over thirty years, software synthesis has promised to automate the chore of writing programs. But only recently, the power of modern computers and the growing maturity of verification technology have combined to make practical synthesis possible. One of the biggest challenges for practical synthesis is to establish a synergy between the synthesizer and the programmer. There is potential for synergy because, while the synthesizer needs human insight to produce acceptable implementations, programmers actually want control over the implementation strategy, so they want to be able to guide the synthesis process. Thus, both the programmer and the synthesizer benefit when programmers are allowed to provide insight in a natural way. Sketching is my answer to the challenges of practical synthesis. It is a form of synthesis whose key novelty is the use of partial programs (sketches) to communicate insight to the synthesizer. The talk will describe sketching as implemented in the SKETCH language, and the innovations in inductive synthesis that made sketching possible. The talk will also describe our experience using SKETCH to synthesize complex implementations of ciphers, scientific codes, and even concurrent lock-free data-structures.

Cartesian computations and the high cost of moving data
Larry Carter   On:  15-Apr-2008 01:00 PM - 02:30 PM
Emeritus   At:  Watson Research Center (Hawthorne), Room GN-F15 (backup: 1S-F40)
UCSD   Host:   Bowen Alpern

Abstract:
In this talk, we identify and analyze a class of algorithms that includes many familiar and important scientific computations. A ""2-D Cartesian computation"" is characterized by having two very large data structures, A and B (perhaps A is the input and B the output), and for each suitably chosen chunk of A and chunk of B, there is a chunk of computation that must be performed. When neither A nor B fits in the fast memory of a computer, the time (or energy) needed to move bits between cores, chips, nodes and levels of the memory hierarchy can dominate the computation. Static Partitioning, Tiling, Inspector/Executor strategies, and Bucketizing are some well-known programming techniques that reduce data movement. We present a methodology that, for many Cartesian computations, allows one to decide which is the best of these techniques. Our results elegantly relate three orthogonal aspects of a computer -- computation speed, memory capacity, and communication or memory bandwidth -- and show that different techniques are needed at different levels of architectural granularity.

Speaker biography:
Larry Carter worked at as a Research Staff Member and manager at the Watson Research Center for nearly 20 years in the areas of probabilistic algorithms, compilers, VLSI testing, and high-performance computation. From 1994 to 2004, Dr. Carter was a professor in the Computer Science and Engineering Department of the University of California at San Diego. Between 1996 and 2000, he served as Vice Chair and then Chair of the department. His current research interests include scientific computation, performance programming, parallel computation, and computer architecture. Prof. Carter is a Senior Fellow at the San Diego Supercomputing Center, a Fellow of the IEEE, and a Professor Emeritus at UCSD.

A Constraint Solver for Software Engineering: Finding Models and Cores of Large Relational Specifications
Emina Torlak   On:  14-Apr-2008 10:00 AM - 11:30 AM
MIT   At:  Watson Research Center (Hawthorne), Room GNF15
   Host:  Mandana Vaziri

Abstract:
Relational logic is an attractive candidate for a software description language, because both the design and implementation of software often involve reasoning about relational structures, whether in the problem domain (organizational structure, for example), in the high level design (architectural configurations, for example) or in low level code (graphs and linked lists). Until recently, however, frameworks for solving relational constraints (such as Alloy3) have had limited applicability. While powerful enough to analyze relatively small, hand-crafted models of software systems, current frameworks perform poorly on large and automatically generated specifications. In this talk, I will describe Kodkod, an efficient constraint solver for relational logic, with recent applications to design analysis, code checking, test-case generation, and declarative configuration. The Kodkod system includes a finite model finder and a minimal unsatisfiable core extractor, both based on SAT solving technology. I will present the key ideas and contributions behind these analyses, discuss how they compare to existing approaches, and conclude with an overview of future work.

Speaker biography:
Emina Torlak is a Ph.D. candidate in Computer Science at MIT. Her main research interests are in software engineering and lightweight formal methods. She is currently working on a scalable relational engine with applications to design analysis, code checking, test-case generation, and declarative configuration. Emina received her B.S. and M.Eng. in Computer Science from MIT.

Synthesis of highly concurrent data structures
Martin Vechev   On:  10-Apr-2008 10:00 AM - 11:30 AM
IBM Research   At:  Watson Research Center (Hawthorne), Room 1SF40
   

Abstract:
Highly concurrent algorithms are difficult to design, test, and verify. This motivates the need for systematic construction techniques. Unfortunately, existing universal construction techniques are not applicable for creating efficient practical algorithms. This leads to a myriad of problems: repeated effort per algorithm (both construction and verification), suboptimal and incorrect solutions, inability to differentiate inherent complexity from implementation complexity, and difficulties in adapting algorithms to new architectures are only a small sample. In this talk I will present some of the latest work we have done towards addressing these problems. I will show how starting from a sequential implementation, we construct a space of concurrent data structure algorithms, representing various design choices with different tradeoffs. Some of the algorithms we discover are novel and of practical value. In particular, one of the algorithms uses only the compare-and-swap (CAS) synchronization primitive, and provides a wait-free contains() operation. The current construction process combines manual steps that correspond to high-level insights with automatic exploration of implementation details. We have implemented the exploration procedure in a new tool called Paraglider

Intel Architecture Memory Ordering
Rick Hudson   On:  3-Apr-2008 01:30 PM - 02:30 PM
Intel Research   At:  Watson Research Center (Hawthorne), Room 1SF40
   Host:  Maged Michael

Abstract:
Intel recently published more precise memory ordering principles for the IA32 and Intel Architecture 64 (aka x86) processors. This talk discusses the key principles embodied in this memory ordering and explains some of the software driven motivation behind them. Along the way we discuss issues such as publication safety and how to use the principles to implement the memory models found in high level programming languages. The presentation is aimed at developers of concurrent shared memory software and will provide a presentation of the principles as well as guidance on how to reason about them. This is joint work with Bratin Saha and many others both inside as well as outside Intel.

Speaker biography:
Richard L. Hudson is best known for his work in memory management including the invention of both the Train Algorithm and the Sapphire Algorithm. Richard joined Intel in 1998 where he has worked on memory management, concurrency, synchronization, and memory model related issues. He went to Shortridge, holds a B.A. degree from Hampshire College and an M.S. degree from the University of Massachusetts.

All Patterns Great and Small
Jason Smith   On:  2-Apr-2008 10:00 AM - 11:30 AM
IBM Research   At:  Watson Research Center (Hawthorne), Room 1SF40
   

Abstract:
Design patterns have proven to be a useful tool for describing and thinking about software design issues. Their abstract nature makes them appropriate for conceptual noodling, but difficult to work with on a concrete basis. In this talk I will describe a class of design patterns, the Elemental Design Patterns, that describe the lowest level of design issues available in programming, and illustrate how they form the basic language for design. In addition, I will discuss how these EDPs have a simple formal basis which makes them statically detectable in source code, as demonstrated by the System for Pattern Query and Recognition. SPQR is a semantic decompiler that efficiently detects instances of high-level design patterns in source code without human intervention by using EDPs as a core language, and applying simple inference techniques. By treating design patterns not as abstractions far removed from the foundations of programming language theory, but as a natural consequence of them, new opportunities for round-trip engineering and governance appear.

Reasoning about Software in the Presence of Transient Faults
Frances Perry   On:  24-Mar-2008 10:30 AM - 12:00 PM
Princeton University   At:  Watson Research Center (Hawthorne), Room GN-K35
   Host:  John Field

Abstract:
A transient hardware fault occurs when an energetic particle strikes a transistor, causing it to change state. Although transient faults do not permanently damage the hardware, they may corrupt computations by altering stored values and signal transfers. Existing solutions can detect transient faults by duplicating computations and comparing the results, however these solutions lack any formal reasoning about their behavior. In this talk, I will show how to use low-level type systems to cleanly express invariants about redundant computations and to formally reason about code behavior, even when execution may be affected by transient faults. In particular, I will present a typed assembly language named TAL_FT and use it to prove that well-typed programs will always detect any single fault before the fault causes a change in the program output.

Towers of Multicore: Concurrency, Synchronization, and Locality
Vugranam Sreedhar   On:  17-Mar-2008 10:00 AM - 11:30 AM
IBM Research   At:  Watson Research Center (Hawthorne), Room 1SF40
   

Abstract:
Transitor growth in 2011 will reach over 32 Billion transitors. Massive multicore chips with over 500+ cores will soon become commodity processors. Killers applications continues to challenge massive multicore processing. In this talk I will present three Towers of Multicore: Concurrency, Synchronization, and Locality. To fully harness the power of massive multicore we have to climb the three towers in a careful and holistic manner. For the past four years I have worked with CAPSL group led by Prof. Guang Gao to address some of the challenges of massive multicore. One important observation is that computer architects, system software designers and application scientists must work closely together to address Concurrency, Synchronization, and Locality challenges in order to improve performance and scalability of large-scale applications (both regular and irregular applications). I will present some of our results in addressing these three challenges.

Intelligent speculation for pipelined multithreading
Neil Vachharajani   On:  13-Mar-2008 10:00 AM - 11:30 AM
PhD Student   At:  Watson Research Center (Hawthorne), Room 1SF40
Princeton University   Host:   Erik Altman

Abstract:
In recent years, microprocessor manufacturers have shifted their focus from single-core to multicore processors. To avoid burdening programmers with the responsibility of parallelizing their applications, some researchers have advocated automatic thread extraction. Within the scientific computing domain automatic parallelization techniques have been successful, but in the general purpose computing domain few, if any, techniques have achieved comparable success. Despite this, recent progress hints at mechanisms to unlock parallelism from general purpose applications. In particular, two promising proposals exist in the literature. The first, a group of techniques loosely classified as thread-level speculation (TLS), attempts to adapt techniques successful in the scientific domain, such as DOALL and DOACROSS parallelization, to the general purpose domain by using speculation to overcome complex control flow and data access patterns not easily analyzed statically. The second, a non-speculative technique called Decoupled Software Pipelining, partitions loops into long-running, fine-grained threads organized into a pipeline (pipelined multithreading or PMT). DSWP effectively extends the reach of conventional software pipelining to codes with complex control flow and variable latency operations. Unfortunately, both techniques suffer key limitations. TLS techniques either suffer from over speculation, in an attempt to speculatively transform a loop into a DOALL loop, or realize little parallelism in practice because DOACROSS parallelization puts core-to-core communication latency on the critical path. DSWP avoids these pitfalls with its pipeline organization and decoupled execution using inter-core communication queues. However, its non-speculative nature and restrictions needed to ensure a pipeline organization prevent DSWP from achieving balanced parallelism on many key application loops. In this talk, I present two key contributions that advance the state of automatic paralellization of general purpose applications. First, I propose extending pipelined multithreaded execution with intelligent speculation. Rather than speculating all loop-carried dependences to transform loops into DOALL loops, I propose speculating only key predictable dependences that inhibit balanced, pipelined execution. I will present results from our automatic compiler transformation, Speculative DSWP, demonstrating the efficacy of this technique. Second, to support decoupled speculative execution, I will describe an extension to a multi-core architecture's memory subsystem allowing it to support memory versioning. The proposed memory systems resemble those present in TLS architectures, but provide efficient execution in the presence of large transactions, many simultaneous outstanding transactions, and eager data forwarding between uncommitted transactions. In addition to supporting usage patterns exhibited by speculative pipelined multithreading, the proposed memory system facilitates existing and future speculative threading techniques.

Speaker biography:
Neil Vachharajani is a Ph.D. student in the Department of Computer Science at Princeton University. His research in