Moving from blogspot to TypePad

I'm moving this blog to TypePad, which isn't free like bolgspot, but has a nice feature that allows you to categorize posts. The tricky part will be to select the right categories, of course, but it makes a blog much easier to navigate. For example, maybe you are only interested in posts about user experience, or security, or even non-sequitur posts.

It was pretty painless to import all of the posts to date, including comments, and even to categorize all of the posts so it's already possible to filter on particular topics.

The new home for this blog is here:



European e-Infrastructures Roadmap

I'm participating in a European Commission workshop at the Finnish IT Center for Science (CSC) in Helsinki, focused on the European e-Infrastructure Roadmap. (e-Infrastructure is the European analog to the NSF term Cyberinfrastructure.)

I gave a presentation on TeraGrid, including a discussion about cyberinfrastructure and related initatives at NSF and DOE.

Kyriakos Baxevanidis (from the European Commission) gave an overview of the new Seventh Framework (funding) Programme, FP7.

Timo Skyttä (from Nokia) gave an interesting talk about identity management and the Liberty Alliance. He points out that both Liberty and the Shibboleth project have adopted SAML 2.0, an XML-based set of specifications for exchange of authorization and auuthentication information. Skytta showed a number of case studies including the US E-Authentication initiative.

Klaus Ullman (from DANTE) gave an overview of the future of research and education networking in Europe. DANTE's current infrastructure is called GEANT2, a high-speed backbone network that interconnects 30+ national research and education networks (NRENs). There are some nice downloadable PDF maps at the DANTE website. Ullman gave a nice overview of their forecasts in terms of user requirements and how these shape the network deployment plans.

Presentations from this workshop can be downloaded from the workshop agenda page.

(CSC is actually in Espoo, just outside of Helsinki)

Non-sequiter of the week: Saving energy. Today I read in Argonne's daily newsletter that Argonne employees are encouraged to swap out incandesant light bulbs for compact flourescent lightbulbs (CFL's). We had already done this for many of the lights in our home, but the following statement caught my eye. If every household in the United States replaced one light bulb with an Energy Star qualified compact fluorescent light bulb (CFL), it would prevent enough pollution to equal removing one million cars from the road. Cool. Go for it - replace two!


Making TeraGrid More Accessible

During the past year one of our priorities has been to give users better tools for learning about, and interacting with, TeraGrid resources and services. Diana Diehl and Tim Gumpto (SDSC) have led efforts to make the TeraGrid website easier to navigate while deepening the content. Eric Roberts and Maytal Dahan (TACC) are rolling out an updated version of the TeraGrid User Portal this week with some important new functions. And we have an experimental "crowdsourcing" site where we are inviting the TeraGrid community to collaborate on a next-generation TeraGrid website and watering hole.

The challenge for our team over the next few months will be to bring these three important efforts together into a coherent set of functions - this is a significant focus of our user services working group.


Cyberinfrastructure User Advisory Committee (CUAC)

A few months ago the TeraGrid leadership team worked with NSF to create the Cyberinfrastructure user Advisory Committee, or CUAC. The CUAC is comprised of twelve end-users - consumers - of cyberinfrastructure. We were delighted to be able to find advisors from each and every one of NSF's science directorates, and we had our first meeting in June. We are beginning to pull together the first report from the CUAC, documenting a series of small-group discussions (the CUAC is loosely organized into three subgroups) that the CUAC has had over the past few months. Several themes strike me about this draft (which will eventually be posted at the CUAC website referenced above).

One of the topics that comes up repeatedly is the need for more training and education regarding how to use the individual TeraGrid resources as well as how to use them together, for example in a workflow. Having harnessed TeraGrid, many users also are looking for training and help in analysing and visualizing their data. Simply put, we need to look hard at how we can increase our training and education offerings.

CUAC members also recommended that we look carefully at the barriers faced by interested potential users, before they even become users. From the point of view of a scientist considering writing his or her first proposal for a TeraGrid allocation, it would be useful to understand what are the chances that their research can be accelerated by TeraGrid, and what are the chances that their proposal will result in an allocation. We do have development allocations, or DAC awards that are very straightforward to propose, and so much of this is also a matter of better communication with potential users. (actually the DAC process has been wildly successful, with nearly 250 awards granted already this calendar year!).

In a nutshell, communication, training and education are clearly high-priority items for TeraGrid to address in the coming months.

Non-sequiter of the week: Louisville Slugger Baseball Bats. After my daughter's cross country meet tomorrow we plan to head for the Louisville Slugger Museum and Factory which I'm told is a very fun tour, particularly if you are a baseball fan. Speaking of baseball - and it is that time of year - I'm pulling for a twenty-year anniversary world championship this year, which may give you an idea of my baseball leanings. :-)


Exploring VMs

After posting a few days ago some thoughts on virtual machine technology, and spending lots of time thinking about how to leverage commercial services such as Amazon's EC2, I was talking with Kate Keahey from Argonne, who has been working in this area for a while. She gave me a very nice summary of work in the Globus Alliance that I thought would be worth sharing here:

One of the advantages of using virtual machines is the ability to easily and efficiently deploy desired software environments encapsulated in a VM image. This allows resource users to configure the virtual machine images themselves and deploy them on a VM-enabled platform made available by a resource provider. Another feature of interest is that VM tools offer capabilities allowing a resource provider to guarantee the delivery of specific resource quota (in terms of memory, CPU%, disk, bandwidth, etc.) to a VM -- this facilitates implementing sharing and accounting between different clients. The Globus Virtual Workspaces project leverages these capabilities to provide such controlled sharing and configuration independence (see a recent paper).
The configuration and performance isolation implemented by virtual machines enables a division of labor between resource provider and consumer which has the potential to significantly contribute to the growth and scalability of Grids.

The advantages of using virtual machines in Grid and generally distributed computing are still emerging as new hypervisor capabilities and new requirements emerge. The VTDC06 workshop, co-hosted with SC06 this year, brings together the virtualization and distributed computing communities to discuss the potential of virtualization in resource management, scheduling, security and service hosting.


Grid Interoperation (Now?)

About a year ago many of us involved in major grid initiatives and facilities realized that there were many pair-wise discussions about interoperation, and a set of emerging "common themes" to these discussions. This quest for interoperation is driven by two strong needs. First, there are many research teams with collaborators located in different countries, and/or on different continents, with access to multiple grid facilities. How do we help them work together, which often involves use of grid resources in multiple grid facilities? A second driver here is a practical and technical desire to adopt working solutions from others rather than reinventing them.

Leaders from nine major Grid initiatives met in November 2005 to band together to drive interoperation (pardon the acronyms): TeraGrid (US), OSG (US), DEISA (Europe), NGS (UK), NAREGI (Japan), K*Grid (Korea), PRAGMA (Pacific Rim), APAC-Grid (Australia), and EGEE (Europe)

During a half-day discussion this group identified four areas where the current state of technology, with some coordination on our part, could begin to support interoperation. We formed several task-forces to develop interoperation plans in the areas of:

- Information services
- Job submission
- Data movement
- Authorization

The PRAGMA folks also took the lead in identifying several early-adopter applications to drive these four areas, and we set up an operations task force to capture that experience. Plans in these areas were presented at the Athens GGF meeting in February, and eleven more grid projects joined us (I won't try to list them here in this already acronym-rich post). A tremendous amount of work was done early this year, and we held updates on progress at the Tokyo (May) and Washington, DC (September) GGF meetings.

You can find details on this progress, constantly being updated and expanded as we move forward, at the Grid Interoperation Now (GIN) wiki hosted at the GGF site.

The next steps for this group involve expanding the applications effort to bring in at least another dozen science teams interested in testing what we have put in place and driving it forward. The GIN effort is completely open, and we are always looking for more people to help out- head over to the site and jump right in!


Predicting the (near) Future

This past week we had a quarterly TeraGrid management meeting in Austin at the University of Texas, home of TACC. One of the discussions we had was regarding the growing number of computational resources available to users, and the need to help them to sort through the options. A key question for a user is "if I submit my job to this particular TeraGrid machine, and I'd like it to run in the next n minutes, what is the likelihood that it will run in that time?

It turns out that there are several tools that can give the user a prediction, albeit not with 100% certainty, based on the state and history of the queue in question. Rich Wolski (UC Santa Barbara) and his Network Weather Service project have been doing nice work in this area for quite a while. Rich's work with the VGrADS project has brought us a very nice tool that you can see demonstrated at his demo website.

We are working to get this capability embedded in the TeraGrid User Portal and after I sent email to our Science Gateways mailing list I found that many of them were already in the process of making this tool available. For a scientist trying to get work done, it will be a wonderful thing to be able to look across the now more than 20 major computational systems in TeraGrid and get a sense for where his or her job will run soonest!

Non Sequitur of the Week: Apple Widgets. Well this is really a non-sequitur if you're not a Mac person, but I am and I had been interested in what was involved in writing Dashboard widgets. I downloaded the Apple Developer kits and there were some nice examples in there. Since Rich Wolski sent me a tarball of the bqp (ok not a total non-sequitur, see above) command line utilities I decided to try to make a very simple widget, building on one of Apple's examples. It took about an hour to figure out the basics, and was kinda fun (I called it "AskRich"). It assumes you put the NWS command line tools in /usr/local/bin on your Mac and executes a hard-coded query, but it's a start. Perhaps next weekend I'll learn how to let the widget user select the options... (if you are interested in seeing the widget, send me email with the word widget in the subject line).


{Amazon, Google, eBay, Microsoft...}.EDU

I've been having many discussions with people from the Research & Education community - TeraGrid Science Gateway providers, individual users, computer center directors, etc. - regarding the notion of taking advantage of some new and interesting storage and computing web services such as Amazon's S3 and EC2. Google, Microsoft, eBay, and others are surely going to provide new web servies in this space. Further, anyone paying moderate attention will also see that technology provider companies (IBM, EMC, Platform, Univa, etc.) are introducing powerful building blocks aimed at building service oriented systems (e.g. "Grids"). Some (especially end users!) respond with enthusiasm - and some folks have responded along the lines of "we can do it ourselves cheaper" or "performance isn't good enough."

I think these responses are true to some extent, but they also ignore some important factors. The first is Moore's Law. Today's price is irrelevant - prices based on technology (like disk or CPU or bandwidth) get cheaper, rapidly, over time. (Imagine if the $100,000 price tag on a visualization workstation twenty years ago had stopped us from developing imaging tools...) What we have typically done in this community is to ask what the computational environment of the future will look like, and we design and plan around the future - not the present. That's how you invent the future rather than just reacting to change as it hits you.

The second is mistaking oranges for apples, and thus doing an apples to oranges comparison. Take Amazon S3. It's way, way more expensive than buying a disk drive, especially if you already operate a large computing facility. But is it the same? Not if your computing server does not provide a web services interface! Does it matter? Only if your users want a web services interface, or if you want to develop a workflow, or other sophisticated capability with web services. Many users I've spoken with say they do!

Let's look at an example. If you don't already run a storage service, what's the best way to share something like a 5 TeraByte data collection with colleagues spread around the Internet? To set up a server with 5 TB of disk and a sensible backup system (if you care about that, otherwise the calculations change) you'll pay about the same as the storage cost for putting the data in S3 for three years. The open question is data transfer- if you're sharing the 5 TB with thousands of users you may be better off hosting it yourself due to the S3 I/O charges. But if you're sharing with a small community, with modest needs in terms of moving data out and in, then S3 is likely much cheaper than rolling your own- unless your system administration staff work for free.

I believe that TeraGrid and similar initiatives must seriously investigate what a partnership might look like with (web/grid) "service providers." While these services do not address the requirements of users who need multiple Teraflops of computing or tens of Terabytes of storage, they just may offer something for the many people who want to share smaller amounts of data, or have intermittent needs for rapidly accessible, modest computing power.

TeraGrid is focused, rightly, on providing for Petascale computational, storage, and data analysis services. For the Gigascale stuff, perhaps we should think about a new type of "resource provider" - Amazon.edu?

(at Austin)