Chisel 
logo  Computer Human Interaction & Software 
Engineering Lab

How Information Visualization Novices Construct Visualizations

Lars' Blog - Thu, 2010-07-22 17:22
Visualization for the masses is a topic that has gained a lot of attraction in the InfoVis community in recent years, e.g. in projects such as IBM ManyEyes. The goal is to enable a wide user population to leverage information visualization technology to understand large amounts of data. This could potentially help them make more informed decisions, and is especially promising as more and more data becomes available (see open data). However, there are still many challenges that need to be addressed so that visualization for the masses can become a reality, ranging from limited visual literacy to insufficient tool support.

Together with Melanie Tory and Margaret-Anne Storey, I investigated how information visualization novices construct visualizations in a laboratory setting. Our research paper "How Information Visualization Novices Construct Visualizations" was accepted for presentation at IEEE InfoVis 2010.

Here is the abstract of our paper:

It remains challenging for information visualization novices to rapidly construct visualizations during exploratory data analysis. We conducted an exploratory laboratory study in which information visualization novices explored fictitious sales data by communicating visualization specifications to a human mediator, who rapidly constructed the visualizations using commercial visualization software.

We found that three activities were central to the iterative visualization construction process: data attribute selection, visual template selection, and visual mapping specification. The major barriers faced by the participants were translating questions into data attributes, designing visual mappings, and interpreting the visualizations. Partial specification was common, and the participants used simple heuristics and preferred visualizations they were already familiar with, such as bar, line and pie charts.

From our observations, we derived abstract models that describe barriers in the data exploration process and uncovered how information visualization novices think about visualization specifications. Our findings support the need for tools that suggest potential visualizations and support iterative refinement, that provide explanations and help with learning, and that are tightly integrated into tool support for the overall visual analytics process.

Download Technical Report
Categories: News

Spark TextArea with Line Numbers

Chris's Flex Blog - Fri, 2010-07-16 10:46
Here is a skin that you can use on the Spark TextArea class to show line numbers down the left side of the text.
The line numbers use the same size font as the TextArea.

It is very simply to use, simply set the skinClass property in mxml like this:
<s:TextArea skinClass="flex.utils.spark.TextAreaLineNumbersSkin"/>

If you want a horizontal scroll bar on the TextArea, then set lineBreak="explicit" and the horizontal scrollbar will appear.

Here is an example of it in action (right click to view source).


It would be very easy to modify this skin to make it resizable by adding a resize handle in the bottom right corner. Look at the source code from my previous blog post on Resizable Controls, specifically the flex.utils.spark.resize.ResizableTextAreaSkin class, it uses a custom skin for the Scroller, which adds the resize handle.
Categories: News

Flex 4 Spark Resizable Controls

Chris's Flex Blog - Mon, 2010-06-28 16:48
Please go here for Flex 3 Resizable Containers.

I've created a bunch of skins for many of the common Spark components that allows them to be resized. Each of these skins contains a resizeHandle that when dragged allows the control to be resized. There are two resize handle classes that you can use, the default is called flex.utils.spark.resize.ResizeHandleLines. You can replace every occurrence of that class with flex.utils.spark.resize.ResizeHandleDots if you prefer.

Here are a list of resize skins:

With the exception of the ResizableLabel class, all the others are Skins, and as such can be used very simply by setting the skinClass="flex.utils.spark.resize.___Skin" property to the appropriate skin.

Another option is to create a CSS style for ALL spark.components.Scroller classes to use the flex.utils.spark.resize.ResizableScrollerSkin class like this:
<fx:Style>
@namespace s "library://ns.adobe.com/flex/spark";
@namespace mx "library://ns.adobe.com/flex/mx";
@namespace spark "flex.utils.spark.*";
@namespace resize "flex.utils.spark.resize.*";

/* Make all Scroller's use the resizable scroller skin. */
s|Scroller {
  skin-class: ClassReference("flex.utils.spark.resize.ResizableScrollerSkin");

</fx:Style>

** Note that I've renamed the Flex3 package flex.utils.ui.resize.* to the new Flex4/Spark package name flex.utils.spark.resize.*.

The most used skin is the ResizableScrollerSkin, it is used on TextAreas, Lists, DataGrids, Trees, ComboBoxes, DropDownLists, and anything else that uses a Scroller component. The way it works is to use a skin for the Scroller that adds the resize handle and uses custom HScrollBar and VScrollBar classes which leave room for the resize handle (the simplest way I could think to do it). Each of the resizable skins uses the ResizeManager class to handle the mouse events and resize the appropriate control.

The resizable ComboBox and DropDownList skins are slightly different in that they both save the size of the drop down list since it gets destroyed and re-created each time. It also sets the popUpWidthMatchesAnchorWidth="false" after resizing since the width no longer matches the anchor.

I've also added support for restricting the resize in only the vertical or horizontal direction. There are many ways you can do this, you can either set a style on the resize component:
.resizePanel {
  resize-direction: vertical; /* or horizontal */
}
Or you can call a static method on the ResizeManager class:
ResizeManager.setResizeDirection(resizePanel, "vertical"); // or "horizontal"
Or if you can access the ResizeManager class (usually stored in the skin class), then you can set the resizeDirection property on the manager like this:
resizeManager.resizeDirection = "vertical"; // or "horizontal";
There are constants defined in the ResizeManager class for "vertical", "horizontal", and "both" (default).

Here is an example of most of the skins, view-source enabled.
Categories: News

Diver in linux

Del's Blog - Fri, 2010-06-25 15:11
Some people may be having difficulty with Diver in various distributions of linux. You may be getting the following message when trying to launch a trace.


And you may see this in the console:


Error occurred during initialization of VM
Could not find agent library in absolute path: /[path-to-plugin]/libsketch_linux32.so


Now, you may be surprised to find out that these errors may have nothing to do with open ports or whether or not the file listed exists. This is not Diver's fault. The client can't find an open port because the Java virtual machine couldn't start and open one. The Virtual machine couldn't start not because the listed library is missing but because you might not have some of the dependencies installed on your system. Unfortunately, the Java VM sometimes gives some pretty uninformative error messages.

Diver depends on the C++ boost libraries to do socket communication and multi-threading. It also uses boost to do multi-platform file system manipulation. Many distributions of linux come with boost installed. Some do not. If you have a debian-based version of linux, installing the required boost libraries is easy:

$ sudo apt-get install libboost-iostreams1.40.0 libboost-date-time1.40.0 libboost-filesystem1.40.0 libboost-system1.40.0 libboost-thread1.40.0

If you are running a redhat distro, you should be able to use a similar yum command. I hope that this all works for everyone.
Categories: News

Top 10 Eclipse Helios Features

Ian's Blog - Tue, 2010-06-22 23:11

Two weeks ago I asked you to think about high quality software that has been consistently delivered on-time. Think about software that is used by millions of people world-wide, built by hundreds of developers, free to use and open to everybody and anybody. Think about software that spans domains, runs on the smallest of devices and powers the worlds largest enterprises.

Any ideas? Yes I’m talking about Eclipse, and the next release — Helioshas arrived. (For an an ultra fast download try our Amazon Cloudfront mirrors). While everyone seems to enjoy kicking off new software projects, specifying requirements and designing the perfect system, only to have it fizzle out — Eclipse is Different. Eclipse Delivers.

For the past 2 weeks I’ve been counting down the Top 10 Features of Helios that I’m most excited about:

10. Resource Improvements
9. Feature based configurations
8. Improvements to API Tools
7. Java IDE Improvements
6. Target Platform Improvements
5. p2 API and the b3 Aggregator
4. MarketPlace Client
3. EMF, Riena and RAP integration
2. Git Support at Eclipse

And my number 1 feature of the Helios release is: Xtext, Version 1.0.

For those of you who haven’t heard of Xtext, Xtext is a programming language framework. Xtext bridges the gap between grammars, models and programming language tool support. Using Xtext you can create a powerful environment for your own DSL (domain specific language) or full fledged general purpose programming language.

There are a number of important features that make this such a powerful toolkit, including generated editors that support code folding:

folding Top 10 Eclipse Helios Features

styled content providers:

styledText Top 10 Eclipse Helios Features

quick fix support:

QuickFixNew Top 10 Eclipse Helios Features

quick outline view, and more:

QuickOutline Top 10 Eclipse Helios Features

There is also a number of tools to help you create Xtext grammars such as Grammar Content Assist:

grammar content assist Top 10 Eclipse Helios Features

Xtext also supports project builders and can even derive a grammar from an Ecore model.

I’ve been following Xtext for close to 4 years now (from its origins at openArchitectureWare and through the Textual Model Framework proposal), and it’s great to see this excellent tool declare its 1.0 release. Xtext also received much deserved praise for its outstanding website, large collection of getting started material and they even won the Eclipse Community Award for most Innovative Eclipse Project at EclipseCon this year.

Great work Michael Clay, Sven Efftinge, Moritz Eysholdt, Dennis Huebner, Jan Koehnlein, Sebastian Zarnekow, Heiko Behrens, Peter Friese and Knut Wannheden.

Throughout this series I’ve tried to cover a variety of different Eclipse projects, but this list is far from complete. Please feel free to leave a comment with your favourite Eclipse Helios feature. Or better yet, why not write an article about it?

Categories: News

Git Support, Top Eclipse Helios Feature #2

Ian's Blog - Mon, 2010-06-21 21:16

Only 1 more day until Eclipse Helios is release and we are down to my Top 2 features.

Over the life of Eclipse (Jeff McAffer tells me that he’s been working on Eclipse since 1999) a lot has changed. Eclipse started its life inside OTI/IBM. In November 2001 the Eclipse Consortium was announced and Eclipse was released as ‘Open Source’. For the next few years Eclipse grew, but was still mostly supported by a few large companies. New projects were proposed, new committers came on board, and Eclipse became the dominate player in the IDE space.  But as the popularity of Eclipse grew, so did its diversification. Then in April 2010, David Carver noticed that the number of active individual committers (those not associated with any particular company) was tied with IBM for the top spot.

Committers Git Support, Top Eclipse Helios Feature #2

What does all this mean and what does this have to do with the Eclipse Helios release? Well, as Eclipse continues to diversify, the Eclipse foundation will need a software revision control system that supports this diversification. The Eclipse Helios release marks the beginning of this transformation. Number 2 on my Top 10 List is: Git Support at Eclipse.

Three important components make up the Git support at Eclipse: JGit, EGit and the Git Infrastructure. JGit is a pure Java library implementation of Git version control system. JGit is licensed under the EDL has a number of users, including the Netbeans Git support.

EGit is the Eclipse tooling, and is build on JGit. There is currently support for a number of Git features:

Egitmenu 0.8.0 Git Support, Top Eclipse Helios Feature #2

History view:

Egit 0.8 history view Git Support, Top Eclipse Helios Feature #2

Repository View:

Egitrepositoriesview Git Support, Top Eclipse Helios Feature #2

Patch Support:

PatchContextMenu Git Support, Top Eclipse Helios Feature #2

The JGit / EGit team has excellent documentation and there is some great information on Git in general.  Git is being worked on by Matthias Sohn, Shawn Pearce, Chris Aniszczyk, Mathias Kinzler, Stefan Lay, Robin Rosenberg and Christian Halstrick.  However, a really big thank-you goes out to the past (and present) committer reps for bringing Git to Eclipse.  The initial Git contribution provided a number of unique licensing challenges that required unanimous approval from the Eclipse board of directors.  Git at Eclipse would not have been possible without their hard work.

In addition to the tool support, Eclipse.org has rolled out Git infrastructure for the community to make use of. There are Git mirrors for Eclipse projects and even Git repositories that some projects have started to migrate too. The big thank-you goes out to Denis Roy and Wayne Beaton for this.  Git really is the future of Eclipse, and if all goes as planned, Git will be on my Top 10 List again next year.

Categories: News

EMF, Riena and RAP integration, Top Eclipse Helios Feature #3

Ian's Blog - Sun, 2010-06-20 22:43

Well here we are, it’s release week. Eclipse 3.6 — Helios — will be available on Wednesday June 23rd. It also means that I’m into my Top 3 features for this years release. For the past 7 days I’ve been presenting some of the New and Noteworthy features of this years release.

Number 3 on my Top 10 list is EMF, Riena and RAP integration.

I’ll be the first person to admit that when I first heard about the Rich Ajax Platform (RAP) I didn’t get it. I assumed RAP was about re-recreating the Eclipse UI in a browser. I, like many others, quickly realized that this is not the point of RAP. RAP brings the Eclipse programming model – Jobs API, JFace content providers, SWT API, Stacks, Forms, Selection Providers, etc… to the browser. If you appreciate the Eclipse programming model, and more importantly, if you have invested in the Eclipse programming model, then RAP is your best friend.

Of course you *can* re-create the Eclipse UI in the browser:

rap workbench EMF, Riena and RAP integration, Top Eclipse Helios Feature #3

but this likely is not what you want to do. Instead, you want to reuse your existing software and theme it for a rich web experience.

dashboards screenshot EMF, Riena and RAP integration, Top Eclipse Helios Feature #3

The concept of reusing your hard work across multiple mediums is known as Single Sourcing.  And it’s not just about the web; the new RAP protocol (not part of Helios) will open up a whole new world such as RAP on the IPad.

There are a number of notable new RAP feature in Helios including Opaque menus:

opacity EMF, Riena and RAP integration, Top Eclipse Helios Feature #3

Drag and Drop:

dnd EMF, Riena and RAP integration, Top Eclipse Helios Feature #3

New Themes:

fancyDesign EMF, Riena and RAP integration, Top Eclipse Helios Feature #3

Cheatsheet support:

cheatsheets EMF, Riena and RAP integration, Top Eclipse Helios Feature #3

Control Decorations:

ControlDecoration EMF, Riena and RAP integration, Top Eclipse Helios Feature #3

and Graphics context support:

gc2 EMF, Riena and RAP integration, Top Eclipse Helios Feature #3

For these features, kudos goes out to Ralf Sternberg, Holger Staudacher, Tim Buschtoens, Ruediger Herrmann, Austin Riddle, Ivan Furnadjiev and Benjamin Muskalla.

While the new RAP features are incredible, RAP demonstrates the real power of Helios — cross product integration. Other Eclipse projects are starting to target RAP as runtime. In particular, Elias Volanakis has extended the Riena framework to make it work with RAP. You can now use the powerful Rigets on the web.

riena on rap EMF, Riena and RAP integration, Top Eclipse Helios Feature #3

Finally, Kenn Hussey has extended the EMF Framework to target the Riach Ajax Platform as well as the RCP Platform.

Rapemfproperties EMF, Riena and RAP integration, Top Eclipse Helios Feature #3emf rap EMF, Riena and RAP integration, Top Eclipse Helios Feature #3

Thanks everyone!

Categories: News

MarketPlace Client, Top Eclipse Helios Feature #4

Ian's Blog - Fri, 2010-06-18 12:34

As most of you know, Eclipse Helios will be released next week.  For regular readers of my blog (and PlanetEclipse.org), you know that I’ve been counting down some of the new features available in this release. During this series I have received comments (both in the comment fields, and on places like twitter) that essentially read: I really like Eclipse except it doesn’t have an editor for XYZ. Or, when I get the following package, it has feature ABC which I don’t want. Obviously we can’t please all the people all of the time.

It was feedback like this that inspired Feature Number 4 on my Top 10 List: The Eclipse MarketPlace Client.

As we all know, Eclipse is much more than a Java IDE. In fact, Eclipse is an entire eco-system with thousands of plug-ins. Some of these plug-ins are packaged with the different Eclipse downloads. Other plug-ins are available as projects at eclipse.org. However, there are also thousands of plug-ins that are not hosted at Eclipse. Some of these are commercial tools developed for enterprise customers. Others (like one of my favourites — the vi plugin) have a small cost associated to help pay for the developers time. Finally, there is a large assortment of plug-ins available from a variety of other hosting sites. Finding and installing these components has always been a challenge, but with the Helios release this will all change.

The MarketPlace Client (MPC) makes it easy to browse and install 3rd party components. Available under Eclipse -> Help, this new feature should make it much easier to find the tools you need.

mpc MarketPlace Client, Top Eclipse Helios Feature #4

Also, unlike other ‘famous’ markets (or app stores), the Eclipse MarketPlace Client is plug-able and open, meaning vendors are free to create custom market places for their particular needs. Helios currently ships with 2 marketplaces, one from Eclipse.org, and the Yoxos Market hosted by EclipseSource. The two markets are slightly different in that the Eclipse MarketPlace lists plug-ins for a variety Eclipse versions, while the Yoxos MarketPlace is a curated repository of Helios related content.

mpc2 MarketPlace Client, Top Eclipse Helios Feature #4

It was the great work from David Green and Steffen Pingel that brought us this feature. Nathan Gervais from the Eclipse Foundation did the server side work, while Ian Skerrett was the point person behind all of this.

In addition to the MarketPlace, the Eclipse Foundation, with the help of Google, have launched the Eclipse Labs.

Eclipse Labs is a community of open source projects that build technology based on the Eclipse platform. It provides the infrastructure services typically required by open source projects, such as code repositories, bug tracking, project web sites/wiki. Eclipse Labs is hosted by Google Code Project Hosting, so it will be very familiar to developers already using Google Code Project Hosting.

Combing the market place with Eclipse Labs will make it much easier for developers to create, publish and distribute their products to the community.

Categories: News

Awesomeness in Helios

Del's Blog - Fri, 2010-06-18 12:00
There was a recent post by Wayne about how Eclipse is an IDE Platform. For all practical purposes, that is the way that I use it. But what is really awesome about Eclipse as an IDE platform is that it is an IDE platform that just keeps improving.

I have to admit that I honestly don't keep up with the latest improvements to the Eclipse IDE. The reason is simply practical. I work on making tools for doing research on IDEs. I would like my current project, Diver to be as compatible as is reasonable so that I can decrease the barrier to adoption. If I work with the "latest and greatest" version of Eclipse, I am just too tempted to use the "latest and greatest" features of Eclipse in my own tools, which means that people running the current release may not be able to work with what I am building. It's really just a matter of discipline.

But now that Helios is about to be released I've started to use it and I find it totally awesome. Ian Bull has been faithful to his top ten list of the newest and greatest features of Eclipse. I don't know if I will do the same, but I think I might report on the great things about Eclipse as I discover them.

So far, the number 1 through 10 greatest feature of Helios for me has been its new integration with the Eclipse Marketplace (and the Yoxos Marketplace catalog). I just recently had to change computers which meant that I also had to reinstall Eclipse. Normally, I would use Yoxos to create my own custom distribution of Eclipse, but I wanted to try Helios and all its cool new features. I use a number of 3rd party tools that aren't available in the standard Eclipse p2 repositories. For example, I work in academia which means that I write papers. I use LaTeX for writing papers, which is a surprisingly process to writing software. So, I use Eclipse to write text as well as software. It used to be total pain to search for my LaTeX plug-in, add its p2 repository to my list of "Available Software Sites", and install. That is no longer the case. Now, I can just load up my Eclipse Marketplace client and do a quick search for "LaTeX":


And there you go. No hassle, no fuss, I just install and that's it. It's beautiful.

As an aside (and a shameless plug), you can do the same thing with my project "Diver". Just load your Eclipse Marketplace client. A quick search for "reverse engineering" will point you straight to my favorite Eclipse tool ;-) :

Categories: News

p2 API and the b3 Aggregator, Top Eclipse Helios Feature #5

Ian's Blog - Thu, 2010-06-17 12:44

The official Helios release is less than 1 week away, and we are now into the Top 5 Features that I’m most excited about. Over the past week I’ve been highlighting some of upcoming features of the Eclipse Helios release. These features include: improvements to the Java Development Tools, Plug-in Development Environment, API Tools and the Eclipse Platform. Number 5 on my Top 10 List is: p2 API and the b3 Aggregator.

On Monday I discussed the importance of API when it comes to Eclipse projects. The p2 team has been working on the API for almost 3 years now and when Helios is released the p2 API will be official. What does this mean? It means you can build provisioning solutions around p2 without worry that the entire system will change from under you. In fact, I’ve been on both sides of the p2 fence: helping to define the API and then building the new Yoxos Launcher and Yoxos Enterprise solutions, using this technology.  If you are building a system that needs SelfUpdate, Install, Uninstall and RollBack, and you have anything more complicated than a few static dependencies, you should really consider p2.

Here are some of the API highlights:

1. Support for multiple agents: This means you can manage multiple applications using a single controller. Once you create (or acquire) the agent, you can acquire agent services for: computing provisioning plans, working with metadata, working with artifacts, performing installs, etc… We make heavy use of this in Yoxos since our systems can both update themselves and manage your Eclipse installs.

2. A new approach to Queries: Querying metadata is an essential part of any provisioning system and p2 now supports both a p2 Query Language and a simple QueryUtil class to create the most common queries.

3. Java 5 generics: No we did not just leave all the Java 1.4 people behind and finally decide to move to Java 5; but rather, p2 now uses generics and down-compiles to Java 1.4 for backwards compatibility. This is a huge step forward for all the Java 5+ developers out there.

4. The operations API: The saying “Make easy things easy and hard things possible” has been on our mind as we designed the p2 API. While p2 has a very powerful planner (award wining planner I should add), the idea of crafting provisioning plans and executing these plans on an engine in order to affect a profile, is quite frankly — complicated! Using the operations API you can easily invoke common “operations” like update this item, or install this other thing. For an idea of what’s involved, please see our help documentation.

5. Real API: You will notice that we dropped provisional from many of our package names. Feel free to browse the p2 API Java docs.

There are also a number of improvements to the API to make things more consistent.

Thanks goes out to the entire p2 team for all the hard work (and heated discussions :-) ). In particular, John Arthorne, DJ Houghton, Thomas Hallgren, Susan McCourt, Daniel Le Berre, Simon Kaegi, Andrew Niefer, Henrik Lindberg, Matthew Piggott, Tom Watson and Pascal Rapicault.

In addition to the API, Steffen Pingel and Susan McCourt have worked on a new Discovery UI which can be used to provide a branded presentation of a p2 repository. Tools like Mylyn use this UI to make it easy for users to install Mylyn Connectors.

connector discovery small p2 API and the b3 Aggregator, Top Eclipse Helios Feature #5

Finally, there are other projects around Eclipse.org that make working with p2 a little easier.  PDE/Build, b3 and the newly proposed Tycho project make it possible to build p2 repositories.  However, one project is the real workhorse behind the Helios release — the buckminster / b3 aggregator.  The aggregator combines repositories from various sources into a new aggregated p2 repository. The aggregator has both a UI component and can be run headless (i.e., you can aggregate p2 repositories as part of your build process). Also, you can use the aggregator to get a detailed view of what’s in a p2 repository.  For more information on this impressive tool, checkout their wiki page.

800px B3 aggregator sample 1 p2 API and the b3 Aggregator, Top Eclipse Helios Feature #5

In addition to creating aggregated p2 repositories, the b3 aggregator can produce maven repositories.

Kudos for this work goes out to Thomas Hallgren, Henrik Lindberg, Filip Hrbek and Karel Brezina.

Categories: News

Target Platform Improvements, Top Eclipse Helios Feature #6

Ian's Blog - Wed, 2010-06-16 07:07

There are three large groups of artifacts that play a key role while writing software. There are the tools you use, the code you write and the libraries you depend on. There is a large body of research studying the cognitive support provided by software development tools. There is also a number of tool centric development models. Facilities like Yoxos and the Eclipse Market Place help you manage these tool chains.

Regarding source code management, there’s a endless debate over which tools, technologies and techniques we should use. In fact, most University curriculum’s spend a great deal of time on how to best architect, design, document, write and manage source code.

However, when it comes to the management of your 3rd party libraries — the code you need but you don’t write — this is very much an ad hoc process. Finding dependencies, including them on your build path, finding the corresponding source, determining (and locating) which version you need, etc… is mostly a manual process:

  1. Figure out what jar you need (Apache commons collections for example)
  2. Use google to search for the jar
  3. Add the jar to your path
  4. Run
  5. Look at the errors
    • Did you have the right version
    • Did you miss any dependencies
  6. Figure out what else you need to find
  7. GOTO 2

Lucky for us as Eclipse developers, PDE’s Target Platform and Target Definitions make this process effortless. You can define and share you dependencies with your team. If you are missing a dependency, it can be automatically provisioned and placed on your build-path.

Eclipse 3.6 is hitting the shelves (or at least the download mirrors) in 1 week, and to celebrate this release I’ve been counting down the Top 10 features I’m most excited about. Number 6 on my list is the Improvements to Target Platform Management.

In Eclipse 3.6 you will be able to search repositories and quickly add components from these repositories to your target platform (Ctrl+Shift+Alt+A).

add to target Target Platform Improvements, Top Eclipse Helios Feature #6

In addition to this, a new quickfix allows you to search repositories for a missing import package and have a bundle supplying the package added to your target.

hover quick fix Target Platform Improvements, Top Eclipse Helios Feature #6

Finally, one of the biggest headaches for release engineers is collecting all these bundles that constitute your target. There is now a new export wizard that will export all the bundles in your target to a single directory. The tool will also generate a p2 repository. This repository can then be used in your build as a repoBaseLocation.

export target Target Platform Improvements, Top Eclipse Helios Feature #6

A big thanks goes out to Chris Aniszczyk and his army of Minions for this work ;-) .

Categories: News

Java IDE Improvements, Top Eclipse Helios Feature #7

Ian's Blog - Tue, 2010-06-15 12:38

As Eclipse committers, we spend lots of time emphasizing that Eclipse is not just an Integrated Development Environment. Eclipse is a framework, a tooling platform, a collection of run-time technologies, an eco-system, etc… However, at the end of the day, an IDE is the primary use of Eclipse for many people.

As we approach the next major release of the Eclipse platform — Helios — I’ve been counting down the features I’m most excited about. Number 7 on my list are the Enhancements to Eclipse as an IDE. These are features that will make your life easier as a developer (many of these features are Java specific, but not all).

The Java Development Team has released a number of new code formatter options:

codeformatter Java IDE Improvements, Top Eclipse Helios Feature #7While these are cool, the most exciting one (in my opinion) is the ability to disable formatting for certain code blocks

formatter disabling enabling tags preference Java IDE Improvements, Top Eclipse Helios Feature #7

formatter disabling enabling tags formatted Java IDE Improvements, Top Eclipse Helios Feature #7There are even a number of improvements to comment formatting.

As well as code formatting, the JDT team has introduced some new capabilities including: a breakpoint details pane

breakpoint details Java IDE Improvements, Top Eclipse Helios Feature #7object instance counts

instance counts Java IDE Improvements, Top Eclipse Helios Feature #7and static analysis improvements:

unused object allocation Java IDE Improvements, Top Eclipse Helios Feature #7Huge kudos go the very active JDT team, including: Jayaprakash Arthanareeswaran, Deepak Azad, Frederic Fusier, Walter Harley, Ayushman Jain, Satyam Kandula, Markus Keller, Dani Megert, Kim Moir, Michael Rennie, Srikanth Sankaran, Olivier Thomann, Raksha Vasisht, Curtis Windatt and Darin Wright.  Over the next year the JDT team will be focusing on Java 7 support. If you are interested in helping with this effort, why not get involved?

In addition to Java specific enhancements, the Eclipse Platform team has been working on general IDE improvements.  One feature that really caught my eye was improved patch support.  Last year the Platform team improved the Java Compare Editor. However, these changes did not extend to the apply patch wizard.  As of Eclipse 3.6 this doesn’t matter because you can now use the synchronize perspective to apply patches:

apply patch in sync view preference Java IDE Improvements, Top Eclipse Helios Feature #7

ignore leading segments option Java IDE Improvements, Top Eclipse Helios Feature #7

This makes patch review a much easier process, especially since you can now apply a patch directly from a URL:

applyPatchUsingUrl Java IDE Improvements, Top Eclipse Helios Feature #7

The Platform team (especially Tomasz Zarna and Szymon Brandys) deserve the credit for this work.  Thanks everyone for making my life as a Java Developer easier.

Categories: News

Improvements to API Tools, Top Eclipse Helios Feature #8

Ian's Blog - Mon, 2010-06-14 11:01

I’ve been thinking a lot lately about what defines an Eclipse project? Not in the literal sense (a project hosted at eclipse.org that follows the EDP), but rather, what technical qualities do all Eclipse projects share.

Years ago the answer was simple, extensible IDEs. More recently Eclipse was defined as a tooling platform (for everything and nothing in particular), but when you start to look at where Eclipse projects are being used (Eclipse RT and modelling projects in particular), you realize that ‘a tooling platform’ doesn’t begin cover the spectrum.

Even eclipse.org has got out of the business of defining Eclipse. And while defining ‘Eclipse’ might not even be possible, there are a few technical qualities that all Eclipse projects share:

1. OSGi Based (most projects produce OSGi bundles)
2. A strong commitment to API
3. Meaningful versions

It’s the last two points that I want to focus on here.

Most Eclipse projects defined their versions based on API. We don’t use version sequences like 3.0, 3.1, 95, 2000, XP, 7. Instead, Eclipse projects define their versions such that a change in a version number indicates API compatibility (or incompatibility). There was an excellent tutorial at EclipseCon on this topic, and I believe that the connection between API and versioning is so important that it should be part of the undergraduate curriculum for software engineering.   Like many software engineering activities, manging the relationship between your API and version number is challenging, but it can be aided through tool support. Lucky for us there is an entire Eclipse component dedicated to API tooling.

As we approach the Helios release, I’ve been counting down the Top 10 Features I’m most excited about. Number 8 on my list is Improved API tooling support.

API tooling has been included in Eclipse for a number of years now. When you enable API tooling as consumer of API, you can identify when you are using methods you shouldn’t be:

api consumer Improvements to API Tools, Top Eclipse Helios Feature #8

Producers of APIs can use the tooling to help identify when they have ‘broken’ API:

api producer Improvements to API Tools, Top Eclipse Helios Feature #8

While these features have been around for a number of years now, there are some noteworthy additions to API tooling. There is now a launch configuration which can be used to track API usage and generate HTML reports. This allows producers (and consumers) to see who is accessing non-API packages / classes / methods:

api launch Improvements to API Tools, Top Eclipse Helios Feature #8

scan report1 Improvements to API Tools, Top Eclipse Helios Feature #8

While this is valuable information, most most exciting advancements of API tooling is the new Migration Report. Using the migration reports we can determine if our bundle(s) can be (easily) migrated to a new version. For example, if the API producer decided to change the signature of their apiMethod to include a new parameter (and they properly released version 2.0 of their bundle), we could run the migration report and as a consumer.  Doing so would uncover any migration issues. In particular, on Line 10 in the doSomething method, we invoke a method that no longer exists.

migration Improvements to API Tools, Top Eclipse Helios Feature #8

More information on the migration tasks can be found on the Eclipse help system.  Thanks to Chris Aniszczyk, Michael Rennie, Darin Wright and Olivier Thomann for this work.

Note:  If I missed someone in the kudos, please let me know. I do my best to track down who worked on each feature :-) .

Categories: News

Congress 2010 — An interdisciplinary experience in Montreal

Christoph's Blog - Mon, 2010-06-14 10:11

As suggested by the outside member on my PhD committee — Ray Siemens from the Department of English at UVic — I attended a day of Congress 2010 in Montreal in early June.

Congress 2010 refers to the Congress of the Humanities and Social Sciences, an event that is held once a year at a Canadian University. The Congress is the premiere destination for Canada’s scholarly community in the Humanities and Social Sciences. Congress 2010 in Montreal at Concordia University had about 9,000 attendees.

The main reason I attended was a panel discussion between Pierre Levy and Alan Liu, two of the leading minds in the field of new technologies and their impact on society. It was a bilingual conversation (truly Canadian, with simultaneous translation) called “Collective Intelligence or Silicon Cage?: Digital culture in the 21st century”. Levy’s point was basically that digital media can help us understand our knowledge as a society because it works as a mirror of collective intelligence, while Liu warned that we run the risk of monotony and singularity, and that everything converges towards the same idea if we don’t have several institutions in between individual and universal. I was fortunate enough to have lunch with both panelists, and could discuss some of my research with them. Collective intelligence and emergent knowledge structures in software development are closely related.

It was very stimulating and inspiring to attend an event from a different discipline, and I found it also really interesting to see how these events are organized in other disciplines. Some good ideas that we might be able to adapt for Software Engineering venues:

  • Use a mix of panel discussions and paper presentations, to foster a more interactive environment. We have controversial issues in Software Engineering as well.
  • Produce youtube clips with highlights from every day such as this one. It’s a great way to keep people in the loop who are unable to attend, and it captures the spirit of the event.
  • Choose a university campus as venue, especially one that’s right downtown. It was great to be fully emerged into Montreal during the lunch breaks, with a huge selection of lunch places.
  • Use a different pricing model. I paid a total of 15 dollars to attend Congress 2010.
  • Be interdisciplinary. Meeting researchers from other disciplines can be very inspiring. It forces us to focus on the essence of our work, gives us the opportunity to find a broader perspective, and can lead to great ideas. When it comes to related work, I’m thinking 15th century now…

Add to TwitterAdd to FacebookAdd to DiggAdd to Del.icio.usAdd to Stumbleupon


Categories: News

CrossFit Scoring - Alternative Perspective

Sean's Blog - Sun, 2010-06-13 09:48
This is my third foray into the world of CrossFit data analysis, this time looking at scoring. Recently, over at the CrossFit Games website, there was an article about Scoring CrossFit Competitions as well as one talking about Scoring Technology. I participated in a number of conversations in the comments section with regards to how competitions are scored. This post is going to be a bit of a summary of those conversations as well as a description of a new extension to the CrossFit Data Explorer that allows people to explore different scoring schemes for any given competition (i.e. alternative perspectives).

For those that are unaware, there are a variety of approaches to scoring competitions. I am not going to go into detail about these different metrics as they are described in both the articles I linked to above. The main thing to keep in mind is that each scoring system has certain "flaws" and depending on how an event is scored, there can be different outcomes.

I wanted to explore some of these different scoring schemes as well as allow others to play with these different metrics. To do this, I added a new tab to the CrossFit Data Explorer called "Scoring tables". Check out the screenshot below displaying data from the Men's Northwest Regional.


This view shows a table of the athletes, their results for each event, and it is sorted by the overall placement. On the left-hand side, you can choose how to score the athletes. The currently available scoring options are:

  • Ranked-based - each athlete receives points based on their ranking in a workout, 1pt = 1st, 2pts = 2nd, ..., 50pts = 50th. The athlete with the lowest overall score wins.
  • Proportional - Athletes receive scores based on their relative performance to the top scoring athlete in an event. Highest overall score wins.
  • Lowest converted points (LCP) - Timed workouts receive 1 point per second, all other results are subtracted from this score. Lowest overall point total wins.
  • Standard score - Athlete receives points based on the distance between their score and the population mean. Highest score wins.

The other feature supported by this new scoring component is that you can filter workouts from consideration in the ranking process. For example, if you thought, "Damn, if that stupid deadlift workout hadn't been in the regional, I would have made it." Well, now you can verify whether that's true :-).

Ok, time to dig into comparing some scores. Keeping with the Men's Northwest Regional, if we re-rank the athletes based on the four supported scoring systems, we get the following results (top three athletes made games from this region).

SystemFirstSecondThirdRank-basedChris SpeallerJerome PerrymanEric O'ConnorProportionalChris SpeallerJerome PerrymanEric O'ConnorLCPJerome PerrymanJordan HollandChristopher DunkinStandardChris SpeallerJerome PerrymanJordan Holland

As we can see, there's a certain amount of fluctuation in the results, especially for the LCP metric. In fact, with LCP, Eric O'Connor finishes 5th and Chris Spealler comes in 10th! (See screenshot below).


The major reason for this is the way the scoring system is designed. Each workout is scored completely independently from all other workouts, so a particular workout can completely dominate the final value. For example, coming in first in workout #1 at this regional gives you a score of 159 while coming in first in the second workout gives you a score of 10,908. That's a massive difference!

In fact, if we look at the correlation between the LCP ranking and the ranking based purely on the second workout result, there's a strong correlation of 0.45 (p < 0.01). In comparison to the actual official ranking results, the correlation between workout #2's ranking and the official ranking is only 0.37 (p < 0.01).

One really nice thing about the proportional, LCP, and standard score is that they "reward" people for being exceptional. For example, consider the Canada Men's Regional results. Erik Szakaly completed the Wall-ball/Pull-ups workout more than a minute ahead of the next fastest athlete. Check out the picture below, look at how far Erik's result is separated from all other athletes. Obviously this guy eats wall-balls for breakfast.


He received a low score of 1 point at the regional, but in the standard score system he would have received 2.17 points, almost a full point ahead of the next closest score (see below). This is a fairly significant margin in this scoring system.



Perhaps some kind of hybrid scoring system is needed. I think as CrossFit evolves, so will the scoring metrics, just as we see with other sports.

As fun as it is to play with all this data, there's a few things to keep in mind. First is that switching the scoring system and re-ranking athletes assumes that the athletes would perform the same regardless of the scoring system. However, that's quite possibly not true. For example, Garth Prouse won the Canadian Regional run quite easily and received 1 point for coming in first (rank-based system). However, if he knew that he was being scored based on one of the other three systems, he may have pushed his pace harder to give himself a larger lead.

Another assumption that is made is that the people that designed the workouts would include the same set of workouts regardless of the scoring system used. This may not be the case. For example, in the Northwest regional, perhaps the workout designers would have scored workout #2 differently if they were planning on using an LCP system so as to not bias the result in favor of performances in this workout.

The other thing to be aware of is my application has to make certain assumptions. It assumes that all workouts with a time-based result are ranked lowest to highest, and all other workouts go highest to lowest. In the Canadian Regional, the run did not have times recorded, so including this event in the score comparison leads to misleading results.
Categories: News

Feature based configurations, Top Eclipse Helios Feature #9

Ian's Blog - Fri, 2010-06-11 10:43

Yesterday I asked you to think about high-quality software that has been consistently delivered on-time for eight straight years. To make this quiz more challenging, this software should be installed on millions of users’ desktops.

As we approach the next major release of Eclipse — dubbed Helios — I’m counting down the top 10 features I’m most excited about. Yesterday’s feature was the improvements to Resources. Number 9 on my list comes to us courtesy of the Plug-in Development Team: Feature based targets and launch configurations.

I’ve read many articles over the years that compare IDEs. These articles often critique the support for particular programming languages. For example, some IDEs support Closure, or Haskel better than others — and if you use these languages then you should use the tool that best supports your workflow. For me, one of the most important IDE features is OSGi tooling. I spend my days writing bundles, crafting configurations, wiring together services and deploying OSGi based applications. Nothing compares to the OSGi tooling provided by Eclipse. The OSGi tooling is part of the Plug-in Development Environment. The reason it’s not called the OSGi Development Environment is strictly historical (these guys were doing OSGi tooling before it was cool).

Feature based targets and launch configurations make configuring your OSGi applications much more manageable. Features provide developers a way to ‘group’ bundles together and talk about things at a higher level. Instead of talking about the p2 engine, and p2 director, and p2 core, and all other bundles that ‘make-up’ p2, we can group these together into a feature — the p2 feature. For those of you wish you craft OSGi applications that include p2, you can simply use the p2 feature (instead of worrying about the particular bundles that constitute this feature). The real thank-you for this feature goes out to Ankur Sharma and Curtis Windatt.

feature launch Feature based configurations, Top Eclipse Helios Feature #9

target features Feature based configurations, Top Eclipse Helios Feature #9

This will become more and more important as Eclipse continues to grow in the Run-time space.

Categories: News

Resource Improvements, Top Eclipse Helios Feature #10

Ian's Blog - Thu, 2010-06-10 14:51

Pop quiz: Can you name any ‘high quality‘ software that has been consistently delivered on-time, for 8 years in a row?

Yes folks it’s that time of year again! It’s time for the Eclipse Release Train to start revving up its engine for another high quality release. Unlike other software — which gets delayed, bumped or put-on-the-back-burner — Eclipse just simply, delivers.

The official Eclipse release — named Helios — is not arriving until June 23rd, but in keeping with tradition, I thought I would help count down the next 10 business days with the Top 10 Helios features I’m most excited about.

As many of you know, Eclipse is no longer Just a Java IDE. Eclipse is a tooling platform, Eclipse is runtime stack, Eclipse is set of world class IDEs, Eclipse is an eco-system, Eclipse is like family. Before I start to count down my favourite features, I should state that I only use a small subset of Eclipse. There are some really exciting things happening in the C/C++ development tools. The SWT team has added some fantastic new MacOS and Windows Vista support.

overlaytext Resource Improvements, Top Eclipse Helios Feature #10

progress Resource Improvements, Top Eclipse Helios Feature #10

The Birt and Web tools team continually pump out great code. While all these projects are doing cool things, sadly, I don’t get to use them on a day-to-day basis.  Because of this I can’t talk about them here. As I’ve said in the past, this really is My Top 10 List. So please — if you disagree with me — write your own Helios Review (you might even win a prize).

Number 10 on my list is Resource Improvements. This includes everything from virtual folders to the file permission management.

virtual folder Resource Improvements, Top Eclipse Helios Feature #10

file attributes ui Resource Improvements, Top Eclipse Helios Feature #10

As well, a number of enhancements have been added to the Open Resource dialog.

open resource path relative Resource Improvements, Top Eclipse Helios Feature #10

Thanks goes out to the Resource Team, and in particular Serge Beauchamp, for this.

In addition to the improvements to resources, one of the oldest feature requests related to files has finally been fixed: Bug 4922 (yes, a 4 digit bug number). Eclipse now has the ability to open a file from the command line (and have it open in an existing running instance of Eclipse). While this may seem like a trivial enhancement request that all IDEs should support, the API behind this feature is what makes it so interesting. As RCP developers, you can make use of this API to load data files into other running processes. As Eclipse programmers you can double click Java files and have them open in Eclipse.  And like everything in Eclipse, this works across platforms.  If you’re interested in the technical details, checkout Andrew’s Blog. Huge kudos go out to Andrew Niefer, Kevin Barnes, and Oleg Besedin.

Categories: News

ICSE 2010 highlights

Christoph's Blog - Tue, 2010-06-08 09:58

Now that the papers from ICSE 2010 are available in the ACM digital library (Volume 1, Volume 2, workshops), it’s time for a blog post about my personal highlights from ICSE 2010 in Cape Town, South Africa, in May 2010. This is of course very subjective, and it follows the tradition that Jorge Aranda started last year.

SUITE

For me, ICSE 2010 started off with SUITE, the 2nd International Workshop on Search-Driven Development organized by Sushil Bajracharya, Adrian Kuhn, Joel Ossher and Yunwen Ye.

I was pleasantly surprised by how well the very discussion-focused format of SUITE worked. The paper presentations were short (5 minutes) and they were all done in the morning. That left quite a bit of time for discussion in the morning, and even more time for discussion in the afternoon. These discussions didn’t seem overly regulated, and it was great to see how topics emerged.

Topics of the workshop ranged from API search and immediate search in the IDE to dynamic filtering and Semantic Web. As discussion topics for the afternoon we selected IDE integration, developer needs, and the creation of a reference collection for SUITE researchers. It was great to see that developer needs turned out to be the most popular topic — a first sign that the ICSE community is focusing on human aspects more and more.

FlexiTools

FlexiTools, the workshop on Flexible Modeling Tools, organized by Harold Ossher, André van der Hoek, Margaret-Anne Storey, John Grundy and Rachel Bellamy covers a really interesting area. The workshop addresses the problem that formal modeling tools (such as UML diagram editors) and more informal but flexible, free-form approaches (such as white boards or office tools) have complementary strengths and weaknesses. The goal of this workshop is to develop flexible modeling tools that have the advantages of both approaches.

The workshop was structured into 3 paper sessions and a concluding discussion session. The day started with requirements for flexible modeling tools, focusing on support for creative work, support for incremental work and changing conditions, support for alternatives, and support for capturing the evolution of models. In the following session on “Unstructured to Structured”, we discussed how unstructured informal models could incrementally be transformed into structured formal models. Several tools such as BITKit were demoed in the afternoon session on Tool Infrastructure.

MSR

Due to several double-bookings, I wasn’t able to attend all of the working conference on Mining Software Repositories, but I did manage to present our MSR challenge paper on bug lifetimes in FreeBSD.

My personal highlight of MSR was Michele Lanza‘s keynote “The Visual Terminator”. His brilliant slides are online on slideshare.net. The keynote was about software visualization, a term he defined as “The use of computer graphics to understand software”. After telling the stories behind tools such as CodeCrawler and CodeCity, he argued that it is time to rethink software, that software is more than text, and that visualization is the key. While I didn’t agree to all the points he made — I don’t think “empirical validation of visualizations is suicide” — the keynote had everything that a good keynote should have: an outsider’s perspective (visualization is not the core topic of MSR), a great speaker, and some provocative insights. My favorite quote: “Academic research is like Formula 1: driving around in circles, wasting gasoline … but it generates spin-off values!”

Web2SE

Our workshop on Web 2.0 for Software Engineering went really well. Two presentations by Sue Black outlining the results of surveys on the use of social media and Web 2.0 in software development set the stage for the rest of the workshop and also provided insights into her own use of social media, in particular twitter. The following two sessions were more tool focused — from tagging and commit messages to Codebook, mashups and wikis. Three topics were chosen by the participants for the concluding panel discussion: Information overflow, Privacy and Ethics, and the potential move of the IDE to the browser. The detailed notes from the discussion are available here.

To address information overflow, several solutions such as generating less information, generating summaries, voting, context-sensitive labeling, automated categorization, interaction mining and interruption management were discussed.

The discussion regarding privacy and ethics started with the example of the use of twitter at conferences, in particular the question whether it is ethical to quote other individuals such as keynote speakers in conference related tweets. We moved on to discuss our ethical obligations — both moral and legal — before reading communication channels from Open Source projects. A general problem identified in this discussion is that we do not have good metaphors for privacy in social media. The metaphors of filing cabinets and public art projects were suggested.

Looking at projects such as Mozilla Bespin and Heroku, there seems to be the trend of moving the IDE into the browser. First of all, it needs to be noted that neither the use of Web 2.0 mechanisms nor the storage of data “in the cloud” implies that the IDE has to be in the browser. Nonetheless, despite challenges such as concurrency, editor speed and naming schemes, there are reasons why the IDE could move to the browser: accessibility, collaboration, data integration, the same configuration for everybody and superficial reasons such as “the browser is the future”. To conclude the discussion, we had a vote among the workshop participants asking if IDEs should move to the browser. 10 votes yes, 4.5 voted no, and there was 1 maybe.

In the spirit of the workshop topic, we used Web 2.0 tools throughout the workshop to take notes collaboratively. While our Google Waves suffered from low bandwidth, twitter with the #web2se hashtag was quite active.

Doctoral Symposium

Due to the overlap with Web2SE, I wasn’t able to attend the doctoral symposium, and only went there to present my paper on emergent knowledge structures.

Main Conference

The main conference started off with a video welcome address by Desmond Tutu, and an excellent keynote by Clem Sunter. Without notes or slides, he gave a highly entertaining lecture on scenario planning with regard to South Africa and the world. He used the metaphors of foxes and hedgehogs to demonstrate different approaches to dealing with business decisions. According to Clem, foxes embrace uncertainty and change their mind when they realize that there’s something better out there. They reach the optimal decisions through their knowledge of the system as a whole. Hedgehogs on the other hand simplify life around one idea, more or less disregarding everything else. Clem did a great job relating these ideas to current events and leaders. His website mindofafox.com has more details.

The presentations of our paper on Awareness 2.0 and our NIER paper on tags went well. The highlights from other paper presentations included Andy Begel‘s presentation on Codebook. The room was very crowded when he talked about the survey they did at Microsoft which revealed that engineers need better ways to find connections between each other. The Codebook framework addresses this issue. In his talk about Supporting Developers with Natural Language Queries, Michael Würsch presented a framework that is able to process guided-input natural language queries that resemble plain English. The approach is based on an OWL ontology and the Semantic Web. The next paper in the same session addressed a similar problem, focusing on questions that require the integration of different kinds of project information. Thomas Fritz and Gail Murphy started by identifying 78 questions that developers want to ask but for which support is missing. They introduced an information fragment model along with a prototype implementation that was evaluated with positive results. Rachel Bellamy did a great presentation on the paper Moving into a New Software Project Landscape. They conducted a grounded theory study with 18 newcomers across 18 projects and found a wide range of interesting things such as the three primary factors that impact the integration of newcomers: early experimentation, internalizing structures and cultures, and progress validation. Thomas Fritz‘ presentation in their distinguished paper A Degree-of-Knowledge Model to Capture Source Code Familiarity started off with a cartoon clip that did a great job at outlining the idea of the paper: If several people collaborative on writing a paper, the degree of knowledge in the paper of one individual author will increase when this author edits to paper, and it will decrease as soon as another author edits the paper. Transferring that idea to source code, they showed that the degree-of-knowledge model can provide better results than existing approaches.

Cape Town and surroundings

Cape Town turned out to be a great place for a conference: Spectacular scenery, amazing food and wildlife not far away. I posted my best pictures here (public facebook album).

Add to TwitterAdd to FacebookAdd to DiggAdd to Del.icio.usAdd to Stumbleupon


Categories: News

More CrossFit Data Analysis - This time you can play too!

Sean's Blog - Fri, 2010-06-04 15:06
Don't care about my post and just want the good stuff, follow this link: CrossFit Data Explorer

My last blog post was quite easily my most popular post to date (although my Searchy-type problems post did receive a fair amount of attention). One former student of mine explained that my entry had, "the most practical findings for average people. Most of your posts even I have trouble understanding … you need a PhD".

Ok, so my blog isn't always for the faint of heart :-). Ah well. However, this time I'm going mainstream again. Well, not quite mainstream, but as mainstream as CrossFit is :-).

Due in part to the level of interest my last post generated as well as my own curiosity, I spent some more time looking at the CrossFit Games data. I've had some requests to analyze the women's data from the Canada Regional in the same way I analyzed the men's data. I haven't done that yet, but I think what I have done is pretty cool. And, more importantly, anyone can now interact and play with the data to discover their own interesting trends and results!

I decided to build a visual tool for interacting with the various data available on the CrossFit Games website. I started out building it as a standalone Java application, but that seemed soooo 1998. So, part way through development I abandoned the Java application in favor of a Web 2.0-style application, equipped with all the latest buzzword technologies (i.e. AJAX, jQuery, JSON, etc.).

I've made the application available online here, so feel free to play with it.

There's three different visualizations available that allow you to compare athletes based on different event modalities as well as athlete rankings across various events. It should work with any URL that resolves to an overall results page (example page: Men's Canadian Regional). I've pre-populated a drop down with all the regional results so you can select items from there or paste in an appropriate URL. The Men's Canadian Regional data is loaded by default. The application may not work with Internet Explorer, so I suggest Firefox, Safari or Chrome.

In the Compare events tab, you can set the graph axes to show results from different events. For example, using the default data, we can set the x-axis to be the first event, the 6.7 KM run, and set the y-axis to be the second event, the snatch complex. Comparing athlete values across these two events allows you to explore an athlete's strength versus endurance. If you look at the screenshot below, the tooltip shows that Garth Prouse was 1st in the run, but 30th in the snatch.


One interesting thing we can see in this view is how widely distributed the athletes are for various events. Consider the screenshot below where I plotted the overall placement versus the time for the double-under/burpee workout. Everyone is pretty closely clustered, but I've marked two distinct outliers. I'm guessing these two individuals must have struggled with double-under technique. Even those these two scores appear to be outliers, if you read my last post, there was little statistically significant difference between these athletes on this particular workout. In contrast, there is a lot of variance in the distribution for this workout when you inspect the women's data (second screenshot below).




In the Compare athlete rankings tab, you can inspect an athlete's ranking in each event as well as their overall ranking. For example, in the screenshot below I've selected only the top 6 athletes from the default data. We can see that three of these athletes (Erik, Nate, and Dan), for the most part, were pretty consistent across all events. On the other hand, DJ Wickham has an outlier on the run, while Garth and Michael have an outlier on the snatch complex event.


Finally, in the Event rank comparison tab, you can compare an athlete's ranking in a specific event versus their overall placement. This allows you to visually correlate how closely tied an event's ranking for an athlete is in comparison to how they did after completing all events. The screenshot below shows all athletes and their rank in the run versus their overall ranking. We see that Cam's ranking went from 25th in the run to 34th overall, while Jason Fleming went from 10th to 50th and in the other direction, DJ Wickham went form 38th to 6th overall.


For those with technical expertise or those just curious, the way the application works is I send the URL corresponding to the overall results for a sectional or regional competition to the server-side code. Taking whatever URL is provided, I make a server-side request to the URL, get the HTML contents and parse out the values of the overall results table using the PHP Simple HTML DOM Parser. I load this into a simple datastructure (a hashtable of hashtables), which describes the athletes, the events, and all the various results. This information gets encoded as JSON and sent back to the client (front-end).

On the client, I convert the JSON text into a Javascript object/associative array. Then, based on whatever tab is selected, the data is processed into a data series for each athlete. I use Flot to handle the rendering of the graphs. All the interactive behavior and some of the UI is built using jQuery.

The data is split into results for each event and overall placement. For each of these, the results are split into rankings and actual scores. For example, in the default dataset, athlete "Rogers, Dan", has both an overall rank of first as well as an overall placement score of 37. I do some simple things like recognize times, which for plotting purposes are converted into seconds.

Please post to any questions, suggestions, or general comments. Please let me know if you discover anything interesting :-).
Categories: News

Eclipse Labs: To Migrate or Not to Migrate?

Del's Blog - Thu, 2010-06-03 17:01
I'm really happy about the recent creation of the new Eclipse Labs. While I would absolutely love to have my Diver project as an official Eclipse project in incubation, I'm afraid that the process is too heavy-weight. I work in research which often requires a very malleable programming schedule... I don't think that I would be able to hit the release targets of Eclipse. Plus the IP process can be a little long. My friend and colleague Ian Bull was able to make his Zest project an official Eclipse project (it is now part of GMF/draw2D) while he was doing his Ph.D... he did have a little programing help from some of the other members of our lab ;-). I'm a Masters student, though: the program is too short to wait for my project to be approved by the committers.

So, Eclipse Labs is exactly perfect for what I need. It has the benefit of the Eclipse brand, but it doesn't carry the baggage of official Eclipse projects. It's too bad that it wasn't available six months ago when I was making Diver a tool open to the public. The best option I had at the time was SourceForge, which has been quite good, really.

So, now I'm left with the question of whether or not to migrate Diver to the Eclipse Labs hosting. I would really like Diver to have a closer tie to the Eclipse brand. But, I would lose a lot of work that was put into creating the project content for Diver over at SourceForge. Most specifically, I would lose are the hard work that I put into creating the web site. Google code has its wiki, which is OK, but it isn't as rich as what I've been able to make with my web page.

So: to migrate, or not to migrate? I don't know. Is there any real benefit?
Categories: News
Syndicate content