Who’s on the O’Reilly Open Source Radar?

From Radar O’Reilly:

Greenplum – Scott Yara. At the heart of most web 2.0 applications is the management of big data. Greenplum’s massively parallel Postgres database is the highest performance open source database around.

Hyperic – Javier Soltero. If operations is advantage, Hyperic would like to help more companies find that advantage. They’re also building a Web 2.0 ecosystem in which the software gets smarter the more people use it.

Django: Adrian Holovaty. Like Rails, Django is a case where the application was kept proprietary, but the framework used to build it was released to the world. Is this the new model for how to open source a web application?

DabbleDB – Avi Bryant. OK. This thing is built with smalltalk. What else do you need to know? It’s a web-based database, potentially asymmetric competition for Access and Filemaker, but it’s also a data multiplexer that can help to give people more control over their own data.

Alfresco – Matt Asay. There are a lot of open source content management systems, but Alfresco is the one that’s targeted where the money is, and that has built the robust data store to meet the needs of big companies.

On the centrality of Excel

I have a very specific use case in mind for Enterprise Web 2.0, which consists of a business analyst mailing around spreadsheets and I’m sure it’s consistent from company to company.

There are a lot of people who are skilled Excel users who can do sophisticated analysis and reporting but who are cruelly (cruelly!) limited by their available tools. It would be fantastically useful for them to be able to call external services using something simple and familiar like Excel formulas.

A lot of times, people use Excel *just* for presentation, which is absurd, of course, but they know how to do it. HTML would be better, but it’s not as widely known. Most people I work with have never touched HTML or even know to look at “view source” in their browser.

Have you ever tried to move Excel tables onto the web? Someone who went to the trouble of creating an Excel export to HTML tables would be doing everyone a great favor.

Then all you have to do is add some all-important eye candy and publish it, so that instead of emailing around zipped files of multi-tabbed Excel extracts from Oracle databases, you could have a URL perhaps with an RSS feed that would give me the three numbers that I really need: revenue, utilization, and pipeline for my practice.

Enterprise Web 2.0 Use Case

Our hero, Kevin Whatnot, is a business analyst at Kossar’s, a tier-one bialy manufacturer. He’s responsible for maintaining the sales forecast and the factory production and utilization information, among other tasks.

He gets this information from a mix of sources: Kossar’s Siebel SFA (sales force automation) system, a homebrewed SQLServer-based inventory system, a partially-implemented Manugistics package for supply chain. And email. And lots and lots of phone calls.

Every Thursday, he hosts two long conference calls; in the morning, for sales managers to give their updated sales forecasts for each of their regions, and in the afternoon for factory managers to give their updated production forecasts. He crunches all of this information in a complicated Excel spreadsheet that he mails out first thing on Friday morning and then again, with updates from the various systems, again on Wednesdays in preparation for the big Thursday calls.

Now, Kevin is a skilled Excel user; he’s capable of doing sophisticated analysis in Excel, but he tends to use it for *everything* including tasks that it’s not ideally suited for, such as presentation. The sets of spreadsheets that Kevin emails around each week are each multi-tabbed affairs, sent as zipped attachments to save on space, but which do not themselves do a lot of calculation. Excel largely serves as a presentation layer for Kevin’s massively complex data.

He’s more of a user than a superuser for the other enterprise apps he deals with: Manugistics, for example, is mostly a mystery to him but he needs a couple of numbers from it each week and he knows how to navigate the system to find them. Likewise with the homebrewed database application; Kevin suspects that no one understands it completely anymore. He gets around inside of it like he drives around Boston; lacking signposts or a navigational system he can comprehend, he relies on landmarks and prior experience. And he hopes that he doesn’t make a wrong turn.
His victims, the sales and factory managers, are likewise cruelly (cruelly!) limited by their available tools. They understand Excel, perhaps not as well as Kevin but they probably have a better sense of the business implications of his number-crunching.

Google Spreadsheets

So Bob Hull, who doesn’t give a rat’s ass about Web 2.0, sends out this email today to a couple of other spreadsheet nerds in my practice:

Hey,

Been trying out the web-based shared spreadsheet beta at google. I’m not sure how fully compatible it is with .XLS formats, but it is a great tool for collaborating remotely (e.g., when Nadine and I were working on the [client] business cases, which weren’t full of advanced functions, it would have helped a lot).

If you haven’t tried – do

I’m sure there are some security issues – although it is PW protected.

The specific case he refers to is illustrative in a couple of ways. We were working on a Linux migration strategy a big Novell customer, including doing financial analysis of several scenarios (e.g., Oracle/Solaris -> Oracle RAC/Linux, Windows -> virtualized Windows on VMWare or XEN). So that’s cool and everything. But we were also working in different locations (Bob’s wife was hospitalized during the project, so he had to leave to care for her, but he foolishly kept working remotely) and with terrible collaboration tools. It was really a case of the slow boat in the convoy determining the speed for everyone.

On that project, as Bob indicates, we really could have used Google’s spreadsheet, or one of the alternatives that are out there.

And, by the way, I’ve been testing it and it seems to import .xls files – as long as they’re fairly simple – with no problem at all. So you can’t bring in a multi-tabbed pivot table rich charting dashboard, but the single tab discounted cash flow analysis comes across fine. And you could reasonably argue that doing DCF is what a spreadsheet ought to be doing, not running your company.

So what?

The product itself is interesting, in that it’s a viable alternative right off the bat to Excel and OpenOffice. But, as I’ve said before about Writely (which Google has subsequently acquired, so they have a word processor and a spreadsheet and check out S5 for presentations), these web based office tools have built-in advantages. Bob highlighted the collaboration aspect, but there are others. For me, foremost among them is format. I don’t know how long .doc is going to be around, but I can bet you that .html is more viable.

Google’s spreadsheet is the first good way I’ve seen to go from .xls to .html, which is not trivial. A lot of spreadsheets, including at Novell, get mailed around principally as presentation layers; columns and rows of numbers. You can do this in HTML, of course, but that’s not in the toolkit for most Excel users. There’s an Export to HTML option in Excel (which is generally very good about importing and exporting), but it generates ‘orribly non-standard HTML. So if I want to generate a web page from my Excel spreadsheet so that I can share it with my team mates, currently the best option that I know of is Google Spreadsheets. Or maybe MS Sharepoint.

Also, if Google plays their hand correctly, it will get better quickly on the basis of the community around it. Do you need a function to convert pre-July 11, 1998 Thai bhat to dollars? Maybe someone had that itch once and scratched it and released it into the wild so not only are there the Excel functions from Microsoft (=sum(b1:b15)) but also (=oldbhat(value, target_currency)).