Oracle Glassfish Now Supports Jython and DJango

Oracle -- as you know -- plans on purchasing Sun and all their Java-licious technology. This includes the open source Glassfish application server, which is a free competitor to Weblogic, which Oracle obtained in the Sun BEA acquisition... and they both competed with OC4J, which was Oracle's application server prior to 2008.

I -- along with everybody else -- am very curious to see how all this plays out... It certainly appears that OC4J has lost favor, and Weblogic stole the show... but now Oracle "owns" an open-source alternative to Weblogic as well. So which one should you choose? Naturally, this depends a lot on what out-of-the-box features and integrations you need... But if I were a developer creating a new application from scratch, I'd probably go with Glassfish. Besides being open source, they will soon have built-in support for JRuby/Rails and Jython/DJango web frameworks. To me, that says the people behind Glassfish really "get it" when it comes to delivering web frameworks that make developers more productive...

According to Vivek Pandey's blog, the latest preview release of Glassfish v3:

  1. Provides GlassFish v3 connector and deployer as OSGi module. Which means that deployment of a Python application will trigger Jython Container code.
  2. Wire up the HTTP request and response at very low level by implementing a GrizzlyAdapter, hence resulting in better runtime performance and scalability using grizzly scalable NIO framework.
  3. WSGI (Web Services Gateway Interface) is a Python standard to wire a Web Server to Python web frameworks such as Django or TurboGears etc. Jython Container implements WSGI interface and so it would be pretty easy to add support for various Python web frameworks. Currently, we have Django and we will have others such as TuroboGears, Pylons etc.
  4. Currently Jython Container is available thru GlassFish v3 Update Tool. In the future it may appear with GlassFish v3 core distribution.

His blog also has step-by-step instructions about how to enable Jython and DJango... with luck, this will be rolled into the final release, so these steps will be easier.

I'm also curious to see what Jake and the AppsLabs boys might think about Glassfish... those guys are building some of Oracle's most "social" applications, and they are big JRuby/Rails fans. I'm more of a Python/DJango guy myself. I've said many times that if I were to rewrite the Oracle Content Server from scratch, I'd probably have picked DJango as the core framework... But DJango in a Java container??? That's even better! Quick coding, easy modifications, plus the reliability of Java.

But that's just for my needs... others may prefer the "Weblogic way" for different reasons.

Looking forward To The Weekend...

Man, this has been a hectic few weeks... I just launched one site for a client. It went smoothly, but it was a lot of work and late nights. I've been spending so much time writing documentation that I lost the will to blog. Unfortunate... considering what happened this week.

I'm talking specifically about the highly disruptive Google I/O conference. It looks like Google Wave is going to be huge... it will no doubt set the standard for web-based email collaboration.

I'm happy that its using XMPP instead of HTTP behind the scenes... this is a great idea, since XMPP is a high-end instant messaging protocol, whereas HTTP is a freaking dinosaur. I'm hoping this push will mean that browsers might naively support XMPP in the near future... Imagine that! Being able to get data -- like RSS Feeds, new email messages, and bundles of web sites -- pushed to you when they change, instead of having to poll the web site a bazillion times... or use awkward and obtuse asynchronous JavaScript. This technology choice has caused a few folks to predict the downfall of HTTP.

Nothing would make me happier than the death of HTTP, but it's not happening yet... As others have noted, Google Wave is still very dependent on HTTP... it only uses XMPP for server-to-server communication. The web browser still has to poll the server for more data. Although, I'd wager that once this takes off and Google servers are swamped, they might sneak XMPP into Google Gears and use that instead.

It looks like Wave will be easy to integrate with, and its all open-source... You don't need to host it at Google, you can install their server, or just implement the protocol. This is good, considering how many enterprises might want to make Microsoft Exchange more Wave-y. I have a couple of ideas for Wave plug-ins... but I have to wait until Google gets me a user account for testing :-(

Oh well... Its probably for the best. I could probably use one less distraction this month...

How ONE Bad Business Process Doomed GM

A friend of mine was on a flight a few years back with a Business Process Management (BPM) expert, who had done a lot of consulting for various car companies... the fellow told an interesting tale about how one single bad business metric made him swear off GM cars forever... and it just might be a major reason for their downfall.

The business process seemed innocuous enough at first glance... GM wanted to control costs on their auto parts. So, the process stated to test every auto part, and look for the most expensive parts that were the most reliable. Next, ask suppliers for a cheaper version of that part. Sure, the cheaper version of that part might fail more often... but so what? These are the parts that were severely over-engineered. You don't need space shuttle quality parts in a minivan... so what's the harm?

Notice the problem? If not, don't worry... neither did the millionaires who ran GM.

Now, consider how a rival car company dealt with the same problem. They would also run tests on every car part. They would also keep metrics on which car part failed the least, and which failed the most. However, they did very different things with the data. The rival company took the parts that failed the most often, and either demanded higher quality versions, or searched for a new vendor. Sometimes these changes would increase the price of the part, sometimes it would decrease.

Now do you see the problem? Each business process had an overall side-effect to the quality of the cars produced. As rival companies made their cars more and more reliable, GM was making theirs less and less reliable! Instead of focusing cost cutting on the overall finished product, they decided to tie cost cutting directly to making lower quality cars!

After realizing this, and being completely unable to get anybody at GM to change their metrics, the gentleman decided to swear off GM cars forever...

Peter Senge warned about similar problems 30 years ago in his famous book The Fifth Discipline. Side effects, negative feedback loops, and simple delays in cause-and-effect can wreak havoc on any business process you put together. No matter what your metric, there is always a case where a "good" result is a very bad thing... The key is to try to predict how that could be possible... otherwise, by doing your job better and better, you just might be dooming your company to mediocrity.

Naturally, its probably unfair to blame this all on one single business process. There are also the armies of people asleep at the switch who should have done something to correct it. Unfortunately, the folks who design the business processes are usually unable or unwilling to accept this harsh reality... and if they are politically powerful, bad processes remain. This is why we need "nice" tools that help gather information about these negative consequences that are outside of the model, and make it clear that its time to change...

That's one of the stated goals of Enterprise 2.0 tools... but even they can't help you unless you first try to build up trust and camaraderie in your company. Only then is it easy for people to accept the harsh reality.

Tweeting The Mona Lisa

As many of you know, Twitter limits you to only 140 characters in each "tweet." That doesn't sound like much... but if you try you can cram a good deal of data in those 140 characters! In fact, Quasimondo figured out how to tweet a pretty good version of the Mona Lisa!

The technique was pretty clever: Tweet in Chinese! Twitter allows you to use UTF8 characters, which means if you pick a language with a lot of possible letters -- like Chinese -- you can encode a great deal of data into one single letter. If properly encoded, you can cram 210 bytes of data into 140 Chinese letters.

So, the guy came up with a way to sketch the Mona Lisa in about 200 bytes, then encoded it into 140 Chinese letters. You can see the results below, which look pretty cool. The English translation is a tad odd, however:

The whip is war
that easily comes
framing a wild mountain.

Hello, you in the closet,
singing--posing carved peaks
of sound understanding.

Upon a kitchen altar
visit a prostitute--
an ugly woman saint--
who decoys.

Particularly
lonesome mountain valley,
your treasury: a dumb corpse and
funeral car, idle choke open.

Reclassification:
exactly what you would call nervous.
Well, do not suggest recalcitrance
those who donated sad.

The smell of a rugged frame
strikes cement block once.

Where you?
Cape. Cylinder. Cry.

Interesting... It makes me wonder what the Tao Te Ching would "look" like... It also makes me wonder what kind of word salad we would get if we "translated" corporate logos into Chinese...

Petition to Update The HTML on Example.com

You want to see a web site that's still stuck in 1994? Check out example.com.

This site -- along with its cousins example.org and example.net -- is a reserved top-level-domain, owned by the Internet Engineering Task Force (IETF). Its not available for sale, because its used quite frequently in technical documentation.

Unfortunately... I don't think its been updated for well over a decade... and I think this site is in dire need of a facelift. I don't mean that it needs to be prettier -- although that would be a plus -- but its also slightly broken on the technical side... and the text is relatively unhelpful. Below are my primary complaints:

Unhelpful Text

The people who click on links that contain example.com are probably new either to the internet, new to designing network system, or new to the concept of web standards and RFCs. This site is huge opportunity to educate people about why this site is reserved, why people use it in documentation, and the importance of web standards. It needn't be much... maybe some friendly text, and a link to the Wikipedia page about their site.

Invalid HTML

I'm not kidding! There are two lines of text, and the W3C validator spots three markup errors! They mix uppercase and lowercase in the HTML elements, and there's no DOCTYPE. That's just plain lazy...

Poor Error Handling

Most sites these days have a "smart" error page if the page you are looking for cannot be found. Say for example you wanted to demonstrate how URLs will look on your web site, you might make a link that looks like this:

Clicking on that link would naturally bring you to an unhelpful error page on example.com. I'd argue that a nicer error page would show you what URL you clicked, and then give the standard boilerplate about how that site was only an example.

I Have to Say... its Kind of Ugly

OK, so I downplayed this initially... but, man, is this page ugly. I have no problem with simple web sites, but this just looks slapdash. Anybody who knows HTML could do it 100x better, but still keep it simple and clean.

The Solution?

I don't know what the best solution would be... however I think it would be a good idea for the IETF to have a redesign contest. Maybe something totally minimalistic that will still render perfectly in an old Netscape browser, but still looks attractive, and has more helpful text. Then they could open up voting on which one to choose. I'd suggest posting each one to Reddit/Programming, then count which one gets the most "up" votes.

I put together a web petition to encourage the IETF to have a redesign contest... if you think its a good idea as well, please sign it!

Wolfram Alpha... yawn...

Continuing on my anti-semantic-web rants... I feel obligated to note that I expect very little of use to come out of the latest pony in this show: Wolfram Alpha. There were the obvious insanely overly-optimistic reviews that said its a search engine that will change the universe forever!!! A few other folks were cautiously optimistic that it might have limited long-term value... I think Spigel Online had the best summary:

Clever presentation, but a weak database: The soon-to-be-launched Wolfram Alpha search engine is already being touted as the "Google killer." SPIEGEL ONLINE has tested a preliminary version. The conclusion: It knows a lot about aspirin, a little about culture -- and it thinks German Chancellor Angela Merkel's political party is an airport.

Its not really a "Google Killer." Its not even a search engine per se... I'd describe it as a slightly smarter almanac... and its going to take about 1000 full time employees just to keep it vaguely useful. In general, the more clever your code gets, the more likely it is to go off the deep end and give you very very bad data.

Personally, if the inventors dial back their claims, this might have limited use... if not, it will probably languish like pretty much every similar hunk of software in the past... (anybody out there remember Cyc?)

Blog Silence...

Sorry for the lack of blogging, folks... Last week was IOUG Collaborate, and I was usually indisposed. For those who didn't make it, you can check out my presentation on Slideshare. I gave my talk on A Pragmatic Strategy for Oracle ECM, as well as my Top 10 Ways To Integrate With Oracle ECM.

Billy put up some of his talks as well. The How To Be A Rock Star with ECM talk was well received... although the slides don't quite capture the whole presentation.

Overall, I was pleased with the turnout... I was kind of bummed out that there wasn't a bigger Oracle ACE presence there. I saw Dan Norris a few times, but there wasn't an 'official' ACE briefing. Oh well... I guess I'll need to wait for Oracle Open World in the fall. With all the new Sun customers and partners, that place is going to be chaos.

Year In Review...

It's been three years since this blog's inaugural post... so I thought I'd reflect on the past. Unlike most folks who take stock on January first, my blog's official "end of fiscal year" is April 29th. So I'll take the opportunity to have new year's in April, and recount my most popular blog posts of the past 12 months:

  1. Google Owes Me A Pony: (11,264 hits) my take on how the new economy makes Google a much more powerful player than most people realize.
  2. The Idiot Test: (8,480 hits) a fun game that tests your ability to spot the obvious.
  3. How to make Vodka Infusions: (6,923 hits) a perenial favorite
  4. Empathy Versus Sympathy: (4,932 hits) one that I'm very proud of... as one commenter said, "Dude, you just summed up marriage in one easy little post!"
  5. How many hits does your site really get: (4,422 hits) my take on how to analyze data from page popularity metrics (like this!)
  6. The Official Guide To The Idiot Test: (3,467 hits) the cheat sheet for the above-mentioned game
  7. Why Does Vista Suck: (3,370 hits) a question for the ages...
  8. Tell IKEA to bring back the Coolest Desk Ever: (3,168 hits) my petition to bring back the favorite desk of Lifehackers worldwide
  9. Why Developers Love BAFFLINGLY Complex Code: (2,995 hits) a take on the psychology of why code will always be complex and brittle
  10. I'm Sorry I Invented Object Oriented Programming: (2,339 hits) the title says it all...

Overall, I had 180,516 page views over the year (according to Google Analytics)... so the top 10 articles were about 30% of my traffic. Naturally, several of these posts were written in 2007 or 2006, but they are still going strong! Only four of the top 10 from were written in the past 12 months...

I've lately been trying to focus on "evergreen content", which means blog posts that will be relevant for multiple years. Such posts take longer to write, and too much emphasis can sometimes prevents you from updating your blog as frequently as you should... but on the plus side, your posts from three years ago will still be hot spots on the web ;-)

Time For The IOUG 2009 Conference!

If you are attending the IOUG Collaborate conference this year, you might want to check out my talks:

These are both repeats of the ones I gave at Open World 2008 a few months back... although the "Top 10 Ways" have changed a little bit since the introduction of the RIDC connector... I'm planning on something completely different for Oracle Open World this year. ;-)

I'm also doing a book signing after my "Pragmatic Strategy" talk... It will be at 2:30pm on Monday, at the bookstore. The bookstore is in the middle of level 2, outside the entrance to the exhibit hall. If you'd like a signature for either book, swing on by!

Unfortunately, there aren't any plans for the Oracle ACEs to get together... although I'm pretty sure that Dan Norris and others will be attending.

The Bucket List

Pie is feeling reflective... so he tagged a few of us with the question what do you want to do before you die?

I actually covered this back in 2007 in a post about The Buried Life. In case you don't know, The Buried Life is a show about 4 college guys who made a list of 100 "what to do before we die" tasks. Then, after making the list, they asked themselves what the heck are we waiting for?!?! So they rented a motor home, and toured Canada one summer doing as many things as they could. And every time they scratched something off their list, they tried to help a stranger fulfill their dreams as well.

I realized with some surprise that I had already done about a quarter of the things on their list... including:

  • skydiving,
  • hot air ballooning,
  • learning to play a musical instrument,
  • swimming with sharks,
  • catching something and eating it (not a shark), and
  • destroying a computer

That last one can be sooooooooooo fulfilling...

So, what's left? Pie wants us to pick one "bucket item" and explain it. I'm not sure if this means it has to be #1 on the list, or just the one you plan on doing next... In my case, I think they might be the same thing:

I want to invent some physical device that is both popular, and practical.

In my high school and college days I was a bit of a tinkerer... more with electronics than anything. However, my job these days is writing software, which is a tad less fulfilling than inventing a device with actual physical dimensions... I have a few ideas for devices that could save fuel, save money, and even save lives, but I haven't set aside the time to properly implement them. Michelle and I are planning on buying a house soon, and one of my requirements is to have a shed where I could experiment. I'm also going to try to dedicate 10% of my work week to tinkering. Hopefully, its only a matter of time before I invent something useful, or I blow up my shed. One or the other...

I had several others that didn't make the #1 spot:

  • Get a degree in economics: since clearly nobody else in the world seems to be using theirs...
  • Write a book NOT about computers: I have many other book ideas that I think are great, and I've even started outlining them... but Michelle only wants me to write one book every three years. Apparently, deadlines make me grumpy.
  • Build a school: in some poor part of the country, or the world. I think it should be based on the KIPP methodology with a strong emphasis on Non-Violent Communication as well. That's a big project I probably can't get around to it for a few years... but I really want to do it some day.
  • Travel lots and lots and lots: my list includes Greece and Turkey because I'm a history nut... Australia because the people there are just plain cool... Galapagos and Antarctica because I'm a science nerd... and Japan because they make 80% of the world's weirdest stuff. I don't know if I can do it all, but I'll certainly try.

I don't feel like tagging anybody else with this meme... These kinds of reflections can be a bit of a bummer for some folks, so I won't subject them to it. However, if you're inspired to write your own "bucket list" because of this post, leave a link in the comments, and I'll retro-tag you ;-)

Oracle Buys Sun: Insert your own "Java Garbage Collector" Pun

In case you haven't heard, Oracle bought Sun... after being teased by IBM, and watching its stock price plummet, Oracle began talks with Sun last Thursday about possible acquisition...

If you were surprised, don't feel bad... Neither IBM nor Microsoft had a clue this was going to happen.

First thoughts... holy crap! Oracle sure saved Sun from becoming a part of the IBM beast... and now Oracle (more or less) owns Java, and has access to all those developers who maintain it. This is win-win for them both, in my opinion. Sun gets most of their revenue from hardware, which Oracle avoided doing for decades, so overall there's not much overlap in product offerings -- unlike last year's BEA acquisition.

The hardware-software blend is a compelling story... Imagine getting all your Oracle applications and databases pre-installed on a hardware appliance! Not bad... You could even get one of them data centers in a box, slap a bunch of Coherence nodes on each, and have a plug-and-play "cloud computer" of your very own.

Second thoughts... how the heck is the software integration plan going to work? Sun helps direct a lot of open source projects... including JRuby, Open Office, and the MySQL database... not to mention the OpenSSO identity management solution, and the GlassFish portal/enterprise service bus/web stack. The last two are award winning open-source competitors to existing Oracle Fusion Middleware products. Oracle now owns at least 5 portals, and at least 4 identity management solutions... unlike past acquisitions, existing Oracle product lines are going to have to justify themselves against free competitors. I can foresee a lot of uneasy conversations along the lines of:

So, Product Manager Bob... I notice that your team costs the company a lot of money, but your product line isn't even as profitable as the stuff we give away for free... Can you help me out with the logic here?

There are a lot of open source developers shaking in their boots over this... but I'm being cautiously optimistic. Oracle can't "kill" MySQL: there are too many "forked" versions of MySQL already, any one could thrive if Oracle tried to cripple the major player. Likely they will simply try to profit from those who choose to use a bargain brand database. Case in point, Oracle could sell them their InnoDB product, which allows MySQL to actually perform transactions.

Middleware is the big question mark... but with a huge injection of open source developers, products, and ideas, I'm again cautiously optimistic that -- after an inevitable shake-up -- the Middleware offerings would improve tremendously.

And Open World 2009 is going to be a lot more crowded...

The Semantic Web Versus The Fallacies Of Distributed Computing

Back in the early days of the web, Peter Deutsch from Sun penned a classic list: The Fallacies of Distributed Computing. Peter took a long, hard look at dozens of networked systems that failed, and realized that almost every failure made one or more catastrophic assumptions:

  1. The network is reliable.
  2. Latency is zero.
  3. Bandwidth is infinite.
  4. The network is secure.
  5. Topology doesn't change.
  6. There is one administrator.
  7. Transport cost is zero.
  8. The network is homogeneous.

Any time you make an assumption along the lines of the fallacies above, your project will almost certainly fail. These fallacies are best explained in an article by Arnon Rotem-Gal-Oz, but today I will focus on fallacy #5: Topology doesn't change, and how the semantic web will fail partially because its creators made this fatal assumption.

As I mentioned before, proponents of the "Semantic Web" are trying to dial down their more grandiose claims, and focus on items with more concrete value. The term that Tim Berners-Lee is using these days is Linked Data. The core idea is to encourage people to put highly structured data on the web, and not just unstructured HTML documents, so the data is easier for machines to read and understand.

ummmmm.... ok...

Funny thing, people have been doing this for decades. Tons of folks make structured data available as "scrapable" HTML tables, as formatted XML files, or even as plain ol' Comma Seperated Value (CSV) files that you can open in Excel. Not to mention the dozens of open web services and APIs... allowing you to do anything from check stock quotes, to doing a Google Maps mashup. There really is nothing groundbreaking here... and I find it painfully disingenuous for somebody to claim that such an obvious step was "their magic idea."

Well, not so fast... in an attempt to breath relevance back into the "Semantic Web," Tim claims that "Real Linked Data" needs to follow three basic rules:

  1. URLs should not just go to documents, but structured data describing what the document is about: people, places, products, events, etc.
  2. The data should be important and meaningful, and should be in some kind of standard format.
  3. The returned structured data has relationships to other kinds of structured data. If a person was born in Germany, the data about that user should contain a link to the data about Germany.

OK... so your data has to not only be in a standard format... but it needs links to other data objects in standard formats. And this is exactly where they fail to heed the warnings about the fallacies of distributed computing! Your topology will always change... not only physical topology, but also the logical topology.

Or, more succinctly, what the heck is the URL to Germany?!?!?

Look... links break all the time. People move servers. People shut down servers. People go out of business. People start charging for access to their data. People upgrade their architecture, and choose a different logical hierarchy for their data. Companies get acquired, or go out of business. Countries merge, or get conquered. Embarrassing content is removed from the web. Therefore, if you use links for identifiers, don't expect your identifiers to work for very long. You will need to spend a lot of time and energy maintaining broken links, when quite frankly you could do quite fine without them in the first place.

An identifier says what something is. A link says where you can find it. These concepts should be kept absolutely separated. Its a bad bad bad bad bad idea to blend the "where" with the "what" into one single identifier... even the much touted Dereferenceable URIs won't cut it, especially from a long-term data maintenance perspective... because the data they deference to might no longer be there!

So, where does that leave us? Exactly where we are. There are plenty of ways to create a system of globally unique IDs, whether you are a major standards body, or a small company with your own product numbers. But we shouldn't use brittle links... we should use scoped identifiers instead. We need a simple, terse way to describe what something is, that in no way, shape, or form looks like a URL. The identifier is the "what." We need a secondary web service -- like Google -- to tell us the most likely "where." At most, data pages should contain a link to a "suggested web service" to translate the "what" into the "where." Of course... that web service might not exist in 5 years, so proceed with caution.

For example, we could use something similar to Java package names to make sure anybody with a DNS name can create their own identifier... For example, there's a perfectly good ISO standard for country names. So you tell me, which is a better identifier for Germany?

  • http://en.wikipedia.org/wiki/Germany
  • http://de.wikipedia.org/wiki/Deutschland
  • http://linkeddata.openlinksw.com/about/Germany#this
  • http://dbpedia.org/resource/Germany
  • org.iso.3166-1.de

I don't know... Openlinsw.com and DBPedia might not be around in 3 years, and data is supposed to be permanent. Wikipedia will probably be around for a while, but should it go to the English page or the German page? The ISO 3166 identifier may not be clickable, but at least it works for both German and English speakers! Also, if you remove the dots and Google it, the first hit gives you exactly the info you need. Plus, these ISO codes will exist forever, even if the ISO itself gets overrun by self-aware semantic web agents.

I just can't shake the feeling that using links for identifiers leads to a false sense of reliability. Your identifiers are some of the most important parts of your data: they should be something globally unique and permanent... and the web is anything but permanent.

Lets' accept the fact that the topology will change, create a system of globally unique identifiers that are independent of topology, and go from there.

Never Outsource What You Don't Understand

Experts can be dangerous... not because they don't know what they are doing; but because you don't know when they don't know what they are doing. And if you are unable to notice this, then you will likely lose a lot of money...

Case in point, there was a recent neurobiology study on how the act of listening to "experts" actually makes your brain shut down!

In the study, Berns' team hooked 24 college students to brain scanners as they contemplated swapping a guaranteed payment for a chance at a higher lottery payout. Sometimes the students made the decision on their own. At other times they received written advice from Charles Noussair, an Emory University economist who advises the U.S. Federal Reserve... The advice was extremely conservative, often urging students to accept tiny guaranteed payouts rather than playing a lottery with great odds and a high payout. But students tended to follow his advice regardless of the situation, especially when it was bad. When thinking for themselves, students showed activity in their anterior cingulate cortex and dorsolateral prefrontal cortex — brain regions associated with making decisions and calculating probabilities. When given advice from Noussair, activity in those regions flat lined.

Woah... simply listening to "experts" makes your brain less able to calculate risks and make decisions... what's worse, the more counter-intuitive the advice, the less the brain functioned! This should be a wake-up-call to anybody who uses experts frequently...

To be clear, I use experts all the time... but I feel uneasy when I rely on experts. Yes, I understand electronics, auto repair, and accounting, but I still prefer to use outside experts because it saves me time. I never want to engage an outside expert on something I don't understand -- especially personal finance -- I prefer taking a crash course on it so I can easily spot those so-called "experts" who actually don't know what they are doing. Only after I gain that skill, do I feel comfortable listening to experts.

Well... isn't it a bit odd for me -- a software consultant -- to bash outsourcing? Not really... because I try hard to never approach projects with the attitude of an "expert." I prefer to approach it as an "educator." I try to help people understand the whole problem, the possible solutions, and potential risks. There is no "right way" to do software, there are only ways that in the past have helped us avoid failure... So my greatest skill is helping my clients avoid failure, but only with their knowledge and support will I be able to make them truly successful.

In contrast, an "expert" can only tell you what you want, and then give it to you... whether or not that is actually what you need.

Scripting Oracle UCM With Jython

Sometimes when I'm working on a big-ish project, I need to quickly whip out a script to alter items in the content server. The old-school way to do this would be to use the IdcCommand application... other folks might prefer a Java application written with the J2EE connectors in the Content Integration Suite (CIS), or maybe even SOAP... but my preference would be to do it all in a scripting language. In particular, Jython.

Jython is a Java implementation of the Python programming language... which is my favorite language these days. Jython did stagnate for may years, stuck on Python 2.2, and more than a little buggy... but the project is alive and kicking and just released version Jython 2.5 beta 3, which I recommend you use. I'd wager that the Jython project was revived partly because of envy about the rise of Ruby and JRuby. Whatever the reason, I'm always happy to have new code to play with.

You can invoke any Java libraries in Jython, so naturally you could use SOAP or CIS to make administrative scripts. However, I think the majority of people would prefer a new-ish Java connector for Oracle UCM: the Remote IntraDoc Client (RIDC). In contrast with both CIS and SOAP, the RIDC connector is very lightweight, very fast, and very simple to use. There's no WSDL or J2EE bloat at all; RIDC is just a "Plain Old Java Object" wrapper around UCM web services... so it's very easy to embed in a Java application.

To get started, download the most recent version of the Content Integration Suite from Oracle. This ZIP file contains two folders: one for the new RIDC connector, and one for the standard CIS connector. I'd suggest you take a look at the "ridc-developer-guide.pdf" before you go any further. The samples and JavaDocs are also very useful, but you can peruse them later.

Next, download Jython 2.5b3, and run the installer.

Next, make a folder to contain your UCM Jython scripts. Copy the "jython" launcher file from its install directory to this directory. On Windows, this file is named "jython.bat". Also copy the RIDC library "oracle-ridc-client-10g.jar" to this folder.

Next, edit your copy of the Jython launcher file to make sure the Java classpath includes the RIDC library. You can set this near where they set JAVA_HOME at the top. On Windows, you would edit "jython.bat" and add this:

        set CLASSPATH=%CLASSPATH%;D:\FOOBAR\oracle-ridc-client-10g.jar

On Unix, your would edit the "jython" text file, and add something like this:

        CLASSPATH=$CLASSPATH:/FOOBAR/oracle-ridc-client-10g.jar

That's it! Now just run "jython" on the command line, and you'll get an interactive shell where you can load Java classes, and use them. Loading them is fairly similar to how you load libraries in Python. For example, the script below will load the RIDC libraries, connect to the content server, run a search, and dump out the results:

# import the Oracle UCM libraries into Python from oracle.stellent.ridc import IdcClientManager from oracle.stellent.ridc import IdcContext # create the manager and client objects manager = IdcClientManager() client = manager.createClient("idc://localhost:4444") userContext = IdcContext("sysadmin", "idc") # prepare to run a search binder = client.createBinder() binder.putLocal("IdcService", "GET_SEARCH_RESULTS") binder.putLocal("QueryText", "") binder.putLocal("ResultCount", "20") # get the response response = client.sendRequest(userContext, binder) responseBinder = response.getResponseAsBinder() # dump out the response data localData = responseBinder.getLocalData() print("LocalData:") for key in localData: print(key + " = " + localData.get(key)) for name in responseBinder.getResultSetNames(): print("\nResult Set '" + str(name) + "':") rset = responseBinder.getResultSet(name) fields = rset.getFields() rows = rset.getRows() for rowItem in rows: for fieldItem in fields: print ("\t" + str(fieldItem.getName()) + " = " + str(rowItem.get(fieldItem.getName()))) print("\t----------------")

Remember: whitespace is relevant in Python, so watch your indentations...

You can easily expand on this to create scripts to run archives, batch update metadata fields, resubmit items to the indexer, or run them through a converter to generate PDFs or HTML. Also, there are multiple ways you can set up the security if you don't want to send the password with every request, or if you want to use SSL instead of clear-text sockets. See the RIDC documentation for examples.

Enjoy!

Popularity of the Web Considered Harmful

In a recent TED Talk, Tim Berners Lee laid out his next vision for the world wide web... something he likes to call "Linked Data." Instead of putting fairly unstructured documents on the web, we should also put highly structured raw data on the web. This data would have relationships with other pieces of data on the web, and these relationships would be described by having data files "link" to each other with URLs.

This sounds similar to his previous vision, which he called the "semantic web," but the "linked data" web sounds a bit more practical. This change of focus is good, because as I covered before, a "true" semantic web is at best impractical, and at worst impossible. However, just as before, I really don't think he's thought this one through...

The talk is up on the on the TED conference page if you'd like to see it. As is typical of all his speeches, the first 5 minutes is him tooting his own horn...

  • Ever heard of the web? Yeah, I did that.
  • I I I.
  • Me me me.
  • The grass roots movement helped, but let's keep talking about me.
  • I also invented shoes.

I'll address his idea of Linked Data next week -- preview: I don't think it will work. -- but I first need to get this off my chest. No one single person toiled in obscurity and "invented the web." I really wish he would stop making this claim, and stop fostering this "web worship" about how the entire internet should be the web... because its actually hurting innovation.

Let's be clear: Tim took one guy's invention (hypertext) and combined it with another guy's invention (file sharing) by extending another guy's invention (network protocols). Most of the cool ideas in hypertext -- two-way links, and managing broken links -- were too hard to do over a network, so he just punted and said 404! In addition, the entire system would have languished in obscurity without yet another guy's invention (the web browser). There are many people more important than Tim who laid the groundwork for what we now call "the web," and he just makes himself look foolish and petty for not giving them credit. Tim's HTTP protocol was just an natural extension of other people's inventions that were truly innovative.

Now, Tim did invent the URL -- which is cool, but again, hardly innovative. Anybody who has seen an email address would be familiar with the utility of a "uniform resource identifier." And as I noted before, URLs are kind of backwards, so its not like he totally nailed the problem.

As Ken says... anybody who claims to have "invented the web" is delusional. Its would be exactly like if a guy 2000 years ago asked: "wouldn't it be great if we could get lots of water from the lake, to the center of the town?" And then claimed to have invented the aqueduct.

As Alec says... the early 90s was an amazing time for software. There was so much computing power in the hands of so many people, all of whom understood the importance of making data transfer easier for the average person... Every data transfer protocol was slightly better than the last, and more kept coming every day. It was only a matter of time until some minor improvement on existing protocols was simple enough to create a critical mass of adoption. The web was one such system... along with email and instant messaging.

Case in point: any geek graduate of the University of Minnesota would know that the Gopher hyperlinking protocol pre-dated HTTP by several years. It was based on FTP, and the Gopher client had easily clickable links to other Gopher documents. It failed to gain popularity because it imposed a rigid file format and folder structure... plus Minnesota shot themselves in the foot by demanding royalty fees from other Universities just when HTTP became available. So HTTP exploded in popularity, while Gopher stagnated and never improved.

But, the popularity of the web is a double-edge sword. Sure, it helps people collaborate and communicate, enabling faster innovation in business. But ironically, the popularity of the web is hurting new innovation on the internet itself. Too much attention is paid to it, and better protocols get little attention... and the process for modifying HTTP is so damn political, good luck making it better.

For example... most companies love to firewall everything they can, so people can't run interesting file sharing applications. It wasn't always like this... because data transfer was less common, network guys used to run all kinds of things that synced data and transferred files. But, as the web because much more popular, threats became more common, and network security was overwhelmed. They started blocking applications with firewalls, and emails with ZIP attachments just to lessen their workload... But they couldn't possibly block the web! So they left it open.

This is a false sense of security, because people will figure ways around it. Its standard hacker handbook stuff: just channel all your data through port 80, and limp along with the limitations. These are the folks who can tunnel NFS through DNS... they'll find their way through the web to get at your data.

What else could possibly explain the existence of WebDAV, CalDAV, RSS, SOAP, and REST? They sure as hell aren't the best way for two machines to communicate... not by a long shot. And they certainly open up new attack vectors... but people use them because port 80 is never blocked by the firewall, and they are making the best of the situation. As Bruce Schneier said, "SOAP is designed as a firewall friendly protocol, which is like a skull friendly bullet." If it weren't for the popularity of the web, maybe people would think harder about solving the secure server-to-server communication problem... but now we're stuck.

All this "web worship" is nothing more than the fallacy of assuming something is good just because it's popular. Yes, the web is good... but not because of the technology; it's good because of how people use it to share information... and frankly, if Tim never invented the web, no big loss; we'd probably be using something much better instead... but now we're stuck. We can call it Web 2.0 to make you feel better, but it's nowhere near the overhaul of web protocols that are so badly needed... Its a bundle of work-arounds that Microsoft and Netscape and open source developers bolted on to Web 1.0 to make it suck less... and now it too is reaching a critical mass. Lucky us: we'll be stuck with that as well.

What would this "better protocol" be like? Well... it would probably be able to transfer large files reliably. Imagine that! It would also be able to transfer lots and lots of little files without round-trip latency issues. It would also support streaming media. It would have built-in distributed identity management. It would also support some kind of messaging, so instead of "pulling" a site's RSS feed a million times per day, you'd get "pushed" an alert when something changes. Maybe it would have some "quality of service" options. Most importantly, it would allow bandwidth sharing for small sites with popular content, to improve the reach of large niche data.

All these technologies already exist in popular protocols... but they are not in "the web." All of these technologies are likewise critical for anything like Tim's "Linked Data" vision to be even remotely practical. All things being equal, the web is almost certainly the WORST way to achieve a giant system of linked data. Just because you can do it over the web, that doesn't mean you should. But again... we're stuck with the web... so we'll probably have to limp along, as always. Developers are accustomed to legacy systems... we'll make it work somehow.

Now that I've gotten that out of my system, I'll be able to do a more objective analysis of "Linked Data" next week.

caption contest...

wtf_pictures-macho-macho-t1

words fail me... other than: the internet has EVERYTHING.

Real-World ECM Success Stories

On my recent book tour, I presented some real-world examples of successful UCM strategies. It included some tips and warnings that Andy and I used to help us write the book... and shared some hard return-on-investment numbers from existing UCM clients. I uploaded the presentation to Slideshare, for those of you who were interested... or the lazy can just view it below:

I spent some time talking about the basics... what problems does ECM solve? What causes initiatives to fail? How do you define and measure success? And what are some tips for ensuring success? This is more strategy than technical, so hopefully everybody on your ECM team will get something useful out of it.

The hard numbers for cost savings came from the Survive or Thrive With UCM talks that Oracle has been touting recently. There's a lot of good information in those webcasts, which I won't repeat here. I'll just strongly encourage you to check them out.

Naturally... the people who had the best success to report were those who were the most disciplined in taking metrics. How much less paper are we printing? How much time is saved because the process is now automated? How much easier is it for employees / customers / partners to find the information they need? How much faster can you deploy new web sites? How much faster can you perform comprehensive audits? How much extra revenue can we credit to the system? How can you prove value to your boss?

If you don't ask the hard questions, and make measurements before and after, it will be difficult to ever quantify success...

Does Twitter Mean The End Of TinyUrl.com?

Face it, folks... the 140 character limit that Twitter imposes really starts to hurt when you try to pass around URLs. Previously, most folks used tinyurl.com to make their URLs short and sweet... but lately, some Twits have determined that even tiny URLs waste precious character space, and have been demanding even tinier URLs. I mean, look at this link to my site:

http://tinyurl.com/bonkw4

Twenty five characters. What a bit pig... although I love the hidden message bonk w4... especially now that it's tax season.

As a result, many Twits have decided to go for a system that makes even shorter URLs. Here are a few URL shortening sites I've found laying about:

  • bit.ly: Using this, the link to my home page is now compressed to: http://bit.ly/Vh0e3. Wow! That's saving six whole letters! Just enough to throw in one last ROTFL into your tweet...
  • is.gd: not to be outdone... 'Is Good' shaves off eight whole characters, making it http://is.gd/ndwf.
  • tinyarro.ws: UNGODLY short URLs! Check out http://➡.ws/웕. This makes links twelve characters shorter than tinyurl.com, making it easily the tiniest URL I've ever seen! Hot damn! Its using some nifty tricks with UTF8 encoded URLs, the arrow symbol, and what I believe to be the Chinese symbol for "awesome." Its all technically valid in a URL, but it won't work on all screens, all web pages, nor all programs.

There are many more URL shorteners available... if you really insist on being original. Personally, I think is.gd will probably muscle out tinyurls.com, at least amongst the Twitter crowd... and tinyarro.ws might cause more than a few folks to argue about the practical limitations of encoded URLs, especially where security and phishing are concerns...

There is another possibility... sites may begin making their own tiny urls. Imagine if CNN they did their own tiny urls, so you only had to send something like cnn.com/t/123. Just ignore the http:// prefix, since many apps these days know if a string something has .com and a few slashes, its probably a link... so they just slap on the implied http:// prefix automatically.

So what do you think? Will Twitter kill tinyurls.com? Will web sites start offering pre-made tiny URLs? Which URL shortener do you use?

Oracle Enterprise 2.0 Podcast, Parts 1, 2, and 3

Last week Bob Rhubart interviewed Billy Cripe, Vince Salvato and myself about Enterprise 2.0. Bob will be releasing it in three chunks, which you can download with the links below:

  • PART ONE: we began by discussing the importance of "social media," both in and out of the enterprise. We also touched on the importance of a Web 2.0 infrastructure to enable the "right kind" of collaboration. I spent a tiny bit of time discussing the importance of social search, in contrast to typical enterprise search, but not in the depth that I would have liked...
  • PART TWO: this talk was more about collaboration. Bob begins with the question, isn't Enterprise 2.0 just "anarchyware?" Meaning, it might do too good of a job at "flattening" the corporate structure, that it might lead to poor decisions and chaos. We dealt with how you can avoid anarchy, sometimes with a slow adoption that your corporate culture can tolerate, and sometimes by putting extra seat belts in these systems. The best enterprise 2.0 architectures should be natural extensions of systems that allow effective committee-based decisions... believe it or not, there are several good ways a big committee can make decisions, although it takes a bit of discipline. I also challenged Friedman's assertion that "The World Is Flat" with a rant about how The World Is Spiky. Random collaboration is pretty much just noise; true innovation will only occur when you get the right people to collaborate...
  • PART THREE: finally, we get talking about the digital natives: the latest generation of workers versus the baby boomers. The former will love the latest E 2.0 systems, because they will help them expand their influence, whereas the latter will hate changing their habits and sharing their knowledge so close to retirement... which is a shame, because that is exactly what businesses need. Luckily, systems like Facebook and Twitter are becoming so fun, that there's still hope to bring similar systems into the enterprise. We also discuss why should architects care at all about enterprise 2.0?

It was fun to put these together... and thanks a lot to Bob for editing all of our ramblings into easy to follow chunks! Feel free to comment on these podcasts below...

Great New Site: Oracle ECM Alerts

If you watched Michelle's Oracle ECM Community Call on March 10 -- or you spotted one of the leaks in the twitscape -- you would have heard about the new site for Oracle ECM announcements: ECM Alerts.

She worked really hard to put this together and promote it, and already its Google Rank is impressive...

The goal behind the blog is to give a forum for Oracle ECM Product Managers to announce the latest news about each of their products. It will contain product release information, integrations, samples, and general how-tos for most of the products in the Universal Content Management suite. The site makes a good use of tags and categories, so you can subscribe to only the product announcements that matter for you. And because everything is piped through Feedburner, you can subscribe to alerts by email or RSS.

The product managers seem to like the idea, and already there are a good number of product alerts. I'd wager that it will take a few more weeks to get everybody on board with this... after which it will likely be the best place to get Oracle ECM Announcements.

For existing ECM customers who were used to the Stellent customer newsletters, this will be a welcome addition.

Recent comments