Finally! Oracle Gets Approval for Sun Acquisition!

Well, that took long enough! The European Union finally approved the merger... so now it's official that Oracle owns Sun (and Java!). Oracle is having a press conference about their Sun strategy this Wednesday, January 27, at 9am Pacific Time. They covered a lot of this at Open World 2009... but now I guess it will be official.

They probably won't announce the inevitable layoffs at this talk... although some speculate that 50% of Sun's workforce is redundant after the merger. I'd expect them to talk big about the Exadata V2 hardware... maybe something about pre-packaged Oracle "appliances." It would be cool to have a database in a box, or something akin to the Google Search Appliance for their secure enterprise search.

I'd also expect some talk about virtualization... ever since the BEA acquisition, Oracle has owned a very interesting virtualization solution. I'm not talking about their Linux VM; I'm talking about their Java VM. Instead of making a virtual machine of operating systems running J2EE application servers, BEA had a solution that virtualized just the application server without an operating system. Leaner, meaner, fewer security holes, and much easier to maintain. Oracle has kept pretty quiet about this technology... I'd expect it to be touted a bit more.

What's your feeling? Does anybody out there think Oracle will make an earth-shaking announcement? Does anybody think their strategy will be significantly different than what they have strongly hinted at last year?

Book Review: The Drunkards Walk

This book is -- hands down -- the best book on probability and randomness written for the general public.

I am typically disappointed with pop-science books written for the general public... they usually don't present enough data for me to make up my own mind about their conclusions... and when they do present data, they almost never provide enough details to determine whether or not their results are "statistically significant." In other words, how do they know that their "supporting data" isn't just a great big coincidence???

This book bucks that trend big time, and the results are very impressive.

The author wrote this like a history book about the field of statistics, and how it evolved (slowly) over they years. But, every time the author introduces a new concept in statistics, he also shares real-world situations where people made terrible mistakes because they didn't understand these basic principles. It highlights very well that in general, people are terrible at recognizing randomness, and thus will always be controlled by it!

Randomness is very unsettling to people, as such we have a tendency to give order and purpose to the world... People frequently see patters where they just don't exist. One great example of this was discovered fairly recently, called "regression towards the mean." The author's example was as follows:

A psychologist was visiting a group of Air Force instructors after World War 2 to help them design a new training program. He was telling the instructors that positive feedback was much more effective at getting people to learn than negative feedback. At which point, an instructor jumped up to yell at the psychologist for talking hogwash.

"When my students have a good day and I praise them, the next day they slack off and don't do as well. But, if they do badly and I yell at them well, the next day they do MUCH better. So don't tell me this 'positive feedback' garbage works, because it doesn't!"

Surprisingly, both the psychologist and the instructor were right! Regression Towards the Mean means that on average, people perform at the average of their abilities. It doesn't sound monumental, but people forget it sometimes. If you flip a coin enough times, eventually you'll get 100 heads in a row. Likewise, if you do a task enough times, eventually you're going to have a 100-item winning streak that is entirely random luck... and people will incorrectly assume that it's because of greater competency. Sure, you need to be competent enough to perform the task, and you need to do the task very frequently... but after that, any winning streaks are probably just dumb luck.

It's the same reason why some mutual fund managers do better than others, but only for a few years... it's why Roger Maris beat Babe Ruth's home run record, and then very few records after that... it's why some CEOs do amazingly well at one company, but then crash and burn when put in charge of another. You need only be as talented as the average to have a great winning streak...

My other favorite section was when the author covered false positives on medical tests. Most people -- most doctors even -- aren't good enough with statistics to understand when medical tests lead you astray. Towards the end of the book -- after a highly readable introduction to statistical theory -- he presents the following question:

  • assume breast cancer is present in 0.8% of the population
  • assume a mammogram says a healthy person has cancer 7% of the time (false positive)
  • if a mammogram says you have cancer, what are the odds you actually have cancer?

Most people would assume that the answer is 93%, since there is a 7% false positive rate. When asked to a bunch of doctors, the average answer was actually around 70%. But the correct answer is much different: if you test positive for cancer, there is only a 10% chance you actually have cancer!!!

How is this possible? Do the math... out of 1000 people getting a mammogram, 8 will have breast cancer (0.8% incident), and be told so. However, because of the false positive rate of 7%, another 70 people will be told that they have cancer, when in fact they don't! That's 78 people who test positive for cancer, but only 8 actually have it... therefore, if you test positive for breast cancer, there's only about a 10% chance you actually need to worry. In fact, even if you get 6 positive mammograms in a row, you still have better than a 50/50 chance of being totally healthy! Medical tests for rare diseases are fraught with this kind of problem, and it's a shame doctors aren't better at telling their patients the true odds.

Overall, I would recommend this book to everybody. It is really easy to read, and it is chocked full of examples where very smart people made very bad decisions... simply because they didn't understand how randomness rules our lives.

AIIM Launches a Request For Proposal (RFP) Template for ECM

Ah, the dreaded RFP... the big giant document that you hope asks all the right questions from an unbiased point of view... difficult to write, and difficult to respond to. I myself prefer proof-of-concepts and bake-offs to RFPs... but RFPs are pretty good at weeding out the people and products you don't want.

Well, good news for folks in the ECM universe! The AIIM organization has drafted a ECM RFP Template that you can use to generate your RFPs. It's $79, but probably worth it if you are new to ECM and need to know what questions to ask. Experienced ECM professionals might not need it, but it's probably still worth checking out to see if the template suggests asking questions that you don't...

If you have feedback, they appear to have a discussion thread on Information Zen to help make it better.

My Latest Podcast on Oracle Technology: ECM, Collaboration, and Enterprise 2.0

Recently Brian Dirking interviewed me for an Oracle Authors Podcast on content management, collaboration, enterprise 2.0, and other topics of interest... You can download the MP3 and listen at your leisure. Questions include:

  • What is "infoglut?"
  • Is it fair to say most people fall short of true enterprise-wide content management?
  • How do you see "Social Media" affecting ECM?
  • What's the difference between content management, collaboration, and enterprise 2.0?
  • What's the difference between a "process worker" and a "knowledge worker?"
  • Where is ECM going in the next 5 years?

If you've read my blog or my books, I'm sure you know that I have a fairly strong opinion on many of those topics... at one point the producer had to interrupt one of my rants because we were running short on studio time. ;-)

Top 10 New Years Resolutions for ECM

A lot of folks are doing end-of-year predictions about what will happen in 2010 in the Enterprise Content Management universe. In general I'm not a huge fan of making predictions on the future of technology... the easiest way to predict the future of technology is to build it. So instead of countering their predictions with mine, I thought I'd share a list of ten new years resolutions for ECM geeks:

1) Test Your Disaster Recovery Strategy!

Yes, you probably have a decent backup strategy... but are you sure??? When was the last time you tested it? If you haven't tested your disaster recovery strategy, then you don't have one. What if your server melts? How long would it take to recover? What if your existing backups are corrupted? What if your database gets hacked and somebody deletes all your tables? Test your existing what-if scenarios... and then add one more to the list!

2) Install Necessary Patches

Are your security patches up to date? Or is there some annoying little bug that's driving you nuts, which might be fixed in a newer version? It's probably a good time to take stock of where you are, and where you'd like to be... Oracle Metalink has some pretty good advice on How To Maintain UCM and How To Maintain Site Studio. After doing the minimum, think a bit about where you'd like it to go next.

3) Learn About At Least One New ECM Feature or Technology

ECM is a fast changing field... do you know as much as you need to know about records management? How about the new features in Site Studio 10gr4? Have any new connectors been released that might make integrating ECM into your systems easier or more useful? How much do you know about Web 2.0 and Enterprise 2.0? Make a commitment to read a book or at least some blogs about something new in the ECM universe, and how it can benefit you.

4) Calculate Return-On-Investment

Some ROI is based on fairly hard-cost numbers that are easy to calculate... How much less printing and shipping did you have to do this year? Did you save money on warehouse space by scanning documents instead of keeping paper copies? Were you able to lower call-center volume with a self-service web site? Were you able to save on legal costs because your system was easier to audit?

Other kinds of ROI are harder to calculate... for example, how much time did you used to spend looking for documents, compared to now? Were you able to more effectively collaborate? Were you able to avoid problems and spot new opportunities because you had more information at your fingertips? These kinds of calculations might have to rely on soft numbers, and some end-user surveys.

5) Retire Outdated Systems

The primary value of ECM is that you can use it as a central repository for all your content... but all that value is wasted if you keep those old systems around. Commit yourself to retiring at least one outdated system. Go for the low-hanging fruit: something with useful information, that is difficult to use, and easy to replace.

6) Determine What Content Is Popular

It is always a good idea to keep statistics on what content is popular... not only does it help you determine what information is useful to your audience, it's also a great way to encourage user adoption. If you knew that your content had a below-average popularity amongst your peers, you might take some more care to make your content easier to understand, and easier to find. In other words... once rankings are public, people use less jargon, and better metadata.

It's also a good way to determine what content needs to be updated... if a one year old document is extremely popular, you might want to kick off a workflow to get the original author to make a new version.

7) Perform a General Audit Of Your Repository

Run a few performance tests on your site... spot check your users to make sue their security credentials are not too generous... see if you can simplify your workflows so they are faster... check your repository to see which metadata fields are always left as the default (a good sign that nobody uses them)... see if you can simplify your security and metadata model a bit...

8) Run Formal Usability Tests

There are a lot of great ideas on usability tests in Don't Make Me Think... but my favorite is also the most simple:

  • Come up with 10 or so common use cases for your system: why would people use it?
  • Collect 10 novice users who have never had any training on your system
  • Ask them to perform these 10 simple tasks, and don't give them any guidance
  • Videotape them
  • Force your developers and administrators to watch every minute of the tape!

Trust me... there are few things more painful to a developer than watching people click the wrong button... it will haunt them in their sleep until they make the system easier to use. Especially if you threaten to make them watch it every day until it's fixed.

9) Documentation!

Admit it: there is a little bit of black magic in your setup. Some customization you wrote, some script you hacked together, some configuration flag that nobody else knows about... Commit yourself to documenting at least three features of your solution that would be difficult for people to figure out n their own. And then -- of course -- check it into your ECM system!

10) Give Back To The Community

Got an idea for an ECM blog post? Maybe a nice presentation topic for local user group? How about some quick tips and tricks that you can share on the Oracle ECM forums and mailing lists? Then please share! At the very least, show up to local user groups and network with your fellow ECM practitioners... ARMA, and IOUG all have local groups worth checking out.

Software is like a lot of creative endeavors: the very best always create more value than they take. Which is a good New Year's Resolution, no matter who you are ;-)

Get iPhone-Like Animations with jQuery

I just came across jQTouch, which appears to be a version of the popular jQuery JavaScript library for the iPhone. I've been playing around with it, and its pretty dang sweet... It gives you a whole bunch of iPhone-like forms and animation, but its all HTML5 and CSS. This means you only need to know HTML in order to create a beautiful iPhone app interface.

Not only that... but once you make your interface, it will also work on the Google Phone, the PALM Pre, and anything else with a browser that supports the HTML5 specification! Technically, HTML5 is still a work-in-progress... but using jQTouch means you can target the two fastest growing smart phones -- the iPhone and the Google Phone -- with one single HTML-like code base. Even better... you no longer have to struggle with the iPhone App Store to get new versions of your product to your customers.

I kind of figured mobile phone development would eventually become just mobile HTML5 platforms... it's nice to see the jQuery folks leading the charge.

UPDATE: most of my above review was based on testing with the iPhone and the iPod Touch. Both were great platforms for jQTouch. However, I have also tested with the latest and greatest Google 'Droid phone... and on the droid jQuery Touch was horribly unusable. I'm not sure why... it could be because the iPhone has some nifty hardware acceleration for the kinds of animation that jQuery Touch wants to do, whereas the 'Droid has some pretty awful lag. The forms are mostly OK, as are the demo apps, but anything with animations is god awful.

Not sure if this is jQuery's fault or the Droid's fault... however if I were Google I'd work very closely with the jQuery folks to make this all happen... because then it would be a heck of a lot easier to make slick looking UIs.

The Best Tech Of The Decade!

O'Reilly Radar came out with their Best And Worst Technology of the Decade, which is a pretty good read. Here's their list:

The Best!!!

  • AJAX
  • Twitter
  • Ubiquitous WiFi
  • Smart Phones
  • The Do-IT-Yourself MAKER culture
  • Mainstream Open SOurce

I mostly agree, buy seriously... Twitter??? You're putting Twitter up there? You're going to give it a greater importance than YouTube, Wikipedia, and Facebook? The author concedes that most tweets about what celebrities have for dinner are kind of silly... but:

"the real power of the 140 character dynamo is that it has brought about a resurgence of real web logging. The most useful tweets consist of a Tiny URL and a little bit of context."

I don't quite agree... Frankly, that problem was solved a lot better by Del.icio.us: post a link, describe it, tag it, and share it! They had a huge head start on this kind of movement... they just needed a bit more pizzaz to make it easier and more fun to use. Unfortunately, Del.icio.us was bought out by Yahoo, who can't run a Taco Truck let alone manage a global folksonomy.

Twitter is the killer app for Public Relations people... Just like LinkedIn was the killer app for recruiters. Twitter will be "hot" regardless of how useful it is, because it helps PR people do their jobs. PR people are very skilled at getting you to talk about what PR people want you to talk about... and PR people walk you to talk about Twitter.

Although... I can't say I mind. a couple of keywords in 140 characters is much easier to digest than the typical press release... and a lot more genuine.

The one other thing that Twitter does well is breaking news... Such as the Iran election last year. However, there is a huge signal-to-noise ratio problem. It is trivially simple to run a bunch of Zombie Tweeters to spam the twitscape with phony URLs after events break.

The Worst!!!

  • SOAP
  • Intellectual Property Wars
  • The Cult of Scrum
  • Ubiquitous Work

Hrm... an O'Reilly geek blasting SOAP... how unusual ;-)

I agree somewhat... the IP wars are crippling software innovation, simply because geeks are much faster than lawyers, and don't like any restrictions. Scrum can become problematic, but the same holds true for any rigid software methodology. Also, if you hate always being at work, then you had better ditch your smart phone ;-)

I do agree that some vendors oversell SOAP... and all are guilty of piling on yet-another-web-services-specification. However, it's important to remember that none of this is inherent to the goals of SOAP: it's just that one vendor had a wacky idea, implemented it, and then called it the "standard" way to do something.

SOAP worship has partly fueled the rise of JSON, but JSON is a backlash to obtuse XML formats in general. It's a much better way to describe human readable data, but it still has problems with multipart messages and binary files. In other words, you still need some kind of standard on top of JSON to describe complex messages between systems.

It doesn't matter whether you use ReST or SOAP... people will still feel a desire to come up with a "standard" way of describing messages between systems. For example, ReST kind of falls apart when you want to do batch processing over XMPP instead of HTTP... so it's only a matter of time before somebody comes up with a "standard" way to do it. Probably Google, since Google Wave relies heavily on XMPP.

The evil here is not in trying to adopt standards... it's in the inevitable tendency for people to believe the "standard" way is the "right" way. Software just doesn't work like that in the long term. Using a standard means that your software doesn't change... and stable software is the same thing as dead software.

Why CMIS Won't Be A "Real" Standard

As mentioned by pie guy, the Content Management Interoperability Services (CMIS) standard has reached version 1.0 status... and is open for public comment. As I mentioned before, I'm a fan of CMIS, and I think it is a decent start at making content management systems more interoperable... especially for folks creating vertical applications on ECM systems.

For example... Let's say you want to make a killer application for scanned medical records. Your content needs are pretty basic: just a big image with some metadata. You might need external engines for workflow and identity management, but the content problem is simple. In this case, a good idea it to code your application in middleware, and use CMIS as a content storage interface.

However, I'd like to make one point very clear: if you think that CMIS will turn into a "standard," get used to disappointment.

Now... why would I say such a thing?

It could be because the past decade has seen a half dozen Enterprise Content Management standards come and go -- ODMA, SPI, WebDav, JSR170, JSR283, etc... so I might just be a skeptical curmudgeon who won't cut anybody slack and is adopting a "wait and see" attitude.

It could be because ECM is a marketing term; not a specification... so every vendor does something fundamentally differently. Some of the big points can be addressed by a spec... but no matter how hard we try, those fundamental differences can never be included in a standard. All abstractions are leaky, and all attempts to hide complexity ultimately fail when you attempt anything interesting...

Or, it could be because those precise difference are exactly why a customer chooses one ECM vendor over another... They didn't just spend a ton of money on an ECM system just so you could treat it like a big hashtable... Even if a customer demands that their system supports CMIS, that doesn't mean they will actually use it. Support for CMIS more than anything represents a commitment to interoperability... and that you can use it for content migration.

But the real reason I say CMIS will never be a true "standard," is because Microsoft is involved.

Microsoft has a long, long, long history of saying they will follow a standard, when in fact all they are trying to do is force everybody to do it "their way." While true believers try to religiously follow the spec, Microsoft will do whatever makes sense for their product direction... and then say to everybody "you want interoperability? You'd better do it my way. Ha!!!"

Now, this isn't always a bad thing. When Microsoft's Internet Explorer went their own way with HTML, some of their ideas were horrible... but others -- like innerHtml and AJAX -- forced the concepts of dynamic HTML on the public. Likewise, some of the LDAP extensions they put into Active Directory made pretty damn good sense... although their extensions to Kerberos encryption make me skittish, especially since we're not allowed to view the source code.

Well, how should we think about CMIS? If you want to avoid lots of pain and heartache, don't think of CMIS as a standard; think of it as a contract signed by Microsoft, that they might change at any time. When Microsoft pushed the WebDAV standard, they made sure that common Microsoft products -- Word, Excel, Windows -- followed (most) of the specification. This does not mean that you have to follow the specification to the letter... just follow it enough so that you can integrate more easily with Microsoft products.

Naturally... Microsoft will probably find all kinds of limitations to the CMIS spec later on. This could be because there's a gap in the spec, the spec is limited in some real-world situation, or they just flat out don't care anymore. If history is any judge, that means their next move will be to violate the spec. While spec purists at IBM/OpenText/Documentum complain, Microsoft will happily make Word 2012 do something completely different... and break interoperability.

Expect it my friends...

So... for that company making vertical applications on top of ECM, my advice is this:

  • CMIS is a good start, but a true ECM standard will always be a work in progress
  • Expect Microsoft to follow most of the spec
  • Expect Microsoft to break the spec in both wonderful and horrific ways
  • Expect to spend a lot of extra time finding the magic voodoo to make Microsoft work
  • Expect half of your feature requests to be outside of what CMIS can do

This advice is partly mine... and partly the battle wounds from Oracle/Stellent developers who worked on making WebDAV work properly...

No Wonder ECM Confuses People, Part 2

Back in January, I blogged about how the wikipedia entry for Enterprise Content Management was a bit thin... it was tagged as unclear, poor grammar, and in need of expert cleanup. Well, I checked again, and it appears to have gotten worse since I last checked:

Now in addition to being confusing, unclear, and grammatically incorrect... it's also using peacock terms, and is now written like an advertisement.

I'm not sure what to think of this... Is it a turf war between the marketing departments of the big firms? Is it that nobody outside of marketing cares to explain it to a layman, and they can't help speaking in marketing-ese? Personally, I've avoided writing anything there because I know my biases, and was hoping that a neutral expert -- like AIIM -- would take ownership of this page... or maybe some up-and-coming blogger who wants to make a name for himself.

New Oracle UCM Webcasts

I just got word about two new Oracle UCM webcasts next week, and thought I'd share!

The first one is on Paperless Personnel Processes... try saying that 5 times fast! If you are interested in making you HR processes involve less paper, this webcast should have lots of good tips and tricks for those of you with Peoplesoft, and would like to integrate it with Oracle UCM. Its next Tuesday Nov. 17th, 10 a.m. PT/1 p.m. ET.

The second one is on Enterprise Document Management. It will offer tips and tricks for paperless order management, asset management, and accounts payable. If you are an E-Business Suite customer, I would highly recommend this one. Its next Wednesday, Nov. 18th, 10 a.m. PT/1 p.m. ET

These are live webcasts, and I don't know if they will be recorded. So register, watch, and grill the presenters with tough questions ;-)

Oracle UCM Security: Challenges and Best Practices

I recently gave a security talk at the Minnesota Stellent User's Group... Stellent of course being the old name for Oracle Universal Content Management. I uploaded it to Slideshare, and embedded it below:

This talk is a variation on a talk I gave at Crescendo a few years back... it covers the security risks and vulnerabilities inside Oracle UCM, and countermeasures to prevent break-ins. This talk is not a how-to for integrating LDAP, Active Directory or Single Sign On... rather it's intended to be an introduction to cross site scripting, SQL injection, and other common web application attack vectors. It's a bit scary for a while, but then it tells you how to prevent attacks.

Enjoy! And don't be evil...

Free 10gr3 Component: Add Tool Tips to Metadata Fields

I recently got a question from a customer about how to add tool tips to metadata fields. Like if you had a field named "Comments," you could float your mouse over that field, and you'd see a small popup with a description of that field. I said, no problem, just set this flag in your config.cfg, or as a side effect to a profile rule:

xComments:description=Comments about this content item

No different than the isHidden or isInfoOnly flags. Unfortunately, it didn't work...

I thought that was built into the core, because I distinctly remember making that feature myself. Or more correctly, I made a component called ProfileExtras which added a whole bunch of useful features to the 7.5 Profiles functionality... including this. I thought I rolled that into the core for the 10gr3 release, but I left Stellent before Oracle released UCM 10gr3...

I thought about telling the customer how to do it... but I realized it would take about as much time to do it myself, as it would to describe to somebody else how to do it... So I whipped it out, and put it in the Bezzotech Library:

  • Tool Tips : A simple component that adds tool tips to metadata fields, so contributors know what to put in the fields.

Hopefully others find this useful as well...

Garnter Sued for $132 Million for Saying Mean Things

OK, this is just nutty...

A tiny Silicon Valley software vendor is taking on mighty Gartner, one of the technology industry's largest and most influential market research and consulting companies. The battle is playing out in a San Jose federal courtroom, where ZL Technologies is asking for $132 million in damages (plus even more in a punitive judgment), saying the research outfit damaged its prospects by ranking it in the bottom segment of its closely watched Magic Quadrant report. The MQ divides technology providers into different classes, with the bottom segment essentially forming a "do not buy" recommendation.

Blogger reactions are varied... but I agree that this is a pretty silly lawsuit.

ZL Technologies makes an email archiving product, and Gartner is not impressed with it... so in their opinion they call it a "niche" market player. Since in the US we have a little thing called the first amendment, this suit should be just thrown out. Unless Gartner is guilty of some kind of fraud... but I'd doubt it. They're too big of a firm to take that risk.

Besides... calling a product "niche" is hardly an insult. Stellent was once "niche", then "visionary," and after many many years it made it to "leader". "Niche" hardly means "do not buy," it simply means that the product might not be suitable for some industries, or some uses. In order to be a "leader", you need an innovative product with a good strategy, and a large enough organization to ensure the product will be around for a while (and not gobbled up and shut down by Open Text). Even if you have the best technology in the world, if you don't have a future vision, and the ability to grow your business, you're going to be called "niche."

I disagree with Gartner frequently -- mainly because they focus a bit too much on the "ability to execute" angle, and they do tend to ignore open source a lot... but this lawsuit is just ridiculous.

Oracle ECM: Rated a "Leader" In Gartner Magic Quadrant

Oracle UCM rated highly in Gartner's latest mystic grid magic quadrant. They were ranked as leaders, along with IBM/FileNet, EMC, Microsoft, and OpenText. You can see the article for yourself, to compare Oracle versus the rest of the leaders. Here's what they say that's positive:

Oracle has been expanding its ECM market footprint while building content management functionality into its enterprise business applications. Oracle ECM Suite includes document management, WCM, records management, imaging and process management. Though it does have transactional content management functionality — including synergies with its own ERP and CRM applications — Oracle is widely considered to be more of a collaborative and contextual content vendor.

Strengths

  1. Oracle Universal Content Management (UCM) is a mature, well-integrated product suite that provides "productized" integrations with Oracle applications. With Oracle Fusion Middleware, it has integration with a broad set of complementary technologies, such as BPM, BI, portals and enterprise search.
  2. The size and capabilities of Oracle's sales force, product development and support organizations give it significant opportunities to grow its content management business and increase its market share. Sales incentives for performance in these areas appear to be helping to build momentum.
  3. Customer loyalty is high — Oracle customers often ask first whether Oracle already offers, or will soon deliver, products in any particular ECM component category.

And now for the inevitable negatives...

Cautions

  1. Although Oracle has presented a cohesive vision for content as part of infrastructure, it has been less clear in its vision for collaboration and social software offerings (Oracle WebCenter and Beehive, for example) and their ties to UCM. All were developed using Oracle Fusion Middleware, but integration is still needed, and Web 2.0 capabilities across the products need to be rationalized. Oracle's next-generation portal product, WebCenter, is promising but immature, and its collaboration product, Beehive, is a work in progress with no clear ties to content management at a time when most ECM vendors are adding richer collaboration and support for Web 2.0 in their core platforms.
  2. Other vendors are commonly chosen for invoice automation and ERP integration in preference to Oracle's products. Oracle intends to close this gap, but its delivery of a compelling imaging and process management solution is late.
  3. Oracle's customers don't consider that it has created the same sense of community around content and collaboration as several other leading ECM vendors have.

The complaint about imaging and process management being late is apt... however, I think that Oracle's IPM solution will dominate Oracle shops soon, especially when combined with Oracle's Business Intelligence solutions. The complaint about community kind of hits me hard, because I know a lot of people who have tried for years to build a solid community... Hopefully this "ding" will ensure these teams have the resources they need to get things done.

Regarding point number one, I'm actually nonplussed. A coherent "collaboration" strategy would be nice... but frankly, I'm hesitant for Oracle to jump on the E 2.0 / Web 2.0 / Collab 2.0 bandwagon and be nothing but a follower. I have yet to see anybody put together a true "collaboration" tool, that doesn't immediately devolve into co-blaberation. At the moment, I'm thinking that Oracle could do something completely different... They could blend together the identity management tools from Sun and Oblix, along with the analytics tools from Siebel, along with the standards (email, ECM, enterprise search, social software) and create something fundamentally different than what anybody else has.

Or at least, so I hope...

Print-On-Demand Google Books

Last week, Google announced a partnership where they would be able to print any book on demand that is in the Google Books library. They can do this by using the Espresso Book Machine, which can print out any book in about 4 minutes! Printed, cut, bound, and even with a full color cover.

This sounds pretty cool to me... and I curious how it will affect book stores, as well as new authors...

The Espresso Book Machine folks bill their system as an "ATM For Books," whereby you put money it, get a book out, and the author gets a small royalty. This makes the bookselling model highly distributed, and really lowers the barrier of entry for new authors.

I think the analogy is a bit inaccurate, tho... an ATM takes 10 seconds, whereas book printing takes 4 minutes. You'll need to set up dedicated systems where people can print out their books online, then go pick them up when finished. I could see this working really well at pharmacies, and maybe even at coffee shops. Sick of reading The Onion and City Pages whilst sipping your latte? Just print out some Burkowski and chill with the other hipsters...

Of course... browsing the library can be a bit tedious... the review system for Google Books is nowhere near as good as Amazon. Therefore, browsing for books that you might enjoy will be hard. However, since the entire book is indexed, it should be easier to find books when you know exactly what is in them... which is great for researchers looking for out-of-print books.

Since my first book is available on Google Books, you might be able to print out your own copy once it's no longer available.

Open World 2009, Day 1

Open world opened officially today... but I got there early for the "soft opening," including the briefing for my fellow Oracle ACE Directors. We had a surprise Q&A visit from Thomas Kurian himself. If I had known, I would have surely had a much bigger list of questions for him! Nevertheless, I learned quite a bit about Oracle's future product strategy. I can't share what I learned until after the conference, tho... they are planning a few announcements.

We kept trying to extract some info on the future of Sun product lines... but the Oracle folks were very tight-lipped about it. The European Union has not yet approved the merger -- mainly because of MySQL -- so they can't say a thing about it yet.

Some interesting news I'd like to highlight:

  • The Social Schedule Builder for Open World: my friend Chris Bucchere integrated his popular conference schedule builder with Oracle Mix... so if you have a Mix account you can use this to organize your conference. If you find Oracle's default Schedule Builder to be too clunky, check out this one. And since it was released a full 2 days before the conference, its perfect for procrastinators.
  • Try Out Amazon Web Services For Free:provided you're at the conference, and you show up at the Fusion Middleware Lounge on Floor 3 of Moscone West. Some Amazon folks should be there to give you a quick tutorial, and let you test it our for free.
  • Oracle is giving away 400 copies of my book. If you registered for the Information Overload add on, you get an electronic copy of my latest book. Not sure if that's a good sign for book sales or not...

I'll be heading to a few more sessions and user groups today... and I'm sure I'll have some updates after the main keynote.

UPDATE: the Sunday keynote just ended... and since Oracle was nice enough to give me press credentials, I thought I should post my thoughts ASAP. They were still pretty hush-hush about what the acquisition will mean. The three big questions are:

  1. What will happen to MySQL?
  2. What will happen to Sun's hardware division?
  3. What will happen to Java?

That first question was the big one... it's probably the main reason why the EU has not yet approved of the merger. Well, Scott McNeally made the obvious point that MySQL doesn't compete with Oracle; it competes with Microsoft SQL Server. Also, Oracle acquired two other open-source databases -- Sleepy Cat and Innobase -- and has increased R&D for them. Larry Ellison himself said Oracle promises to spend more resources on MySQL than Sun does right now. Given Oracle's past history with Open Source databases, I'm prone to trust Larry on this one. They'll likely use it as a wedge to get some of Microsoft's business when a company doesn't need Oracle's performance.

Oracle also seems to be committed to expanding Sun's hardware division. IBM tried to use the tiresome "Fear, Uncertainty, and Doubt" to scare existing Sun customers to dump SPARC in favor of IBM hardware... But I don't think so. The new stuff they showed off -- like the 4 Terabyte F5100 FLASH memory array -- was really innovative stuff. McNeally said you can get 4x I/O throughput by just bolting this on to existing storage infrastructure... not to mention ultra-low power consumption, and much more compact compared to IBM's stuff. Larry even issued a challenge: if you are an IBM hardware customer, and Oracle can't make your system run TWICE as fast on Sun hardware, they will give you $10 million dollars. IBM was explicitly invited to try.

End of the day, Sun's hardware is better than IBM, IBM is Oracle's new enemy, and Larry likes to win. Ain't no way that stuff is going away...

Regarding Java, I don't think there was ever a question there... Oracle is heavily invested in Java, and is a big contributor. They are going to keep that thing going as long as they can. James Gosling himself was up on stage, saying he looked forward to the acquisition... because then he'd finally be working for a software company!

Har...

Overall, I think that was a really good way to soothe Sun customers, Open Source advocates, the EU, and Java Bunnies everywhere.

Oracle Open World, 2009

I'm off to Open World! I came early this year, because Oracle is doing the ACE Director briefing on Friday. That's always a bit tense for me: sneak previews on cool technology that I'm not allowed to blog about! Alas, I'll survive... It will be nice to see all the other Oracle ACEs again, like Sten, Lonneke and Chris. I already bumped into Jason Jones at the airport.

For the first time, I'm not presenting anything this year. I had planned a few talks on security and Site Studio 10gr4, but this summer was busier than normal, and I couldn't put them together in time for the deadline. Kind of a bummer, but no big deal: I'll just present them at Collaborate 2010, or the local Minnesota Stellent Users' Group.

I don't know what I'll be able to share after my briefing today, but I'll do what I can. Also, if you are heading to Open World, and you'd like to meet up, send me an email!

Site Studio Performance Tuning: Now Posted

In case you missed my talk last month... IOUG has posted the full video of my Site Studio Performance Tuning Webcast. This was an hour long talk containing tips and tricks for making your web sites faster. Only half of it is specific to Site Studio or Oracle UCM: I also share tips on making general HTML pages faster, which should apply no matter what kind of system you use.

As usual... my presentation is available for download from Slideshare, if you'd like a copy... Although this one lacks the panache of the video version.

PS: sorry that its in WMV format... I had no control over that...

The Best Fairy Tale Ever

Once upon a time, a guy asked a girl, "Will you marry me?"

The girl said, "NO!"

And the guy lived happily ever after and rode motorcycles and went fishing and hunting and played golf a lot and drank beer and scotch and had money in the bank and left the toilet seat up and farted whenever he wanted.

The End!

(Hat tip, Shelia and Michelle...)

The Deep, Dark, Secret Origin Of Oracle UCM's Security Model

On a recent blog post about Oracle UCM -- Should Oracle Be On Your Web Content Management Short List? -- CMS Watch analyst Kas Thomas commented that he thought Oracle's security model was a bit spooky. He admitted that this may be because he didn't know enough about it: his concern stemmed from an overly stern warning in Oracle's documentation.

Alan Baer from Oracle soothed his fears and said that the documentation needed a bit of work... The documentation mentioned that changing the security model might cause data loss, which is in no way true. It should say that changing the security model might cause the perception of data loss, when in fact the repository is perfectly fine... the problem is that when you make some kinds of changes to the security model, you'll need to update the security settings of all your users so they can access their content.

Nevertheless, I thought it might be a good idea to explain why Oracle UCM's security model is how it is...

Back in the mid 1990s when UCM was first designed, it had a very basic security model. It was the first web-based content management system, so we were initially happy just to get stuff online! But immediately after that first milestone, the team had to make a tough decision on how to design the security model. We needed to get it right, because we would probably be stuck with it for a long time.

  1. Should it be a clone of other content management systems, which had access-control lists?
  2. Should it be a clone of the unix file permissions, with directory and file based ownership?
  3. Or, should it be something completely different?

As with many things, the dev team went with door number 3...

Unix file permissions were simply not flexible enough to manage documents that were "owned" by multiple people and teams. The directory model was compelling, but we needed something more.

Access Control Lists (ACLs) are certainly powerful and flexible, because you store who (Bob, Joe) gets what rights (read, delete) to which documents. The ACLs are set by the content contributors when they submit content. However, ACLs are horribly slow and impossible to administer. For example, I as an administrator have very little control over how you as a user set up your access control lists. Let's say some kinds of content are so important that I want Bob to always have access, but Joe never gets access. If Bob gets to set the ACLs on check-in, then there's a risk he gives Joe access. It's tough to solve this problem in any real way without a bazillion rules and exceptions that are impossible to maintain or audit.

Instead, the team decided to design their security model with seven primary parts:

  • SECURITY GROUPS are like a classification of a piece of content. Think: restricted, classified, secret, top secret, etc. As Jay mentioned in the comments, these are groups of content items, and not groups of users.
  • ACCOUNTS are like the directory location of where a content item resides in a security hierarchy. Think: HR, R&D, London offices, London HR, etc. These are typically department-oriented, but its also easy to make cross-departmental task-specific accounts for special projects.
  • DOCUMENTS are required to have one and only one security group. Accounts are optional. This information is stored with the metadata of the document (title, author, creation date, etc.) in the database.
  • PERMISSIONS are rules about what kind of access is available to a document. You could have read-access-to-Top-Secret-documents, or delete-access-to-HR-documents. If the document is in an account, then the user's access is the union intersection of account and group permissions. For example, if you had read access to the Top Secret group, and read access to only the HR account, you'd be able to read Top-Secret-HR content. However, you would not see Top-Secret-R&D content.
  • ROLES are collections of security group permissions, so that they are easier to administer. For example, a contributor role would probably have read and write access to certain kinds of documents, whereas the admin role would have total control over all documents. Change the role, and you change the rights of all users with that role.
  • USERS are given roles, which grants them different kinds of access to different kinds of documents. They can also be granted account access.
  • SERVICES are defined with a hard-coded access level. So a "search" service would require "read" access to a document, otherwise it won't be displayed to the user. A "contribution" service would require that the user have "write" access to the specific group and account, otherwise you will get an access denied error.

This kind of security model has many advantages... firstly, it is easy to maintain. Just give a user a collection of roles, and say what department they are in, and then they should have access to all the content needed to do their job. It works very well with how LDAP and Active Directory grant "permissions" to users. That's why it is usually a minimal amount of effort to integrate Oracle UCM with existing identity management solutions.

Secondly, this model scales very well. It is very, very fast to determine if a user has rights to perform a specific action, even if you need to do a security check on thousands of content items. For example, when somebody searches for "documents with 'foo' in the title," all the content server needs to do is append a security clause to the query. For a "guest" user, the query becomes "documents with 'foo' in the title AND in the security group 'Public'." Simple, scalable, and fast.

There are, of course, dozens of ways to enhance this model with add-on components... The optional "Collaboration Server" add-on includes ACLs, along with the obligatory documentation on how ACLs don't scale as well as the standard security model... The optional "Need To Know" component opens up the security a bit to let people to see some parts of a content item, but not all. For example, they could see the title and date of the "Hydrogen Bomb Blueprints" document, but they would not be able to download the document. The "Records Management" component adds a whole bunch of new permissions, such a "create record" and "freeze record." I've written some even weirder customizations before... they aren't much effort, and are very solid.

I asked Sam White if he could do it all over again, would he do it the same? For the most part, he said yes. Although he'd probably change the terminology a bit -- "classification" instead of "role," "directory" instead of "account." In other words, he'd make it follow the LDAP terminology and conventions as closely as possible... so it would be even easier to administer.

I do think it is a testament to the skills of the UCM team that the security model so closely mirrors how LDAP security is organized... considering LDAP was designed over many years by an international team of highly experienced security nerds. I'm also happy when it gets the "thumbs-up" from very smart, very paranoid, federal government agencies...

Recent comments