People have a tendency to behave differently when they have insurance... they are a tad more careless than they should be, because its suddenly "somebody else's responsibility" to pay when things go wrong.
Auto insurance? If somebody scratches your car the insurance company has to pay for the paint job... even if you yourself scratched your paint job a dozen times prior to that. Theft insurance? Maybe you care less if somebody steals your 3-year-old computer, because then you get a new one. Health insurance? Well, then you might demand a CAT scan for every headache, an MRI for a sprained ankle, and expensive drugs instead of just taking a walk once in a while...
If "somebody else" is paying the repair costs, people tend to stop taking care of their stuff... This is what economists call adverse selection, and its a common reason why insurance is more expensive than it should be...
As we move towards health care reform in this country, a lot of people are concerned about this kind of behavior becoming more widespread. We need some kind of system that prods people into being more responsible with their health, but it cannot be coercion, nor can it be preachy nagging. This is the paradox about us: Americans love telling people what to do, but we hate being told what to do.
My solution? Bribe people to stay healthy. It sounds silly, but similar projects have shown promise for different kinds of insurance...
Take a classic case of unemployment insurance. In general -- and barring a widespread economic downturn like today -- most unemployed people find work within the first 2 months of being unemployed... even though they are receiving unemployment benefits. After this, there really aren't many people who get jobs in the 3rd, 4th, or 5th month of unemployment. That's because unemployment insurance lasts 6 months. At the last moment -- right about when the free money dries up -- there's another huge surge of people getting jobs.
In the 1990s, there was something called the Illinois Reemployment Bonus Experiment, where unemployed people got a bonus for getting a job within 60 days. Instead of waiting around for 6 months, most people worked hard to get jobs in 60 days, just to get that bonus. About half of them applied for different jobs with their previous employer. Overall, this decreased the costs of insurance, because they didn't have to pay the extra benefits for the other months. Critics say it could use some more fine-tuning -- many people quit the new jobs right after they held it enough to qualify for the bonus. Nonetheless, they proved their point, and saved a lot of money, despite the cheaters.
Why not try similar experiments with health care? How about instead of wasting money on preachy public service announcements that never work, you give $500 to anybody who quits smoking? How about a $1000 bonus for marathon runners? How about if your health care costs are significantly below average, and yet you still qualify as "healthy" in an annual physical, you get a little bling? How about on your income tax return, you can get a deduction of 300 minus your blood cholesterol?
Naturally, I'm not a doctor, nor an insurance guru, nor a biostatistician... and my friends who are experts seem divided on whether this will work. There are problems with setting the right "bribe," caching cheaters, general fairness, and a feeling that genuinely sick people shouldn't be doubly punished. All valid points, but I'm not talking about individuals; I'm talking about the aggregate.
All I know, is that if we have universal health care -- in ANY form -- there will be no direct economic incentive to stay healthy. Given how generally unhealthy Americans are already, and how we like to overspend on doctors, that's a recipe for big financial problems. I know people "should" just stay healthy because its the "right thing to do," but we also all "should" eat 5 servings of veggies per day. We don't, because the payoff is too vague.
But what if your health insurance provider gave you $500, if you could prove you ate broccoli every day?
I would bet anything that a lot more people would suddenly become more interested in their health...
Its been far too long since my last trip to the mother country... If I were there, I could have been a part of this fun at the Liverpool Street station:
FULL DISCLOSURE: T-Mobile is my provider... I like them fine, but I hate cell phones in general. Full story here.
This is a pretty good visual analysis of the credit crisis... its 10 minues long, but it explains everything better than anything else I've seen:
Like playing hot-potato with a timebomb... Overall, this is very good, but it is missing answers to the following:
- What amount of leverage did banks have in the past? The 100-to-1 ratio of credit-to-cash is a bit of an exaggeration... Investment banks and lending banks used to be legally separate entities. Lending banks were careful with cash (10-to-1 leverage), investment banks were free-wheelers (30-to-1 leverage). When Clinton repealed the requirement that they be separate, dangerous levels of leverage were inevitable... but it never got as bad as 100-to-1.
- Why don't people understand that homes are HORRIBLE investments? The common wisdom that being a homeowner was a good idea only worked because of the demographic shift of the baby boomer generation. Suddenly there were a bunch of well-off families that wanted a home, and demand exceeded supply. This drove prices up, and fooled people into thinking that home ownership was a good investment. Home ownership in America will never be that great of an idea... unles we all have tons of kids, we import a lot of immigrants, or your local area has an equivalent demographic shift of new people.
- Why did Greenspan steadfastly refuse to raise interest rates, when the economy was clearly overheated? He himself said Wall street demand for bad mortgages drove the financial crisis... He could have stopped all of this if he raised interest rates in 2004. Why did Greenspan do absolutely nothing useful when he was in charge? A bowl of soup would have done a better job...
- Where were the credit ratings agencies in all this? The only reason sane investors purchased these crappy bundles of sub-prime mortgages was because folks like Moody's and S&P claimed sub-prime mortgages were "safe." If it weren't for the idiotic ratings agency, everybody would know it was a time bomb, there wouldn't be much demand for them, and we wouldn't be in this mess. These folks should be fired, and legally barred from from doing any job that requires math.
- What is the size of the "credit default swap" market? The last estimate was about $44 trillion dollars: more than all the money in the world! The only reason Wall Street used such odd terms was because if they called it "insurance," then it would be regulated. Some government official would have audited their books years ago, and said "hang on a minute, you don't have enough cash to repay your obligations if something REALLY bad happens. You can't sell this anymore!" In effect, folks like AIG made "promises" to repay people if their stock market investments lost value... but naturally, AIG didn't have $44 trillion dollars in cash to cover all their promises... selling insurance when you cannot pay claims is FRAUD, thus every executive at AIG belongs in jail.
- Finally... why do all sub-prime mortgage owners smoker cigarettes???
At the QCon convention in London this year, they have an interesting track: Historically Bad Software Ideas. I love the concept... put together a list of ideas that seemed good at the time, and some people probably found useful. However, in the end they were Really Bad Ideas that cost billions of dollars.
The historically bad ideas are as follows:
- Remote Procedure Calls: including things like DCOM, CORBA, EJBs, XML-RPC, and SOAP... all of which are fundamentally flawed. Integrating remote systems will always be tricky, and you aren't doing anybody any favors by trying to make a remote system seem local. We would all be better off if deveopers were forced to memorize The Fallacies of Distributed Computing, and then deal with reality. Look out, ReST, you're about to be ruined as well if you're not careful... Or so says Steve Vinoski, the author of many books and articles on RPC.
- J2EE: Enterprise Java filled an important niche when developers wanted to flee from proprietary systems and APIs into some kind of standard middleware. But, as those Java systems became slaves to committee-driven standards that focused more on selling software than empowering developers, the developers again fled to open standards, open source, and Plain Old Java Objects... Or so says Rod Johnson, the inventor of The Spring Framework
- The Null Pointer: multiple, multiple, so many reasons why this is a billion dollar mistake... If compilers never allowed you to compile pointers that might be Null, how many problems could have been avoided? Several billion, at least... Or so says Tony Hoare, the inventor of the Null pointer.
- Architectures that ignore multi-core, parallel, virtualized environments: Not so much a billion dollar mistake... but a billion dollar wasted opportunity. Developers have to stop thinking about serial processes one box, and start thinking in terms of a massively parallel cloud... Or so says Oracle Vice President Cameron Purdy, from the Coherence team (a truly impressive product, IMHO).
- Software Standards: My personal pet peeve is the requirement that everything -- and I mean everything -- be a "standard." Obviously a good standard is always superior to a good API... but a good API beats a mediocre standard any day of the week! Too often the process involved in creating a "standard" is premature, political, overly rigid, and just plain awful at serving the needs of the end users... Or so says Paul Downey, who has worked on OASIS, WS-I, and W3C standards.
Naturally, some of these talks are going after some sacred cows... I'm sure a number of my readers are fans of SOAP, J2EE, and Standards, so I anticipate these talks might trigger some controversy in the enterprise software world. So what do you think? Which "Historically Bad Idea" might be getting a bad rap? Which additions would you make to this list?
The White House just launched their latest Democracy 2.0 web site: Recovery.gov. It helps you get up-to-date info about how your stimulus money is being spent. Its pretty slick,
although it appears to be down right now. Its running the open source Drupal content management system... which is the same CMS I use to run my own blog.
As Alex noted in the comments, they are using a customization of theme recovery_v3, and they appear to have re-written a lot of the components from scratch. Might they contribute their customizations back to the Drupal community?
Another stimulus-related web site you should check out StimulusWatch.org, which lets citizens vote on prospective city projects that might get some of the federal money. These are not yet approved by the federal government, so voice your opinion before its too late!
Naturally, I would have gotten a warm fuzzy if Recovery.gov used Oracle ECM, but I'm just jazzed that they are using version control at all! I'd like to take this to the next level, and force Congress to use something like Subversion to write legislation... Then we'd know exactly who to blame for specific bills ;-)
The question is, how do we make enterprise search better? Some people complain that enterprise search should behave more like Google search, which I vehemently disagree with, for one primary reason: enterprise search is a FUNDAMENTALLY different problem than internet search. Here are some examples:
The internet search problem is like this:
- Heavily linked pages, which can be analyzed for "relevance" and "importance"
- Spam is a constant problem
- People don't want you to monitor their behavior
- People obsess about their Google Page rank
- People obsess about their hit count
- People aren't looking for the answer, they are looking for an answer
The whole problem reminds me of a scene from The Zero Effect:
Now, a few words on looking for things. When you look for something specific, your chances of finding it are very bad... because of all things in the world, you only want one of them. When you look for anything at all, your chances of finding it are very good... because of all the things in the world, you're sure to find some of them.
Internet search is like looking for anything at all... whereas enterprise search is like looking for something specific:
- People don't want general information; they want the 100% definitive answer
- The trust level is usually higher between co-workers, than between random web surfers... or at least it should be. Otherwise, you got bigger problems than information management.
- You know exactly who is running the search
- You know exactly what department they are in, and what content they are likely to need
- You know exactly their previous search history, possibly even their favorite "tags"
- Spam is minimal, or non-existent
- Content uses few, if any, hyperlinks to help determine relevance
- People usually write content because of obligation, and do not usually care about making it easy for their audience to understand
Trying to solve both problems with the same exact tool will only lead to frustration...
Now... Solving this problem with social tools is a much easier, and arguably better approach. People usually don't want to know the answer, people usually want to know who knows the answer. This is an observation as old as Mooer's Law (1959) about information management:
“An information retrieval system will tend not to be used whenever it is more painful and troublesome for a customer to have information than for him not to have it.”
Fifty years later, and folks still don't quite seem to get it... The average user does not want to read enterprise content! They don't read documentation on the subject, nor do they read books on the subject, nor do they read blogs on the subject... In general, people don't care to actually learn anything new; they just want the quick answer that lets them move on and get back to their normal job. Most people look for information so they can perform some kind of task, and then they'll be more than happy to forget that information afterward. Its a rare individual who learns for the sake of knowledge... These folks are sometimes called Mavens, and everybody wants to be connected with these Mavens so they can do their jobs better. As a result, these Mavens will always be overwhelmed with phone calls, emails, and meeting invites.
As those mediums became flooded, some of your resources fled to other places -- like Twitter, or Facebook, or enterprise social software -- and forced would-be connectors to follow. This constant movement (or hiding) helps a bit... but its only a matter of time before those mediums get flooded as well, and the noise overwhelms the signal.
In order to truly solve the enterprise search problem, you need to first understand why people may choose to never use enterprise search, no matter how good it is... then try to bring them back into the fold with socially enabled enterprise search tools. Don't just help people find information; help them find somebody who understands what the information means. Connecting people with mere words can easily backfire, and might actually make these people a burden on society. Instead, connect them with real, live humans who are eager to teach the knowledge being sought. At the same time, you need to work hard to protect these Mavens, so they don't flee your system in favor of another.
This is a problem that Google's search engine cannot solve -- mainly for privacy and trust reasons -- but it is 100% solvable in the enterprise. I'm just wondering why so few have done it...
There's a great developer site out there called 99 Bottles Of Beer. It shows you how to output the lyrics of the oh-so-annoying camp song in well over 1000 different programming languages.
Woah... 1000 languages, you say? Yes, there are well over 1000 known programming languages, but please keep in mind how developers think. Most of these languages are klunky, impractical, or intentionally impossible to use. These are sometimes called esoteric languages, or even Turing tarpits. Here are some of my favorite bizarre programming languages:
- Whitespace: no letters, no numbers, no symbols... the only valid syntax is tab, space, and carriage returns.
- LOLCODE: the syntax looks like something you'd see on a LOL cats poster. I HAZ A BEERZ ITZ 99! IM IN YR LOOP! IZ VAR LIEK 0? KTHXBYE!
- Piet: just damn pixels on a screen... no letters even!
- Cow: instead of number and symbols, you only get moOmOOmoOmOoOOM.
- Brainf**k: trust me... you do NOT want to maintain code written in this language...
Kidding aside, there's a pretty good argument that learning how to print out 99 bottles of beer is a useful exercise when learning a new language. You need to learn the syntax of variables, conditionals, text output, and loops. Not to mention the fact that every language has nuances that can sometimes help you to further minimize your code base, but not sacrifice clarity... there's probably a dozen ways to write it in each laguage, each with different benefits.
So -- seeing how Oracle UCM was being left out -- I submitted the below code to their site. 99 Bottles of Beer, in IdocScript:
<$numBottles = "99", bottleStr = " bottles "$> <$loopwhile (numBottles > 0)$> <$verse = numBottles & bottleStr & "of beer on the wall,\n" & numBottles & bottleStr & "of beer!\n" & "Take one down, pass it around,\n"$> <$numBottles = numBottles - 1$> <$if numBottles > 0$> <$if numBottles == 1$> <$bottleStr = " bottle "$> <$endif$> <$verse = verse & numBottles & bottleStr & "of beer on the wall!\n"$> <$else$> <$verse = verse & "no more bottles of beer on the wall!\n"$> <$endif$> <$verse$> <$endloop$>
Naturally, there are multiple ways to do this... you could use resource includes, localization strings, result sets, etc. But that's part of the fun of learning a new language. I'll leave it as an exercise for my audience to make it better.
This is a pretty concise presentation about the stock market... it puts the current downward swing into perspective:Get Rich Slowly)
One of the biggest challenges in social networks is keeping them updated. When you first log in, its a blank slate, and you have to find all your friends and make connections to them. This is a bit of a pain, so sites like Facebook and LinkedIn allow you to to import your email address book. They then data-mine the address book to see who you know that might already be in the network, which helps you make lots of connections quickly.
Ignoring the obvious security and privacy concerns, there are still two big problems with this:
- These systems find connections, but they ignore the strength and quality of those connections.
- You have to constantly import your address book if you keep making new friends.
In my latest book, I give some practical advice about how Content Management fits in with social software and Enterprise 2.0 initiatives... One of the ideas that I liked to drive home is that not all connections are equal, and it takes a lot of effort to keep quality information in your social software systems. Who is connected to whom? Which connections are genuine? And who is just a "link mooch" who is spamming people with "friend" requests just to ratchet up his ranking?
That latter one is particularly problematic on LinkedIn... Its littered with sub-par recruiters who send friend request spam so they can get something from you... but they never care to do anything for you.
Luckily, in the enterprise these problems can be solved relatively easily: data mine your email archives for who is connected to whom! By monitoring a host of statistics on who emails whom, about what, and when, you have a tremendously powerful tool for building social maps. You can determine who is connected to whom, who is an expert on which subject, and where the structural holes are in your enterprise. And you never need to maintain your connections! Any time you send a message to a friend, your social map is automatically rebuilt for you!
In order to do so, you'll need to run some data mining tools to find answers to the following questions:
- Who do you send emails to? These are the people you claim to be connected to.
- Does this person reply to your emails? If so, the connection is mutual.
- How often do you email? A one-time email is probably not a connection, but a weekly email might be a strong connection.
- How long does it take them to reply to you? A faster reply usually means your communications get priority to them, and they feel a stronger connection to you.
- How long do you take to reply to them? Again, a faster reply from you means that their communications get priority from you, meaning you feel a strong connection as well.
- Do you answer emails about a topic, or just forward them along? Just because you are the "point man" for Java questions, that doesn't mean you "know" Java... but it probably means you "know who knows" Java, which is sometimes even better.
- Does one person usually do all of the initiation of new emails? If so, then this might be a lopsided friendship, or it might just mean that one person has more free time.
- What are the topics of conversation? In reality, the more often you discuss work, the weaker the connection! If you also discuss gossip, news, current events, sports, movies, family, or trivia, then you probably have a stronger connection. The more topics you discuss, the more likely you are to be close friends.
- What is the flow of email from one department to another? If its peer-to-peer, then these departments are comfortable sharing information. If it always goes through the chain of command, then these departments are socially isolated, and probably unlikely to trust each other.
- Who do you email outside the company? If an employee in the marketing department emailed a friend who works at the company Ravenna, and your sales person is trying to connect with somebody at Ravenna, then these two employees might want to connect.
Unfortunately, many employers have a policy against using company email for personal communications. Ironically, this policy could hurt the employer in the long run, because analyzing the violations of that policy are frequently the best way to determine who is well connected in your company! So, before you deploy any social software in the enterprise, encourage your employees to goof off via email (within reason), and set up some technology to data-mine your email archives (like Oracle Universal Online Archive, or something similar). Then keep tuning your map based on the email messages people send.
That will help you hit the ground running with enterprise social software...
UPDATE: This book tour has been rescheduled for March 17th-19th.
Well, its not really a book tour... but Andy and I will be visiting 3 cities for roundtable discussions on "Pragmatic Content Management". Oracle is organizing the whole shindig, and space will be limited... Andy will be giving a talk on Pragmatic ECM strategy, then I will present on implementation advice. Then there will be a 30-minute roundtable discussion, and we'll wrap it up before lunch.
For more specific information, please read the official invitation from Oracle. Here are the cities and dates:
- Cincinnati: Tuesday, March 17, 2009
- Memphis: Wednesday, March 18, 2009
- Houston: Thursday, March 19, 2009
If you want a book signed, please register and drop by!
The boys over at InfoVark tagged me a few weeks back, trying to revive the meme why do you blog? I'll oblige, mainly because I've wanted to write something along these lines for a while.
Why Do I Blog?
This is actually my fourth blog... I tried to get into it before, but it never worked out. I was too busy, I didn't have enough to say, it just didn't feel right. I started this site back in 2006 so I'd have a landing page for my first book. I mentioned bexhuff.com several times in the book so people could come to my site, download the sample code, ask questions, and find links to other ECM resources.
One problem... by the time I finished the book and set it off to the publishers, I still hadn't launched my blog yet!
So... with panic mode setting in... I decided to force myself to write a lot of content before the book hit the shelves. I wrote some good articles, some crappy ones, but I just kept on writing. Writing writing writing! When I thought I wrote enough, I wrote some more, and saved them for later publishing.
Oddly enough, that trigger was what it took for me to finally enjoy blogging. I also noticed that the more I blogged, the better my writing became. These days, I blog for three main reasons:
- To keep my writing and communication skills sharp.
- To draw attention to events/articles on the web that deserve commentary.
- To inject my contrarian opinions into technical matters that my readers might find interesting.
That seems to be a good formula... Google Analytics says I got 170,000 pageviews in 2008... despite virtually zero self-promotion, and no guest bloggers... Not bad for somebody who also works 60 hours per week, runs his own company, writes books, manages an 18-unit condo, and travels ;-)
What Do I Blog?
Initially the topics were a tad scattered... lifehacks, technology, and all that good stuff. These days I try to keep it to software -- specifically in the information management realm -- and connections between it and other topics. I also have occasional posts on science, communication theory, alternative energy, economics, and general half-baked ideas I have... but I try to keep those to one per week.
How Do I Blog?
My blogging technique varies...
If I'm blogging just to keep my writing skills sharp, I'll take a complex subject and do my best to explain it clearly. One of my heroes there is the Nobel Prize wining Physicist Richard Feynman, who firmly believed that if you cannot explain a concept to the average college freshman, then either you're a rotten communicator, or you don't understand the concept very well. I strongly believe that this is true... so when I want to wrap my head around a tricky subject, I try to explain it to the "average educated person." Sometimes I succeed, sometimes I fail...
If I'm blogging to draw attention to recent events or articles, then I usually start by trolling on the web. I like Digg and Reddit... sometimes I just take a look at what was tagged on Delicious in the past 5 seconds. If something leaps out at me, and I think its appropriate for my readers, I'll mention it. I also follow a lot of B-grade and C-grade bloggers to see if they have penned any original prose. I try to blog twice per week, so I use this technique the most often.
If I'm feeling like writing something contrary to mainstream opinion, then my process is very methodical... it might take days, weeks, or even months to write a post, depending on how strongly held the mainstream opinion is. I usually have a half dozen such blogs in my head at any one time, waiting for the right moment. I covered the my technique in an earlier post: Five Ways To Move Beyond Conventional Wisdom, so I won't bore everybody by repeating the five steps here. I rarely win friends with contrarian posts, but I do voice objections that need to be heard.
I suppose I'll keep this in the Oracle universe, and tag the following people:
- Billy Cripe
- Dan Norris
- Matt Topper (because he's the new guy at The Apps Lab)
- Eddie Awad
- Chris Bucchere
Have at it, boys!
The W3C -- my absolutely positively most favorite standards body ever -- has just come up with an XML namespace for emotions! I must say that I fully support this specification... who on earth would ever want to type something as confusing and ambiguous as this:
When we can do The Right Thing™ and use XML instead:
<emotionml xmlns="http://www.w3.org/2008/11/emotionml"> <emotion> <category set="everydayEmotions" name="Amusement" /> <intensity value="0.7" /> </emotion> </emotionml>
Ugh... If this were released on April 1st instead of November 20th, I would have been amused. But now I'm just plain sad. As Wearehug said, "It is becoming increasingly difficult to distinguish W3C specs from Onion articles."
(Hat Tip Aristotle)
I'm a power hater. I don't hate often, but when I do, I do it with gusto. So I have to say, this pile of vaporware called "The Semantic Web" is really starting to tick me off...
I'm not sure why, but recently it seems to be rearing its ugly head again in the information management industry, and wooing new potential victims (like Yahoo). I think its trying to ride the coattails of Web 2.0 -- particularly folksonomies and microformats. Nevertheless, I feel the need to expose it as the massive waste of time, energy, and brainpower that it is. People should stay focused on the very solvable problem of context, and thoroughly avoid the pipe dreams about semantics. Keep it simple, and you'll be much happier.
First, let's review what the "Semantic Web" is supposed to be... A semantic web is about a system that understands the meaning of web pages, and not merely the words on the page. Its about embedding information in your pages so computers can understand what things are, and how they are related. Such a beast would have tremendous value:
"I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The ‘intelligent agents’ people have touted for ages will finally materialize." -- Tim Berners-Lee, Director of the W3C, 1999
Gee. A future where human thought is irrelevant. How fun.
First, notice that this quote was from 1999. Its been ten years since Timmy complained that the semantic web was taking too long to materialize. So what has the W3C got to show for their decade of effort? A bunch of bloated XML formats that nobody uses... because we apparently needed more of those. By way of comparison, Timmy released the first web server on August 6, 1991... Within 3 years there were 4 public search engines, a solid web browser, and a million web pages. If there was actually any value in the "Semantic Web," why hasn't it emerged some time in the past 18 years?
I believe the problem is that Timmy is blinded by a vision and he can't let go... I hate to put it this way, but when compared against all other software pioneers, Timmy's kind of a one trick pony. He invented the HTTP protocol and the web server, and he continues to milk that for new awards every year... while never acknowledging the fact that the web's true turning point was when Marc Andreessen invented the Mosaic Web Browser. I'm positive Timmy's a lot smarter than I, but he seems stuck in a loop that his ego won't let him get out of.
The past 10,000 years of civilization has taught us the same things over and over: machines cannot replace people, they can only make people more productive by automating the mundane. Once machines become capable of solving the "hard problems," some wacky human goes off and finds even harder problems that machines can't solve alone... which then creates demand for humans to solve that next problem alone, or build a new kind of machine to do so.
Seriously... this is all just basic economics...
Computers can only do what they are told; they never "understand" anything. There will always be a noticeable gap between how a computer works, and how a human thinks. All software programs are based on symbol manipulation, which is a far cry from processing a semantically rich paragraph about the meaning of data. Well... isn't it possible to create a software program that uses symbol manipulation to "understand" semantics? Mathematicians, psychologists, and philosophers say "hell no..."
The Chinese Room thought experiment pretty clearly demonstrates that a symbol manipulation machine can never achieve true "human" intelligence. This is not to imply human brains are the only way to go... merely that if your goal is to mimic a human you're out of luck. Even worse, Gödel's Incompleteness Theorem proves that all systems of formal logic (mathematics, software, algorithms, etc.) are fundamentally error-prone. They sometimes cannot prove the truth of a true statement, and other times they prove the truth of false statements! Clearly, there are fundamental limits to what computers can do, one of which is to understand "meaning".
Therefore, even in theory, a true "semantic web" is impossible...
Well... who the hell cares about philosophical purity, anyway? There are many artificial intelligence experts working on the semantic web, and they rightly observe that the system doesn't have to be equivalent to human intelligence... As long as the system behaves like it has human intelligence, that's good enough. This is pretty much the Turing Test for artificial intelligence. If a human judge interacts with a machine, and the judge believes he is interacting with a real live human, then the machine has passed the test. This is what some call "weak" artificial intelligence.
Essentially, If it walks like a duck, and talks like a duck, then its a duck...
Fair enough... So, since we can't give birth to true AI, we'll get a jumble of smaller systems that together might behave like a real, live human. Or at least a duck. This means a lot of hardware, a lot of software, a lot of data entry, and a lot of maintenance. Ideally these systems would be little "agents" that search for knowledge on the web, and "learn" on their own... but there will always be a need for human intervention and sanity checks to make sure the "smart agents" are functioning properly.
That raises the question, how much human effort is involved in maintaining a system that behaves like a "weak" semantic web? Is the extra effort worth it when compared to a blend of simpler tools and manual processes?
Unfortunately, we don't have the data to answer this question. Nobody can say, because nobody has gotten even close to building a "weak" semantic web with much breadth... Timmy himself has said "This simple idea, however, remains largely unrealized" in 2006. Some people have seen success with highly specialized information management problems, that had strict vocabularies. However, I'd wager that they would have equivalent success with simpler tools like a controlled thesaurus, embedded metadata, a search engine, or pretty much any relational database in existence. That ain't rocket science, and each alternative is older than the web itself...
Now... to get the "weak semantic web" we'll need to scale up from one highly specialized problem to the entire internet... which yields a bewildering series of problems:
- Who gets to tag their web pages with metadata about what the page is "about"?
- What about SPAM? There's a damn good reason why search engines in the 90s began to ignore the "keywords" meta tag.
- Who will maintain the billions of data structures necessary to explain everything on the web?
- What about novices? Bad metadata and bad structures dilute the entire system, so each one of those billion formats will require years of negotiation between experts.
- Who gets to "kick out" bad metadata pages, to prevent pollution of the semantic web?
- What about vandals? I could get you de-ranked and de-listed if you fail to observe all ten billion rules.
- Who gets to absorb web pages to extract the knowledge?
- What about copyrights? Your "smart agent" could be a "derivative work," so some of the best content may remain hidden.
- Who gets to track behavior to validate the semantic model?
- What about privacy? If my clicks help you sell to others, I should be compensated.
- Will we require people to share analytical data so the semantic web can grow?
- What about incentives? Nobody using the web for commerce will share, unless there's a clear profit path.
I'm sorry... but you're fighting basic human nature if you expect all this to happen... my feeling is that for most "real world" problems, a "semantic web" is far from the most practical solution.
So, where does this leave us? We're not hopeless, we're just misguided. We need to come down a little, and be reasonable about what is and is not feasible. I'd prefer if people worked towards the much more reachable goal of context sensitivity. Just make systems that gather a little bit more information about a user's behavior, who they are, what they view, and how they organize it. This is just a blend of identity management, metadata management, context management, and web trend analysis. That ain't rocket science... And don't think for one second that you can replace humans with technology: instead, focus on making tools that allow humans to do their jobs better.
Of course, if the Semantic Web goes away, then I'll need to find something else to power hate. I'm open to suggestions...
In the early days of computer science, people discovered what was later to be called "Conway's Law":
Any organization that designs a system (defined more broadly here than just information systems) will inevitably produce a design whose structure is a copy of the organization's communication structure.
In other words, lets say you are designing a complex system -- an auto manufacturing plant, a new financial market, a hospital, the World Health Organization, or a large software solution -- the efficiency of the end result will always be limited by the efficiency of how the committee communicates. Lets say two segments of your system need to communicate with each other... however, the two designers of those systems were unable to communicate effectively with each other. The end result will invariably be a system where those two segments are unable to exchange important information properly. If I have to run an idea by my boss before handing it off to my peer in another department, then I'll almost always design a system that uses the same paths for sending important messages... whether or not its the optimal approach.
This helps explains why large companies love Enterprise Services Buses, but small companies think they are the spawn of the devil... neither is correct, however both opinions derive from the communication structure in their respective organizations.
This goes beyond the obvious communication problems between silos and corporate fiefdoms... even the physical components you design will inevitably mirror your ability (or inability) to communicate. From Wikipedia:
Consider a large system S that the government wants to build. The government hires company X to build system S. Say company X has three engineering groups, E1, E2, and E3 that participate in the project. Conway's law suggests that it is likely that the resultant system will consist of 3 major subsystems (S1, S2, S3), each built by one of the engineering groups. More importantly, the resultant interfaces between the subsystems (S1-S2, S1-S3, etc) will reflect the quality and nature of the real-world interpersonal communications between the respective engineering groups (E1-E2, E1-E3, etc).
Another example: Consider a two-person team of software engineers, A and B. Say A designs and codes a software class X. Later, the team discovers that class X needs some new features. If A adds the features, A is likely to simply expand X to include the new features. If B adds the new features, B may be afraid of breaking X, and so instead will create a new derived class X2 that inherits X's features, and puts the new features in X2. So, in this example, the final design is a reflection of who implemented the functionality.
How do you avoid becoming a similar statistic? Simple: be flexible.
The more flexible you are when making the design, the more flexible you are to adopt new ideas and new ways of communicating, the more likely you are to create a useful product. For those who looooooooooove process, then what you need is a process for injecting flexibility into your system when metrics demonstrate a communication problem.
The number one task of any business is to make money. The number two task is to improve inter-departmental communication. After that, all problems can be solved.
I've always said, the most important skill a technical person can posses is the ability to communicate... you might not have a remarkable impact on any one feature, but you'll be better positioned to understand the whole problem, and the whole solution. Talk with your peers, and make sure that the lines of communications are 100% open across divisions... Especially divisions that hate each other. Make sure people feel connected, and that they can trust the opinions and needs of others.
Only then will a committee be able to design a system less dysfunctional than itself...
There are a lot of non-techie skills that make you a better software developer... I've found that when trying to debug people's problems, you tend to run into a lot of situations where you are reading off DNS names that sounds almost exactly the same: "Did you say 'dee zee cee zee one,' or 'dee cee zee zee one,' or 'dee zee zee zee one,' or ...
You get the picture...
So, one of my new year's resolutions was to memorize the phonetic alphabet. This is the code that the military uses to help prevent confusion when dealing with pass codes and acronyms.
Of course... if you start using these you might want to warn people... otherwise your audience might wonder why Romeo and Juliet are drinking a Kilo of Whiskey in Quebec...
So, what non-techie skills do you find helpful?
Yikes... Confusing, unclear, and cluttered since July of 2007... Not quite a ringing endorsement from the "crowd," eh?
The Wikipedia article for the Association for Information and Image Management isn't any better... at least Stellent's tiny tiny page is excusable since it doesn't exist as a company anymore. Considering the fact that folks like IBM, Oracle, EMC, and Microsoft all have product suites in this industry -- and considering how all of them tout blogs and wikis -- you'd think that somebody would have cleaned up Wikipedia by now.
I guess we all have better things to do...
Personally, I find this a refreshing reminder that the "semantic web" will NOT save you. Unless you do the hard work of creating new business processes around new information management technology, you'll just be cluttering your enterprise with ever more outdated, useless, and false data.
Cordell sent me an interesting article about how IT Certification is becoming less important. Some bloggers -- like James -- believe IT Certifications could have value if they just raised their standards a bit... but I'm not so sure. You used to be able to take the average tech-inclined person, send him through a training course, and then get him a decent job in IT. Not so much these days, and its not because of the ailing economy. Here are some other reasons:
- Certifications are Vendor-centric: they should instead be solution-centric, or more like a mentorship program
- Certification’s Life Cycle Is Short: significantly shorter than a college degree
- Certifications Are Not Real-World Oriented: they are brain dumps which present technology you may never use
- Certifications Have Been Devalued: some are just high-tech diploma mills.
- No Oversight Body: who gets to say who is certified to train database management? Oracle? Microsoft? Both? Neither?
- Degree vs. Certification vs. Experience: with experience and a degree, why on earth would you need a certification?
- HR People Are Not In Touch with the Real World: and nowhere is this more true than in IT
- Budget Cuts: no more training dollars from big companies, so certification companies are desperate for bodies
- Glut of Certified People: anybody can get one, so everybody does get one
- No One Knows Which Certifications Matter: some are very tough to pass, other have a 100% passage rate
The fundamental problem is that it is unbelievably difficult to determine how good an IT person will be based on a piece of paper.
Folks on The Daily WTF and Joel On Software have discussed endlessly on what is the proper mix of education, experience, and certification... each has benefits... but most employers prefer college degrees to certifications.
However, this raises another question... since all education loses value over time, why would an employer prefer a candidate with a 5-year old college degree, compared to a 6-month old certification?
Probably because people have a general idea of the quality of education that is possible at a well-known college. They can look up the name in any number of schools that rank college programs, and have some level of third-party validation. Also, you never quite know where a new employee's true talents may lie... Most of what I learned for my Computer Science degree is outdated, and not frequently relevant for my job... However, those were not the only courses I took in college. I took dozens of non-computer courses that helped me be meticulous when experimenting, write more clearly, think more abstractly, and visualize complex integrated systems better. These courses helped me develop true skills and talents, as opposed to just filling my head with stale knowledge.
Personally... I feel that a college degree means you can learn, experience means you've made the typical rookie mistakes, and certifications/conference attendance means you're dedicated to continuing your education. Of course, none of these demonstrate that your knowledge/skills/talents will be of any practical use for your employer... so its always a risk.
I've been reading a lot about economics and finance lately... Retirement planning gets a lot more complex when you run your own business! In any event, I've learned several things that made me highly skeptical about commonly held advice about retirement savings plans. In particular, I now believe that nobody should ever invest in a Roth IRA. This probably goes against what a lot of financial planners say, but I have my reasons.
Why? First, lets go over the differences:
- Traditional IRA: This is a pretty good deal... these let you purchase mutual funds of stocks and bonds, and take a tax deduction when doing so. Your money grows tax-free while its in the fund. At age 59.5, you can take money out of the fund without penalty, and you pay federal taxes on it as income.
- Roth IRA: This is a relatively new idea... identical to the Traditional IRA, except for two things. First, you cannot take a tax deduction when you put money into it. However, since you paid taxes up-front, you can take money out of the fund, and not pay taxes on it! Wow, sounds pretty good, huh?
For example... let's assume some dude named Bob Lemonjello is 30 years old, and puts in $5,000 per year into a IRA. This is the current maximum Bob can put into his account. We could assume a reasonable 8% growth over the next 30 years, yielding a total of about $610,000 by retirement. If Bob did this as a traditional IRA, that $5000 would be tax-deductible every year... saving him about $50,000 in taxes before he retires. Not bad... but when Bob takes out money from your IRA, it will be taxed... so the government will probably get $150,000 of his nest egg.
If Bob instead did this as a Roth, he wouldn't get a tax deduction, so he'd wind up paying approximately an extra $50,000 in taxes during his working years... but then he has $610,000 of tax-free cash! Woo hoo! The government can't touch a dime of that! Even better, he could have a traditional IRA, then do a rollover immediately before retirement. Sure, he'll have to pay $50,000 in back taxes when doing the roll-over, but for that $50,000 investment, he gets to avoid paying any taxes on his $610,000 nest egg!
Bwa ha ha ha ha!!! Bob is free... FREEEEEEEEEE!!!
I have one question: does anybody actually believe that the future US government would let Bob keep his Roth money, and not make him pay any taxes on it? Does anybody actually believe that the US government will never change the tax laws, and that they will sit idly, and not demand a piece of that easy money?
Reality time: Roth IRAs and Roth 401Ks are amazing tax-free investments, which have become wildly popular amongst people in every tax bracket... which is exactly why future governments will not keep their promises.
Let me remind you... until 1983, Social Security benefits were considered tax-free income... then Ronald Regan signed a law which made half the recipients pay taxes on their benefits! Bill Clinton later boosted it, so that 85% of Social Security recipients pay some kind of income tax. Face facts... When a government wants money, it will find clever ways to tax you. They will be called "Roth Withdrawal Fees," or "Conditional Rollover Fees," or just plain "We got all the guns! Gimme Gimme Gimme!"
The entire benefit of the Roth IRA rests on the belief that the government won't change the tax laws. I for one have zero faith that the government will keep their promises about the Roth. If you want the sure thing, go for a Traditional IRA. This has an immediate tax deduction at exactly the moment when you are in a high tax bracket, along with tax deferred growth. You'll pay taxes when you take money out, but in retirement you'll almost certainly be in a lower tax bracket.
So what do you think? Will the US Government keep it promises? If the tax laws change, will a Roth IRA be worse than a Traditional IRA?
You've probably heard about the technique of Rick Rolling... its basically the web version of the oh-so-mature "made you look" game. You tell people that a link goes to some interesting info, when if fact the link goes to a YouTube video of Rick Astley singing "Never Gonna Give You Up." It's also lead to the trend of live Rick Rolling, in where you trick somebody to look at the lyrics of the song... like what happened during the 2008 Vice Presidential Debates.
Well, now people are so suspicious of YouTube links, they won't click on them anymore. So the answer is to raise the bar a little. My technique is to use open redirects from legitimate websites to hide links to YouTube!
For example... see the link below to Yelp.com? Where do you think it goes? Cut and paste it into a browser URL to see where it actually goes:
It looks like a link to Yelp.com, which is a restaurant review site... but with a little URL magic, you can force Yelp to annoy people. Naturally, once Yelp catches wind of this, they will shut down the open redirect pretty fast, so you have to keep looking for more. The technique is pretty simple:
- Find a large/important site that links frequently to small/unimportant sites... such sites usually have open redirects.
- Poke around and see if you can spot any URLs that look like they might be redirects... the URLs might have parameters like url=http://example.com, redirect=example.com, or something similar.
- Copy one of these redirect URLs into your address bar
- In the site URL, replace the redirect URL parameter with a Rick Rolling URL -- such as http://www.youtube.com/watch?v=Yu_moia-oVI -- and see if the site redirects to YouTube.
- For advanced Rick-Rolling, you might want to disguise the link to YouTube by URL encoding it. Use the form below to obfuscate a URL parameter:
You may now Rick-Roll with impunity...
Why do these open redirects exist? Simple: to prevent SPAM blogs. This problem was big on Amazon.com, because at first they allowed people to submit links in comments. However, that meant that folks could link back to SPAM sites from Amazon.com. This is bad enough, but when Google noticed that Amazon linked to a site, its page rank and "relevance" would increase... meaning those awful SPAM sites would have a higher rank in Google search results. There were many proposals to combat this problem... but the only one that completely solves it is to do a redirect from Amazon.com itself.
This does help the battle against SPAM, but unless you do it right its a major security hole... people would see a link that goes to Amazon.com, then click on it, but then get hijacked to an evil site. The URLs look completely legit, and they bypass most SPAM/SCAM filters. These are particularly useful for people who use the phishing technique to steal bank account numbers, credit card numbers, and the like. Back in 2006 I found these security holes on Google, Amazon, MSN, and AOL. I alerted them all to the bug; some of them fixed it... however more sites every day make this same error. I'm hoping that broadcasting this technique to Rick Rollers might do some good... that way, Rick Rollers will find these security holes on new sites before hackers, cracker, and phishers do.
Basically, I'm betting that the annoying outnumber the evil... Let's hope I'm right...