07.01.08
Clueless CIOs
Clueless magazine writes about clueless CIOs. Does anyone out there know what a CIO actually does?
A trail of random code
Clueless magazine writes about clueless CIOs. Does anyone out there know what a CIO actually does?
I knew it was only a matter of time before a pedantic database supporter would chime in that mapreduce is not the greatest thing since sliced bread.
Mapreduce is very cool, and the last people likely to understand this are database programmers.
What is mapreduce?
Mapreduce stems from functional programing, such as Scheme. At the University of Illinois, I thought the CS program did students a disservice by teaching them Scheme. Apparently not. The simplest map reduce function I can think of is counting words.
every good boy does good
Let’s create a map function that simply places a one (1) next to each word and sorts it.
(boy, 1)
(does, 1)
(every, 1)
(good, 1)
(good, 1)
Next let’s create a reduce function that adds the number for non-unique keys.
(boy, 1)
(does, 1)
(every, 1)
(good, 2)
That’s mapreduce. If you had 1 trillion words to count, mapreduce becomes more useful. The data can start on many nodes, then return counted and sorted to many nodes. Even better, numbers are not necessary here. URLs can be added for both keys and values, such as (”www.amazon.com”,”ebay.com”). Using this model, reverse links can be counted, monte carlo simulations can run, and the result is page rank. There are many uses. This does not include every coding problem, but it opens many doors for formerly difficult problems.
Why would database people not understand this?
Database folks appear to be in a rut. They are overly concerned with optimization. They create the index, continue to add to the index, and retrieve data very quickly using the index. What mapreduce does appears wasteful. It creates the index once, cannot add to the index, and throws away all of the work each time it is run. I think someone in the Hadoop world should solve this problem of throwing away the index, but that’s only an optimization. Optimizations do not count where a paradigm shift occurred. Now from the article.
The database community has learned the following three lessons from the 40 years that have unfolded since IBM first released IMS in 1968.
* Schemas are good.
* Separation of the schema from the application is good.
* High-level access languages are good.
Are schemas good? This implies that data should be strongly typed. Now we are getting into the strongly-typed argument that seems never to be won between Java and C vs. Perl and PHP. My guess by extension is that Perl and PHP are bad and Hadoop is bad as well.
Schemas and applications could be separated, but that makes sense only in a database world. In mapreduce, the programmer is given control over his data. It’s called freedom. This also sounds too much like MVC arguments. Yes, databases and MVC save money in some cases, but in other cases, they just hold back creativity and development.
High-level languages are good. I agree. In a way, mapreduce programs are written in a high level language in all cases. The loop that you think you write in Java or C is actually torn apart and run on many nodes. I only appears that the loop runs on one machine, yet it runs on many machines in tiny pieces.
Some of you out there should try a little mapreduce programing and see how it screws with your mind. It’s wonderful to feel different about a loop. I feel just as good doing this as when I first learned SQL.
I enjoyed 10 Absolute “Nos!” for Freelancers. This appears to be sound advice. As a counter to this sage advice, I give you
10 Absolute “No!s” for coders
1. Can you comment your code? No!
People ask me all the time if I can comment my code. I just won’t do it. Commenting my code simply invites lesser programmers to come along and steal my job. Remember this, commenting your code will only get you fired.
2. Can you use this/that design pattern? No!
Design patterns merely sell books and create bloated code. The answer is no. Too many devs rely on a singleton that really should have been a global. Why does no one praise global variables anymore? They are all scared by the design pattern mafia.
3. Can you stop using a global variable called temp4 and be more descriptive? No!
I’m old school. If you can’t remember all of your variables and what they do, you should not be writing code. Pure and simple, you should just sit down and remember this stuff.
4. Can you use this/that language for this project? No!
Assembly is good enough for Steve Gibson. It’s good enough for me. People who do not code in assembly regularly simply lose touch of reality quickly. Then people ask why I even need an assembler. Why not machine code? Machine code is never ever portable!
5. Can you get me some coffee? No!
People think because you are not a high-paid, suit wearing consultant, you have to do everything for them. I stop at coffee. I will get you water if I am getting a cup, but not coffee. It’s too demeaning.
6. Can you take a shower? No!
I’ve been here for 22 hours. I don’t have time to take a friggin shower.
7. Can you write down how the config node works for the rest of us? No!
See 1 above.
8. Can you fix your buffer overflow or memory leak? No!
I can find many better things to do than fix a memory leak that probably came from some else’s library.
9. Can you write this as a reusable library? No.
Why would anyone ever want to use my code?
10. Can you write your code on more than one line? No!
There is a reason for semicolons to exist. They separate code. If you can’t read my code, you should get emacs or some IDE to format it for you.
Strangely, one programmer once worked with me who said no to maybe 8 of these requests. I was flabbergasted. Here was a real-life Bartelby that refused to do everything that comes natural to most coders. I have seen multiple functions written on one line. That’s fine, but very hard to debug. I’m still not sure what to due with my temp3 and temp4 variables. Maybe they mean something. For the record I never asked another coder to get me coffee.
The crazy thing about Hibernate is that it replaces the easiest part of the database, SQL. Hibernate sounds impressive, with its Spring framework and ease of use, but something is missing. It’s scalability.

Hibernate was never designed to scale. It was designed to be easy. What it does not do is federate databases, remap to new machines, migrate data and help you optimize your database. It falls into the usual Java mistake of first, placing all visible components in the programming language as Objects, then preventing you from using the best features of various databases. It corrupts MVC by placing the model in the controller and makes it more difficult to scale the databases separately.
What happens when you need to add a second database on a separate IP? A second session factory is necessary, which is essentially a second connection. If you try a single find on multiple databases, good luck. I read that multiple databases can be managed by a program called Castle, which is .Net only. I know there are other projects (like shards), but these will only be opaque solutions on top of Hibernate. Multiple databases make cross-db joins almost worthless. At that point, you lose the joy of Hibernate.
And what is the joy of Hibernate? Let me explain my travails.
Day 1:
I downloaded hibernate3.jar and its brethren. I launched quickly into the tutorial. The tutorial said it will take 3 days to jump fully into Hibernate. I thought, “That’s stupid. I can do this in a couple of hours.”
Day 3:
I finally finished my tutorial. And that was just the tutorial. It took quite a while to find all of the jars. The tutorial had errors. It claimed I could use “native” for the autoincremented column. I had to use “increment” instead. In the old days, a simple, useless SQL error would complain about your syntax. A quick trip to the MySQL docs solved this issue. With Hibernate, there is almost no way to debug it, because instead of a command line, you receive a stack trace for every little issue. I’m sure stack traces are loved by many Java fanboys, but really, won’t SYNTAX ERROR do?
After trying out the completely useless Java DB server, which never saved my persistent data, I switched easily to MySQL server. I hunted for my data. And hunted. In MySQL, my EVENTS table ended up in the worst possible database, “mysql”. Hibernate shows no respect! The mysql DB is for system information, not your stupid data. This is not good.
mysql> show tables; +---------------------------+ | Tables_in_mysql | +---------------------------+ | EVENTS | | columns_priv | | db |
Now here’s the killer issue. When my company grows, and I need 5 or 50 clusters, will Hibernate help me? No. A separate connection factory is necessary for each database, and each IP. All of the ease of use goes away when I add a second IP.
What does Hibernate replace?
SELECT * from events where name=’David Kellogg’
I could have written that in my sleep. Adding a column is not hard. SQL makes that easy. Mapping and casting in your code is not hard. What is so hard that it requires Hibernate? Not much really.
The real issue is that Hibernate is a framework. A framework is like a Minnesota bridge. When one beam collapses, the whole bridge falls down. Once you change one item beyond what the framework allows, the framework becomes brittle, and collapses under the weight of ease of use.

I love Rails, I think. During the 3 hours and 3 minutes I spent with r0r, I began to anticipate, then love, then loath this simple application. What I expected to be a fun 30 minutes turned into a grueling 3 hours, including a crying baby and a nearly crying back-end developer.
Yes, Ruby on Rails made my baby cry.
I decided to run Rails before deciding its fate in my coding repertoire. I think Terry rejected Rails based on far-too objective reasons of speed and reported scalability. As a sensitive guy, I can appreciate the subjective love people have for Rails. Since Rails is an Ogre, and an Ogre is made of onions, I pealed this onion to see its insides. Here is my minute-by-minute diary.
8:36 fink install Ruby. The assault on Rails has begun. Will it be easy? I don’t know a lick about Ruby. By the way, a ruby is a sapphire doped with chromium. When even a green laser shines on a ruby, it shines with a beautiful red.
As you can see, I had stars in my eyes.
8:38 Ruby is compiliing. Soon this wonderful package will be mine.
8:39 ‘gcc’ running. ‘as’ running. The excitement is building!
8:40 I will skip a straight-off download. Instead I will use ‘gem’.
Little did I know this ‘gem’ would be a 3 hour ordeal.
8:44 gem: command not found
8:46 fink install gem unsuccessful.
8:50 Looking at README. Looks super easy.
That’s the README of gem, and no, it is not super-easy. That was a dumb thing to say. What I did not know was that the very latest gem required a self-update before it could be used. Otherwise, it spews errors, such as command ‘<' not found. Syntax error. Grrrr!
8:52 dirs for models and views. Fools!
This is really stupid. There’s this pattern-like thangy that requires a model, a view, and a controller for every stupid web application. It’s a dumb idea, because it reduces problems sometimes, but always separates documentation of an app into 3 places. What a waste. Some stupid prof probably came up with that MVC thing.
8:56 Ethan just woke up.
9:01 Gave Ethan Mylicon. I hope it helps.
9:15 Ethan back to his mom.
Rails made my son cry. I hope DHH feels sorry.
9:17 ruby setup.rb
9:18 Oops. sudo make me a sandwich.
9:19 Compiling gem. Couldn’t they just write the installer in Perl?
The gem thing is no gem. Gem is the retarted-ruby-app-installer. It’s even slower than YUM, and makes me wonder how fast and scalable their CPAN-like server is.
9:24 ERROR: While executing gem … (Gem::GemNotFoundException)
Could not find rails (> 0) in any repository
Really? Rails not in any repository? Maybe I should just self-update gem first. That totally makes sense, since I just downloaded the latest gem!
10:43 Gave Ethan to Rebecca for his final nightly feeding
10:48 Updating gems. Why does this take so long?
11:42 Checking my Twitter. A PHP guy is following. He’ll be sorry after he finds how I love rails!
11:44 alias rails_hello_world=’rails hello && cd hello && ./script/generate controller welcome hello && echo “Hello World” > app/views/welcome/hello.rhtml && ./script/server -d && firefox 0.0.0.0:3000/welcome/hello’
Rails works! My server is alive! Actually, I do love Rails at this point, but it will all come crashing down in just 4 minutes.
11:48 Time for a stress test.
11:52 100 r0r welcome pages in 13 seconds
11:55 effects.js does not download until prototype.js is done
11:56 apache loads 100 in 5 seconds same Welcome page
00:00 Firebug says 1.82 seconds to load Welcome page in r0r
The truth is free. Rails is sooooo slow. 2 seconds to load a 7kB page is really sad. Prototype.js has to load before effects.js can load. This is a truly horrible app. Apache runs 2.6 times faster in my stress test. I’m sure there are r0r wienies out there that will complain that I didn’t load this or that customary app to accelerate my pitiful speed, but really, out of box, Ruby on Rails is a failure. I’m back to PHP and Apache.
12:52 Now I’ve had 52 minutes to reflect on this application.
Rails suffers from trying too hard. Yes it is easy, despite my wondering through the forest for 3 hours, to install Rails. Its problems of speed and inflexibility show right away. Maybe no one but I noticed the Javascript loading order.
If you wanted to correct something in a framework, you would have to re-write the framework, which destroys the whole point of coding inside the framework. A framework like Rails is like a dictatorship. PHP and library-oriented languages are all about freedom. Perl is freedom. Linux is freedom. Maybe Ruby is freedom. With this MVC trash, Rails is like a straight jacket. Dictatorships and straight jackets serve their purpose, but not in my coding universe.

First what is a closure?
A closure is a function that returns without completing itself, and which still holds local variables at their old state. It’s pretty neat and handy when you see what they can do. Here’s an example of the town of Tinyville, population 5. A new resident moves in, but the census starts before he signs a lease.
var population = 5;
function fun(count) {
return function() {
alert(”count is “+count);
}
}
setTimeout(fun(population), 1000);
population++;
document.writeln(population);
Result:
6
Alert: 5
The six is written to the screen, but the old value of the population is saved by the setTimeout and the closure function, fun.
Closures are so important that reading the wrong tutorial will doom the AJAX writer. Here’s an example from Apple’s AJAX tutorial,
req.onreadystatechange = processReqChange; // DOOMED!!
req is a global variable. That’s really bad, since the chance to use a closure has past and no more than one request at a time may occur. So much for asyncronous. One alternative is to tag each outbound url, such as “http://localhost/ajax.php?population=5″. That will save the state, but that’s not as good as saving whole objects or sets of local variables. The MDC tutorial at Mozilla gets it right. Here’s the crucial line with my population variable injected.
httpRequest.population = 25;
httpRequest.onreadystatechange = function() { alertContents(httpRequest); } // BETTER
Why is this better?
You can tack on any variable to httpRequest and it will be separate from other requests. For instance “httpRequest.population=5″ is a valid way to preserve needed variables. Later the population is used.
if (httpRequest.status == 200) {
alert(httpRequest.responseText + “population is “+httpRequest.population);
} else {
alert(’There was a problem with the request.’);
}
One final reason to use closures is that AJAX is asynchronous, so you are not guaranteed to receive responses in the same order as the requests were dispatched. This makes using a closure especially valuable.
I just joined myBlogLog. I hope it works well. So far, I cannot see the widget. Maybe it will appear soon.
OK. It appears to have worked. See the sidebar on the right. Nice.
Terry Chay made an interesting point about Yahoo management. It appears the Yahoo techie has it bad. But the Yahoo engineer looks down on the Ebay manager. The Ebay manager only belittles the Ebay engineer, but that’s the end of the line. There is nothing lower than an Ebay engineer. I’m not talking talent. That’s just the way the world is.
Ebay makes gobs of money. God bless them. Making money covers up all sorts of engineering and managerial problems. Google is a good example. Internally Google is a mess, but they have two products that work. That raises its managers’ score.
This is all too confusing. The whole process of superiority needs to be formalized. Here is an accurate hierarchy of who pecks whom in Silicon Valley. It has little to do with grunt engineering talent, only grisly managerial cluelessness.
This SBU chart is inspired by Luke.

As you can see clearly by the chart, Facebook engineers are managed by superior beings. Yahoo is in the middle of the pack, only because there are much stranger places to work. Ebay (excluding Paypal) is unfortunately the butt of all jokes. Just say “train”, and the Valley engineer will chuckle. Train you ask?
A train contains a fixed number of seats. An engineer has the privilege to sit in one of these seats. At some point, the eBay train will leave the station. Your train might leave at 3 am. You as an eBay engineer must stay up until 3 am to check in your working code to CVS. But the 2 am engineer already broke your libraries, so you must stay up until dawn to fix his problems. The train is so efficient that toolies must throw their bodies under the cogs of the Juggernaut to please the managerial gods.
Apple could not make the list due to secrecy. I only met one Apple employee who would divulge what he did. He works for Safari, which means, if you look in the upper-right-hand corner of your browser, he works for Google.
The n-squared catastrophe (Metcalfe’s Law) occurs while messaging many other nodes in a cluster. I knew this math since 5th grade at West Elementary.
1 0
2 1
3 3
4 6
50 1225
This is also a game of connect the dots. How many lines are required to connect all dots to all others? n*(n-1)/2.
If there are 4 students in a class they need to pass 6 pairs of notes to each other to communicate. With 50 students, 1225 pairs of notes must trade hands. This is getting out of hand. Soon, instead of performing important tasks, the entire school day is taken up by note-passing. Surely there is a better way.
Computer programmers are slow to learn this lesson. This is mainly a lesson of a multi-node setup. Most coders like to think of complex problems in terms of a single application. A multi-node setup is never like a single application due to the above scaling law. Consider the code.
int count = 0; // a global
That’s all. It’s a global. In a single application, you would not dream of attempting to sync multiple variables, count1, count2, etc. Why would you do this with multiple nodes? The obvious choice in an application is to create a global or static variable. The obvious choice in a multi-node setup is to use a single centralized database. The n-squared catastrophe will kill you otherwise.
Dave
Javascript continuations are are intriguing for AJAX (remote scripting) applications. Here I will show you how to do it.
First what is a continuation? A continuation saves the current state of a program and allows the coder to restart it at any time. A good example (though a poorly scalable one) is a login. Once you attempt to click on a link that requires a login, you go to a separate login page, then proceed to the intended page. If you used a continuation on the server side, you could save the current state of the user, then return him to that state as if the login jump had not occurred. This often presents the constraint that the user must return to the same server, a usual no-no for medium-sized startups. It also wastes memory or a database call that should be used for something else.
On the client side, though, the browser and user remain in place. Client-side continuations still use up memory, but here we stick it to the user, not to your poor server. We can use continuations to make AJAX coding easier. The callback is implied. All calls to the server appear to receive data instantly. Then the code can act on the data on the next line.
Though not natively implemented, continuations can be created using Javascript’s own flexibility. It turns out that JS knows what function it is running, can save the call stack through closures, and has access to the source code it is invoking.
This turns out to be a fun experiment. Can we save the entire environment, then pick up where we left off? We do below with a bit of PHP and Javascript. The tail of the function with the continuation is saved. It all appears synchronous, even when it is not.
Thanks to Steve Yen for inspiration on this.
<?php
if($_GET['AJAX'] == 'true') {
if($_GET['function'] == 'date') {
echo "{'date': '".date('h:i:s A')."'}";
}
exit();
}
?>
<html>
<head>
<title>Continuations</title>
</head>
<body>
<form method="GET" action="#Oops" onsubmit="get_remote_date(); return false;">
<input type="submit" value="Get Server Time">
</form>
<div id="changeme">
Text to change.
</div>
<script language="javascript">
/*
* Called by pressing the button.
* The tail of function and context are saved in the
* closure function 'tail_fun'
*
*/
function get_remote_date() {
var tail = get_tail(arguments);
var tail_fun = function(response_text) { eval(tail); }
var response_text = remote_request(tail_fun,
"continuation.php?AJAX=true&function=date");
if(!response_text) {
return;
}
eval("response_obj = "+response_text);
var changeme = document.getElementById("changeme");
changeme.innerHTML = response_obj.date;
}
/*
*
* This accepts the callback function and
* makes the correct request.
* 'handle_success()' then checks whether
* data is ready and calls the callback.
*
*/
function remote_request(callback_fun, url) {
var req = false;
if (window.XMLHttpRequest) {
req = new XMLHttpRequest();
} else {
req = new ActiveXObject("Microsoft.XMLHTTP");
}
if (!req) {
return false;
}
req.onreadystatechange = function() {
handle_success(req, callback_fun);
};
req.open('GET', url, true);
req.send(null);
return req.responseText;
}
/*
*
* handle_success takes the XHR object
* and the callback to be used upon reply.
*
*/
function handle_success(req, callback_fun) {
if (req.readyState == 4) {
if (req.status == 200) {
callback_fun(req.responseText);
} else {
alert('There was a problem with the request.');
}
}
}
/*
*
* get_tail() finds the function name
* and parses the function to find the
* last lines after the return. Evaling
* this function allows the continuation.
*
*/
function get_tail(arguments) {
var source = arguments.callee.toString().
replace(/(.|\r|\n)*return;.*(\n|\r)*.*}/,"");
return "{ " + source;
}
</script>
</body>
</html>