Friday, January 13, 2006

Use what you need...

When I first moved to California, I worked for a company called Natural Language Incorporated. During my tenure there, I got saddled with the enviable task of getting NLI's product, which the customers knew as NLI Natural Language, and which we knew as "swan" (System Without A Name), to connect to IBM's SQL/DS product on VM/CMS.

VM/CMS is kind of the MS/DOS of IBM 370 operating systems. VM stands for Virtual Machine; CMS stands for Cambridge Monitor System, because it was actually invented at Cambridge University, not at IBM. CMS is like MS-DOS in the sense that you get a whole machine to yourself, and you can do whatever you want there, within the confines of your little virtual universe, just like an IBM PC running MS-DOS.

VM makes the big IBM 370 into a bunch of little machines. VM and CMS are actually completely independent - in theory I guess you could run CMS right on the bare metal, if the bare metal looked like the kind of machine that CMS was originally designed to run on. But CMS running on the bare metal would be boring, because CMS is like MS-DOS - it doesn't do very much.

Anyway, by the time I got my hands on VM/CMS and SQL/DS, the worm had turned. I turned 41 today, so I'm kind of old, but VM/CMS comes from the geek generation that preceded mine, when geeks mostly wore suits or white lab coats and were referred to as poindexters. So to me, VM/CMS was an ancient, hoary remnant of a past age (I'm told it was cool when it was first invented).

I could tell many tales of my experience with this hoary remnant. But today's tale has to do with what virtual machines do. Think of a machine (by which I mean a computer) as a universe in which a computer program runs. The machine is the set of all memory locations and devices (a device is something like a disk drive, or a printer, or a video display) that the program can access.

Back in the early days of home computers, you got the whole real machine to yourself. This is called running on the bare metal. Your program would be using the entire computer. Now, one thing you need to know about computers is that they do a single unit of work for every tick of the clock (yes, Virginia, I am oversimplifying). This means that every time the computer's clock ticks (which happened about two million times a second on an Apple ][, for example), something would be done. Always. Whether there was something to do or not.

When you're writing a program on the bare metal, and you want to talk to a device, you do something called polling. Devices are how you communicate with the outside world. The outside world isn't always doing something. Humans, for instance, do not do two million (admittedly tiny) operations per second. So most of the time, what a personal computer does is to wait for its human to type something. I type about a hundred words a minute, which is nearly two characters per second.

So this means that if the computer is waiting for me to type, it spends the better part of a second asking the question "has that putz typed anything yet (htptay)?", and then "ah! wonder of wonders! the human typed 'a'. Let me make a note of that. (lmmanot)" So during the course of a minute's typing, the computer asks the htptay question perhaps 11,999,900 times, and does lmmanot 100 times.

So that's what you do when you're running on the bare metal. As I say, it's called polling. However, a virtual machine is not bare metal. It's a simulation of bare metal. The reason you make a virtual machine is that you want to more than one universe. Remember, on a system like MS-DOS, you can only ever do one thing at once. IBM 370 computers used to be really expensive. I mean, millions of bucks, back when that was real money, not the downpayment on a house in Silicon Valley.

So a 370 is usually shared amongst many users. And this usually works just fine, because most of the time, the users aren't doing anything. The computer is just waiting for them to hit a key. So with virtual machines, you partition the bare metal into slivers, called timeslices, and you partition the disk drive into virtual disk drives, each of which is a slice of the disk. To a program running on one of these virtual machines, it feels just like you're running on the bare metal, except that time moves jerkily, because you're only getting slices of it, not the whole loaf.

So the additional complication here is that sometimes when the user types something, it turns out that the computer needs to actually *think*. Thinking takes time. On the bare metal, you just think as fast as you can until you're done thinking, and then you proudly present your results to the user. On a virtual machine, you think as hard as you can given the slices of time that the computer gives you.

But remember, computers are stupid. They just do what you tell them. Polling for input looks just as much like thinking as solving Fermat's last theorem, to a computer. So imagine a bunch of little virtual machines, all of them polling for input most of the time, but thinking some of the time. The virtual machine manager doesn't know which ones are polling and which ones are thinking, so it gives equal time to all, whatever they are doing.

This is Bad. Really Bad. Because it means that time that could be spent calculating Fermat's last theorem is instead being spent asking, htptay over and over and over again, millions of times.

Now, call them poindexters all you want, but the guys in the white lab coats at Cambridge were no dummies. It was obvious to them that polling was a bad idea. Why give a timeslice to a program that's just waiting? Why not have it so that when the program starts to wait, it gives up the rest of its timeslice, and never gets another timeslice until its wait is over? Sure enough, this is what they did.

So on VM/CMS, most of the time most of the virtual machines aren't getting any timeslices at all, because they aren't busy. They're just waiting. By the way, sometimes geeks (and poindexters too, I guess) refer to htptay as busy waiting. Busy waiting is bad. The opposite of busy waiting, where you don't get a timeslice until the human types something, is called event-driven I/O (I/O is short for Input/Output; recording what someone types is an example of input, and displaying something on the screen is an example of output).

So now picture a multimillion-dollar IBM 370, running VM/CMS, and picture a little program running under its own little virtual machine, written by an Apple ][ programmer. Apple ]['s are bare metal. If you want to wait for something to happen, you just busy wait. So the Apple ][ programmer does the same thing on the VM/CMS machine, because he's never heard of event-driven I/O.

The next thing you need to know about IBM mainframes, which I guess is pretty obvious, is that someone has to pay for them. So IBM mainframes come with elaborate accounting systems. What these systems typically count is (1) how much time you use, and (2) how big your virtual disk is. You're charged by the second for CPU time, and a single second is not cheap. Imagine what 3600 seconds (an hour) would cost.

The good news in this story, for me, was that after making a bit of a botch of the SQL/DS project, because I had no experience with mainframes, they hired a mainframe guy. The mainframe guy was the one who wrote the busy wait loop. The other good news is that the company that was selling us our little virtual machine was understanding about what happened, and didn't charge us a hundred thousand dollars for the CPU time we accidentally wasted.

Fast forward to 2006. If you have a computer, you've probably noticed that the fan doesn't always go at the same speed. This can be distracting, which is why you've probably noticed it. The deal is that it actually costs a certain amount of energy to tick the clock. When a computer is idle (and most computers are idle most of the time), you don't want to tick the clock. It wastes battery. A lot of battery. So the way computers work nowadays is that if every program is waiting for an I/O event, meaning that nobody wants a timeslice, you slow the CPU clock way down. When something happens, you speed it back up again. If the clock is going full bore, the fan is going fast. If the clock is barely ticking, the fan slows down and maybe even stops.

So why am I writing this long article about busy loops and virtual machines, which nobody probably cares about anymore? Because the programmers at Sony apparently don't know about event-driven I/O. I have a program running on my laptop, called Download Taxi (which is a cute name, albeit somewhat puzzling). It's doing a download, and instead of doing event-driven I/O, it's busy-waiting. And this reminded me of the busy-waiting VM/CMS program.

I like to think the person who made it is an old Apple ][ programmer like me. CPU time on my computer costs a lot less than on a 370, so nobody's out much money. And actually, the download just finished as I started typing this paragraph, so I've happily used up the time typing. Since typing requires so little of the computer's CPU time, it all worked out pretty well in the end.

And maybe someone will benefit in some small way from this description of the old days of computers, when we all programmed in BAL and it was uphill both ways to the computer room, and 1600bpi magtapes were too adventurous for most people, who preferred 800bpi. That's bits per inch. With a magnifying glass and some magnetic tape developer, you could actually see the bits. Wanna know how many bits per inch they're squeezing onto your 500G disk drive, sonny? But I digress.


Blogger Jym said...

=v= Well, happy birthday! You're 30 years younger than Elvis.

Friday, January 13, 2006 2:04:00 PM  
Blogger Ted Lemon said...

Yeah, well, and Elvis is living in a nursing home, using a walker, and scaring off bloodsucking demons for a living. This is supposed to comfort me?


Friday, January 13, 2006 9:15:00 PM  
Anonymous Stephen said...

Happy Birthday, Ted :-)

Saturday, January 14, 2006 6:28:00 AM  
Blogger Ted Lemon said...

Thanks, SJ! I assume it's SJ... :')

Tuesday, January 17, 2006 10:39:00 AM  

Post a Comment

<< Home