New to my blog? You may want to think about subscribing to be notified of new posts. Thanks for visiting!
In June 2008 I announced I was writing a book. The working title is 'Living in the Cloud'. It is a book intended to help people understand computers, the internet and other technology they use every day. I plan to have the book finished by the end of 2008, and will post excerpts as I write them for feedback and criticism. The following is one such excerpt:
Character Study
So how do all those ones and zeroes translate into words on a computer screen?
By the 1960s, the Morse Code telegraph system, developed over 100 years previously, had evolved considerably. A system of Telex machines and other teleprinting devices was in full swing, transmitting news, government information and personal messages across countries and the globe. All of these devices used the binary system in the form of perforated paper tape with holes punched in.
It wasn't Morse Code being used by these machines, however. As telegraphic transmission became more common, people had been looking for ways to automate the sending and receiving of telegraphs, which led to the development of new code lists, which included binary sequences not only for the usual A-Z, 0-9 and various punctuation marks, but also special sequences that would tell the machines that read them to start or stop sending, begin a new line, start a new page or various other behaviours.
In 1963 one new such code was introduced and quickly became adopted as the standard system. Called the American Standard Code for Information Interchange, or ASCII, it assigned a character, (a letter, number, symbol or instruction) for each binary number from 0-127, making a total of 128 possible characters. 95 of these were printable characters such as letters, numbers, punctuation marks and symbols, while the other 33 were non-printable instructions called control characters.
Click here to view an ASCII chart which shows the 128 different characters and their binary values.
In the binary system, each one or zero is referred to as a Binary digIT, which is in turn referred to as a bit. To cover all the numbers from 0-127, each character in the ASCII chart needed up to seven ones and zeros (7 bits) to write. At the time, perforated tape came in sections which allowed for up to 8 holes to be punched (8 bits), which meant there was always a spare bit left over at the beginning of each section of tape. This spare bit was sometimes used to detect any errors in the transmission, called a parity bit, or was sometimes simply set to 0 and ignored.
Parity bits work by counting the number of 1's in the sequence. If there is an odd number of 1's, the parity bit will also be set to 1. If there are an even number of 1's, the parity bit is set to 0. This is called even parity and makes it possible for the computer to detect if there are any errors in the sequence, because if any of the bits are written or transmitted incorrectly the numbers won't match up.
Each sequence of 8 bits read by the computer is called a byte, perhaps because that was the most the computer could 'chew' through at any one time. Science has unravelled many mysteries, but the sense of humour possessed by engineers is not one of them. In any case, this meant 1 character was also equal to 1 byte. A word with six letters and a full stop after it would take 7 bytes to write, and two five-letter words with a space between them would take 11 bytes to write. It has already taken tens of thousands of characters to produce the portion of this book you have read so far, which means tens of thousands of bytes. As you can see, they add up fairly quickly.
Thankfully we have names for all these big numbers and no doubt you've familiar with several of them already. For starters, 1024 bytes is called a kilobyte. You may be wondering, as have many people, why, if kilo means 1000, a kilobyte is 1024 bytes. It's because of the difference between binary and decimal counting. In decimal or Base 10 counting, 1000 = 10x10x10, or 10³. But in binary, which is Base 2, the first power to reach the thousand mark is 2¹⁰ or 2x2x2x2x2x2x2x2x2x2x2, which equals 1024. It may not sound like much, but it's an important difference, especially as the numbers get a lot bigger.
And they do. Here is a handy table of how it all works:
8 bits = 1 byte (B)
1,024 bytes = 1 kilobyte (Kb)
1,024 kilobytes = 1 megabyte (Mb)
1,024 megabytes = 1 gigabyte (Gb)
1,024 gigabytes = 1 terabyte (Tb)
1,024 terabytes = 1 petabyte (Pb)
It goes on, but that should do to get you started. As for those 24s I mentioned earlier, consider this. 1 gigabyte = 1,073,741,824 bytes. That's a difference of over 73 million bytes compared to calculating using 1000 instead of 1024. Speaking of which, have you ever looked at a brand new 250 gigabyte hard drive, only to discover the computer says it has a capacity of 232.83GB, paused briefly to curse those snake oil salesmen at the computer shop and then forget all about it? Well perhaps not, but if you have, it's because of exactly the phenomenon we just talked about.
Manufacturers of hard drives work on multiples of 1000 bytes when calculating the capacity of their products, whereas most operating systems use the binary method of 1024, which explains the discrepancy. It's actually quite a heated debate in certain circles, but if that's what passes for entertainment over there, you can be sure they're not circles you want to be part of. Let's move on.
ASCII is still in use today, although over time it was extended to include more possibilities such as mathematical symbols and foreign language characters. Eventually it began to be phased out in favour of Unicode, which requires 4 bytes to write each character but allows for just about every conceivable character in any language. In a rare case of visionary thinking, it was also decided that the first 128 characters in Unicode be identical to those of ASCII to prevent confusion and compatibility problems.
In the days of perforated tape, using four times as much effort to produce the same result was far too expensive a proposition to be considered, but that was before computer technology took a giant leap forward with the advent of the microprocessor. Essentially a complex collection of transistors and connectors built into a piece of silicon, the microprocessor delivered a way to process information much faster and more efficiently than ever before. In 1974, after a few earlier designs had already made the rounds, an electronics company based in Santa Clara, California, released a new microprocessor called the 8080 which would lay the foundation for computers as we know them today. That company was Intel, and after its technological breakthrough, the Santa Clara region in California soon became known as Silicon Valley.
Just like the perforated tape, Intel's 8080 microprocessor was capable of handling 8 bits at a time. What made it such an amazing breakthrough was the sheer speed with which it could process these bytes of information. The tape devices at the time were handling around 9 or 10 cycles of 8-bit per tape second, an impressive feat in itself. But Intel's 8-bit microprocessor could run around 2 million cycles every second, thus signalling the end of analogue computing and throwing open the door to the digital computing revolution.

{ 0 comments… add one now }