Department of Energy Argonne National Laboratory Office of Science NEWTON's Homepage NEWTON's Homepage
NEWTON, Ask A Scientist!
NEWTON Home Page NEWTON Teachers Visit Our Archives Ask A Question How To Ask A Question Question of the Week Our Expert Scientists Volunteer at NEWTON! Frequently Asked Questions Referencing NEWTON About NEWTON About Ask A Scientist Education At Argonne Hard Drive Field Separation
Name: Dan P.
Status: other
Age: 40s
Location: NY
Date: N/A 


Question:
You have an excellent answer to the question of "How does a computers hard drive work...store and send data?" (see http://www.newton.dep.anl.gov/newton/askasci/1995/compsci/CSI36.HTM) but could you explain further...

What separates the magnetic field so that the hard drive knows the difference between a magnetic field that is one bit in size and a magnetic field that is 100 bits long? Are there spaces between each magnetic field? If so how can it tell the difference between a non-magnetic field that is one bit in size and a non-magnetic field that is 100 bits long?



Replies:
That is a good question. What really happens on a hard disk is a little more complicated than the simple magnetized-bit picture.

First, the read/write heads are designed not to measure the actual polarity of the magnetic fields, they sense magnetic flux reversals. So when the data are read and written, the magnetic fields are set up such that the reversals themselves carry the information.

Second, simply writing "ones" and "zeroes" would give errors for long strings of the same number. Many modulation schemes have been devised to get around this problem. For example, a simple but obsolete scheme is modified frequency modulation (MFM). If "R" is a reversal and "N" is no reversal then in this scheme a 1 is encoded NR, a 0 preceded by a 0 is encoded as RN, and a 0 preceded by a 1 is encoded by NN. You can see that it is impossible to have a long non-magnetic string. The drawback is that this system requires some extra reversals. Other methods for efficiently using the reversals to code data are possible.

All hard disks make errors in reading data. To eliminate errors, extra error-correction bits are written. These bits do not contain data; rather, they contain information about the data that can be used to correct any problems encountered trying to access the real data bits. There's many types of error detection and correction and all of them involve writing some extra information to the disk as a way of checking if the original data is okay. As much as 25% of the information (or as little as a few percent) that is written is for error correction. "Reed-Solomon" codes are popular. These codes do not just write some extra "check" bits. In Reed-Solomon codes, an entire block of data is processed with complicated mathematics and written. When the data are read, the data is reprocessed (again with complicated mathematics). Is possible to detect and fix (more complicated mathematics) any errors that occur.

Reed-Solomon codes are used in other types of digital communication and disks too. By using these codes, it is possible to reconstruct data that is ruined, as long as it is not too extensive. In a CD-ROM for example, a scratch might destroy a string of data thousands of bits long. Yet the error correction routines can recreate all of the thousands of bits perfectly!

More information can be found on the Internet if you do a search on "hard disk" and "error correction."

Bob Erck


There is a long set of stories on that exact subject. And the encoding of a single digital bit these days is usually more complicated than a magnetic bump with an empty moat around it. If you had that, then a digital zero bit would be "no bump, with an empty moat around it", i.e., nothing, and then your simple disk-readback electronics would definitely make mistakes trying to tell the difference between 99 and 100 zero-bits in a row. The speed of the read-coil flying by the magnetic surface of the disk cannot always be held perfectly steady.

Another big disadvantage of having moats around your data bits is: that takes up 3 times more space than the data bit itself. Every manufacturer would prefer to invent some squiggly method of making their hard-disks hold twice as much data using the same magnetic platters and probes. Almost every paying purchaser of hard-disks prefers this, too. 2 to 4 times the storage for the same money.

"Clock recovery" is the engineering term that refers to the job of having simple electronics figure out where the bits were, and the boundaries between them. "The clock" is a hypothetical square-wave which has one cycle, one positive pulse, for every digital bit whether it's one or zero. Every serial receiver must generate a clock signal, correctly synchronized with the data bits, so all it's subsequent computer processing can effortlessly know where the bits are and not make mistakes.

Reading a hard-disk is not too different than receiving Morse code by radio, or listening to a computer-modem over a telephone line. In all these cases of serial-stream communication, once you "read the wave", you have a plot of semi-analog voltage vs time. No "clock channel" signal comes in parallel with it, that would double the number of wires or frequencies or space needed. The semi-analog signal consists solely of highs (say +0.9 v), lows (say -0.9v), transitions (+0.9->-0.9 or v/v), and some superimposed noise (say 0.1v rms). Transitions are usually somewhat rounded, but they must start and finish in less than , say, half the width of a bit.

The only brisk way to mark a spot in the analog signal time-line is to have a transition there, and repeat it only at certain regular intervals. Our electronics can easily adjust the frequency of a clock oscillator to synchronize with these transitions. Once there is synchronization, every time we see a transition we know where the next 5 or 10 bits will be. [note, below] So we can afford to skip as many of the next 5 transitions as we like.

Now we invent a simple encoding:

Every time we include the transition, this is a digital bit zero.
Every time we skip a transition, this is a digital bit one.

With this system we have the problem that we cannot ever send digital data with more than 5 ones in a row. We cannot have that, people need to be able to send a million ones in a row, if that is what their data has in it. But it is not a big deal to fix. You can have 4 bits of arbitrary data followed by a one, a transition, every fifth bit. Simple electronics can easily take out this 5th bit and pass on the original data.

For hard-disk drives they have invented a look-up table which "wastes" even less space than the 20% fraction in the example above. On the input side of the table is a list of n-bit codes with all the "forbidden values" omitted. On the output are simple counting index-numbers for those values. There probably needs to be a power-of-two number of those, so that bunches of output bits can be jammed together to make the final, reconstituted data stream.

So you can see there is basically no space between the bits in one row on a hard drive. And that the user's precious 1's and 0's are never exactly what is really written on the media. If you scattered (hypothetical) iron dust on a disk and took a photo, you'd initially have a hard time seeing where you wrote your name, even if you knew your ASCII codes for the alphabet, and were looking at the right spot.

The adjacent rows are pretty crammed together, too. Much of the noise in the semi-analog wave form, mentioned above, consists of fringe fields from adjacent rows. Perhaps some disk-drives have built-in intervals where they position their magnetic probes to read the rows. But some actually start reading while moving the head between the rows, and when the check-values in the data read correctly, that is where the row is, and where the probe is held. Another option that uses simpler hardware, is having one little ray of staggered rows, consuming a narrow pie-slice of the disk from center to rim. If the probe reads the value of the left-staggered row, it is moved a little bit rightwards, and v/v. Equilibrium happens when it reads "left" as often as "right", on a short-term average. This way the rows of data (tracks) require no significant spacing between them.

Placing these track-markers, and then filling the rest of the disk with transitions at nice, regular intervals, is what you know as "formatting" a disk. The long, slow kind of formatting.

I am sure there are more complications I have not mentioned here. Many of them I know nothing about. But this gives you the flavor of the game. Using roughly this much knowledge a hobbyist might be able to re-invent a decent disk-writing scheme for his own use, and run it on old hardware. It might be only 30% less dense than what the pro's use.

Jim Swenson



Click here to return to the Computer Science Archives

NEWTON is an electronic community for Science, Math, and Computer Science K-12 Educators, sponsored and operated by Argonne National Laboratory's Educational Programs, Andrew Skipor, Ph.D., Head of Educational Programs.

For assistance with NEWTON contact a System Operator (help@newton.dep.anl.gov), or at Argonne's Educational Programs

NEWTON AND ASK A SCIENTIST
Educational Programs
Building 360
9700 S. Cass Ave.
Argonne, Illinois
60439-4845, USA
Update: June 2012
Weclome To Newton

Argonne National Laboratory