Hard Drive Field Separation
Name: Dan P.
You have an excellent answer to the question of "How does
a computers hard drive work...store and send data?" (see
could you explain further...
What separates the magnetic field so that the hard drive knows the
difference between a magnetic field that is one bit in size and a magnetic
field that is 100 bits long? Are there spaces between each magnetic
field? If so how can it tell the difference between a non-magnetic field
that is one bit in size and a non-magnetic field that is 100 bits long?
That is a good question. What really happens on a hard disk is a little
more complicated than the simple magnetized-bit picture.
First, the read/write heads are designed not to measure the actual polarity
of the magnetic fields, they sense magnetic flux reversals. So when the
data are read and written, the magnetic fields are set up such that the
reversals themselves carry the information.
Second, simply writing "ones" and "zeroes" would give errors for long
strings of the same number. Many modulation schemes have been devised to
get around this problem. For example, a simple but obsolete scheme is
modified frequency modulation (MFM). If "R" is a reversal and "N" is no
reversal then in this scheme a 1 is encoded NR, a 0 preceded by a 0 is
encoded as RN, and a 0 preceded by a 1 is encoded by NN. You can see that
it is impossible to have a long non-magnetic string. The drawback is that
this system requires some extra reversals. Other methods for efficiently
using the reversals to code data are possible.
All hard disks make errors in reading data. To eliminate errors, extra
error-correction bits are written. These bits do not contain data; rather,
they contain information about the data that can be used to correct any
problems encountered trying to access the real data bits. There's many
types of error detection and correction and all of them involve writing some
extra information to the disk as a way of checking if the original data is
okay. As much as 25% of the information (or as little as a few percent)
that is written is for error correction. "Reed-Solomon" codes are popular.
These codes do not just write some extra "check" bits. In Reed-Solomon
codes, an entire block of data is processed with complicated mathematics and
written. When the data are read, the data is reprocessed (again with
complicated mathematics). Is possible to detect and fix (more complicated
mathematics) any errors that occur.
Reed-Solomon codes are used in other types of digital communication and
disks too. By using these codes, it is possible to reconstruct data that is
ruined, as long as it is not too extensive. In a CD-ROM for example, a
scratch might destroy a string of data thousands of bits long. Yet the
error correction routines can recreate all of the thousands of bits
More information can be found on the Internet if you do a search on "hard
disk" and "error correction."
There is a long set of stories on that exact subject.
And the encoding of a single digital bit these days is usually more
complicated than a magnetic bump with an empty moat around it.
If you had that, then a digital zero bit would be "no bump, with an empty
moat around it", i.e., nothing,
and then your simple disk-readback electronics would definitely make
mistakes trying to tell the difference between 99 and 100 zero-bits in a row.
The speed of the read-coil flying by the magnetic surface of the disk
cannot always be held perfectly steady.
Another big disadvantage of having moats around your data bits is: that
takes up 3 times more space than the data bit itself.
Every manufacturer would prefer to invent some squiggly method of making
their hard-disks hold twice as much data using the same magnetic platters
Almost every paying purchaser of hard-disks prefers this, too. 2 to 4
times the storage for the same money.
"Clock recovery" is the engineering term that refers to the job of having
simple electronics figure out where the bits were, and the boundaries
"The clock" is a hypothetical square-wave which has one cycle, one
positive pulse, for every digital bit whether it's one or zero.
Every serial receiver must generate a clock signal, correctly synchronized
with the data bits,
so all it's subsequent computer processing can effortlessly know where the
bits are and not make mistakes.
Reading a hard-disk is not too different than receiving Morse code by
radio, or listening to a computer-modem over a telephone line.
In all these cases of serial-stream communication, once you "read the
wave", you have a plot of semi-analog voltage vs time.
No "clock channel" signal comes in parallel with it, that would double the
number of wires or frequencies or space needed.
The semi-analog signal consists solely of highs (say +0.9 v), lows (say
-0.9v), transitions (+0.9->-0.9 or v/v), and some superimposed noise (say
Transitions are usually somewhat rounded, but they must start and finish
in less than , say, half the width of a bit.
The only brisk way to mark a spot in the analog signal time-line is to
have a transition there, and repeat it only at certain regular intervals.
Our electronics can easily adjust the frequency of a clock oscillator to
synchronize with these transitions.
Once there is synchronization, every time we see a transition we know
where the next 5 or 10 bits will be. [note, below]
So we can afford to skip as many of the next 5 transitions as we like.
Now we invent a simple encoding:
Every time we include the transition, this is a digital bit zero.
Every time we skip a transition, this is a digital bit one.
With this system we have the problem that we cannot ever send digital data
with more than 5 ones in a row.
We cannot have that, people need to be able to send a million ones in a
row, if that is what their data has in it.
But it is not a big deal to fix. You can have 4 bits of arbitrary data
followed by a one, a transition, every fifth bit.
Simple electronics can easily take out this 5th bit and pass on the
For hard-disk drives they have invented a look-up table which "wastes"
even less space than the 20% fraction in the example above.
On the input side of the table is a list of n-bit codes with all the
"forbidden values" omitted.
On the output are simple counting index-numbers for those values.
There probably needs to be a power-of-two number of those, so that bunches
of output bits can be jammed together to make the final, reconstituted
So you can see there is basically no space between the bits in one row on
a hard drive.
And that the user's precious 1's and 0's are never exactly what is really
written on the media.
If you scattered (hypothetical) iron dust on a disk and took a photo,
you'd initially have a hard time seeing where you wrote your name,
even if you knew your ASCII codes for the alphabet, and were looking at
the right spot.
The adjacent rows are pretty crammed together, too.
Much of the noise in the semi-analog wave form, mentioned above, consists
of fringe fields from adjacent rows.
Perhaps some disk-drives have built-in intervals where they position their
magnetic probes to read the rows.
But some actually start reading while moving the head between the rows,
and when the check-values in the data read correctly,
that is where the row is, and where the probe is held.
Another option that uses simpler hardware, is having one little ray of
consuming a narrow pie-slice of the disk from center to rim.
If the probe reads the value of the left-staggered row, it is moved a
little bit rightwards, and v/v.
Equilibrium happens when it reads "left" as often as "right", on a
This way the rows of data (tracks) require no significant spacing between
Placing these track-markers, and then filling the rest of the disk with
transitions at nice, regular intervals,
is what you know as "formatting" a disk. The long, slow kind of formatting.
I am sure there are more complications I have not mentioned here.
Many of them I know nothing about.
But this gives you the flavor of the game.
Using roughly this much knowledge a hobbyist might be able to re-invent a
decent disk-writing scheme for his own use,
and run it on old hardware.
It might be only 30% less dense than what the pro's use.
Click here to return to the Computer Science Archives
Update: June 2012