Sectors and Clusters

A sector, being the smallest physical storage unit on the disk, is almost always 512 bytes in size because 512 is a power of 2 (2 to the power of 9). The number 2 is used because there are two states in the most basic of computer languages – on and off.

Each disk sector is labelled using the factory track-positioning data. Sector identification data is written to the area immediately before the contents of the sector and identifies the starting address of the sector.

The optimal method of storing a file on a disk is in a contiguous series, i.e. all data in a stream stored end-to-end in a single line. As many files are larger than 512 bytes, it is up to the file system to allocate sectors to store the file’s data. For example, if the file size is 800 bytes, two 512 k sectors are allocated for the file. A cluster is typically the same size as a sector. These two sectors with 800 bytes of data are called two clusters.

They are called clusters because the space is reserved for the data contents. This process protects the stored data from being over-written. Later, if data is appended to the file and its size grows to 1600 bytes, another two clusters are allocated, storing the entire file within four clusters.

Sectors and Clusters


If contiguous clusters are not available (clusters that are adjacent to each other on the disk), the second two clusters may be written elsewhere on the same disk or within the same cylinder or on a different cylinder – wherever the file system finds two sectors available. A file stored in this non-contiguous manner is considered to be fragmented. Fragmentation can slow down system performance if the file system must direct the drive heads to several different addresses to find all the data in the file you want to read. The extra time for the heads to travel to a number of
addresses causes a delay before the entire file is retrieved.

Cluster size can be changed to optimize file storage. A larger cluster size reduces the potential for fragmentation, but increases the likelihood that clusters will have unused space. Using clusters larger than one sector reduces fragmentation, and reduces the amount of disk space needed to store the information about the used and unused areas on the disk.

Most disks used in personal computers today rotate at a constant angular velocity. The tracks near the outside of the disk are less densely populated with data than the tracks near the center of the disk. Thus, a fixed amount of data can be read in a constant period of time, even though the speed of the disk surface is faster on the tracks located further away from the center of the disk..

Modern disks reserve one side of one platter for track positioning information, which is written to the disk at the factory during disk assembly. It is not available to the operating system. The disk controller uses this information to fine tune the head locations when the heads move to another location on the disk. When a side contains the track position information, that side cannot be used for data. Thus, a disk assembly containing two platters has three sides that are available for data.

Hard disk interfaces
Hard disks also come in several flavors such as IDE (actually ATA), SCSI and SATA, as do optical drives. ATA is the most common interface used today. SCSI disks can usually be found on servers.

Integrated Drive Electronics, more commonly called by its acronym IDE, is an interface for hard drives. IDE is a marketing term; the real standard is called ATA.   EIDE (Enhanced IDE) or ATA-2 was later developed and increased transfer speed, added 32-bit transactions and DMA support.

ATA stands for Advanced Technology Attachment. The ATA -term is commonly used interchangeably with IDE. The older and more common paraller ATA (P-ATA) is currently being replaced by serial ATA (SATA).

Most PCs have two IDE controllers on the motherboard. One IDE controller can support two devices, so four storage devices is usually the maximum. Paraller ATA interface uses ribbon cables with 40 -pin connectors to connect the hard drives to the motherboard. The cable has usually three connectors. Of these one is connected to the motherboard and the rest two are left for hard drives. If two hard drives are connected to the same controller, one must be defined as master and the other one as slave. This is done with jumpers.

ATA-2 is the real standard for what is widely known as EIDE. ATA-2 introduced higher speed data transfer modes: PIO Modes 3 and 4 plus Multiword DMA Mode 1 and 2. These modes allow the ATA interface to run data transfers up to about 16MB/second.

Serial ATA, also known as SATA or S-ATA, is a bus used to communicate between the CPU and internal storage devices such as hard drives and optical drives. It is designed to eventually replace the ATA (also known as IDE) bus. Traditional ATA is beginning to be referred to as Parrellel ATA, P-ATA, or PATA to avoid confusion.

The main difference between SATA and PATA is in the cabling. SATA does away with the master/slave relationship of PATA (hence the difference in names), as well as PATA’s ungainly ribbon cables. Instead, SATA has much slimmer and easier to manage cables, which will enable better airflow through cases. The connectors are keyed, preventing connectors from being plugged upside down. Truly native SATA drives will have different power connectors also.

A third advantage of SATA is hotplugging.
Currently, SATA has a transfer rate of 150 MB/s, which is only 17 MB/s more than standard PATA. However, with the introduction of SATA II, this is expected to go up to 300 MB/s, with 600 MB/s being released sometime around 2007. The faster bus isn’t expected to affect performance in the short term, since hard drive performance is usually bottlenecked by the moving parts of the drive.   During the transitional period before true native SATA drives are released, most SATA drives actually have onboard PATA controllers, which connect to SATA by a bridge. This generally causes a 30-50% performance drop. Also, PATA power connectors are still being used.

DMA (Direct Memory Access) is a function of  the memory bus in the computer that lets connected devices like hard disks transfer data to the memory without the intervention of the CPU, thus speeding up the transfer. This is superior to the way PIO works.

There are two distinct types of direct memory access, DMA and bus mastering DMA. The plain DMA relies on the DMA controller on the motherboard to grab the system bus and transfer the data. In bus mastering DMA all this is done by the logic on the interface card itself. Bus mastering allows the hard disk and memory to work without relying on the old DMA controller built into the system, or needing any support from the CPU.

USB (Universal Serial Bus) is a hardware bus using a serial protocol used by many different hardware devices and supported in most computers/mainboards. Originally developed by Compaq, Intel, NEC and Microsoft. It allows many devices to be connected to the bus at the same time, the theoretical maxmium is 127 devices. The maximum data transfer bandwidth is about 12Mbit/s (USB2.0 supports 480 Mbit/sec).

Firewire is a less known alternative to USB that (at its time) was better then USB for media related tasks. As of USB2 there have been significant increases, specifically more bandwidth.

SCSI – Small Computer System Interface. Pronounced “scuzzy”. It’s a specification for a hardware interface for connecting devices such as hard disks and scanners to a computer.

Most PCs have an ATA(IDE) bus instead of SCSI for connecting internal hard disks. SCSI is seen more often in servers, as it tends to be faster and more reliable (though more expensive). Another advantage of SCSI controller is that it requires only one IRQ and can hadle usually at least 7 devices whereas ATA can handle only 2.

Typically, you put a SCSI card in your computer, and then connect internal hard disks with a ribbon cable to some connector on the card. Also, the card will have an external connector which you might also be using simultaneously.

Data recovery Salon welcomes your comments and share with us your ideas, suggestions and experience. Data recovery salon is dedicated in sharing the most useful data recovery information with our users and only if you are good at data recovery or related knowledge, please kindly drop us an email and we will publish your article here. We need to make data recovery Salon to be the most professional and free data recovery E-book online.