Backup Encyclopedia Logo

       
 

Home | Resources | Backup Products | Online Backup | Support | About us
   
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z | View All
 
   
Data  
  Data can be defined as a graphical or textual representation of facts (organised into various data structures), concepts, numbers, letters, symbols, or instructions used for communication or processing into information.
  Side note: When you push the "Save" button, and write your data to the hard disk drive, you expect it to be returned correctly when you open the file at a later date.  The actual specification for this expectation of data integrity is the unrecoverable read error rate.  This is typically in the range of one (1) bit in error for 1013 to 1015 bits read.  Every part and function of the hard disk drive is essential for achieving this level of data integrity.
  Files, whether they represent text, a database, photographs, songs, movies, web pages, executable programs, or anything else, are stored as a series of sectors.  A sector is a physical location on the hard disk drive that is designated to store (most commonly) 512-user bytes.  Because of the encoding overhead and the requirements of the detection algorithms, about 600 bytes are actually stored in a sector.
  Sectors have traditionally been uniquely identified in a hard disk drive by cylinder, head, and sector (CHS) co-ordinates.  The "head" number indicates on which surface the sector is located.  The "cylinder" number identifies the specific concentric track on that surface where the sector can be found.  And the "sector" number indicates which of the hundreds of sectors on the track contains the data that is sought.
  The question to ask is "How does the hard disk drive know where your file is?" The truth of the matter is that it does not.  That is the job of the operating system.  The operating system keeps track of which logical blocks on which hard disk drive contain a file.  For convenience, consider a logical block to be a data sector, although each block could also point to several consecutive sectors.  The hard disk drive will request a logical block from the hard disk drive, for example block number 1,635,324.  The hard disk drive must map this logical block location into a physical block (CHS) location, for example cylinder 5,000 on head 1 at sector 452.  There are fast algorithms for computing this, however the interesting complication is when the usual physical location for a logical block has a defect that precludes (prohibits) it from reliably storing data.
  Such locations are found and mapped out during the manufacturing process.  There are also provisions for doing this check and re-mapping when the hard disk drive is in use in the field.  The hard disk drive has many spare sectors and even spare tracks to be used as replacements for defective sectors.  This is transparent to the operating system under normal operation.  The hard disk drive accepts the logical block address and performs the logical-to-physical translation itself.  This varies from hard disk drive-to-hard disk drive, reflecting the mapping-out of defects found during the hard disk drive's surface scan self-test.
  In the field, i.e., in a computer, the hard disk drive may acquire additional defects due to corrosion, handling, or other causes.  These are typically identified in a table of exceptions (sometimes called the P-list and the G-list, for primary defects and grown defects, respectively).  This table, the table of parameters, and the firmware are typically stored on the hard disk drive itself in the outermost tracks.  These tracks are referred to as the system area, maintenance tracks, diskware, negative cylinders, etc.  However, some hard disk drive models store the table in non-volatile memory on the printed circuit board.  Clearly this table of exceptions is uniquely linked to the media in a particular hard disk drive.  The table for one hard disk drive will not, in general, be the same for the media from another one.
  Up until the 1980's, hard disk drives typically had the same number of sectors on each track.  However, the circumference of a track on the surface of a platter at the outer radius of the hard disk drive (called the OD, for outer diameter) is clearly much larger than the circumference of tracks at the ID (inner diameter) as they are concentric. This means that the linear bit density (BPI) varies too. The reason that density varies is because the lengths of the tracks increase in the direction of the inside tracks to the outside tracks, so outer tracks can hold more data than inner ones. Although, generally, the bits per (square) inch (BPI) is highest only on the innermost track compared to the outermost tracks, zoned bit recording does pack more data onto the outer tracks to exploit their length than would normally be the case. Nevertheless, it is still the case that outermost tracks contain less data than they have the potential to store. BPI defining how many bits can be written onto one linear inch of a track.
  To maximize the amount of data that can be stored, each hard disk drive’s surface is divided into groups of adjacent tracks called zones.  There are 8 to 32 (or more) zones per surface.  From the ID to the OD, each zone is written with a higher frequency to counteract the bit spacing growth caused by the higher linear velocities at the larger radii.  The BPI still drops slightly across each zone.  While zoning makes better use of the storage capacity of the hard disk drive, it also means that many unique optimisation settings must be determined for each surface during manufacturing.
  The user's file is likely to be stored across many sectors.  These sectors may be spread across different tracks in different zones and even across different hard disk drive surfaces.  Furthermore, the same logical blocks may be mapped into different physical sectors on two hard disk drives depending on the unique distribution of defects on each hard disk drive.
  A track of data may be less than 10 micro-inches in width.  The hard disk drive must find this track within a few thousandths of a second and follow the repeatable and random fluctuations of the track to less than one millionth of an inch.  The servo positioning system makes this possible by using a sophisticated feedback control algorithm that controls the fast seeking and precise track following.
  For the best performance, the servo system requires a very accurate measurement of the head's position relative to the track.  Each hard disk drive’s surface is divided into data sectors and servo wedges.  The servo wedges are arc-shaped regions that extend from the ID to the OD.  They contain a unique magnetic pattern that provides a reference to the centre of the track.  The servo pattern is typically written at a much lower BPI than the data and its frequency is constant across the hard disk drive. It is not zoned, meaning that the BPI is lower at the OD.  In other words, the servo pattern is shorter near the ID and longer near the OD – a wedge shape.  There are typically 50 to 200 evenly spaced servo wedges per revolution.  This embedded servo information is on each hard disk drive’s surface.  The servo field begins with a single frequency pattern for establishing timing and amplitude references.  A sync pattern indicates the beginning of the encoded cylinder number (or "track ID").  This is followed by three to six bursts of single frequency magnetic transitions.  These bursts provide accurate position information, relative to the track centre.  The first two bursts, typically called the "A burst" and the "B burst".  When the head is exactly on track centre, it will get a certain amount of signal from the A burst and then an equal amount from the B burst.  The relative amount (amplitude or energy) of each burst signal provides a precise measure of the head's position relative to the track centre.  Because the servo information is written before any tracks of data, the servo bursts actually define the centre of the data tracks.  The track ID indicates which track centre.  The servo system also identifies each sector.  It does this by maintaining synchronisation with the "first" servo wedge in a revolution and timing from there to indicate the beginning and end of each data sector on the track.  This timing relationship changes from zone-to-zone, but the wedge-to-wedge servo timing remains constant.
  Every data sector is a sequence of binary ones (1s) and zeros (0s), stored as a pattern of magnetic transitions.  A magnetic transition is a change from a north-facing magnet to a south-facing magnet or vice versa.  These are sometimes called "north-north transitions" and "south-south transitions," which stresses their "polarity" differences.  The GMR head and its amplifier respond with a voltage pulse for each transition that is read.  The polarity of the pulse indicates the transition's polarity.  The read gate is generated based on timing offsets from the rotational synchronisation generated by the servo system.  As stated above, the timing offsets vary from zone-to-zone.  Detection of a sector begins with the read gate's assertion and ends with its de-assertion.  The detection of data is equivalent to the detection of the presence or absence of the pulses, and their polarity.  However, detection must take place in a noisy environment, so mistakes can be made.  Furthermore, the read back signal can be distorted in many ways, including due to slightly off-track placement of the head.  At high BPI the pulses overlap, which causes pulse position shifting known as intersymbol interference (ISI).  This makes identifying the data sequence especially difficult. Today’s hard disk drives use variations and extensions ( extended PRML or EPRML) of partial-response maximum-likelihood (PRML) sequence detection in order to correctly detect data in such environments.  A controller using PRML or its successor EPRML (the standard hard disk drive method of data decoding) employs sophisticated digital signal sampling, processing and detection algorithms to manipulate the analogue data stream from the hard disk drive’s surface via the read/write head (the partial-response component) and then determine the likeliest sequence of bits this represents (the maximum- likelihood component) or correct data sequence. EPRML has replaced PRML, using more advanced algorithms and signal-processing circuits.  The benefit of using EPRML is that the linear component of areal density can be increased further without a detrimental effect on the error rate. In time, even more sophisticated techniques, such as iterative detection, will likely be employed.  For good error rate performance, it is necessary to establish the proper gain for each sector and lock the detection process to the precise frequency and phase of the read back waveform.
  This places three specific requirements on the stored data. Every data sector must start with a single-frequency sequence of transitions.
  1. This is usually called the preamble and is about 10 to 15 bytes long.  The preamble makes it much easier to establish the proper gain and timing synchronisation for the sector.  Every servo field also starts with a single frequency preamble for the same reason.
  2. It is possible that the beginning of the user's data might look just like the repetitive pattern of the preamble.  To precisely indicate the end of the preamble a unique, easily identifiable transition sequence called the sync mark, or frame sync, is written in between the preamble and the user's data.  The sync mark is typically 2 to 6 bytes long and may be written in two locations in case the first sync mark is missed or damaged.
  3. After the sync mark is found, gain and timing lock must be maintained throughout the user's data that follows.  In order to ensure this, it must contain pulses at least every two to three bytes so that gain and timing locks can be adjusted.  For example, if the user stored an all-zeros pattern there would be no transitions to generate pulses to use to maintain synchronisation.  For this reason, the user’s data is run-length limited (RLL) encoded before being written to the hard disk drive.  This can expand the amount of data that must be written by about one percent to as much as 12.5%, depending on the RLL code used.

  Inside a modern hard disk drive, the user’s data is encoded about 5 times before being written to the hard disk drive. This is done to 1) ensure no imprecise data is provided to the user, 2) correct as many errors that may occur in detection as possible, and 3) improve the quality of detection by improving timing recovery and by mitigating the effects of certain error-prone patterns. Because of these levels of encoding (coding overhead of about 6% and higher for RLL), the user's data itself is not written to the hard disk drive. Instead it is the encoded user data that is stored.
  Currently, most hard disk drives combine RLL codes with a parity check code. This typically adds one or two bits to the RLL code overhead. The benefit of adding this small amount of parity is that the dominant errors made by the detector can be identified and corrected with a small increase in circuitry and code overhead.
  At this time the best that can be achieved towards the goal of eliminating the unrecoverable read error rate is with error correction coding, sometimes referred to as error correction code or error correcting circuits (ECC).
  The efficient Reed-Solomon algorithm for ECC, after researchers that discovered the general technique employed and in general use today, is used for various computing and communications media, e.g., magnetic storage such as hard disk drives, optical storage, high-speed modems, and data transmission channels. Their algorithm, compared to other ECC algorithms, requires the least amount of overhead, it is easier to decode and can detect and correct large numbers of missing data bits. For example, an optical disc (CIRC and CIRC7) has a higher error rate than a magnetic disk. Magnetic tape, consisting of a loop of flexible celluloid-like material, has a higher error rate than magnetic disk. Fibre optic communications cable and semiconductor memory have a low error rate.
  ECC calculates parity bytes for the user’s data, which provide structured redundancy (the bits do not contain actual data, instead these bits contain information about the data that can be utilised to correct problems encountered while trying to access the actual user data bits) that can be used during decoding to detect and correct errors. The ECC encoded user data is what is scrambled and RLL encoded. Typically, Reed-Solomon encoding is used because of its good burst error correction capability and the economy of its implementation. Bursts of errors occur because a scratch or other small mark corrupts a group of consecutive bits. It is not uncommon to have the ECC capability to correct over 200 bit errors in a sector.
  When data is written to a sector, the appropriate ECC codes are generated and stored in the bits reserved for them. When the sector is read back, the user data is read, combined with the ECC bits, can tell the controller if any errors occurred during the read. Errors that can be corrected using the redundant information are corrected before passing the data to the rest of the system. The system can also tell when there is too much damage to the data to correct, and will issue an error notification.  Today’s hard disk drives use sophisticated firmware in which ECC is implemented as part of its overall error management protocols. This is all done "on the fly" without user intervention, and no performance degradation even when errors are corrected. The reason being that errors must be detected and corrected at the same rate at which data is read from the sector using digital logic (hardware implementation via hard disk drive firmware). Software detection and correction would be inadequate.
  However, ECC is not infallible; it can fail in two ways. One way is that there are too many errors in a sector to correct. This is an unrecoverable read error (also called a hard error). The other way ECC fails is much more pernicious.
  If there are a few more errors in a sector than the ECC can correct, and they occur in a certain way, it is possible that the ECC decoding imprecisely encodes the data. This is disastrous in financial transactions, for example. The probability of imprecision and correction, also called the probability of data corruption, is not commonly specified on hard disk drive data sheets. To ensure that it is very unlikely that data will be imprecise, the ECC encoded data is often “wrapped” with a CRC (cyclic redundancy check) code. This has a very strong capability to detect errors, but is not used for correction. This provides the final check that the data is correctly delivered back to the computer over the interface. Note: To get the most benefit from zoning, sometimes data sectors are “split” across servo wedges. The second part of a split sector must also start with a preamble and a sync mark. The detected data sequences from both portions are concatenated and the decoding and descrambling proceed as usual.
  After assembling the components, drive manufacturers burn-in every hard disk drive. Depending on the quality of the drive and the demands of its intended market, the burn-in procedure may take about an hour or more than a day. The drives may go through testing for seek performance, power consumption, data handling, interface compliance, shock and vibration performance, temperature and power extremes, surface scanning for defects, noise measurements, etc.
  However, it is during this time that the hard disk drive's parameters for detection, data organisation, and positioning are determined. These optimised parameters are typically saved to a table that is stored on the extreme outer tracks (“negative cylinders”) of the hard disk drive. Some manufacturers store duplicate copies of this table on each surface. In a modern hard disk drive, self-servo-writing may be employed. This means the hard disk drive's servo pattern is written during burn-in by the hard disk drive itself. This gives incredible flexibility for determining a BPI/TPI combination that provides the desired capacity at the most robust performance point for a particular head/media pair. TPI (track density (tracks per (square) inch (TPI)); defines how many tracks are recorded onto one linear inch of a track. This is sometimes referred to as adaptive formatting. It is also necessary to measure various physical parameters of the head including the offset between the head's write element and the read element, resistance, temperature, pulse asymmetry, etc. Various head-parameters must be optimised, such as the current for writing, the current for read biasing, and the write pre-compensation that is needed to partially linearise the read back signal, etc. After the BPI/TPI, zoning, writing parameters, and reading parameters are determined, the detection parameters are optimised. These must be determined for every zone of every surface of every hard disk drive. A 6-surface hard disk drive with 16 zones requires 96 groups of channel optimisation settings to be stored in the parameters table. These channel settings include equalisation and noise-whitening filter coefficients; gain, timing, and adaptation parameters; detection target; RLL code selection and so forth. Similar settings must also be stored for detecting the servo wedge information. With almost every new generation, a hard disk drive parameter that was fixed becomes variable. This new variable must then be optimised, which leads to the “hypertuning” that occurs routinely in modern hard disk drives.
  See: Allocation Unit (or Cluster), Areal Density, Backup (Why Should I?), Checksum (or Verification), CHS (Cylinder, Head, Sector), Cluster (or Allocation Unit), Data Loss & Data Recovery, Disc & Disk, Sector, and Unrecoverable Read Error (or Hard Error).

 
   
Data Loss & Data Recovery  
 It has been estimated by a leading data recovery company that approximately 300,000 hard disk drives are sent to data recovery companies worldwide each year. About one half to three-quarters have some issue with reading. This might be sporadic reading, a high number of errors, excessive retries, and so forth. About one third to one half are completely unreadable. That is, they are not recognised by the host, do not spin up, do not send back any data, etc. It is estimated that approximately 260 million new hard disk drives were shipped in 2003.
  However, most hard disk drives stop working because of mundane reasons. For example:

  1. Failure of solder traces, electronic components, or connectors on the printed circuit board (PCB).
  2. Exceeding a S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology) threshold.
  3. Damaged or corrupted firmware.
  4. Uncorrected bug in factory firmware.
  5. Damage to “system areas” of the hard disk drive that are used for calibration, testing, storing firmware and parameters tables.
  6. Spindle or voice-coil motor failure (e.g., short circuit, open circuit).
  7. Seized bearings.
  8. Breakdown of bearing grease.
  9. Disk shift or misalignment.
  10. Head damage.
  11. Overheating.
  12. And many others.

The cost of a recovery ranges from US$500 to US$2,500, depending on the filesystem (servers, RAID arrays (multi-hard disk drive storage subsystem), and UNIX are more expensive), the type of repair, the time required, the type of data to retrieve (text is easier to recover than a corrupted databases), and the probability of success and therefore payment costs. Some extreme cases, including, those requiring onsite support, can cost tens of thousands of dollars. Worldwide, it is estimated that the market for data recovery in 2005 was over US$100 million.
  The term “data recovery” refers to accessing logically damaged and/or physically damaged media, specifically from hard disk drives, to obtain files or blocks that have no functioning backups – or are themselves backups. Although the techniques for recovery from logical damage are interesting and challenging, they are more closely related to the operating system and the software programs used to create the data, not the hard disk drive itself.
  There is a general perception (a misconception) that data recovery companies have "magic machines" for retrieving data in almost any situation.  The reality is far less glamorous.
  The most sophisticated state-of-the-art physical techniques that are commercially successful for recovering data from failed hardware can all be described as “part replacement” conducted within laboratory clean room conditions. To achieve high data density and high manufacturing yields, modern hard disk drives are “hyper-tuned” in the factory so that their data layout, zone frequencies, and various channel settings are optimised for each head, surface, and zone. This greatly complicates part replacement because a transplanted headstack, for example, no longer matches the servo, pre-amp, and read channel parameters that were optimised for the original headstack. It is predicted that data recovery will be more important in the future as hard disk drives are exposed to more extreme mobile environments. Hard disk drive manufacturers may be able to differentiate themselves from their competition by designing for recoverability. Note: Data is being stored and lost at a geometric rate.
  Furthermore, the need for data recovery is expected to grow. This is not because drives are being built more poorly. Instead, the networked lifestyle and the expectation that all data is always available, even when mobile, will put massive amounts of information in more vulnerable places. For example, as hard disk drives continue to enter new non-traditional markets such as automobiles, cell phones, navigation, and personal mobile entertainment devices, hard disk drives will experience more and larger extremes of temperature, humidity, shock, vibration, and neglect. “Neglect” in the sense that hard disk drives employed in such uses are less likely to be backed up as often as those connected to home PCs or corporate data centres. In addition, the super-paramagnetic effect, which causes bits to decay with time, temperature, and external magnetic field, e.g., from adjacent-track writing, also appears to be causing some recent data losses. The longer-term unknown effects of the current change from longitudinal to perpendicular recording may also result in additional unexpected failure modes.
  A typical hard disk drive’s boot-up procedure begins at power-on with electronic subsystem self-checks; then spindle motor spin-up; headstack unlatching or unloading; initial acquisition of servo wedge timing; seeking to the system area; and reading in its information for additional boot procedures and/or needed parameters and firmware. Clearly if the system area information is corrupted, the hard disk drive is not likely to function. For this reason, many manufacturers store multiple copies of the system area information, often one (or more) per surface. Furthermore, if a module of this information is corrupt in one copy, a good copy of the module, i.e., one with a valid checksum, from another area can be used. The system area may become corrupted due to malfunctioning circuits, firmware bugs, exceeding the operational shock specifications of the hard disk drive, or position system errors. Another, more common, reason for system area corruption is a loss of power during an update of the system area itself. This might occur when system logs are being updated or when the G-list is being changed. The G-list, or grown defect list, holds information about the location of defects that have been found in the field during hard disk drive operation. The G-list is typically used for sector swapping, or sector reallocation. Related to this is the P-list, or primary defect list, that stores the location of media defects that were found during manufacturing. This is typically used for sector slipping and is not updated in the field.
  For some hard disk drive models, the system area contains only a small amount of information, such as a unique hard disk drive serial number, the P-list and G-list, often a “translator” that converts between logical and physical block addresses including the effects of head and track skewing, S.M.A.R.T. data, and a (possibly encrypted) hard disk drive password.
  Some hard disk drive models have larger system areas, which may span tens of tracks. This typically indicates that a hard disk drive employs hyper-tuning.
  Their system areas contain all of the information listed above, plus some or all of the following: program overlays (executable code) for seldom-used functions or functions subject to revision; hard disk drive-specific tables such as Repeatable/Repetitive Runout (RRO compensation), writer/reader offsets, data rates, zone table, many read channel parameter settings, gains, bias currents, and servo parameters; test routines; calibration routines; factory defaults; system logs; and extensive details about hard disk drive components.
  Data recovery is difficult now, and is getting even more so. Hyper-tuning that simultaneously enables higher data density and higher yields causes the data recovery industry’s traditional hardware repair method of part replacement to fail in more hard disk drive models. While some hard disk drive models currently have recovery success rates above 90%, and others above 60%, an increasing number have practically no chance of recovery for most part replacements.
  Part replacement has historically been successful for data recovery about 40% to 60% of the time.  Claimed data recovery success rates are much higher. While they may, in fact, approach 100% for some hard disk drive models, for other models the success rate is near zero. Drive-independent data recovery methods are needed now to read these hard disk drives. Furthermore, as the data density of hard disk drives continues to increase the number of unrecoverable hard disk drives is expected to grow.
  The reason for this lack of successful recovery can be traced to the methods hard disk drive manufacturers must employ to achieve both high data density and high production yields.  Specifically, current hard disk drives are "hyper-tuned" in the factory to optimise the performance of each section of each hard disk drive.  The data format, head, disk, electronics, and firmware parameters are all optimised together.  This means that it is less likely that a head stack or electronics board or parameter tables from one hard disk drive – even of the same model – will work well when used as a replacement in a failed hard disk drive.
  SteelEye Technology, Inc market survey (June 2006) shows that despite both growing industrywide adoption of formalised business continuity (BC) plans (73%), a startling percentage of organisations that have needed to invoke those plans (45%), many organisations still fall short in their preparedness for an IT disaster. Despite the narrow window for outrages, between 4 (32% reporting over four hours of outrage as being disastrous) through to 48 hours, 19% of organisations have no plan whatsoever for assuring business continuity, with enterprises most often blaming cost as the key barrier. However, BC plans need not be reserved for large budget organisations – there are many affordable options. When considering how many organisations actually use their BC plan and that any company without one could be just a day or two from a 'fatal issue,' the real costs for business continuity assurance begin to look miniscule.
  It is imperative therefore to have a sound backup solution and recovery strategy.
  Cautionary note: There are many reputable professional data recovery companies. However, it is difficult for the end-user, whose hard disk drive (or other storage media) contains important and pressing information and may be successfully recovered, to determine which company to trust. Furthermore, the end-user will not know if his failed hard disk drive is one of the models with which they have a 90% recovery rate or a near-zero rate. A reputable data recovery company will explain the difficulties involved in recovery but will inform the user as to whether a good success rate with the hard disk drive is possible. Furthermore, even if the pass success rate is good for a hard disk drive, the user’s hard disk drive may be damaged in such a way that recovery is not possible. There is always a chance that the data cannot be recovered. When critical data is on the line, be sure that the data is unrecoverable because of the hard disk drive and not because of the lack of skill at the data recovery company that you chose.
  See: Backup (why should I?), Data, Defect Map, and Open File.

 
   
Default  
Settings at startup or reset by the computer’s software and attached devices that remain operational unless changed by the user. Moreover, the term is used in software to describe any action the computer or program takes on its own with embedded values.  
   
Defect Map  
A list of unusable sectors and tracks coded onto a hard disk drive during low-level formatting.
See: Data, Data Loss & Recovery, and Dynamic Cluster Remapping.
 
   
Defragmentation  
The process of rewriting parts of a file to contiguous (touching or jointed at the edge or boundary, in one piece) sectors on a hard disk drive to increase the speed of access and retrieval. When files are updated, the operating system tends to save these updates on to the largest contiguous space on the disk, which is often on a different sector than the other parts of the file. When files are fragmented the operating system must search the disk each time the file is opened to find all of its parts, which slows down response time, increases ware and tear, and consequently possible hard disk drive failure.
  It is recommended that file fragmentation be kept to a minimum. Not only does limited fragmentation maintain the optimum performance of NTFS volumes. Moreover, if data recovery is necessary using a hard disk drive hex editor will be less protracted and time consuming because each file part will follow from the last in an ordered manner.
 
   
Device Drivers (location - systemroot\System32\Drivers - Windows NT-based operating system system startup file #9)  
A device driver or a software driver is a specific type of computer software, typically developed to allow interaction with hardware devices such as a keyboard, mouse, or video. The bulk of Windows operating system’s device drivers code are loadable kernel-mode modules sharing the same protected memory space (typically with the .sys extension) that interface with the O/I Manager and the relevant hardware using system routines and internal routines. Device drivers in Windows do not manipulate hardware directly but manipulate through call functions in the hardware abstract layer (HAL) to interface with the hardware to write output to or retrieve input from a physical device or network.
  The key design goal of device drivers is abstraction: a mechanism and practice to reduce and factor out details, to focus on a few concepts at a time, which are continually improved upon.
  Computers and their operating systems cannot be expected to know how to control every device, both now and in the future. To solve this problem, operating systems essentially dictate how every type of device should be controlled. The function of the device driver is then to translate these operating system mandated function calls into device specific calls. In theory a new device, which is controlled in a new manner, should function correctly if a suitable driver is available. This new driver will ensure that the device appears to operate as usual from the operating systems’ point of view. Depending on the specific computer architecture, drivers can be 8-, 16-, 32-, and more recently, 64-bit.
  Writing a device driver is considered a challenge and a skill in most cases, as it requires an in-depth understanding of how a given platform functions, both at the hardware and the software level. Because many device drivers execute in kernel mode, software bugs often have much more damaging effects to the system. This is in contrast to most types of user-level software running under modern operating systems, which can be stopped without greatly affecting the rest of the system. Even drivers executing in user mode can crash a system if the device being controlled is erroneously programmed. These factors make it more difficult and dangerous to diagnose problems. Engineers most likely to write device drivers come from the companies that develop the hardware. This is because they have more complete access to information about the design of their hardware than most. Moreover, it was traditionally considered in the hardware manufacture’s interest to guarantee that their clients would be able to use their hardware in an optimum way.
  Typically, a device driver constitutes an interface for communicating with the device, through the specific computer bus, e.g., local busses such as CPU, physical memory, PCI (Peripheral Connect Interface), AGP (Accelerated Graphics Port), PCI-Express (Peripheral Connect Interface-Express), USB (Universal Serial Bus; 1.1 or 2.0 (or High-speed USB)), FireWire (i-Link or IEEE 1394a/b); mass storage drivers such as serial ATA (SATA), SCSI (pronounced scuzzy) and IDE (both employing their own external bus system), and human interface. For a typical machine there are approximately five different busses supporting various devices, or communications subsystem that the hardware is connected to, providing commands to and/or receiving data from the device, and on the other end, the requisite interfaces to the operating system and software applications. Although often called a driver for short, it is a specialised hardware dependent computer program which is also operating system specific; enabling another program, typically an operating system, application software package or computer program running under the operating system kernel, to interact transparently with a hardware device; usually provides the requisite interrupt handling required for any necessary asynchronous time-dependent hardware interfacing needs. All devices are seen by user mode code as a file object in the I/O Manager, though to the I/O Manager itself the devices are seen as device objects, which it defines as either file, device or driver objects. Kernel mode drivers exist in three levels: highest level drivers, intermediate drivers and low level drivers. The highest level drivers, such as file system drivers for FAT and NTFS, rely on intermediate drivers. Intermediate drivers consist of function drivers or the main driver for a device that is optionally sandwiched between lower and higher level filter drivers. The function driver then relies on a bus driver or a driver that services a bus controller, adapter, or bridge, which can have an optional bus filter driver that sits between itself and the function driver. Intermediate drivers rely on the lowest level drivers to function. The Windows Driver Model (WDM) exists in the intermediate layer. The lowest level drivers are either legacy Windows NT device drivers that control a device directly or can be a PnP (Plug and Play) hardware bus. These lower level drivers directly control hardware and do not rely on any other drivers.
  The Windows Driver Model (WDM) introduced in Windows 2000 adds support for PnP, Power Options and an extension to the Windows NT driver model. From the perspective of WDM, there are three types of drivers; bus driver, function driver and filter driver. A bus driver services a bus controller, adapter, bridge, or any device that can child a device. Bus drivers are required drivers generally supplied by Microsoft; each type of bus, e.g., PCI, AGP, USB etc as mentioned, on a system has one driver. A function driver is the main device driver and provides the operational interface for the device. It is a required driver unless the device is used raw. A function driver is by definition the driver that knows the most about a particular device, and it is usually the only driver that can access device-specific registers. A filter driver is used to add functionality to a device (or existing driver) or to modify I/O requests or responses from other devices (and is often used to fix hardware that provides incorrect information about its hardware resource requirements). Filter drivers are optional and can exist in any number, placed above or below a function driver and above a bus driver.
  In a WDM driver environment, no single driver controls all aspects of a device: a bus driver is concerned with reporting the device on its bus to the PnP Manager, while a function driver manipulates the device. In most cases, lower-level filter drivers modify the behaviour of device hardware, e.g., if a device reports to its bus driver that it requires four I/O ports when it actually requires 16 I/O ports, a lower-level device-specific function filter could interpret the list of hardware resources reported by the bus driver to the PnP Manager, and update the count of I/O ports. Upper-level filter drivers usually provide added-value features for a device, e.g., upper-level device filter drivers for a keyboard can enforce additional security checks.
  For Windows XP/Server 2003, Microsoft is attempting to address the issues of system instability by poorly written device drivers by creating a new framework for driver development known as Windows Driver Foundation (WDF). This includes User Mode Driver Framework (UMDF) that encourages development of certain types of drivers - primarily those that implement a message-based protocol for communicating with their devices - as user mode drivers. If such drivers malfunction they will not cause system instability. The Kernel Mode Driver Framework (KMDF) model continues to allow development of kernel-mode device drivers, but attempts to provide standard implementations of functions that are well known to cause problems, including cancellation of I/O operations, power management, and PnP device support. For example, Microsoft works closely with graphics device driver engineers as these device drivers are crucial, run in kernel mode and would have a devastating effect on the operating systems stability. Microsoft has introduced digitally certified drivers. Theses drivers have been authenticated to be compatible with, say, Windows XP and Server 2003 and in particular the new Microsoft operating system Windows Vista which intends to install only digitally certified drivers. The main reason that most drivers are not digitally certified is because Microsoft charges a fee, which many software developers feel is unnecessary or financially prohibitive. Therefore, non-digitally signed drivers are flagged as digitally unsigned yet may be perfectly robust and sound to install. However, this may not always be the case.
 System file drivers are Windows drivers that accept file-orientated I/O requests and translate them into I/O requests bound for a particular device, and many others besides.
  A class driver is a type of hardware device driver that can operate a large number of different devices of a broadly similar type. Manufacturers that make their devices compatible with a standardised protocol will be able to take advantage of a class device driver, e.g., CD-ROM and USB devices commonly use class device drivers.
  A class driver can also be used in some operating systems as a base or ancestor class for specific drivers which need to have slightly different or extended functionality, but which can take advantage of the majority of the functionality provided by the class driver. This concept is a key aspect of object oriented programming, and by extending it to drivers, it makes it much easier for a hardware vendor to provide driver support for their product than having to write a driver from scratch.
  Note: Windows provides several base mechanisms which kernel-mode components such as the kernel use, various components of the Windows kernel and several core device drivers, are instrumented to record trace data of their operation for use in system troubleshooting.
  See: Hal.dll (location - systemroot\System32 - Windows NT-based operating system system startup file #7), and Windows Kernel (Lower Windows Executive).
 
   
Device Object  
A data structure that represents a physical, logical, or virtual device on the system and describes its characteristics, such as the alignment it requires for buffers and the location of the device queue to hold incoming I/O request packets.  
   
Directory  
Strictly speaking a file directory is an information source or area of disk that simply stores an index of filenames given to the files saved on the disk and serves as a table of contents for those files – that is, a collection of filenames (along with their file reference) organised in a particular manner for quick and easy access. To create a directory, NTFS indexes the filename attributes of the files in a directory.
  A directory contains data that identifies the name of a file (the filename), the size, the attributes (system, hidden, read-only, and so on), the date and time of creation, and a pointer to the location of the file. Each entry in a directory is 32 bytes long. Windows refers to subdirectories (directories beneath the root directory) as folders.
  See: Filename, File Reference, and Master File Table (MFT).
 
   
Disc & Disk  
 
   
Dismount  
To remove a removable tape or disc from a device drive. Dismount also describes a technique NTFS uses when it finally disengages accesses to a volume; in this context, to dismount means to discontinue the volume for use.  
   
Drive  
A drive is a mechanical device that manipulates data storage media.
  See: Disc & Disk.
 
   
Driver Object  
Data structure that represents an individual driver in the system recording for the I/O Manager (responsible for communication within an information processing system); the address of each of the driver’s dispatch routines (entry points).  
   
Dynamic Cluster Remapping  
An automatic and dynamic recovery or retrieval technique used when a Windows NT family operating system returns a bad sector error to NTFS. NTFS dynamically and continuously replaces (remaps/reallocates) the cluster containing the bad sector, reassigning the cluster with the bad sector/block to its bad-cluster file, and allocating a new cluster so that the data can be copied to it – ensuring data integrity. If the error occurs during a read from a bad hard disk drive sector, the read operation will fail, the data in the allocated cluster will become inaccessible; as NTFS returns a read error to the calling program, and responds appropriately to the data lose. The bad cluster will not be used in future allocations. If this occurs during a write, NTFS writes the data to the new cluster, NTFS remaps the cluster before writing and thus looses no data and generates no error. The same recovery procedures are followed if the filesystems data is stored in a sector that goes bad.
  However, cluster remapping is not a backup alternative. Once errors are detected, the hard disk drive must be monitored closely and replaced if the bad sector detection list grows. For a Windows NT family operating system this type of error is displayed in the System Log of Event Viewer.
  This data recovery and dynamic bad-cluster remapping is an especially useful feature for file servers and fault-tolerant systems or for any application that cannot afford to lose data.
  FAT uses a form of cluster remapping, but only when the volume is initially formatted. If a bad sector occurs in a FAT volume after it is formatted, data stored within the associated cluster can be permanently lost. This is one advantage of using NTFS over FAT.
  In addition to the recovery features mentioned, NTFS uses redundant storage for vital filesystem information so that if a sector on the hard disk drive goes bad, NTFS can still access the volume’s critical filesystem data. This redundancy of filesystem data contrasts with the on-hard disk drive structures of both types of FAT filesystem (FAT16 and FAT32). On these systems, if a read error occurs in one of these critical sectors an entire volume can be lost.
  If some of the filesystem’s control structures reside in the bad sector, an entire file or group of files, or an entire disk, can be lost. At best, some data in the affected file (often, all the data in the file beyond the bad sector) is lost. Moreover, the FAT filesystem is likely to reallocate the bad sectors to the same or another file on the volume, causing problems to resurface.
  To conclude, the benefits of NTFS cluster remapping are that bad sectors/blocks in a file can be fixed without harm to the file (or harm to the filesystem) and that the bad cluster will not be reallocated to the same or another file.
    Chkdsk is less than ideal for removing bad sectors from use.  Chkdsk can take considerable time to find and recover bad sectors.
  Because of the importance of the Master Boot Record (MBR) and Extended Boot Record (EBR) sectors, run hard disk drive scanning tools, e.g., chkdsk regularly, and regularly backup all critical data to protect against loosing access to a volume or an entire hard disk drive.  
See: Bad Block or Bad Sector, Bad Cluster, Bad-Cluster File, Check Disk (chkdsk.exe), Data, Data Loss & Data Recovery, NTFS (New Technology Filesystem), S.M.A.R.T. (Self-Monitoring Analysis and Reporting Technology), and Recoverable Filesystem.

 
   
Dynamic Disk  
Only Windows NT family operating systems can access physical dynamic disks and implement a more flexible partitioning scheme than that of basic disks. The fundamental difference between a basic and a dynamic disk is that dynamic disks support the creation of new multi-partition volumes. For this reason multi-partition volumes provide dynamic disks features that basic disks do not, such as support for volumes that span multiple disks, performance, sizing, and reliability features not supported by simple volumes. Windows manages all disks as basic disks unless the user manually creates dynamic disks or converts existing basic disks to dynamic disks. Unless the user requires multi-partition functionality of dynamic disks, Windows NT family operating systems use basic disk as a matter of course. Dynamic disks use a hidden database to track information about dynamic volumes on the hard disk drive and the other dynamic disks in the computer. Basic disks can be converted (upgraded) to dynamic disks by using the Windows Disk Management (WDM) MMC snap-in or the DiskPart (dmdiag.exe) command-line tool – displays the location and layout of dynamic disks (Master Boot Record (MBR)) or dynamic volumes. A basic disk converted to a dynamic disk loses all exiting basic disk volumes as they do not use partitions or logical dos drives.
  The MBR, partition table and boot sectors of individual partitions allow an operating system to boot a computer and mount individual volumes. For a Windows NT family operating system, an additional way of describing partitions is known by Logical Disk Manager (LDM) partitioning. The LDM is used when partitioning goes beyond what Windows calls basic disks. For basic disk(s), basic/standard partitioning there is an MBR, partition table and boot sector within the MBR, and a boot sector for each volume. For dynamic disk(s) advanced partitioning there is the MBR and LDM only.
  As mentioned, dynamic disks are partitioned using LDM partitioning. The LDM subsystem in Windows, which consists of user-mode and device driver components, oversees dynamic disks. LDM provides a Windows NT family operating system with more robust partitioning and multi-partition volume capabilities. A major difference between LDM’s partitioning and MBR and GPT partitioning is that LDM maintains one unified database that stores partitioning information for all the dynamic disks on a system – including multi-partition-volume configuration.
See: Basic Disk, Dynamic Volumes, Logical Disk Manager (LDM) Database, Mount Point, and Recoverable Filesystem.

 
   
Dynamic Volume  
  If a system fails, a recoverable filesystem such as NTFS is designed to reconstruct the disk volume structure and will not affect the correctness or integrity of the database. Consequently, filesystem structures can be repaired to a consistent state with no loss of file or directory structure information (file data can be lost as this is secondary). Because data integrity is not guaranteed and is secondary to correctness or integrity of the database after a system failure, protection is provided through data redundancy.  Data redundancy for user files is implemented through the Windows layered driver model, which provides fault-tolerant disk support.
  A dynamic volume resides on a dynamic disk. Windows supports five types of multi-partition dynamic volumes: simple, spanned, striped, mirrored, and RAID-5. Note: Dynamic disks are the disk format in Windows necessary for creating partition volumes as described below.  A dynamic volume is formatted by a filesystem, such a FAT or NTFS, and has a drive letter.

  A simple volume can consist of a single region on a hard disk drive or multiple regions of the same hard disk drive that are linked together. A simple volume can be extended within the same hard disk drive or onto additional hard disk drives. Extending a simple volume across multiple hard disk drives, it becomes a spanned volume. Simple volumes can only be created on dynamic disks. A simple volume is not fault tolerant, but simple volumes can be mirrored to create a mirrored volume.

  • Spanned:

  A spanned volume is a single logical volume composed of a maximum of 32 free partitions on one or more hard disk drives. The Windows Disk Management (WDM) MMC snap-in combines the partitions into a spanned volume, which can then be formatted for any of the Windows filesystems.
  A spanned volume is useful for consolidating small areas of free disk space into one larger volume or for creating a single large volume out of two or more small hard disk drives. If the spanned volume has been formatted in NTFS, it can be extended to include additional free areas or additional hard disk drives without affecting the data already stored on the volume. The extensibility is one of the main benefits of describing all data on an NTFS volume as a file. NTFS can dynamically increase the size of the logical volume because the bitmap that records the allocation status of the volume is just another file – the bitmap file ($Bitmap). The bitmap file can be extended to include any space added to the volume. Dynamically extending the FAT volume, on the other hand, would require FAT itself to be extended, which would dislocate everything on the hard disk drive. Extending it onto additional dynamic disks can increase the size of the spanned volume. Spanned volumes are not fault tolerant and cannot be mirrored. Spanned volumes can only be created on dynamic disks.

  • Striped (RAID level 0 or RAID-0):

  A stripped volume is a series of up to 32 partitions, one partition per disk, which gets combined into a single logical volume across all configured hard disk drives in the RAID storage subsystem. Note: A partition in a stripped volume need not span an entire hard disk drive; the only restriction is that the partitions on each hard disk drive be the same size.
  To a filesystem, this stripped volume appears to be a single volume, but the storage of the data is optimised and retrieval times on the stripped volume is improved as a consequence that data is distributed among the hard disk drives evenly, normally. Stripes thus increase the probability that multiple pending read and write operations will be bound on different hard disk drives. As a result of data being distributed evenly among the hard disk drives all the hard disk drives can be accessed simultaneously. Latency time for disk I/O is often reduced, particularly on heavily loaded systems; providing the best performance of any RAID (multi-hard disk drive storage subsystem) level.
    Spanned volumes make managing disk volumes more convenient, but striped volumes spread the I/O load over multiple hard disk drives. These two volume-management features don’t provide the ability to recover data if a hard disk drive fails, however. For data recovery, the use of mirrored volumes, RAID-5 volumes, and sector sparring is necessary.
  Data in a striped volume is allocated alternatively and evenly (in stripes) across the hard disk drive. Striped volumes offer the best performance of all the dynamic volume types that are available in Windows, but they do not provide fault tolerance. If a hard disk drive in a striped volume fails, the data in the entire volume is lost. Stripped volumes cannot be mirrored or extended. Striped volumes can only be created on dynamic disks.

  • Mirror (RAID level 1 or RAID-1):

  In a mirrored volume (a fault-tolerant volume), the contents of a partition on one hard disk drive are duplicated in an equivalent partition on another hard disk drive (concurrent copies are made). For example, when a program writes to drive C, the same data is written to the same location on the mirrored partition. If the first hard disk drive or any of the data on its C partition becomes unreadable due to hardware or software failure, the mirrored data is automatically accessed. A mirror volume can be formatted for any of the Windows-supported filesystems. The filesystem drivers remain independent and are not affected by any mirroring activity.
  Mirrored volumes can aid in I/O throughput on heavy loaded systems. When I/O activity is high, balanced read operations can take place between the primary partition and the mirrored partition. Two read operations can proceed simultaneously and therefore finish in half the time it would otherwise have taken on a non-mirrored setup. When a file is modified, both partitions of the mirror set must be written, but disk writes are done asynchronously, so the performance of user-mode programs is generally not affected by the extra disk update.
  Mirrored volumes are the only multi-partition volume type supported for system and boot volumes. The reason for this is that the Windows boot code, including the Master Boot Record (MBR) code and Ntldr, do not have the sophistication required to understand multi-partition volumes – mirrored volumes are the exception because the boot code treats them as simple volumes, reading from the half mirror marked as the boot system drive in the MBR-style partition table. As the boot code does not modify the disk, it can safely ignore the other half of the mirror.
  A fault-tolerant volume that duplicates data on two hard disk drives. A mirror volume provides data redundancy by using two identical volumes, which are mirrors, to duplicate the information contained on the volume. A mirror is always located on different hard disk drives. If one of the hard disk drives fails, the data on the failed hard disk drive becomes unavailable, but the system continues to operate in the mirror on the remaining hard disk drive. Mirrored volumes can only be created on dynamic disks.

  • RAID-5 (RAID level 5 or stripped volumes with parity):

  A RAID-5 volume is a fault tolerant variant of a regular stripped volume. RAID-5 is also known as stripped volumes with parity because they are based on the stripping approach taken by stripped volumes. Fault tolerance is achieved by reserving the equivalent of one hard disk drive for storing parity for each stripe across three hard disk drives (more specifically three same-sized partitions on three hard disk drives) are required to create a RAID-5 volume. For example, the parity for strip one is stored on hard disk drive one. It contains a byte-for-byte logical sum (XOR) of the first stripe on hard disk drive two and three. The parity for strip two is stored on hard disk drive two, and the parity for strip three is stored on hard disk drive three. Rotating the parity across the hard disk drives in this way is an I/O optimisation technique. Each time data is written to a hard disk drive, the parity bytes corresponding to the modified bytes must be recalculated and rewritten. If the parity were always written to the same hard disk drive, that hard disk drive would be continually busy and could become an I/O bottleneck.
  Recovering a failed hard disk drive in a RAID-5 volume relies on a simple arithmetic principle and in doing so the missing data is reconstructed.
  Parity is a calculated value that is used to reconstruct data after a failure. If a portion of the hard disk drive fails, Windows re-creates the data that was on the failed portion from the remaining data and parity information together for reconstruction. RAID-5 volumes can only be created on dynamic disks. RAID stands for Redundant Array of Independent or Inexpensive Disks and is a multi-hard disk drive storage subsystem for data replication.
See: Basic Input/Output system (BIOS), Dynamic Disk, Logical Disk Manager (LDM) Database, Master Boot Record (MBR), MBR Disk, and NTFS (New Technology Filesystem).

 
 
   
   
 
Genie-soft | Products | Online Backup | Contact Us | Privacy Policy
Copyright© Genie-Soft Corporation 2001-2007. All rights reserved.