Seek time is improving slower than the transfer rate.
Typically is 90’s a drive would be of 1 GB and transfer speed would be 4.5 MB/s
Thus time taken to read the drive = Total Memory / Speed
= 1000 / 4.5 ~ 222 sec ~ 4 min.
Typically scenario now-a-days is 1TB memory and transfer speed of 100 MB/s
Thus time taken to read the drive = Total memory / speed
= 1,000,000 / 100 ~ 10,000 sec ~ 166 min
= ~2.8 hr.
Advantage of Parallelism
Suppose that the 1 TB data is distributed over 50 nodes on a cluster with 1/50 th of the data on each node.
The read time will be 1/50 th of the 1 TB red would be close to 3.5 min.
Hadoop maintains replicas of the copies and this failures do not effect the integrity of data.