Schema on Write VS Schema on Read

Schema on Write (RDBMS) Schema on Read (Hadoop)
Schema must be created before any data can be loaded Data is simply copied to the file store, no transformation is needed.
An explicit load operation has to take place which transforms data to DB internal structure A SerDe (Serializer/Deserializer) is applied during read time to extract the required columns (Late Binding)
New columns must be added explicitly before new data for such columns can be loaded into the database New data can start flowing anytime and will appear retroactively once the SerDe is updated to parse it.

Read is fast
Standards / Goverance


Load is fast
Flexibility / Agility

