1. The retention period of records in Kafka is configurable.
  2. The default retention period is 7 days. The retention period is specific to topic. SO in the cluster each topic can have their own retention period.
  3. The retention attribute is available in the server.properties of the apache kafka distribution.
    The attribute is log.retention.hours=168
  4. Lets say the retention period is one day then in this case the record will be discarded after one day to free up some space in the cluster.
  5. Irrespective of all the consumers have consumed the message, the record will sit in the cluster until the retention period expires.
  6. The performance of kafka is consistent with respect to data size so holding the data from log time is not an issue here