Erasure coding

Normal Reads (No Data Loss): You don’t need to read all segments. Typically, you only need to read the minimum number of fragments required to reconstruct the data. For example:
- If the coding scheme is (6+3) (6 data fragments + 3 parity fragments), you only need 6 fragments (any combination of data and parity) to reconstruct the object.
- This reduces network overhead and improves performance during normal operations.
Recovery Reads (Data Loss): If some fragments are lost (e.g., due to node failure), erasure coding uses the remaining fragments (data + parity) to mathematically reconstruct the missing ones.

Fragment Encoding:
- During storage, the original data is split into fragments (e.g., D1, D2, D3, …) and parity fragments (P1, P2, …).
- Parity fragments are created using mathematical algorithms like Reed-Solomon coding.
- Example: For a (6+3) setup, you store 6 data fragments and 3 parity fragments, distributed across 9 nodes.
Data Loss Detection:
- When a fragment is missing (e.g., due to a node failure), the system detects the absence during a read request.
Reconstruction:
- The system retrieves the remaining fragments (e.g., 6 out of 9) and uses the parity fragments to reconstruct the missing data using the erasure coding algorithm.
- Example:
  - Missing: D2
  - Retrieve: D1, D3, D4, D5, D6, P1
  - Use D1–D6 and P1 to calculate the value of D2.

Mathematical Redundancy: Parity fragments allow reconstruction even with multiple missing fragments.
Lower Storage Overhead: Instead of replicating the entire object (as in replication), you only store parity fragments, reducing redundancy costs.

Detect missing fragments.
Fetch sufficient remaining fragments (data + parity).
Apply mathematical reconstruction (e.g., Reed-Solomon) to recover the missing fragments.

Edmondo's Vault