Methodology

How this archive works.

kakao.io, the Open Cacao Index, is an independent, non-commercial open-data project about cacao the crop — the tree Theobroma cacao, its growing regions, and the people and practices behind it. It is a reference archive, not a shop, a directory, or a promotional site.

The unit of data

The basic record in this archive is the origin: a defined cacao-producing place, which may be a country, a region, or a more specific area. Each origin is given a stable identifier that does not change even if its name, description, or classification is later revised. Stable IDs let records be cited, linked, and corrected over time without breaking references.

Classification

Origins are described using the modern genetic-cluster framework rather than the older Criollo / Forastero / Trinitario model. Where a traditional or commercial term is useful, it is recorded as such and clearly marked as historical or trade vocabulary, not as a genetic fact.

Status of the data

Every record in this archive is currently a draft pending source verification. Entries may be incomplete, provisional, or wrong, and they should not yet be treated as authoritative. As records are checked against published sources, their status will be updated.

Format and licence

The data is kept as plain JSON so it is easy to read, validate, and reuse. It is intended for release under the Creative Commons Attribution 4.0 (CC BY 4.0) licence, allowing reuse with attribution.

Corrections

Corrections and additions are welcome. This is an early-stage archive maintained in the open, and review from people with field, trade, or research experience is valued.