For researchers, journalists, and educators
How to use this data
Open Cacao Index is an independent, non-commercial open-data archive of 124 cacao origins across 47 producing countries, with explicit provenance status and source citations on every record.
License — CC BY-NC-SA 4.0
Use freely for research, journalism, and education. Attribution required (see "How to cite" below). Derivative works must use the same licence. See full licence terms.
How to cite
Academic (recommended)
Open Cacao Index contributors (2026). Open Cacao Index (Version 0.10.0) [Dataset]. https://kakao.io
BibTeX
@dataset{open_cacao_index_2026, author = {{Open Cacao Index contributors}}, title = {Open Cacao Index}, year = {2026}, version = {0.10.0}, publisher = {Open Cacao Index}, url = {https://kakao.io}}
Journalism / informal
Open Cacao Index, kakao.io
Download
- Dataset (JSON) — 124 origin records
- Country overviews (JSON) — 47 countries
- Glossary (JSON) — 44 terms
- Origin record JSON Schema
- Cacao genetics JSON Schema (draft)
- Sitemap (XML) · RSS feed · llms.txt
Data schema
Each origin record has a stable id, plus name, country, genetic_groups, traditional_class, and optional coordinates, elevation_m, harvest_season, fermentation, drying, flavor_profile, summary, certifications, status, sources, updated_at. See the JSON Schema for the canonical definition.
Example — count by genetic cluster (shell + jq)
curl -s https://kakao.io/data/cacao-origins.json | jq '.origins | map(.genetic_groups[]) | group_by(.) | map({(.[0]): length}) | add'
Example — Python
import json, urllib.request
data = json.load(urllib.request.urlopen("https://kakao.io/data/cacao-origins.json"))
for o in data["origins"]:
if "Nacional" in o.get("genetic_groups", []):
print(o["id"], o["name"], o["country"])
Provenance status
Records carry a status: draft (unverified), reviewed (corroborated against cited sources), verified (cross-checked against primary sources — reserved). Conservative genetic assignments (admixture / unknown) are preferred over confident guesses. Approximate coordinates carry coordinates.approx: true.
Commercial use
Republishing in paid products or commercial integration requires a separate licence.
Errors and contributions
Corrections welcome. Submit the origin ID, the correction, and a citation.