Home Recipes Nutrition Lab API Docs
Engineering Case Study

From Crawler to
Culinary Intelligence.

We built an extensive crawler to gather 20,000+ recipes, then layered intelligent APIs to structure, analyze, and unlock that data for the modern culinary world.

The Foundation

Order from
Entropy.

Our raw database holds terabytes of unstructured culinary text. To make it useful, we architected a distributed processing pipeline.

The system pulls raw records from our archives, normalizes inconsistent units (converting "cups" to grams), and stabilizes the schema. It turns static text into a queryable, intelligent dataset ready for application use.

20k+ Processed
99.8% Accuracy
40ms Latency
processor_node_04 — zsh
~ queue:start --batch=2104
FETCH job_id: 8821 READY
├─ Loading raw JSON blob...
├─ Detected 14 ingredients
├─ Detected 6 steps
└─ Image optimization... Done
ENRICH nutrition_engine.ts
├─ Mapping to USDA DB...
├─ Calculating macros...
Output generated.
FINAL OUTPUT
Pizza

Neapolitan Pizza

ID: 8821 45m
Verified
Calories 860 kcal
Carbohydrates 102g
Protein 34g

The Challenge: Beyond Raw Data

With a robust API delivering 20,000+ recipes, the next frontier was understanding them. Raw ingredient lists, even with images, are static. We needed to extract scientific insights from everyday cooking.

  • Text to Nutrition: Translating "1/2 cup almond flour" into accurate macro and micronutrient profiles, considering preparation methods and precise quantities.
  • Allergen Precision: Identifying hidden allergens in ambiguous ingredient names or processed foods, crucial for user safety and compliance.
  • Real-time Analysis: Delivering nutritional breakdowns instantaneously for dynamic applications, avoiding latency in user experience.
Problem: Unanalyzed Recipe

Recipe ID: 12345

Title: Mediterranean Salad

Ingredients: "Lettuce, tomatoes, cucumber, feta cheese, olive oil..."

Nutrition: Unknown

Allergens: Undetected

Data Value Gap

A library of recipes, however vast, provides limited value without deeper analysis.

The Solution: The Nutritional API Layer

Building upon our robust recipe data API, we developed a sophisticated Nutritional API that transforms raw ingredients into actionable scientific data.

1. NLP Ingredient Parsing

Our custom NLP engine meticulously parses each ingredient, identifying Quantity, Unit, Ingredient Name, and Preparation. "2 lg eggs" becomes structured data.

2. USDA Micro-Mapping

Parsed ingredients are cross-referenced with the USDA FoodData Central database, performing fuzzy matching and vector similarity to ensure accurate mapping to scientific nutritional profiles.

3. Dynamic Analysis

All nutritional data, including macro/micro splits and comprehensive allergen flags, is then instantly available via our API, providing a dynamic layer of intelligence over our recipe database.

Hardship #1: "The Salt Problem"

Sodium content varies wildly. "Kosher salt" vs "Table salt" have different densities. A parser error here could label a dish "Heart Healthy" when it's dangerous.

Fix: Implemented specific density lookups for 15 varieties of salt, defaulting to the most conservative (highest sodium) estimate when ambiguous to ensure safety.

Hardship #2: Image Sync

Our extensive crawling often resulted in broken image links or low-res thumbnails, degrading the visual experience of our API.

Fix: Built a tiered fallback system. If the OG image fails, we query a verified image API, and as a last resort, generate a high-quality placeholder based on the recipe category (e.g., "Soup"). This ensures our API always delivers rich media.

Roadblocks & Breakthroughs

Building Recipe Base wasn't a straight line. We encountered edge cases that broke our initial models across both recipe acquisition and nutritional analysis.

One significant hurdle was fuzzy matching. Users type "Parm", "Parmesan", "Parmigiano-Reggiano". String matching failed.

"We had to implement a custom embedding model to understand that 'EVOO' and 'Extra Virgin Olive Oil' are semantically identical in a culinary context."

This attention to detail allows us to achieve >98% accuracy in nutritional labeling, far surpassing standard "keyword" searches, making our recipe database truly intelligent.

Capabilities: Recipes & Nutrition

Two powerful APIs working in synergy to provide comprehensive culinary data.

The Recipe API

Access our curated database of 20,000+ recipes, complete with high-resolution images, ingredients, and instructions, all easily consumable via our API.

20,000+ Items High-Res Images Structured Data

The Nutrition API

Go beyond calories. Our API calculates full macro/micronutrient profiles, generates allergen flags, and audits compliance for any ingredient or recipe in real-time.

12mgIron
800mgCalcium
90mgVit C

Explore the Culinary Data APIs

Dive into our extensive recipe database or analyze any ingredient with our nutrition engine.