Hybrid Data Sources (JSON + database)

Hybrid sources let you query JSON files as if they were database tables, and join them directly with database entities using familiar ObjectQuel syntax.

explanation

Querying JSON Sources

JSON sources support filtering, sorting, and property selection:

// Filter and sort
$results = $entityManager->executeQuery("
    range of product is json_source('catalog/products.json')
    retrieve (product) where product.category = :category
    sort by product.price asc
", ['category' => 'electronics']);

// Select specific properties
$results = $entityManager->executeQuery("
    range of product is json_source('data/products.json')
    retrieve (product.name, product.price) where product.id = :productId
", ['productId' => 15]);

JSON File Structure

JSON files must contain an array of objects with a consistent structure:

[
  { "id": 1, "name": "Wireless Headphones", "price": 99.99, "category": "electronics", "inStock": true,  "tags": ["wireless", "audio", "bluetooth"] },
  { "id": 2, "name": "Gaming Mouse",        "price": 59.99, "category": "electronics", "inStock": false, "tags": ["gaming", "precision", "rgb"] },
  { "id": 3, "name": "Coffee Mug",          "price": 12.99, "category": "kitchen",     "inStock": true,  "tags": ["ceramic", "dishwasher-safe"] }
]

JSONPath Prefiltering

The second parameter of json_source() accepts a JSONPath expression (powered by softcreatr/jsonpath), filtering the data before it reaches the query processor. Use it to reduce memory usage on large files, or to navigate into nested structures that don't start with a flat array:

// Only error entries are loaded into memory
$results = $entityManager->executeQuery("
    range of log is json_source('logs/massive_log.json', '$[?(@.level == \"ERROR\")]')
    retrieve (log) where log.timestamp > :since
", ['since' => '2025-06-01']);

// Navigate into a nested structure
$results = $entityManager->executeQuery("
    range of product is json_source('catalog/products.json', '$.products[?(@.category == \"electronics\")]')
    retrieve (product) where product.price > :minPrice
", ['minPrice' => 50.00]);

// Flatten nested arrays
$results = $entityManager->executeQuery("
    range of order is json_source('orders/daily.json', '$.orders[*].items[*]')
    retrieve (order) where order.quantity > :minQty
", ['minQty' => 2]);

JSONPath Expression Reference

Expression Description
$.products[*] All items in a named array
$[?(@.status == "active")] Filter by field value
$.data.items[?(@.price > 100)] Filter on a nested field
$[?(@.tags[*] == "featured")] Filter by array contents
$.results[0:10] Slice a range of results
$..items[?(@.inStock)] Recursive search with filter

Combining Database and JSON Sources

JSON sources can be joined with database entities. The join type is inferred automatically from the WHERE clause — you never declare it explicitly.

WHERE clause Join type Effect
No condition linking the two ranges Cross Cartesian product
Join condition only Left All database rows returned; unmatched JSON fields are null
Join condition + scalar filter on a JSON field, or IS NOT NULL on a JSON field Inner Only database rows with a matching JSON record are returned
// Left join: unmatched orders still appear, with null product fields
$results = $entityManager->executeQuery("
    range of order is App\\Entity\\OrderEntity
    range of product is json_source('external/product_catalog.json')
    retrieve (order, product) where order.productId = product.id
    and order.status = :status
    sort by order.orderDate desc
", ['status' => 'pending']);

// Inner join: scalar filter on JSON field — only orders with an in-stock product
$results = $entityManager->executeQuery("
    range of order is App\\Entity\\OrderEntity
    range of product is json_source('external/product_catalog.json')
    retrieve (order, product) where order.productId = product.id
    and product.inStock = true
");

// Inner join: IS NOT NULL — only orders where a product record exists
$results = $entityManager->executeQuery("
    range of order is App\\Entity\\OrderEntity
    range of product is json_source('external/product_catalog.json')
    retrieve (order, product) where order.productId = product.id
    and product.name is not null
");

// Multiple JSON sources
$results = $entityManager->executeQuery("
    range of user is App\\Entity\\UserEntity
    range of profile is json_source('profiles/user_profiles.json')
    range of settings is json_source('settings/app_settings.json')
    retrieve (user, profile.displayName, settings.colorScheme)
    where user.userId = profile.user_id
    and profile.theme = settings.theme_name
    and user.active = true
");
Note: The via syntax for defining relationships is not supported for JSON sources.

Pattern Matching and Array Queries

JSON sources support ObjectQuel's pattern matching and array membership operators:

// Wildcard pattern matching
$results = $entityManager->executeQuery("
    range of log is json_source('logs/access_logs.json')
    retrieve (log) where log.userAgent = 'Mozilla*'
    and log.responseCode = 200
");

// Value within an array field
$results = $entityManager->executeQuery("
    range of product is json_source('data/products.json')
    retrieve (product) where 'wireless' in product.tags
");

Best Practices

  • Keep files focused: Aim for files under 10MB; larger datasets should be imported into database tables with proper indexes
  • Prefilter with JSONPath: Always prefilter large files or nested structures to avoid loading unnecessary data into memory
  • Index the join key: The database field used as the join key should be indexed, as the SQL stage runs first and its result set drives the correlation pass
  • Consistent schema: Maintain a uniform object structure within each JSON file
  • Validate paths: Sanitize file path inputs to prevent directory traversal attacks
  • Version control your data: Include JSON data files in your backup and version control systems