Hybrid Data Sources (JSON + database)
Hybrid sources let you query JSON files as if they were database tables, and join them directly with database entities using familiar ObjectQuel syntax.
Querying JSON Sources
JSON sources support filtering, sorting, and property selection:
// Filter and sort
$results = $entityManager->executeQuery("
range of product is json_source('catalog/products.json')
retrieve (product) where product.category = :category
sort by product.price asc
", ['category' => 'electronics']);
// Select specific properties
$results = $entityManager->executeQuery("
range of product is json_source('data/products.json')
retrieve (product.name, product.price) where product.id = :productId
", ['productId' => 15]);
JSON File Structure
JSON files must contain an array of objects with a consistent structure:
[
{ "id": 1, "name": "Wireless Headphones", "price": 99.99, "category": "electronics", "inStock": true, "tags": ["wireless", "audio", "bluetooth"] },
{ "id": 2, "name": "Gaming Mouse", "price": 59.99, "category": "electronics", "inStock": false, "tags": ["gaming", "precision", "rgb"] },
{ "id": 3, "name": "Coffee Mug", "price": 12.99, "category": "kitchen", "inStock": true, "tags": ["ceramic", "dishwasher-safe"] }
]
JSONPath Prefiltering
The second parameter of json_source() accepts a JSONPath expression (powered by softcreatr/jsonpath), filtering the data before it reaches the query processor. Use it to reduce memory usage on large files, or to navigate into nested structures that don't start with a flat array:
// Only error entries are loaded into memory
$results = $entityManager->executeQuery("
range of log is json_source('logs/massive_log.json', '$[?(@.level == \"ERROR\")]')
retrieve (log) where log.timestamp > :since
", ['since' => '2025-06-01']);
// Navigate into a nested structure
$results = $entityManager->executeQuery("
range of product is json_source('catalog/products.json', '$.products[?(@.category == \"electronics\")]')
retrieve (product) where product.price > :minPrice
", ['minPrice' => 50.00]);
// Flatten nested arrays
$results = $entityManager->executeQuery("
range of order is json_source('orders/daily.json', '$.orders[*].items[*]')
retrieve (order) where order.quantity > :minQty
", ['minQty' => 2]);
JSONPath Expression Reference
| Expression | Description |
|---|---|
$.products[*] |
All items in a named array |
$[?(@.status == "active")] |
Filter by field value |
$.data.items[?(@.price > 100)] |
Filter on a nested field |
$[?(@.tags[*] == "featured")] |
Filter by array contents |
$.results[0:10] |
Slice a range of results |
$..items[?(@.inStock)] |
Recursive search with filter |
Combining Database and JSON Sources
JSON sources can be joined with database entities. The join type is inferred automatically from the WHERE clause — you never declare it explicitly.
| WHERE clause | Join type | Effect |
|---|---|---|
| No condition linking the two ranges | Cross | Cartesian product |
| Join condition only | Left | All database rows returned; unmatched JSON fields are null |
Join condition + scalar filter on a JSON field, or IS NOT NULL on a JSON field |
Inner | Only database rows with a matching JSON record are returned |
// Left join: unmatched orders still appear, with null product fields
$results = $entityManager->executeQuery("
range of order is App\\Entity\\OrderEntity
range of product is json_source('external/product_catalog.json')
retrieve (order, product) where order.productId = product.id
and order.status = :status
sort by order.orderDate desc
", ['status' => 'pending']);
// Inner join: scalar filter on JSON field — only orders with an in-stock product
$results = $entityManager->executeQuery("
range of order is App\\Entity\\OrderEntity
range of product is json_source('external/product_catalog.json')
retrieve (order, product) where order.productId = product.id
and product.inStock = true
");
// Inner join: IS NOT NULL — only orders where a product record exists
$results = $entityManager->executeQuery("
range of order is App\\Entity\\OrderEntity
range of product is json_source('external/product_catalog.json')
retrieve (order, product) where order.productId = product.id
and product.name is not null
");
// Multiple JSON sources
$results = $entityManager->executeQuery("
range of user is App\\Entity\\UserEntity
range of profile is json_source('profiles/user_profiles.json')
range of settings is json_source('settings/app_settings.json')
retrieve (user, profile.displayName, settings.colorScheme)
where user.userId = profile.user_id
and profile.theme = settings.theme_name
and user.active = true
");
via syntax for defining relationships is not supported for JSON sources.
Pattern Matching and Array Queries
JSON sources support ObjectQuel's pattern matching and array membership operators:
// Wildcard pattern matching
$results = $entityManager->executeQuery("
range of log is json_source('logs/access_logs.json')
retrieve (log) where log.userAgent = 'Mozilla*'
and log.responseCode = 200
");
// Value within an array field
$results = $entityManager->executeQuery("
range of product is json_source('data/products.json')
retrieve (product) where 'wireless' in product.tags
");
Best Practices
- Keep files focused: Aim for files under 10MB; larger datasets should be imported into database tables with proper indexes
- Prefilter with JSONPath: Always prefilter large files or nested structures to avoid loading unnecessary data into memory
- Index the join key: The database field used as the join key should be indexed, as the SQL stage runs first and its result set drives the correlation pass
- Consistent schema: Maintain a uniform object structure within each JSON file
- Validate paths: Sanitize file path inputs to prevent directory traversal attacks
- Version control your data: Include JSON data files in your backup and version control systems