Financial Data Aggregation Platform
A sophisticated financial data aggregation platform for the precious metals industry, intelligently combining data from 10 external providers to deliver stable, accurate pricing across 10 commodities and 14 currencies whilst gracefully handling provider failures.
The Business Challenge
E-commerce businesses selling precious metals require accurate pricing data to operate, but relying on a single data provider creates a critical single point of failure. Individual providers can experience downtime, provide stale data, or return incorrect values. Rate limits, authentication failures, and inconsistent data formats compound the problem.
The platform needed to aggregate data from 6commodity providers and 4currency providers, supporting 10+precious metals including gold, silver, platinum, palladium, and specialty metals like rhodium, tellurium, and gallium.
Core Requirements
- Reliability through redundancy: Continue operating when individual providers fail
- Data quality over availability: Use trust factors to prefer accurate data from reliable sources
- Flexible pricing rules: Support premiums, day-based adjustments, time-limited promotions, and emergency overrides
- Historical tracking: Maintain detailed historical data with 15-minute granularity
- Gap filling: Automatically fill missing data points during provider outages or market closures
Individual data providers experience downtime, rate limits, and data quality issues. By aggregating multiple providers with trust factors (1-5 scale), the system can prefer data from the most reliable available source whilst maintaining availability even when high-trust providers fail.
Show why provider aggregation mattersHide why provider aggregation matters
Technical Architecture
The system implements a layered architecture with clear separation between command layer (cron jobs), service layer (business logic), repository layer (data access), and entity layer (domain models). At its core is a factory-based provider architecture with common interface contracts enabling parallel HTTP requests and graceful error handling.
Multi-Provider Aggregation
Each data provider has a corresponding factory class that creates configured provider instances. All providers implement a common interface with clear separation between request and response handling, enabling all providers to initiate requests before any begin processing responses.
The system handles provider failures gracefully using isolated error handling. Each provider's request/response cycle is wrapped in exception handling to prevent one failing provider from affecting others.
Provider failures are isolated using try-catch blocks around handleResponse(). When a provider throws an exception, the system logs a critical error but continues processing other providers. This ensures one failing provider does not bring down the entire data collection process.
Show how provider failures are isolatedHide how provider failures are isolated
Trust Factor Priority System
Rather than averaging all provider responses, the system uses values exclusively from the highest-trust provider that returned valid data. This prevents low-quality data from polluting the final price calculation. The system falls back to lower-trust providers only when higher-trust sources fail.
The trust factor priority system differs fundamentally from standard averaging approaches used by other aggregation systems. Understanding this difference is key to appreciating the system's data quality guarantees.
Averaging all provider responses seems intuitive but introduces a critical flaw: a provider returning stale or incorrect data pulls the final price toward that bad value. For example, if four providers return £1,800/oz and one provider returns £1,600/oz (stale data from yesterday), the average becomes £1,760/oz, inaccurate despite most providers being correct.
Show why not average all provider responses?Hide why not average all provider responses?
Flexible Premium Rules Engine
E-commerce businesses need to apply various adjustments to base commodity prices: standard premiums (markup percentages or fixed amounts), day-specific adjustments (weekend pricing, reduced staff days), time-limited promotions or seasonal adjustments, and emergency price overrides (market volatility, supply issues).
The system implements a chain-of-responsibility pattern for rule processing with four distinct rule types: basic premium rules (always-active adjustments), day-based rules (specific days of the week), time-based rules (date ranges), and override rules (complete replacement). Rules stack appropriately with clear precedence.
Precision Financial Arithmetic
Floating-point arithmetic introduces rounding errors that compound over thousands of transactions. The system uses Money PHP for money objects and bcmath functions for arbitrary-precision decimal calculations, ensuring accuracy to the cent.
The rules engine also validates configuration to prevent errors. Custom validators ensure business rules remain consistent and conflict-free.
Time-based rules use custom Symfony validators to prevent conflicting rules. The TimeBasedRulesCanNotOverlap constraint ensures no two time-based rules for the same commodity/currency overlap in their date ranges.
Show rule validation and conflict preventionHide rule validation and conflict prevention
JOINED table inheritance, enabling shared base functionality whilst allowing each rule type to have specific fields (day name for day-based rules, start/end dates for time-based rules).Historic Data Management
The system maintains detailed historical price data with 15-minute granularity. Provider failures or market closures create gaps that need filling. Different commodities may need different timestamp normalisation. Historical queries must be efficient for charting and analysis.
Dual Table Architecture
The system maintains two distinct data tables: LatestCommodityPrice tracks current values per provider (one record per provider-commodity pair, updated in place), whilst HistoricCommodityPrice stores aggregated values with timestamp uniqueness (one record per commodity-timestamp pair, never updated).
A single table approach would require date filtering for “latest” queries and create index conflicts between provider+commodity (for latest) and commodity+timestamp (for historic). Separating concerns prevents these conflicts and simplifies queries.
Show why separate latest and historic tables?Hide why separate latest and historic tables?
Timestamp Normalisation
Timestamps are normalised to 15-minute boundaries before storage. A price arriving at 14:23 becomes 14:15. This enables database unique constraints on commodity+timestamp, direct chart queries without transformation, and straightforward gap detection.
Gap Filling Service
When providers fail or markets close, the system fills gaps using the last known good value. For each missing timestamp, the service looks up the most recent historic price and replicates it forward to maintain continuous data coverage.
The gap filling service generates 96 records per day (24 hours × 4 quarters). For each 15-minute interval, it checks if a historic record exists. If not, it uses the last known price from earlier in the day (or the previous day if necessary).
Show how gap filling works in detailHide how gap filling works in detail
Graceful Degradation & Reliability
The system implements several patterns to handle real-world operational challenges: weekend and market closure handling, timezone conversion, stale data detection, and extensive logging for observability.
Weekend and Market Closure Handling
Commodity markets close on weekends (typically from Friday evening until Sunday evening). The system must handle this gracefully without flooding logs with errors. Day-aware error handling logs informational messages during expected downtime periods rather than critical errors.
Another operational challenge involves handling timezone differences across providers. Not all providers report timestamps in UTC, requiring explicit timezone conversion before storage.
One major provider reports timestamps in British Summer Time (BST) during summer months. All internal storage uses UTC. The system explicitly parses timestamps in the provider's local timezone (Europe/London), then converts to UTC before storage.
Show timezone conversion challengesHide timezone conversion challenges
Configuration Management
Provider credentials and endpoints are managed through environment variables, with an abstract configuration base class that validates required settings. This ensures missing configuration is caught immediately at application startup rather than failing during scheduled data collection.
The system implements extensive logging at every layer: provider request/response cycles, trust factor selection, rule application (with before/after prices), gap filling operations, and duplicate detection. This creates a complete audit trail for financial calculations.
Show observability and logging strategyHide observability and logging strategy
Measurable Outcomes
The platform delivers stable, accurate pricing data whilst gracefully handling provider failures. The architecture choices result in a maintainable, testable codebase with clear separation of concerns.
Code Quality Metrics
- 296source files - Well-organised codebase with clear layering
- 299test files - Near 1:1 test-to-code ratio providing confidence in financial calculations
- 11,175lines of code - Application code (excluding tests and vendor dependencies)
- 20database migrations - Schema evolution spanning 5 years of development
- 27API endpoints - RESTful API for consuming applications
The system aggregates data from 6 commodity providers and 4 currency providers, supporting 10 precious metals (gold, silver, platinum, palladium, rhodium, tellurium, ruthenium, rhenium, indium, gallium) across 14 international currencies (USD, EUR, GBP, AUD, JPY, CAD, SGD, MYR, AED, CHF, HKD, CLP, BRL, MXN).
Show business capability coverageHide business capability coverage
Architectural Benefits
- Resilience: System continues operating when individual providers fail, with trust factor priority ensuring data quality
- Extensibility: New providers can be added by implementing factory and feed interfaces, no changes to core logic required
- Correctness: Precision arithmetic with bcmath functions and thorough testing ensure financial-grade accuracy
- Observability: Detailed logging tracks which providers succeed/fail and which trust level was used for each calculation
- Maintainability: Clear separation of concerns between layers with well-defined interfaces and near 1:1 test coverage
The trust factor priority system is fundamentally different from standard averaging approaches. Rather than treating all providers equally, it recognises that data quality varies and explicitly prioritises reliable sources over mere availability.
The dual storage model (latest + historic) prevents index conflicts and optimises for different access patterns. Timestamp normalisation at storage time simplifies querying and enables database-level duplicate prevention.
Show what makes this architecture notableHide what makes this architecture notable
Industry Applications
The patterns demonstrated in this case study are applicable across multiple industries requiring multi-source data aggregation, precision calculations, and graceful degradation.
Industry Applications
Ready to eliminate your technical debt?
Transform unmaintainable legacy code into a clean, modern codebase that your team can confidently build upon.