Platform > Data Layer > Data Storage
SQL + Cassandra data storage.
Hybrid relational + time-series storage in one platform - structured MySQL tables alongside high-performance Cassandra, without bolting two databases together.

CAPABILITY OVERVIEW
One platform, two databases - no compromise.
Rayven combines MySQL for structured relational data with Apache Cassandra for time-series + event data in a single, unified storage layer.
No ETL pipeline between them. No separate data warehouse. No query tool switch.
Structured records, real-time events, workflow payloads + calculated metrics all coexist in the same platform - governed, queryable + instantly available to dashboards, workflows, AI models + external systems.
Inbound triggers include:
-
All workflow execution data (auto-stored in Cassandra by UID + timestamp)
-
Primary + Secondary Table records (MySQL)
-
Streaming data from IoT, APIs + Kinesis (Cassandra)
-
File-ingested + parsed data
-
Calculated + aggregated metric outputs
Outbound triggers include:
-
Real-time data for dashboards + reports
-
Query-able datasets for API endpoints
-
Training datasets for ML models
-
Raw exports via Node Export + CSV download
-
Data feeds for external systems via output nodes

KEY CAPABILITIES
What SQL + Cassandra Data Storage gives you.
MySQL for structured relational data
All Primary Tables, Secondary Tables + entity metadata are stored in MySQL. Supports relational queries, joins + structured data management. Familiar SQL patterns, managed entirely within Rayven - no separate database administration required.
Cassandra for time-series + event data
All workflow payload data is automatically stored in Cassandra, indexed by UID, Node ID + Timestamp. Optimised for high-frequency writes, horizontal scalability + low-latency time-series reads - purpose-built for IoT telemetry, event streams + operational logs at any scale.
Hybrid architecture - unified access
MySQL and Cassandra coexist within the same platform layer. Dashboard widgets, workflow nodes + API endpoints query both via the same interface - no tool switching, no cross-database ETL, no schema synchronisation overhead.
Azure-hosted, geo-redundant
Both databases are hosted on Microsoft Azure with geo-redundant storage. Daily automated backups (60-day retention), weekly retention (one year) + monthly retention (three years). No single point of failure. Consistent performance regardless of geographic distribution.
Automated backup + retention policies
Daily backups run automatically with 60-day retention by default. Weekly + monthly retention extends historical coverage for compliance + audit. Data Repository node settings configure per-workflow retention at the workflow level.
Instant data availability
Data written to either database is immediately queryable. No refresh cycle, no replication delay, no batch sync. Dashboard auto-refresh operates on live data. API endpoints return current state. AI models train on up-to-date datasets.
HOW IT CONNECTS: EXPLAINER
Where SQL + Cassandra Data Storage fits in the Rayven Platform stack.
The storage layer underpins every other capability in the platform - all data from the Integration Layer is stored here before any downstream use.
-
Data from the Integration Layer flows into MySQL (structured records) or Cassandra (event + time-series data) immediately on ingestion.
-
The Execution Layer reads from both stores for workflow logic, AI + automation.
-
The Presentation Layer queries both for dashboards, reports + custom interfaces.
-
API endpoints expose data from either store to external systems on demand.
Data storage is invisible to end users - it simply ensures data is always there, always current, always accessible.
USE CASES
How SQL + Cassandra Data Storage gets used.
Hybrid data model for an industrial AI platform
A mining operator stores asset registry records in MySQL Primary Tables and continuous sensor telemetry in Cassandra indexed per-asset + timestamp. Workflow logic queries MySQL for asset metadata and Cassandra for the latest sensor values within the same execution chain - no cross-database orchestration required.

Financial services BI platform without a data warehouse
A financial services firm stores customer profiles + contract records in MySQL Secondary Tables and transaction event logs in Cassandra. Live dashboard widgets query both simultaneously - structured customer data alongside real-time transaction metrics. No data warehouse, no ETL delay.

Partner-operated multi-client platform on shared infrastructure
An MSP runs multiple clients on a single Rayven instance. Client entity records are stored per-Label in MySQL. Client event data is stored per-UID in Cassandra with Label-based access control. Both stores are isolated by Label - clients access only their own data. One infrastructure, full data governance.

Rayven SQL + Cassandra Data Storage FAQs:
Why does Rayven use both MySQL and Cassandra?
MySQL handles structured relational data - entity records, configuration tables, reference data. Cassandra handles high-frequency time-series writes - sensor readings, telemetry, event logs - where MySQL would bottleneck under volume. Each database is used for what it does best. See the full Data Layer.
When does data go into Cassandra vs MySQL?
IoT sensor readings, streaming events and any timestamped series data go into Cassandra. Structured entity records, operational logs with schema joins, and form submission records go into MySQL (Primary and Secondary Tables). See Unified Data Tables.
Can both databases be queried in the same workflow?
Yes. Workflow nodes can pull from Cassandra (time-series reads) and MySQL (table rows) simultaneously and merge the results for downstream transformation, dashboarding or AI model input. Explore Data Transformation.
How is data retention managed across both databases?
MySQL data retention is managed via scheduled deletion or archival workflows. Cassandra TTL (Time To Live) settings auto-expire records after a configured period. Retention windows are configurable per data type and business requirement. See Data Management.
Is my data backed up?
Yes. Both MySQL and Cassandra are backed up on a managed schedule. Backup frequency and retention windows are defined in your service agreement. Contact us for backup and recovery SLA details.
Can I access Rayven databases directly via SQL?
Read-only SQL access to MySQL is available for authorised integrations, enabling external BI tools to query Rayven data directly. For Cassandra, data is accessed via API endpoints or the Rayven workflow layer. See API Endpoints.
How does Cassandra scale for high-frequency IoT deployments?
Cassandra is designed for horizontal scaling - adding nodes increases write and read throughput linearly. Rayven manages this scaling automatically. Thousands of assets writing simultaneously at high frequency is a standard deployment pattern. See IoT Devices + Protocols.
Are there limits on how much data can be stored?
Storage limits depend on your platform tier. Cassandra and MySQL storage scale with your contract. For high-volume storage requirements, contact us to confirm the appropriate configuration.
Can historical data be exported for use in external systems?
Yes. Any stored dataset - MySQL tables or Cassandra series - can be exposed via authenticated API endpoints or written to FTP/S3 via output nodes. Data is not locked into the platform. See the Integration Layer.
What happens to stored data if the platform is offline for maintenance?
Scheduled maintenance windows are communicated in advance. All stored data persists through maintenance periods. Data arriving during downtime is queued and processed on recovery, depending on integration type. Contact us for SLA details.
Also in the Data Layer:
Unified Data Tables
Structured Primary + Secondary Tables for entity records, metadata + relational data alongside Cassandra time-series.
Data Management
Configure retention policies, inspect workflow payloads, export raw data + manage data lifecycle across the platform.
Data Transformation
JavaScript, Advanced Function + Combine Data nodes for schema mapping, enrichment + normalisation within workflow processing chains.
File Parsing
Ingest + parse files from FTP, S3 + manual uploads into structured, real-time data available to workflows and AI models.
Calculation + Aggregation
Sum, average, count + aggregate across UID or Label over any defined time window - at the point of processing.
AI Models + Training
Train Python ML models on Cassandra time-series data + deploy predictions as real-time workflow steps.
Real-time Data Processing
Sub-second ingestion + processing of live sensor, device + event data with built-in deduplication + schema validation.
Join the Shift
Discover the easy way to do something new.
Book a free 30 minute assessment with our team and we'll scope your project, needs + what a solution might look like.