NOW LIVE

Give AI agents all your data.

Discover Rayven MCP

Plug Claude, ChatGPT + Gemini into every system you run - SaaS, IT, OT, IoT, files - anything. Get AI agents that can finally see all your business.

Explore Rayven MCP

Platform > Data Layer > File Parsing

File parsing.

Extract structured, actionable data from any file type - PDFs, Word docs, CSVs, emails + more - automatically, at the point of ingestion.

Workflow-Chain-500

CAPABILITY OVERVIEW

Turn files into data, automatically,

Rayven parses any file format at the point of ingestion - converting unstructured content into structured, workflow-ready data without manual extraction or pre-processing.

AI-powered extraction handles complex documents like PDFs and Word files. Regex + mapping rules handle structured formats like CSV and XML.

Extracted data flows directly into storage, workflow logic, dashboards + external systems - no separate parsing tool, no staging step, no manual data entry.

Inbound triggers include:

  • PDF documents

  • Word documents (.docx)

  • CSV + spreadsheet files

  • XML files

  • Email + attachments

  • Images (via AI extraction)

  • JSON files with nested structures

Outbound triggers include:

  • Structured JSON payloads for workflow processing

  • Clean field values for storage in Primary/Secondary Tables

  • Extracted data for dashboards, AI models or external APIs

realtime data processing

KEY CAPABILITIES

What File Parsing gives you.

AI-powered document extraction

Pass PDFs, Word documents + text files to an LLM connector node (OpenAI, Claude, Gemini + others) for structured data extraction. The AI reads the document and returns configured fields as structured JSON - no manual template mapping required.

CSV + structured file parsing

Ingest CSV, XML + JSON files from FTP, SFTP or S3 and parse field values into workflow payloads automatically. Configure column mapping, data type conversion + validation rules to ensure structured output regardless of input format variation.

Email + attachment processing

Process emails and extract data from attached files automatically on receipt. Structured content from email bodies or attachments flows into workflows - useful for invoice processing, report ingestion + document-triggered automation.

Regex + validation rules

Apply regex patterns, field validation rules + mapping logic to incoming file data. Validate field formats on ingestion, flag anomalies + reject or flag records failing quality checks before parsed data reaches storage or downstream processing.

Extract JSON Key node

Extract specific values from nested JSON structures within a workflow. Supports deep nesting, wildcard key selection + array handling. Used when ingested files contain complex JSON with required data buried in nested objects or arrays.

Merged file + real-time data pipelines

Combine parsed file data with real-time streams in the same workflow. Merge uploaded file data with time-series data, API responses or Primary Table records - for example, combining a daily CSV report with live sensor readings for unified analysis.

HOW IT CONNECTS: EXPLAINER

Where File Parsing fits in the Rayven Platform stack.

File parsing nodes sit in the Data Layer, processing file content after ingestion from the Integration Layer.

  • Files arrive via FTP, SFTP, S3 or manual upload from the Integration Layer.

  • Parsing nodes extract, validate + structure file content within the workflow.

  • Structured output writes to MySQL or Cassandra for storage.

  • The Execution Layer uses parsed data for workflow logic, AI processing + automated actions.

  • The Presentation Layer surfaces parsed data in dashboards + reports.

USE CASES

How File Parsing gets used.

Automated invoice processing

Supplier invoices arrive as PDFs in an SFTP folder. A Rayven workflow picks up each file, passes it to a Claude node for structured extraction of supplier name, invoice number, line items + total. Extracted fields write to a Secondary Table and trigger an approval workflow - no manual data entry required.

Workflow-Chain-Preferred WebP

Daily report ingestion for a retail BI platform

Store managers upload daily sales CSV reports to an S3 bucket. A Rayven workflow ingests each file, maps columns to a standard schema, aggregates by store Label + writes results to a Primary Table. A live dashboard surfaces consolidated sales data within 30 seconds of upload.

Custom-Analytics-Solution-WebP

Partner building a document processing pipeline for a legal firm

An MSP uses Rayven's AI document extraction to build a contract review pipeline for a legal client. Contracts uploaded to a portal are parsed by a Claude node, key clauses extracted as structured fields + flagged for review if specific conditions are met - delivered as the partner's own product.

App-Page-500

Rayven File Parsing FAQs:

What file types does Rayven parse?

CSV, JSON, XML, plain text, binary, and compressed formats (.zip, .gz). Configurable character encoding handles non-standard sets. Proprietary or non-standard structures can be handled via the JavaScript Node or Advanced Function Node. See Data Transformation.

How does Rayven ingest files for parsing?

Files are ingested via FTP, SFTP, S3, manual upload through a Rayven form, or HTTP POST. The ingestion method is set as the workflow trigger node. See File Uploads.

Can Rayven parse files with variable structures?

Yes. When file schemas vary, the JavaScript Node or Advanced Function Node handles dynamic structure detection and field extraction. This supports legacy report exports where column order or naming is inconsistent. Explore Data Transformation.

How does CSV parsing handle headers?

Rayven's CSV parser can detect headers from the first row or use a manually defined column mapping. Multi-row header structures, quoted fields and custom delimiters are all configurable per ingestion node. See Data Layer configuration options.

Can parsed data feed an AI model directly?

Yes. Parsed file content - including extracted text from documents - can feed directly into an AI/LLM node for classification, extraction or summarisation within the same workflow. See AI Models + Training.

Is there a file size limit for parsing?

There is no hard size limit in workflow configuration. Performance on very large files depends on the complexity of downstream parsing and transformation logic. Contact us for high-volume file processing requirements.

Can Rayven parse multiple files in a single workflow run?

Yes. File ingestion nodes can poll a directory and process all new files found in a single execution cycle. Each file is parsed and passed through the workflow independently within the same run. Explore the Execution Layer.

How are parsing errors handled?

The Error Handler and Conditional Filter nodes route parse failures to alternative paths - flagging the file for manual review, triggering an alert or storing the raw file without transformation. See Notifications + Alerts.

Can parsed data write directly to a database table?

Yes. Parsed and transformed data can be written to any Rayven Primary or Secondary Table via Push Row nodes. This makes file-based batch ingestion feed the same unified data structure as real-time sources. See Unified Data Tables.

Does Rayven parse data inside compressed archives?

Yes. The file ingestion node can decompress .zip and .gz files and extract individual files for parsing. The structure within the archive is flattened and each file processed through the workflow pipeline. See the File Uploads page.

Join the Shift

Discover the easy way to do something new.

Book a free 30 minute assessment with our team and we'll scope your project, needs + what a solution might look like.