Technology
Platform Architecture
Three interconnected systems — Banyan (regulatory knowledge engine), Bija (AI orchestrator), and BeatSync (market event correlation) — working together to structure the Indian regulatory and financial landscape.
Banyan
Regulatory Knowledge Engine
Multi-Regulator Ingestion
Continuous monitoring of RBI, SEBI, IRDAI, PFRDA, IFSCA, NABARD, MCA, and IBBI. Ingests circulars, master directions, gazette notifications, and public filings through a multi-stage state machine pipeline (PENDING → FETCHING → CLASSIFYING → EXTRACTING → SYNTHESIZING → BINDING → EMBEDDING → STORING → COMPLETE).
Semantic Extraction
Docling-powered PDF extraction on GPU, followed by semantic chunking and late-chunking embeddings. Hybrid extractor combines structural parsing with LLM-powered fact extraction to identify obligations, entities, and regulatory intent.
Hybrid Search
Multi-strategy retrieval combining dense vector search (Milvus), sparse lexical matching, and knowledge graph traversal (Neo4j). Reciprocal Rank Fusion merges results for optimal relevance across searchable passages.
Temporal Analytics
Time-series tracking of regulatory changes, document complexity trends, and ingestion metrics. Historical analysis of how regulatory requirements evolve across entities and regulatory domains.
Bija
AI Agent Orchestrator
Autonomous AI orchestration layer that coordinates LLM workers (Claude, Gemini, GPT, local models), manages task routing based on complexity, and maintains system health across the mesh. Features include multi-provider LLM queue with priority routing, automated log classification and incident dispatch, and self-healing service management.
BeatSync
Market Event Correlation
Real-time market event correlation — connecting regulatory changes to market impact across NSE-listed instruments. GPU-accelerated ML pipelines for pattern recognition, and time-series storage in QuestDB for cross-referencing market movements with regulatory events.
Infrastructure Stack
Milvus
Vector database for high-dimensional embeddings and semantic search
QuestDB
Time-series database for temporal analytics, metrics, and log ingestion
Neo4j
Graph database for entity-regulation knowledge graph and lineage
MinIO
S3-compatible object store for document PDFs and extraction artifacts
GPU Compute
NVIDIA T4/L4 GPUs for JAX pipelines, Docling extraction, and embeddings
Redis
In-memory cache, LLM task queue, and inter-service message broker