Research & Development

At DISCOS, we build foundational distributed infrastructure for next-generation AI systems. Rooted in distributed systems, our research spans mathematical modeling, system design, algorithmic innovation, practical implementation, and real-world deployment. Our work also advances core AI infrastructure for interdisciplinary applications, including Blockchains (e.g., decentralized FL and AI alignment), Cloud computing (e.g., cloud-based agentic data governance), and Database management systems (e.g., vector databases and ML gradient stores). Our goal is to enable high-performance, highly available, and highly scalable intelligent systems.

DS for AI B·C·D for AI
AI Infrastructure
Distributed Systems
Blockchains Cloud Computing Databases Multi-Agent Consensus Federated Learning ML Influence Estimation Distributed Agentic Systems Data Governance Vector Database Fault Tolerance Data Consistency

Cloud-Based Agentic Data Governance

Our work on cloud DBMS focuses on designing and implementing scalable, reliable, and cost-efficient cloud data management infrastructure. We study centralized data governance, ontology-driven DBMS, and ontology-guided DB agents to support heterogeneous, distributed, and data-intensive workloads. Our research topics include, but are not limited to:

  • Edge-Cloud DBMS for cost-efficient data ingestion and governance
  • Data quality and lean processing
  • Ontology-leading agentic DBMS

We actively collaborate with Airbus Canada to develop next-generation Cloud DBMS with agentic technologies for Industry 4.0.

Digital Data Strategy for Airbus A220

Vector Database (VDB)

VDBs are becoming a foundational component of AI-driven data systems, enabling efficient similarity search over high-dimensional representations (typically generated by LLMs). Our research focuses on system-level challenges, including:

  • ANN indexing and similarity search for high accuracy and low latency across multi-modal data
  • Distributed indexing and query processing, supporting sharding, replication, and parallel search at the scale of billions of vectors
  • Cloud-native VDB architectures that provide elasticity, fault tolerance, and cost efficiency

In addition, we design LLM-centric middleware systems, including key-value caching and memory management layers, for large-scale AI inference and retrieval-augmented generation (RAG) pipelines.

Distributed AI Infrastructure

AI infrastructure is becoming increasingly distributed, especially for large-scale inference services. Foundational distributed systems principles, including consensus algorithms, data consistency, coordination, and fault tolerance, are now facing new demands from increasingly complex AI systems. The quality of distributed AI infrastructure directly affects system performance, service quality, availability, and cost efficiency. Our major research topics include:

  • Foundational algorithms and architectures for distributed systems, including consensus protocols and fault tolerance, such as our work on Escape [ICDCS’22], Prosecutor [Middleware’21], PrestigeBFT [ICDE’24], and Cabinet [VLDB’25].
  • Multi-agent systems, including multi-agent coordination and alignment, to enable groups of AI agents to make fault-tolerant, trustworthy, and coordinated decisions.
  • Federated learning, particularly asynchronous FL training and distributed inference.

Our work aims to advance distributed AI infrastructure and services that are performant, scalable, highly available, and robust.

GenAI Regulation and Data Governance

Under the rapid adoption of GenAI systems, training and inference pipelines increasingly rely on large volumes of copyrighted, often proprietary, data, yet current systems lack transparent and enforceable governance mechanisms. Our research explores centralized and decentralized solutions that enable trustworthy, auditable, and incentive-compatible data governance in GenAI systems, including but not limited to the following:

  • Measurable training influence, quantifying how individual data inputs contribute to model training and inference outcomes
  • Decentralized revenue-sharing mechanisms, enabling fair compensation between GenAI service providers and copyright data owners
  • Emerging legal and policy frameworks for GenAI data governance, bridging system design with regulatory requirements

We aim to establish system-level foundations for responsible and sustainable GenAI, where data usage, value creation, and incentives are aligned across technical, economic, and societal dimensions. Read our position paper [ICML'26] for more details.

Our Sponsors and Collaborators

Airbus
IBM
T-RIZE Group
CRIAQ
Mitacs
NSERC / CRSNG
Concordia University — Engineering and Computer Science
University of Toronto
École de technologie supérieure (ÉTS)