User Guide Documentation
1. What is WaveflowDB?
WaveflowDB is a deep search full corpus vector database which is specifically created for building AI agents and can be used for deep enterprise search, powerful conversational AI, seamless document processing, intelligent vector storage, etc. This comprehensive guide will walk you through every feature, configuration option, and best practice to help you maximize the potential of your data for building AI agents using WaveflowDB.
2. Understanding WaveflowDB’s Features
Before diving into cluster creation, it's important to understand how WaveflowDB is structured:
- Get Started: An interactive guide that directs new users through the initial setup process.
- Dashboard: Monitor resources, track usage, and manage your account efficiently.
- Cluster: Your dedicated infrastructure layer that provides compute resources.
- Database: A logically isolated data container within a cluster for organizing different projects.
- Data Universe: Your central hub for uploading, processing, and managing the source files (PDFs, DOCX, etc.) that form the foundation of your knowledge base.
- AI Assistant: RAG-based conversational AI interfaces built on your data.
- Data Explorer : Your data exploration tool. Use it to search and verify the files within your Data Universe before building an assistant.
- Manage Access: The team collaboration centre for inviting users and managing permissions across your workspace.
- API Endpoint: An authentication credential and an access control mechanism, allowing your systems to securely interact with clusters, databases, and AI assistant endpoints.
3. Get Started
The "Get Started" page is the primary landing page for all users upon their first login to WaveflowDB. Its core purpose is to serve as an interactive, guided onboarding tool that directs new users through the essential steps required to achieve their first successful outcome: creating a cluster. This page is designed specifically to solve the "empty state" problem, preventing user confusion and providing a clear, actionable path from account creation to platform value. It acts as a temporary, task-oriented homepage until the user has provisioned their initial set of resources.

3.1 Cluster Creation Process
Step 1: Navigate to Cluster Page under Knowledge Base: Locate the Create Cluster button
Step 2: Select Region: Choose the AWS region closest to your users for optimal performance. Currently only one region is provided.
Step 3: Name Your Cluster: Provide a memorable, descriptive name (e.g., "ProductionCluster", "Dev-Environment"). Make sure the name does not contain any special characters.
Step 4: Select Prod Version: Choose from W1, W2, or W3 based on your requirements.
Step 5: Review Configuration: Confirm your settings before deployment.
Step 6: Create: Click create and wait for provisioning.
4. Knowledge Base
4.1 Cluster
A Cluster is your private, high-performance infrastructure that serves as the foundation for all your vector databases. Think of it as your dedicated compute environment that processes, stores, and serves your vectorized data.
4.1.1 Getting to Cluster After Creation
Created clusters are shown in the Cluster page under Knowledge Base along with features like configuration, provider, status and action. Action has pause and resume functionality allowing temporary break to the cluster when not in use. Use the create cluster button to create a new cluster.

4.1.2 Cluster Pod Types and Specifications
Pod Type | Instance Type | CPU | Memory | Recommended Use Cases |
---|---|---|---|---|
W1 | t2.small | 1 vCPU | 2 GB | Development/testing, small docs (<1000), POCs, experimentation |
W2 | t2.large | 2 vCPUs | 8 GB | Production, medium docs (1k–10k), 10–50 users, business ops |
W3 | t2.2xlarge | 8 vCPUs | 32 GB | Enterprise, large docs (10k+), 50+ users, mission-critical, real-time apps |
4.1.3 Cloud Provider Support
Provider | Status | Notes |
---|---|---|
AWS | Available | Full support with all pod types |
Google Cloud (GCP) | Coming Soon | |
Microsoft Azure | Coming Soon |
4.1.4 Cluster Management Best Practices
- Naming Convention: Use clear, descriptive names that indicate purpose and environment
- Resource Planning: Start with W1 for testing, scale up as needed
- Region Selection: Choose regions close to your primary user base
4.2 Database
A Database is an isolated container within your cluster that organizes your data by project, department, or use case. Multiple databases can exist within a single cluster, each with its own configuration and access controls.

4.2.1 Embedding Models
Model | Status | Dimensions | Best For |
---|---|---|---|
all-MiniLM-L6-v2 | Available | 384 | General purpose text, fast, balanced |
sentence-transformers/allmpnet-base-v2 | Coming Soon | 768 | High-quality embeddings, complex docs |
text-embedding-ada-002 | Coming Soon | 1536 | OpenAI compatibility, premium quality |
4.2.2 Vector Configuration
Setting | Current Options | Coming Soon | Description |
---|---|---|---|
Vector Type | Dense | Sparse, Hybrid | Type of vector representation |
Distance Metric | Cosine | Dot Product, Euclidean | How similarity is calculated |
Dimensions | 384 | 512, 768, 1024, 2048 | Vector size (higher = more precise) |
4.2.3 Database Creation Process
- Step 1: Click Create Database
- Step 2: Select Cluster
- Step 3: Configure Settings
- Database Name: Choose a unique, descriptive name
- Embedding Model: Select from available options (current v0.4 has all-MiniLM-L6-v2)
- Vector Type: Choose Dense (only current option, spare vector type is part of road map)
- Distance Metric: Select Cosine (only current option, Euclidean distance measure is part of roadmap)
- Dimensions: Set to 384 (only current option, other higher dimensions are a part of the roadmap)
- Step 4: Review and Create
4.2.4 Database Organization Strategies
- By Name: Alphabetical sorting for quick nav
- By Last Updated: Show most recent first
- By Document: Sort by doc count
- By Status: Filter by active/inactive
5. API Endpoints
An API key in WaveflowDB is a unique, secret token issued to your account that governs access to your resources. It functions as both a critical authentication credential and a powerful access control mechanism.
5.1 Creating API Key
- Step 1: Navigate to API Endpoints page
- Step 2: Click Add API key
- Step 3: Name it and click Create
- Step 4: API key is shown on screen
5.2 API Key Capabilities
Capability | Functionality |
---|---|
Upload | Upload files via API |
Fetch | Query data and get relevant results |
View | See existing data |
5.3 API Endpoint Features (Coming Soon)
- Resource-Specific Keys
You will be able to bind an API key to a specific Database or Cluster. This provides granular control, perfect for multi-tenant applications where you need to guarantee strict data isolation between different projects or customers.
- Activity Monitoring
The API key dashboard will display a "Last Used" timestamp for each key, helping you identify and safely remove old or inactive keys from your account
- Usage Quotas per Key
You will be able to assign specific rate limits or usage quotas (e.g., max queries per day) to individual API keys. This feature will provide fine-grained cost control and help prevent abuse.
6. AI Assistant
An AI Assistant is a RAG-based conversational interface that can intelligently answer questions, provide summaries, and engage in natural language conversations based on the documents you've uploaded to your databases.
6.1 Key Components
Component | Description | Purpose |
---|---|---|
Knowledge Base | Your uploaded documents | Source of truth |
Retrieval System | Vector search engine | Find relevant content |
Language Model | AI conversation engine | Generate responses |
6.2 Create your AI Assistant
- Step 1: Navigate to AI Assistant page
- Step 2: Click Create Assistant
- Step 3: Configure Your Assistant
- Name: Give it a memorable, descriptive name
- Assistant Type: Define its primary function (e.g., "Customer Support Assistant, Sales Enablement Assistant")
- Description: Explain what it does
- Database Selection: Choose which database it can access
- Step 4: Upload Additional Files (optional)
- Step 5: Test and Deploy
- Step 6: View assistants list and interact in playground
- Step 7: Click on the desired assistant to go to the playground and interact.
6.3 Assistant Configuration Examples
Use Case | Name | Type | Description |
---|---|---|---|
Customer Support | SupportBot | Customer Service | Answers product/policy questions |
Employee Onboarding | OnboardingGuide | HR Assistant | Helps navigate company policies |
API Documentation | DevHelper | Developer Assistant | Provides code examples & guidance |
Research Assistant | ResearchAI | Research Analyst | Synthesizes info from research papers |
6.4 Best Practices for AI Assistants
- Use clear naming for assistants
- Limit scope per assistant
- Regularly retrain and update data
- Test with real user queries
7. Data Universe
The Data Universe is your centralized command centre for managing and uploading files. It provides a robust interface for bringing your documents into WaveflowDB, where they're automatically processed, vectorized, and made searchable.

7.1 Supported File Formats
Format | Extension |
---|---|
CSV | .csv |
Python | .py |
JSON | .json |
Word Documents | .docx |
7.2 File Upload Process
- Navigate to Data Universe
- Select Target Database
- Upload Files:
- Drag & Drop
- Browse
- Batch Upload
- Confirmation on completion
7.3 File Processing Pipeline
- Step 1: Validates file format and size
- Step 2: Extracts text and metadata
- Step 3: Chunks large documents
- Step 4: Generates embeddings
- Step 5: Indexes content
- Step 6: Stores original + vectors
7.4 Upload Best Practices
Practice | Why It Matters | Example |
---|---|---|
Organize files before upload | Easier management & search | Group related docs |
Use descriptive filenames | Better search results | Q4-2024-Sales-Report.pdf |
Check file quality | Better extraction results | Ensure PDFs are text-based |
8. Data Explorer
Data Explorer is your data exploration tool that allows you to inspect, verify, and manage the files you've uploaded to any database.

8.1 Data Explorer Features
Feature | Description | Use Case |
---|---|---|
File Listing | View all files in DB | Verify uploads |
Search by Name | Find by filename | Locate docs |
Filter by Type | Show formats | Focus on PDFs, CSVs |
View Metadata | Upload date, size, status | Troubleshooting |
8.2 How to Use Data Explorer
- Step 1: Access Data Explorer
- Step 2: Select Database
- Step 3: Browse or Search
- Browse: Scroll through all files
- Search: Use the search bar to find specific files
- Filter: Apply filters by file type or upload date
- Step 4: View File Details
- Step 5: Verify Status is “Processed”
8.3 Troubleshooting Common Issues
Issue | Possible Cause | Solution |
---|---|---|
File not appearing | Still processing | Wait & refresh |
Processing failed | Unsupported or corrupted | Re-upload supported format |
Empty content | Image-only PDF | Convert to text PDF |
Partial processing | Large file timeout | Split into smaller chunks |
9. Dashboard
The Dashboard is your central command centre, providing a real-time overview of your WaveflowDB account.
- Account-Wide Summary: total DBs, files, storage used
- Detailed Breakdown: database names, sizes, file counts

10. Manage Access
Manage Access is your team collaboration hub where you can invite team members, assign roles, and control access.
10.1 Access Control Features
Feature | Description | Benefit |
---|---|---|
User Invitations | Send email invites | Easy onboarding |
Role-Based Access | Different permissions | Secure collaboration |
10.2 User Roles and Permissions
Role | Permissions | Best For |
---|---|---|
Owner/Admin | Full access, manage users, billing | Team leads, IT admins |
Member | DB create/modify, upload files, create assistants | Content managers, devs |
Viewer | View DBs, assistants, download, search | End users, researchers |
10.3 How to Invite Team Members
- Step 1: Navigate to Manage Access
Step 2: Click on Add Member Button
- Step 3: Enter Details:
- Email Address: Provide the team member's email
- Database Access: Select which database should be accessible
- Access Level: Choose which databases they can access
Step 4: Send Invitation
Edit and Manage Access