Plugin Architecture

Pluggable provider system for extensible data sources in Viben.

Overview

Viben implements a pluggable architecture that allows extending data sources through:

Built-in Providers: Core academic/research data sources in backend/browse-mcp
Plugin Providers: Third-party extensions in backend/plugins/*

The system uses stevedore for dynamic plugin discovery and loading via Python entry points.

Architecture Components

1. Provider Hierarchy

All data sources follow a hierarchical naming convention:

provider/source_name

Examples:

browse-mcp/arxiv - Built-in arXiv searcher
browse-mcp/pubmed - Built-in PubMed searcher
context7/web - Context7 plugin web searcher
social-media/twitter - Social media plugin Twitter searcher

2. Provider Registry

Location: provider.index.json (root directory)

This JSON file catalogs all available data sources by category:

{
  "providers": {
    "academic": {
      "name": "Academic Sources",
      "description": "Research paper databases and preprint servers",
      "sources": {
        "arxiv": {
          "name": "arXiv",
          "description": "Open access preprint repository",
          "apiKey": "none",
          "documentation": "https://arxiv.org/help/api"
        }
      }
    }
  }
}

Categories:

academic - Research databases (arXiv, PubMed, Semantic Scholar, etc.)
publisher - Publisher-specific sources (IEEE, Springer, ScienceDirect, etc.)
institutional - University repositories (CORE, ResearchGate, etc.)
web - General web sources (Google Scholar, Sci-Hub, etc.)

Built-in Providers

Location

backend/browse-mcp/browse_mcp/sources/

Available Sources (20+)

Source	Description	API Key
arxiv.py	arXiv preprint server	None
pubmed.py	PubMed/MEDLINE database	None
pmc.py	PubMed Central full-text	None
biorxiv.py	bioRxiv preprint server	None
medrxiv.py	medRxiv preprint server	None
semantic.py	Semantic Scholar API	Optional
core.py	CORE aggregator	Optional
crossref.py	Crossref metadata	None
iacr.py	IACR cryptology eprints	None
acm.py	ACM Digital Library	Optional
ieee.py	IEEE Xplore	Required
sciencedirect.py	ScienceDirect	Required
springer.py	SpringerLink	Required
scopus.py	Scopus	Required
google_scholar.py	Google Scholar	None
jstor.py	JSTOR	Required
researchgate.py	ResearchGate	None
wos.py	Web of Science	Required
sci_hub.py	Sci-Hub	None
hub.py	Generic hub searcher	None

Implementation Pattern

All built-in searchers inherit from BaseSearcher and implement:

from browse_mcp.base import BaseSearcher

class ArxivSearcher(BaseSearcher):
    def search(self, query: str, **kwargs) -> List[Dict]:
        """Execute search and return results."""
        pass

    def get_paper_details(self, paper_id: str) -> Dict:
        """Fetch detailed metadata for a paper."""
        pass

Plugin Providers

Location

backend/plugins/
├── browse-mcp-plugin-context7/
│   ├── pyproject.toml
│   ├── README.md
│   ├── CHANGELOG.md
│   └── browse_mcp_plugin_context7/
│       └── searcher.py
└── browse-mcp-plugin-social-media/
    ├── pyproject.toml
    ├── README.md
    ├── CHANGELOG.md
    └── browse_mcp_plugin_social_media/
        └── searcher.py

Plugin Discovery Mechanism

Plugins register their searchers via entry points in pyproject.toml:

[tool.poetry.plugins."browse_mcp.searchers"]
context7_web = "browse_mcp_plugin_context7.searcher:Context7Searcher"
twitter = "browse_mcp_plugin_social_media.twitter:TwitterSearcher"
linkedin = "browse_mcp_plugin_social_media.linkedin:LinkedInSearcher"

Entry Point Namespace: browse_mcp.searchers

Loading Mechanism

The plugin system uses stevedore to discover and load entry points:

from stevedore import extension

def load_plugins():
    """Load all registered searcher plugins."""
    mgr = extension.ExtensionManager(
        namespace='browse_mcp.searchers',
        invoke_on_load=True,
    )
    return {ext.name: ext.obj for ext in mgr}

Reference: backend/browse-mcp/browse_mcp/plugin.py

Creating a New Plugin

1. Package Structure

Create a new package following the naming convention:

backend/plugins/browse-mcp-plugin-{name}/
├── pyproject.toml
├── README.md
├── CHANGELOG.md
├── browse_mcp_plugin_{name}/
│   ├── __init__.py
│   └── searcher.py
└── dist/

2. Implement Searcher

Create searcher class inheriting from BaseSearcher:

from browse_mcp.base import BaseSearcher
from typing import List, Dict

class MySearcher(BaseSearcher):
    """Custom data source searcher."""

    def __init__(self):
        super().__init__(name="my_source")

    def search(self, query: str, **kwargs) -> List[Dict]:
        """Search implementation."""
        # Your search logic here
        return results

    def get_paper_details(self, paper_id: str) -> Dict:
        """Fetch paper details."""
        # Your detail fetching logic
        return details

3. Register Entry Point

Add entry point in pyproject.toml:

[tool.poetry]
name = "browse-mcp-plugin-myname"
version = "0.1.0"

[tool.poetry.dependencies]
browse-mcp = "^0.1.0"

[tool.poetry.plugins."browse_mcp.searchers"]
my_searcher = "browse_mcp_plugin_myname.searcher:MySearcher"

4. Update Provider Registry

Add your plugin to provider.index.json:

{
  "providers": {
    "custom": {
      "name": "Custom Sources",
      "sources": {
        "my_source": {
          "name": "My Source",
          "description": "Description of my data source",
          "apiKey": "required",
          "documentation": "https://docs.mysource.com"
        }
      }
    }
  }
}

5. Install Plugin

cd backend/plugins/browse-mcp-plugin-myname
poetry install

The plugin will be automatically discovered on next application start.

Plugin Lifecycle

Discovery

Application starts
Stevedore scans browse_mcp.searchers namespace
All registered entry points are discovered
Plugins are loaded and instantiated

Loading

# In browse_mcp/plugin.py
from stevedore import extension

def discover_searchers():
    """Discover all available searchers (built-in + plugins)."""
    mgr = extension.ExtensionManager(
        namespace='browse_mcp.searchers',
        invoke_on_load=True,
        propagate_map_exceptions=True,
    )

    searchers = {}
    for ext in mgr:
        # ext.name is the entry point name
        # ext.obj is the instantiated searcher
        searchers[ext.name] = ext.obj

    return searchers

Usage

from browse_mcp.plugin import discover_searchers

# Load all searchers
searchers = discover_searchers()

# Use a specific searcher
arxiv = searchers['arxiv']
results = arxiv.search("quantum computing")

# Use a plugin searcher
context7 = searchers['context7_web']
results = context7.search("machine learning")

Best Practices

1. Naming Convention

Entry Point Names:

Use lowercase with underscores: my_source, web_searcher
Be descriptive: twitter not tw, semantic_scholar not ss

Package Names:

Follow pattern: browse-mcp-plugin-{name}
Use hyphens, not underscores: browse-mcp-plugin-context7

Module Names:

Use underscores: browse_mcp_plugin_context7

2. Error Handling

Plugins should handle errors gracefully:

class MySearcher(BaseSearcher):
    def search(self, query: str, **kwargs) -> List[Dict]:
        try:
            # Search logic
            return results
        except APIError as e:
            self.logger.error(f"API error: {e}")
            return []
        except Exception as e:
            self.logger.exception(f"Unexpected error: {e}")
            raise

3. Configuration

Use environment variables for API keys and configuration:

import os

class MySearcher(BaseSearcher):
    def __init__(self):
        super().__init__(name="my_source")
        self.api_key = os.getenv("MY_SOURCE_API_KEY")
        if not self.api_key:
            raise ValueError("MY_SOURCE_API_KEY not set")

4. Testing

Each plugin should include tests:

# tests/test_searcher.py
import pytest
from browse_mcp_plugin_myname.searcher import MySearcher

def test_search():
    searcher = MySearcher()
    results = searcher.search("test query")
    assert len(results) > 0
    assert "title" in results[0]

5. Documentation

Include in README.md:

Purpose and supported data sources
API key requirements
Installation instructions
Usage examples
Rate limits and limitations

Forbidden Patterns

❌ Hardcoding API Keys

# Bad
class MySearcher(BaseSearcher):
    api_key = "sk-1234567890abcdef"

# Good
class MySearcher(BaseSearcher):
    def __init__(self):
        self.api_key = os.getenv("MY_SOURCE_API_KEY")

❌ Direct Imports Without Entry Points

# Bad - bypassing plugin system
from browse_mcp_plugin_myname.searcher import MySearcher
searcher = MySearcher()

# Good - using plugin discovery
from browse_mcp.plugin import discover_searchers
searchers = discover_searchers()
searcher = searchers['my_searcher']

❌ Blocking Operations Without Async

# Bad - blocking I/O
def search(self, query: str) -> List[Dict]:
    response = requests.get(url)  # Blocks thread
    return response.json()

# Good - async for I/O
async def search(self, query: str) -> List[Dict]:
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as response:
            return await response.json()

Troubleshooting

Plugin Not Found

Symptoms: Plugin doesn't appear in loaded searchers

Solutions:

Verify entry point in pyproject.toml:
```
poetry show browse-mcp-plugin-myname
```

Check entry point registration:

from stevedore import extension
mgr = extension.ExtensionManager('browse_mcp.searchers')
print([ext.name for ext in mgr])

Reinstall plugin:

cd backend/plugins/browse-mcp-plugin-myname
poetry install

Import Errors

Symptoms: ModuleNotFoundError when loading plugin

Solutions:

Ensure plugin package is installed
Check import paths in entry point definition
Verify __init__.py files exist in all package directories

API Key Issues

Symptoms: ValueError: API_KEY not set

Solutions:

Set environment variable:

export MY_SOURCE_API_KEY="your-key-here"

Add to .env file:
```
MY_SOURCE_API_KEY=your-key-here
```
Check key loading in plugin code

Examples

Example 1: Context7 Plugin

# browse_mcp_plugin_context7/searcher.py
from browse_mcp.base import BaseSearcher

class Context7Searcher(BaseSearcher):
    def __init__(self):
        super().__init__(name="context7")
        self.api_key = os.getenv("CONTEXT7_API_KEY")

    def search(self, query: str, **kwargs) -> List[Dict]:
        # Context7 API search implementation
        pass

# pyproject.toml
[tool.poetry.plugins."browse_mcp.searchers"]
context7_web = "browse_mcp_plugin_context7.searcher:Context7Searcher"

# browse_mcp_plugin_social_media/twitter.py
from browse_mcp.base import BaseSearcher

class TwitterSearcher(BaseSearcher):
    def __init__(self):
        super().__init__(name="twitter")
        self.bearer_token = os.getenv("TWITTER_BEARER_TOKEN")

    def search(self, query: str, **kwargs) -> List[Dict]:
        # Twitter API v2 search implementation
        pass

# pyproject.toml
[tool.poetry.plugins."browse_mcp.searchers"]
twitter = "browse_mcp_plugin_social_media.twitter:TwitterSearcher"
linkedin = "browse_mcp_plugin_social_media.linkedin:LinkedInSearcher"

References

Stevedore Documentation: https://docs.openstack.org/stevedore/latest/
Plugin Implementation: backend/browse-mcp/browse_mcp/plugin.py
Base Searcher: backend/browse-mcp/browse_mcp/base.py
Provider Registry: provider.index.json
Example Plugins: backend/plugins/

Last Updated: 2026-02-03

Overview​

Architecture Components​

1. Provider Hierarchy​

2. Provider Registry​

Built-in Providers​

Location​

Available Sources (20+)​

Implementation Pattern​

Plugin Providers​

Location​

Plugin Discovery Mechanism​

Loading Mechanism​

Creating a New Plugin​

1. Package Structure​

2. Implement Searcher​

3. Register Entry Point​

4. Update Provider Registry​

5. Install Plugin​

Plugin Lifecycle​

Discovery​

Loading​

Usage​

Best Practices​

1. Naming Convention​

2. Error Handling​

3. Configuration​

4. Testing​

5. Documentation​

Forbidden Patterns​

❌ Hardcoding API Keys​

❌ Direct Imports Without Entry Points​

❌ Blocking Operations Without Async​

Troubleshooting​

Plugin Not Found​

Import Errors​

API Key Issues​

Examples​

Example 1: Context7 Plugin​

Example 2: Social Media Plugin​

References​

Overview

Architecture Components

1. Provider Hierarchy

2. Provider Registry

Built-in Providers

Location

Available Sources (20+)

Implementation Pattern

Plugin Providers

Location

Plugin Discovery Mechanism

Loading Mechanism

Creating a New Plugin

1. Package Structure

2. Implement Searcher

3. Register Entry Point

4. Update Provider Registry

5. Install Plugin

Plugin Lifecycle

Discovery

Loading

Usage

Best Practices

1. Naming Convention

2. Error Handling

3. Configuration

4. Testing

5. Documentation

Forbidden Patterns

❌ Hardcoding API Keys

❌ Direct Imports Without Entry Points

❌ Blocking Operations Without Async

Troubleshooting

Plugin Not Found

Import Errors

API Key Issues

Examples

Example 1: Context7 Plugin

Example 2: Social Media Plugin

References