跳到主要内容

Plugin Architecture

Viben pluggable Provider system for extensible data sources


Overview

Viben implements a pluggable architecture that allows extending data sources through:

  1. Built-in Providers: Core academic/research data sources in backend/browse-mcp
  2. Plugin Providers: Third-party extensions in backend/plugins/*

The system uses stevedore for dynamic plugin discovery and loading via Python entry points.


Architecture Components

1. Provider Hierarchy

All data sources follow a hierarchical naming convention:

provider/source_name

Examples:

  • browse-mcp/arxiv - Built-in arXiv searcher
  • browse-mcp/pubmed - Built-in PubMed searcher
  • context7/web - Context7 plugin web searcher
  • social-media/twitter - Social media plugin Twitter searcher

2. Provider Registry

Location: provider.index.json (root directory)

This JSON file catalogs all available data sources by category:

{
"providers": {
"academic": {
"name": "Academic Sources",
"description": "Research paper databases and preprint servers",
"sources": {
"arxiv": {
"name": "arXiv",
"description": "Open access preprint repository",
"apiKey": "none",
"documentation": "https://arxiv.org/help/api"
}
}
}
}
}

Categories:

  • academic - Research databases (arXiv, PubMed, Semantic Scholar, etc.)
  • publisher - Publisher-specific sources (IEEE, Springer, ScienceDirect, etc.)
  • institutional - Institutional repositories (CORE, ResearchGate, etc.)
  • web - General web sources (Google Scholar, Sci-Hub, etc.)

Built-in Providers

Location

backend/browse-mcp/browse_mcp/sources/

Available Sources (20+)

SourceDescriptionAPI Key
arxiv.pyarXiv preprint serverNone
pubmed.pyPubMed/MEDLINE databaseNone
pmc.pyPubMed Central full textNone
biorxiv.pybioRxiv preprint serverNone
medrxiv.pymedRxiv preprint serverNone
semantic.pySemantic Scholar APIOptional
core.pyCORE aggregatorOptional
crossref.pyCrossref metadataNone
iacr.pyIACR Cryptology ePrintNone
acm.pyACM Digital LibraryOptional
ieee.pyIEEE XploreRequired
sciencedirect.pyScienceDirectRequired
springer.pySpringerLinkRequired
scopus.pyScopusRequired
google_scholar.pyGoogle ScholarNone
jstor.pyJSTORRequired
researchgate.pyResearchGateNone
wos.pyWeb of ScienceRequired
sci_hub.pySci-HubNone
hub.pyGeneric hub searcherNone

Implementation Pattern

All built-in searchers inherit from BaseSearcher and implement:

from browse_mcp.base import BaseSearcher

class ArxivSearcher(BaseSearcher):
def search(self, query: str, **kwargs) -> List[Dict]:
"""Execute search and return results."""
pass

def get_paper_details(self, paper_id: str) -> Dict:
"""Get detailed metadata for a paper."""
pass

Plugin Providers

Location

backend/plugins/
├── browse-mcp-plugin-context7/
│ ├── pyproject.toml
│ ├── README.md
│ ├── CHANGELOG.md
│ └── browse_mcp_plugin_context7/
│ └── searcher.py
└── browse-mcp-plugin-social-media/
├── pyproject.toml
├── README.md
├── CHANGELOG.md
└── browse_mcp_plugin_social_media/
└── searcher.py

Plugin Discovery Mechanism

Plugins register their searchers via entry points in pyproject.toml:

[tool.poetry.plugins."browse_mcp.searchers"]
context7_web = "browse_mcp_plugin_context7.searcher:Context7Searcher"
twitter = "browse_mcp_plugin_social_media.twitter:TwitterSearcher"
linkedin = "browse_mcp_plugin_social_media.linkedin:LinkedInSearcher"

Entry Point Namespace: browse_mcp.searchers

Loading Mechanism

The plugin system uses stevedore to discover and load entry points:

from stevedore import extension

def load_plugins():
"""Load all registered searcher plugins."""
mgr = extension.ExtensionManager(
namespace='browse_mcp.searchers',
invoke_on_load=True,
)
return {ext.name: ext.obj for ext in mgr}

Reference: backend/browse-mcp/browse_mcp/plugin.py


Creating a New Plugin

1. Package Structure

Create a new package following the naming convention:

backend/plugins/browse-mcp-plugin-{name}/
├── pyproject.toml
├── README.md
├── CHANGELOG.md
├── browse_mcp_plugin_{name}/
│ ├── __init__.py
│ └── searcher.py
└── dist/

2. Implement Searcher

Create a searcher class inheriting from BaseSearcher:

from browse_mcp.base import BaseSearcher
from typing import List, Dict

class MySearcher(BaseSearcher):
"""Custom data source searcher."""

def __init__(self):
super().__init__(name="my_source")

def search(self, query: str, **kwargs) -> List[Dict]:
"""Search implementation."""
# Your search logic
return results

def get_paper_details(self, paper_id: str) -> Dict:
"""Get paper details."""
# Your detail fetching logic
return details

3. Register Entry Point

Add entry point in pyproject.toml:

[tool.poetry]
name = "browse-mcp-plugin-myname"
version = "0.1.0"

[tool.poetry.dependencies]
browse-mcp = "^0.1.0"

[tool.poetry.plugins."browse_mcp.searchers"]
my_searcher = "browse_mcp_plugin_myname.searcher:MySearcher"

4. Update Provider Registry

Add the plugin to provider.index.json:

{
"providers": {
"custom": {
"name": "Custom Sources",
"sources": {
"my_source": {
"name": "My Source",
"description": "Description of my data source",
"apiKey": "required",
"documentation": "https://docs.mysource.com"
}
}
}
}
}

5. Install Plugin

cd backend/plugins/browse-mcp-plugin-myname
poetry install

The plugin will be automatically discovered on the next application startup.


Plugin Lifecycle

Discovery

  1. Application starts
  2. Stevedore scans the browse_mcp.searchers namespace
  3. Discovers all registered entry points
  4. Loads and instantiates plugins

Loading

# In browse_mcp/plugin.py
from stevedore import extension

def discover_searchers():
"""Discover all available searchers (built-in + plugins)."""
mgr = extension.ExtensionManager(
namespace='browse_mcp.searchers',
invoke_on_load=True,
propagate_map_exceptions=True,
)

searchers = {}
for ext in mgr:
# ext.name is the entry point name
# ext.obj is the instantiated searcher
searchers[ext.name] = ext.obj

return searchers

Usage

from browse_mcp.plugin import discover_searchers

# Load all searchers
searchers = discover_searchers()

# Use a specific searcher
arxiv = searchers['arxiv']
results = arxiv.search("quantum computing")

# Use a plugin searcher
context7 = searchers['context7_web']
results = context7.search("machine learning")

Best Practices

1. Naming Conventions

Entry Point Names:

  • Use lowercase with underscores: my_source, web_searcher
  • Use descriptive names: twitter not tw, semantic_scholar not ss

Package Names:

  • Follow the pattern: browse-mcp-plugin-{name}
  • Use hyphens, not underscores: browse-mcp-plugin-context7

Module Names:

  • Use underscores: browse_mcp_plugin_context7

2. Error Handling

Plugins should handle errors gracefully:

class MySearcher(BaseSearcher):
def search(self, query: str, **kwargs) -> List[Dict]:
try:
# Search logic
return results
except APIError as e:
self.logger.error(f"API error: {e}")
return []
except Exception as e:
self.logger.exception(f"Unexpected error: {e}")
raise

3. Configuration

Use environment variables for API keys and configuration:

import os

class MySearcher(BaseSearcher):
def __init__(self):
super().__init__(name="my_source")
self.api_key = os.getenv("MY_SOURCE_API_KEY")
if not self.api_key:
raise ValueError("MY_SOURCE_API_KEY not set")

4. Testing

Each plugin should include tests:

# tests/test_searcher.py
import pytest
from browse_mcp_plugin_myname.searcher import MySearcher

def test_search():
searcher = MySearcher()
results = searcher.search("test query")
assert len(results) > 0
assert "title" in results[0]

5. Documentation

Include in README.md:

  • Purpose and supported data sources
  • API key requirements
  • Installation instructions
  • Usage examples
  • Rate limits and limitations

Forbidden Patterns

Hardcoded API Keys

# Incorrect
class MySearcher(BaseSearcher):
api_key = "sk-1234567890abcdef"
# Correct
class MySearcher(BaseSearcher):
def __init__(self):
self.api_key = os.getenv("MY_SOURCE_API_KEY")

Direct Import Bypassing Entry Points

# Incorrect - bypasses plugin system
from browse_mcp_plugin_myname.searcher import MySearcher
searcher = MySearcher()
# Correct - use plugin discovery
from browse_mcp.plugin import discover_searchers
searchers = discover_searchers()
searcher = searchers['my_searcher']

Blocking Operations Without Async

# Incorrect - blocking I/O
def search(self, query: str) -> List[Dict]:
response = requests.get(url) # blocks thread
return response.json()
# Correct - async I/O
async def search(self, query: str) -> List[Dict]:
async with aiohttp.ClientSession() as session:
async with session.get(url) as response:
return await response.json()

Troubleshooting

Plugin Not Found

Symptom: Plugin does not appear in loaded searchers

Solution:

  1. Verify entry point in pyproject.toml:

    poetry show browse-mcp-plugin-myname
  2. Check entry point registration:

    from stevedore import extension
    mgr = extension.ExtensionManager('browse_mcp.searchers')
    print([ext.name for ext in mgr])
  3. Reinstall the plugin:

    cd backend/plugins/browse-mcp-plugin-myname
    poetry install

Import Errors

Symptom: ModuleNotFoundError when loading plugin

Solution:

  1. Ensure the plugin package is installed
  2. Check the import path in entry point definition
  3. Verify __init__.py exists in all package directories

API Key Issues

Symptom: ValueError: API_KEY not set

Solution:

  1. Set environment variable:

    export MY_SOURCE_API_KEY="your-key-here"
  2. Add to .env file:

    MY_SOURCE_API_KEY=your-key-here
  3. Check key loading in plugin code


Examples

Example 1: Context7 Plugin

# browse_mcp_plugin_context7/searcher.py
from browse_mcp.base import BaseSearcher

class Context7Searcher(BaseSearcher):
def __init__(self):
super().__init__(name="context7")
self.api_key = os.getenv("CONTEXT7_API_KEY")

def search(self, query: str, **kwargs) -> List[Dict]:
# Context7 API search implementation
pass
# pyproject.toml
[tool.poetry.plugins."browse_mcp.searchers"]
context7_web = "browse_mcp_plugin_context7.searcher:Context7Searcher"

Example 2: Social Media Plugin

# browse_mcp_plugin_social_media/twitter.py
from browse_mcp.base import BaseSearcher

class TwitterSearcher(BaseSearcher):
def __init__(self):
super().__init__(name="twitter")
self.bearer_token = os.getenv("TWITTER_BEARER_TOKEN")

def search(self, query: str, **kwargs) -> List[Dict]:
# Twitter API v2 search implementation
pass
# pyproject.toml
[tool.poetry.plugins."browse_mcp.searchers"]
twitter = "browse_mcp_plugin_social_media.twitter:TwitterSearcher"
linkedin = "browse_mcp_plugin_social_media.linkedin:LinkedInSearcher"

References

  • Stevedore Documentation: https://docs.openstack.org/stevedore/latest/
  • Plugin Implementation: backend/browse-mcp/browse_mcp/plugin.py
  • Base Searcher Class: backend/browse-mcp/browse_mcp/base.py
  • Provider Registry: provider.index.json
  • Example Plugins: backend/plugins/

Last Updated: 2026-02-28