browse_read

The browse_read tool extracts and reads text content from papers and other content sources. If the content has not been downloaded yet, it will automatically download it before extracting text.

Basic Usage

browse_read(searcher="arxiv", paper_id="2303.08774")

Parameters

Parameter	Type	Required	Default	Description
`searcher`	string	Yes	-	Source platform
`paper_id`	string	Yes	-	Content identifier (1-200 characters)
`page`	integer	No	-	Read specific page (1-indexed)
`start_page`	integer	No	-	Page range start (1-indexed)
`end_page`	integer	No	-	Page range end (1-indexed)

Pagination

The browse_read tool supports reading specific pages or page ranges from PDF documents. This is useful for:

Reading only specific sections without loading the entire document
Efficiently browsing long papers
Reducing context length for AI assistants

Pagination Parameters

Parameter	Description	Example
`page`	Read a single specific page	`page=3` returns only page 3
`start_page`	Page range start (inclusive)	`start_page=1` starts from page 1
`end_page`	Page range end (inclusive)	`end_page=5` ends at page 5

Pagination Behavior

Parameters	Result
None	Returns all pages
`page=3`	Returns only page 3
`start_page=1, end_page=5`	Returns pages 1-5
`start_page=10`	From page 10 to end
`end_page=5`	Returns pages 1-5

Pagination Examples

# Read only abstract (usually page 1)
browse_read(searcher="arxiv", paper_id="2303.08774", page=1)

# Read introduction (pages 1-3)
browse_read(searcher="arxiv", paper_id="2303.08774", start_page=1, end_page=3)

# Start reading from methods section (assuming it starts at page 5)
browse_read(searcher="arxiv", paper_id="2303.08774", start_page=5)

# Read up to conclusion (first 10 pages)
browse_read(searcher="arxiv", paper_id="2303.08774", end_page=10)

Pagination Response Format

When using pagination, the response includes page markers:

--- Page 1 ---
Title: GPT-4 Technical Report

Abstract
We report the development of GPT-4, a large-scale, multimodal
model which can accept image and text inputs...

--- Page 2 ---
1 Introduction
This technical report presents GPT-4, a large multimodal model
capable of processing image and text inputs...

Paper ID Formats

Each platform uses different identifier formats. See browse_download for complete format details.

Searcher	Example
`arxiv`	`2303.08774`
`pubmed`	`32790614`
`pmc`	`PMC7419405`
`biorxiv`	`10.1101/2020.01.01.123456`
`medrxiv`	`10.1101/2020.01.01.123456`
`iacr`	`2009/101`
`crossref`	`10.1038/s41586-020-2649-2`
`semantic`	`DOI:10.18653/v1/N18-3011`
`core`	`123456789`

Reading Examples

Read from Different Data Sources

# Read from arXiv
browse_read(searcher="arxiv", paper_id="2106.12345")

# Read from PubMed
browse_read(searcher="pubmed", paper_id="32790614")

# Read from PubMed Central
browse_read(searcher="pmc", paper_id="PMC7419405")

# Read from bioRxiv
browse_read(searcher="biorxiv", paper_id="10.1101/2020.01.01.123456")

# Read from medRxiv
browse_read(searcher="medrxiv", paper_id="10.1101/2020.01.01.123456")

# Read from IACR
browse_read(searcher="iacr", paper_id="2009/101")

# Read from Semantic Scholar
browse_read(searcher="semantic", paper_id="DOI:10.18653/v1/N18-3011")

# Read from CrossRef
browse_read(searcher="crossref", paper_id="10.1038/s41586-020-2649-2")

# Read from CORE
browse_read(searcher="core", paper_id="123456789")

Read from Plugin Data Sources

If the social media plugin is installed:

# Read from GitHub
browse_read(searcher="github", paper_id="owner/repo")

# Read from Twitter
browse_read(searcher="twitter", paper_id="1234567890")

# Read from Zhihu
browse_read(searcher="zhihu", paper_id="123456789")

How It Works

Check local cache: The tool first checks if the content has been downloaded
Download if needed: If not available locally, automatically downloads the content
Extract text: Uses appropriate parser (PDF, HTML, etc.) to extract text
Apply pagination: If pagination parameters are set, extracts only the requested pages
Return content: Returns the extracted text string

Response Format

The tool returns extracted text content:

Title: GPT-4 Technical Report

Abstract
We report the development of GPT-4, a large-scale, multimodal
model which can accept image and text inputs and produce text
outputs. While less capable than humans in many real-world
scenarios, GPT-4 exhibits human-level performance on various
professional and academic benchmarks...

1 Introduction
This technical report presents GPT-4, a large multimodal model
capable of processing image and text inputs and producing text
outputs...

[Full paper text continues...]

Input Validation

searcher: Must be one of the enabled data sources
paper_id: Must be 1-200 characters, cannot be empty or whitespace only
page: Must be a positive integer (1 or greater)
start_page: Must be a positive integer (1 or greater)
end_page: Must be a positive integer, greater than or equal to start_page

Error Handling

Common errors and their meanings:

Error	Cause	Solution
Searcher unavailable	Data source not enabled	Enable the data source in configuration
Paper ID cannot be empty	Empty or whitespace only ID	Provide a valid paper ID
Paper not found	Invalid paper ID	Verify paper ID format
Error converting paper to text	PDF parsing failed	Try re-downloading or use another data source
Invalid page number	Page number out of range	Use valid page numbers

Tips

:::tip Workflow For best results, first use browse_search to search for papers, then use the returned paper ID with browse_read to extract content. :::

:::tip Pagination for Long Papers For long papers, use pagination to read specific sections:

page=1 to get abstract
start_page=1, end_page=3 to get introduction
Only read full paper when needed :::
The tool automatically downloads papers, so you don't need to call browse_download first
Downloaded papers are cached for faster subsequent reads
Text extraction quality depends on PDF structure (some scanned PDFs may have poor extraction results)
Pagination only works for PDF content; other content types return full text

Use Cases

Research Summary

Ask your AI assistant:

"Read page 1 of paper 2303.08774 from arXiv and summarize the abstract"

Literature Review

After searching:

"Search for papers about transformer architecture on arXiv, then read pages 1-5 of the top result"

Citation Extraction

"Read the last 3 pages of this paper to find the references section"

Progressive Reading

"Read pages 1-5 first, if more details are needed, then read pages 6-10"

Next Steps

browse_search - Search for papers to read
browse_download - Download papers for offline access
MCP Configuration - Configure download path
Plugins - Extend with more content sources

Basic Usage​

Parameters​

Pagination​

Pagination Parameters​

Pagination Behavior​

Pagination Examples​

Pagination Response Format​

Paper ID Formats​

Reading Examples​

Read from Different Data Sources​

Read from Plugin Data Sources​

How It Works​

Response Format​

Input Validation​

Error Handling​

Tips​

Use Cases​

Research Summary​

Literature Review​

Citation Extraction​

Progressive Reading​

Next Steps​