OctoReport Docs
Back to HomeGo to Console
🚀快速开始
  • 产品概述
  • 快速上手
✨核心功能
    • 数据源总览
    • 搜索类源
    • RSS订阅源
    • 网页与邮件源
    • 政府与新闻源
  • 知识库管理
  • 报告生成
  • 交互式对话
  • 邮件触发
  • 积分与日志
💡使用技巧
  • 配置技巧
  • 优化与排查
🔬产品亮点
  • URL去重
  • 原子计费
  • 系统可靠性
❓帮助中心
  • FAQ与支持

Knowledge Base Management

What is a Knowledge Base

A Knowledge Base (Library) is a container in OctoReport for categorizing and organizing content.

Core Functions:

  • Content Classification: Store different topics in separate categories
  • Multi-Source Aggregation: One library can link to multiple data sources, automatically collecting all their content
  • Report Generation: Report templates can extract content from specified libraries for analysis
  • Conversational Q&A: Ask feature can answer questions based on library content

Relationship Explanation:

  • 1 library can link to multiple data sources
  • 1 data source can also link to multiple libraries
  • Content collected by data sources is automatically stored in all linked libraries

PlaceholderLibrary architecture diagram - showing relationships between data sources, libraries, reports/Ask

ℹ️ Note

Libraries are the core of content management. Proper library structure planning can greatly improve report generation and Q&A efficiency.

Creating a Library

Creation Steps

  1. Click "Knowledge Base Management" in the left sidebar
  2. Click "New Library" button
  3. Fill in basic information:
    • Name: Library name (required)
    • Description: Purpose description (optional, recommended)
  4. Click "Save"

Configuration Examples

Example 1: AI Industry News Library

[object Object],
  ,[object Object],[object Object], ,[object Object],[object Object],
  ,[object Object],[object Object], ,[object Object],
,[object Object],
hljs json

Example 2: Government Tender Information Library

[object Object],
  ,[object Object],[object Object], ,[object Object],[object Object],
  ,[object Object],[object Object], ,[object Object],
,[object Object],
hljs json

Example 3: Competitor Analysis Library

[object Object],
  ,[object Object],[object Object], ,[object Object],[object Object],
  ,[object Object],[object Object], ,[object Object],
,[object Object],
hljs json

Best Practices

  • Concise clear names: Recommend 2-8 words, theme immediately clear
  • Detailed descriptions: Specify library purpose, linked data source types, intended use
  • Topic-based categorization: Don't create overly broad libraries (e.g., "All News"), subdivide by industry and topic

Linking Data Sources

Method 1: Link from Library Page (Recommended)

  1. Enter library details page
  2. Click "Link Data Source" button
  3. Select data source from dropdown list
  4. Click "Confirm"

Method 2: Link from Data Source Page

  1. Go to "Data Source Management"
  2. When creating or editing a data source, select target library in "Linked Libraries" field
  3. Save data source

Many-to-Many Relationships

Libraries and data sources support many-to-many linking:

Scenario 1: One data source linked to multiple libraries

Data Source: "36Kr Tech News"
  ├─ Linked Library: "AI Industry News"
  ├─ Linked Library: "Startup Investment News"
  └─ Linked Library: "Product Design Inspiration"

Scenario 2: One library linked to multiple data sources

Library: "AI Industry News"
  ├─ Linked Data Source: "36Kr Tech News" (RSS)
  ├─ Linked Data Source: "Machine Learning Blog" (RSS)
  ├─ Linked Data Source: "Google AI News" (Google News)
  └─ Linked Data Source: "AI Keyword Search" (Search Source)

⚠️ Note

  • Linking is bidirectional: operations on either library or data source page establish the link
  • After unlinking, already collected content won't be deleted, remains in library
  • Newly linked data sources only collect future content, no historical backfill

PlaceholderLibrary and data source linking diagram - showing many-to-many relationships

Viewing and Filtering Content

Content List

Enter library details page to see all collected content:

Display Information:

  • Title: Content title
  • Source: Which data source it came from
  • Collection Time: When content was collected
  • Status: Whether cleaned, whether expired

Sorting Options:

  • Default by "Collection Time" descending (newest first)
  • Can switch to "Title" sorting

Filtering Features

Filter by Data Source:

  • Click "Data Source" dropdown menu
  • Select specific data source to show only its content

Filter by Time Range:

  • Click "Time Range" selector
  • Choose preset ranges (last 7/30/90 days) or custom dates

Filter by Cleaning Status:

  • Cleaned: Already extracted summary and keywords using LLM
  • Uncleaned: Retains original HTML content
  • All: Show all content

Filter by Expiration Status:

  • Valid Content: Current latest version (default)
  • Expired Content: Old versions marked as expired due to URL deduplication
  • All: Show all content

Content Details

Click any content title to view detailed information:

Basic Information:

  • Title, source URL, collection time, data source name

Content Preview:

  • If cleaned: Shows summary and keywords
  • If uncleaned: Shows original HTML (can click "Trigger Cleaning")

Actions:

  • View Original: Jump to original URL
  • Trigger Cleaning: Manually trigger LLM cleaning (consumes credits)
  • Delete: Remove from library (doesn't affect other libraries)

Usage Tips

  • Regular quality checks: Check for irrelevant content, adjust data source configuration
  • Manual cleaning: For important content, manually trigger cleaning for better summaries
  • Use filtering: Before generating reports, use filters to confirm library has sufficient relevant content

Management Operations

Edit Library

Click "Edit" button on library details page to modify name or description.

Delete Library

Click "Delete" button on library list page.

⚠️ Warning: Deleting a library permanently deletes all content, linked data sources are unaffected.

Clear Content

Click "Clear Content" to empty library while keeping configuration, suitable for testing or restarting collection.

Best Practices

✅ Subdivide Libraries by Topic

Recommended:

Library 1: "AI Research Progress"
Library 2: "AI Business Applications"
Library 3: "AI Policy & Regulations"

Not Recommended:

Library: "All AI-Related Content"

Reason: Subdivided libraries are easier to manage and use, report generation can precisely extract relevant content.

✅ Properly Use Many-to-Many Linking

Scenario: Same data source may cover multiple topics

Example:

Data Source: "Tech Media General News"
  ├─ Linked Library: "AI Industry News" (AI articles)
  ├─ Linked Library: "Blockchain Updates" (blockchain articles)
  └─ Linked Library: "Tech Company Funding" (financing news)

Benefit: Collect once, use multiple times, save costs.

✅ Regular Checks and Optimization

Checklist:

  • Weekly check content volume, confirm data sources working normally
  • Monthly check content quality, remove irrelevant content
  • Based on usage frequency, consider merging or splitting libraries

FAQ

Q1: What's the difference between libraries and data sources?

  • Data Source: Defines where to collect content from (search, RSS, email, etc.)
  • Library: Defines how to categorize and use content (report generation, Ask Q&A)

Q2: Does deleting a library affect data sources?

No. Data sources will continue collecting content, just no linked library to store it.

Q3: Why does content appear duplicated?

Possible reasons: Same data source linked to multiple libraries (normal), or deduplication strategy set to UPDATE (old versions marked as expired).

Next Steps

  • Report Generation - Use libraries to generate reports
  • Ask - Q&A based on libraries
  • Configuration Tips - Optimize library configuration