Commit Graph

28 Commits

Author SHA1 Message Date
RC-CHN 270c89c12f feat: Add URL document parser for knowledge base (#3622)
* feat: 添加从 URL 上传文档的功能,支持进度回调和错误处理

* feat: 添加从 URL 上传文档的前端

* chore: 添加 URL 上传功能的警告提示,确保用户配置正确

* feat: 添加内容清洗功能,支持从 URL 上传文档时的清洗设置和服务提供商选择

* feat: 更新内容清洗系统提示,增强信息提取规则;添加 URL 上传功能的测试版标识

* style: format code

* perf: 优化上传设置,增强 URL 上传时的禁用逻辑和清洗提供商验证

* refactor:使用自带chunking模块

* refactor: 提取prompt到单独文件

* feat: 添加 Tavily API Key 配置对话框,增强网页搜索功能的配置体验

* fix: update URL hint and warning messages for clarity in knowledge base upload settings

* fix: 修复设置tavily_key的热重载问题

---------

Co-authored-by: Soulter <905617992@qq.com>
2025-11-17 19:05:14 +08:00
LIghtJUNction 0b7fc29ac4 style: add ruff lint module of isort and pyupgrade, and some ruff check fix (#3214)
Co-authored-by: Dt8333 <25431943+Dt8333@users.noreply.github.com>
Co-authored-by: Soulter <905617992@qq.com>
2025-11-01 13:26:19 +08:00
Soulter 0823f7aa48 在检查字面量集合的成员资格时使用 set
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
2025-10-25 22:04:17 +08:00
Soulter eb201c0420 feat: refactor knowledge base parsers and add MarkitdownParser for docx, xls, xlsx support 2025-10-25 22:00:54 +08:00
lxfight 57f868cab1 Merge branch 'feature/knowledge-base' of https://github.com/lxfight/AstrBot into feature/knowledge-base 2025-10-25 13:53:03 +08:00
Soulter 016783a1e5 feat: implement RecursiveCharacterChunker and update KnowledgeBaseManager to use it 2025-10-25 13:46:06 +08:00
lxfight 594ccff9c8 fix: 添加数据库连接检查和知识库终止功能,增强错误处理和清理逻辑,修复知识库无法删除的问题 2025-10-25 11:56:37 +08:00
Soulter 8f021eb35a feat: refactor document storage to use SQLModel and enhance database operations 2025-10-24 23:17:37 +08:00
Soulter 4cedc6d3c8 feat: add t-SNE visualization for FAISS index and enhance knowledge base retrieval with debug mode 2025-10-24 21:22:46 +08:00
Soulter 4e9cce76da feat: add timing logs for dense and sparse retrieval processes and adjust top K results in sparse retriever 2025-10-24 17:51:30 +08:00
Soulter 9b004f3d2f feat: update document retrieval to include limit and offset parameters 2025-10-24 17:38:22 +08:00
Soulter 9430e3090d feat: add progress callback for document upload and enhance upload progress tracking 2025-10-24 17:13:44 +08:00
Soulter ba44f9117b feat: enhance document upload process with batch settings and improved chunk handling 2025-10-24 16:37:37 +08:00
Soulter 38e3f27899 feat: update knowledge base retrieval configuration and UI adjustments 2025-10-24 15:06:07 +08:00
Soulter a6be0cc135 feat: refresh knowledge base and document after uploading a document 2025-10-24 14:28:27 +08:00
Soulter a53510bc41 refactor: comment out file path handling in KBHelper and search input in DocumentDetail 2025-10-24 14:27:01 +08:00
Soulter 1fd482e899 feat: update chunk deletion to include document ID and refresh metadata 2025-10-24 14:18:32 +08:00
Soulter 2f130ba009 feat: delete chunk and delete document 2025-10-24 13:59:17 +08:00
Soulter e0ac743cdb perf: remove rerank functionality from settings tab and related form data 2025-10-24 12:13:51 +08:00
Soulter 7e0a50fbf2 feat: enhance knowledge base retrieval with chunk metadata and pagination support; remove unused chunk model 2025-10-24 00:44:40 +08:00
Soulter 59df244173 improve 2025-10-23 21:20:41 +08:00
Soulter e3aa1315ae stage 2025-10-23 00:31:15 +08:00
Soulter 65bc5efa19 feat: 集成知识库管理器,优化知识库上下文注入流程,移除冗余代码 2025-10-22 21:59:00 +08:00
lxfight a05868cc45 feat: 更新知识库管理器以支持重排序模型提供商,调整相关组件的默认配置和提示信息 2025-10-20 22:38:06 +08:00
lxfight 2fc77aed15 feat: 添加知识库检索功能,支持根据知识库 ID 列出相关会话;更新相关界面和国际化文本 2025-10-20 22:23:35 +08:00
lxfight beccae933f fix:修复KBSessionConfig的导入问题 2025-10-19 21:36:01 +08:00
lxfight a0254ed817 refactor: 优化知识库管理器和数据库操作的代码格式 2025-10-19 19:36:26 +08:00
lxfight ad96d676e6 feat: 实现知识库核心后端模块
- 实现完整的知识库数据模型(知识库、文档、文档块、会话配置)
- 实现基于 SQLite 的向量数据库存储和检索
- 实现文档解析器(PDF、TXT)和固定大小分块器
- 实现混合检索系统(密集向量检索 + BM25 稀疏检索 + RRF 融合)
- 实现知识库生命周期管理和消息注入器
- 支持会话级别的知识库配置和关联
2025-10-19 18:40:55 +08:00