f624971613
* chore(core.utils): 🚨 修正错误Lint
* chore(core.provider): 🚨 修复基类错误Lint
* chore(core.utils): 补全session_get()的重载
* chore(core.provider): 🚨 修正实现错误Lint
* chore(core.platform): 🚨 修正platform基类和webchat的错误Lint
* chore(core.platform): 修正错误实现Lint
* fix(core.provider): 修复循环调用和错误assert
* chore(core.platform): 修复部分实现Lint
* chore(core.provider): 补充Dify.text_chat_stream的参数类型
* chore(core.pipeline): 🚨 修复错误Lint
* fix(core.slack): 补充遗漏导入
* chore(core.utils): 修复错误的session_get声明
* chore(core.platform): 移除Lark adapter import中的wildcard
* chore(core.db): 修复声明和部分逻辑
* chore(core.db): 添加typings,使faiss参数能被正确识别。
* chore(core): 修复声明
* chore(core): 修改声明
* chore: 补充faiss声明
* chore(dashboard): 修改实现,减少报错
* chore(package): 修改部分声明与实现,减少报错
* chore(core): 添加Handler的overload,以去除部分assert同时通过类型检查
* chore(core.pipeline): 修改Pipeline Scheduler的execute,将判断属性改为判断类型,通过静态类型检查
* chore(core.config): 添加类型标注,通过类型检查
* chore(core.message): 为File._download_file添加检查,通过类型检查
* fix: 将断言改为条件判断以实现优雅关闭的容错性
* refactor: 移除 discord 客户端中的 assert,改用 if None 判断并抛出异常
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* fix: DiscordPlatformAdapter 对 self.client.user 为 None 做日志并返回,移除断言
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* fix: 增强 Lark 相关空值/异常检查并完善日志输出
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* refactor: 将断言替换为条件检查并加入日志与错误处理
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* chore: 移除LLM生成的无用注释
* refactor: 使用 File.get_file 替换下载逻辑并移除 assert,提供默认 filename
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* fix: Slack Socket 未初始化抛出运行时异常,图片 URL 判空改为非空判断
* refactor: 将 WeChatPadProAdapter 的断言改为空值判断并添加日志
* refactor: 使用 isinstance 替代断言实现类型判断,便于静态检查
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* fix: 去除cast,直接使用字段与字典访问,修正端口解析
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* refactor: 使用 match-case 重构 ProviderManager 加载并通过类型检查抛出 TypeError
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* fix: group_name_display 时若 group 对象为空则记录错误并返回
* fix: 将 _get_current_persona_id 的 assert 替换成 if guard 并返回 None
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* fix: 优化插件目录存在性检查及图片URL非空验证,更新JSON排序配置
* fix: 将 datetime_str 的 assert 替换为显式检查并抛出异常
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* refactor: 移除 cast,改为运行时检查并在找不到调度器时跳过
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* refactor: 移除 cast,改用 isinstance 检查 FaissVecDB 并警告
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* fix: 删除 typing.cast 导入,并在获取文件绝对路径前校验 file_
* refactor: 移除 typing.cast,简化内容安全检查调用
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* refactor: 将 PlatformMetadata.id 设为必填并在注册时传入 id,移除 cast
* refactor: 移除 cast,改用 HasInitialize 与 isinstance 进行初始化
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* fix: 为 ProviderManager.initialize 增加ID类型判断,避免 None 导致 get 失败
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* refactor: 为 OTTSProvider 与 AzureNativeProvider 引入 _client 与 client 属性改进上下文管理
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* fix: 为 Whisper 自托管源添加模型未初始化校验并直接调用 transcribe
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* refactor: 移除未使用的 cast 导入并简化 platform_name 赋值
* refactor: 引入 cast 并对 id 使用 cast(str, ...) 提升类型安全
* fix: 将 _id_to_sid 返回改为 str,空值返回空串;对 id 与 message_id 使用 cast
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* refactor: 重构 Discord 处理逻辑:强制 类型转换、优先斜杠指令并优化提及判断
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* fix: 统一对 id 获取执行 cast,并在微信消息解析失败时抛错
* Revert "fix: 去除cast,直接使用字段与字典访问,修正端口解析"
This reverts commit 1cbfdf9d1b.
* fix: 百炼 Rerank 会话关闭时返回空结果;初始化 request.prompt 避免空值拼接
* fix: 统一处理搜索结果链接为字符串,新增 _get_url 助手并适配 Bing/Sogo
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
* refactor: 调整 call_handler 泛型、Discord 通道注解及 FishAudioTTS API 请求类型
* refactor: 使用 col(...) 替代列引用并对结果进行 CursorResult 强转
* chore: ruff format
---------
Co-authored-by: aider (openai/gemini-3-pro-high) <aider@aider.chat>
Co-authored-by: Soulter <905617992@qq.com>
164 lines
5.5 KiB
Python
164 lines
5.5 KiB
Python
import asyncio
|
|
import base64
|
|
import logging
|
|
import os
|
|
import uuid
|
|
|
|
import aiohttp
|
|
import dashscope
|
|
from dashscope.audio.tts_v2 import AudioFormat, SpeechSynthesizer
|
|
|
|
try:
|
|
from dashscope.aigc.multimodal_conversation import MultiModalConversation
|
|
except (
|
|
ImportError
|
|
): # pragma: no cover - older dashscope versions without Qwen TTS support
|
|
MultiModalConversation = None
|
|
|
|
from astrbot.core.utils.astrbot_path import get_astrbot_data_path
|
|
|
|
from ..entities import ProviderType
|
|
from ..provider import TTSProvider
|
|
from ..register import register_provider_adapter
|
|
|
|
|
|
@register_provider_adapter(
|
|
"dashscope_tts",
|
|
"Dashscope TTS API",
|
|
provider_type=ProviderType.TEXT_TO_SPEECH,
|
|
)
|
|
class ProviderDashscopeTTSAPI(TTSProvider):
|
|
def __init__(
|
|
self,
|
|
provider_config: dict,
|
|
provider_settings: dict,
|
|
) -> None:
|
|
super().__init__(provider_config, provider_settings)
|
|
self.chosen_api_key: str = provider_config.get("api_key", "")
|
|
self.voice: str = provider_config.get("dashscope_tts_voice", "loongstella")
|
|
self.set_model(provider_config["model"])
|
|
self.timeout_ms = float(provider_config.get("timeout", 20)) * 1000
|
|
dashscope.api_key = self.chosen_api_key
|
|
|
|
async def get_audio(self, text: str) -> str:
|
|
model = self.get_model()
|
|
if not model:
|
|
raise RuntimeError("Dashscope TTS model is not configured.")
|
|
|
|
temp_dir = os.path.join(get_astrbot_data_path(), "temp")
|
|
os.makedirs(temp_dir, exist_ok=True)
|
|
|
|
if self._is_qwen_tts_model(model):
|
|
audio_bytes, ext = await self._synthesize_with_qwen_tts(model, text)
|
|
else:
|
|
audio_bytes, ext = await self._synthesize_with_cosyvoice(model, text)
|
|
|
|
if not audio_bytes:
|
|
raise RuntimeError(
|
|
"Audio synthesis failed, returned empty content. The model may not be supported or the service is unavailable.",
|
|
)
|
|
|
|
path = os.path.join(temp_dir, f"dashscope_tts_{uuid.uuid4()}{ext}")
|
|
with open(path, "wb") as f:
|
|
f.write(audio_bytes)
|
|
return path
|
|
|
|
def _call_qwen_tts(self, model: str, text: str):
|
|
if MultiModalConversation is None:
|
|
raise RuntimeError(
|
|
"dashscope SDK missing MultiModalConversation. Please upgrade the dashscope package to use Qwen TTS models.",
|
|
)
|
|
|
|
kwargs = {
|
|
"model": model,
|
|
"messages": None,
|
|
"api_key": self.chosen_api_key,
|
|
"voice": self.voice or "Cherry",
|
|
"text": text,
|
|
}
|
|
if not self.voice:
|
|
logging.warning(
|
|
"No voice specified for Qwen TTS model, using default 'Cherry'.",
|
|
)
|
|
return MultiModalConversation.call(**kwargs)
|
|
|
|
async def _synthesize_with_qwen_tts(
|
|
self,
|
|
model: str,
|
|
text: str,
|
|
) -> tuple[bytes | None, str]:
|
|
loop = asyncio.get_event_loop()
|
|
response = await loop.run_in_executor(None, self._call_qwen_tts, model, text)
|
|
audio_bytes = await self._extract_audio_from_response(response)
|
|
if not audio_bytes:
|
|
raise RuntimeError(
|
|
f"Audio synthesis failed for model '{model}'. {response}",
|
|
)
|
|
ext = ".wav"
|
|
return audio_bytes, ext
|
|
|
|
async def _extract_audio_from_response(self, response) -> bytes | None:
|
|
output = getattr(response, "output", None)
|
|
audio_obj = getattr(output, "audio", None) if output is not None else None
|
|
if not audio_obj:
|
|
return None
|
|
|
|
data_b64 = getattr(audio_obj, "data", None)
|
|
if data_b64:
|
|
try:
|
|
return base64.b64decode(data_b64)
|
|
except (ValueError, TypeError):
|
|
logging.exception("Failed to decode base64 audio data.")
|
|
return None
|
|
|
|
url = getattr(audio_obj, "url", None)
|
|
if url:
|
|
return await self._download_audio_from_url(url)
|
|
return None
|
|
|
|
async def _download_audio_from_url(self, url: str) -> bytes | None:
|
|
if not url:
|
|
return None
|
|
timeout = max(self.timeout_ms / 1000, 1) if self.timeout_ms else 20
|
|
try:
|
|
async with (
|
|
aiohttp.ClientSession() as session,
|
|
session.get(
|
|
url,
|
|
timeout=aiohttp.ClientTimeout(total=timeout),
|
|
) as response,
|
|
):
|
|
return await response.read()
|
|
except (aiohttp.ClientError, asyncio.TimeoutError, OSError) as e:
|
|
logging.exception(f"Failed to download audio from URL {url}: {e}")
|
|
return None
|
|
|
|
async def _synthesize_with_cosyvoice(
|
|
self,
|
|
model: str,
|
|
text: str,
|
|
) -> tuple[bytes | None, str]:
|
|
synthesizer = SpeechSynthesizer(
|
|
model=model,
|
|
voice=self.voice,
|
|
format=AudioFormat.WAV_24000HZ_MONO_16BIT,
|
|
)
|
|
loop = asyncio.get_event_loop()
|
|
audio_bytes = await loop.run_in_executor(
|
|
None,
|
|
synthesizer.call,
|
|
text,
|
|
self.timeout_ms,
|
|
)
|
|
if not audio_bytes:
|
|
resp = synthesizer.get_response()
|
|
if resp and isinstance(resp, dict):
|
|
raise RuntimeError(
|
|
f"Audio synthesis failed for model '{model}'. {resp}".strip(),
|
|
)
|
|
return audio_bytes, ".wav"
|
|
|
|
def _is_qwen_tts_model(self, model: str) -> bool:
|
|
model_lower = model.lower()
|
|
return "tts" in model_lower and model_lower.startswith("qwen")
|