Files
AstrBot/tests/test_whisper_api_source.py
Weilong Liao 7c366a708b fix: unify media reference handling (#8764)
* fix: unify media reference handling

* fix: accept bare base64 record media refs

* chore: update agents.md

* fix: unify file URI handling across media components and utilities

* fix: unify media reference type handling with MediaRefStr alias

* Potential fix for pull request finding 'CodeQL / Incomplete URL substring sanitization'

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* Update astrbot/core/platform/sources/discord/discord_platform_adapter.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* fix: unify media handling and improve base64 decoding across components

* fix: simplify client_kwargs type definition and enhance media message handling in platform adapter

* fix: unify media utility documentation and enhance function descriptions

* perf: drop "pilk" requirement, improve audio outbound for tencent-related IM apps which using silk

* fix: unify Tencent Silk audio handling and enhance media resolver functionality

---

- Centralize media reference materialization and base64 resolution for local paths, http(s), base64://, data URIs, and legacy bare base64 payloads.
- Normalize incoming Record audio to wav and Image media to temporary jpg during preprocess, with event-scoped cleanup.
- Reuse the shared media resolver across OpenAI, Gemini, Anthropic, MiMo, DeerFlow, STT, and platform media paths while sanitizing logs and cleaning temporary conversion outputs.
- Ensure generated TTS audio is tracked for cleanup after the event finishes.

fix #8676
fix #8543
fix #7588
fix #7580
fix #8030
fix #8034
fix #7461
fix #7565
fix #6509
fix #7144
fix #7795



---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-06-14 10:37:16 +08:00

76 lines
2.3 KiB
Python

from pathlib import Path
from types import SimpleNamespace
from unittest.mock import AsyncMock
import pytest
from astrbot.core.provider.sources.whisper_api_source import ProviderOpenAIWhisperAPI
def _make_provider() -> ProviderOpenAIWhisperAPI:
provider = ProviderOpenAIWhisperAPI(
provider_config={
"id": "test-whisper-api",
"type": "openai_whisper_api",
"model": "whisper-1",
"api_key": "test-key",
},
provider_settings={},
)
provider.client = SimpleNamespace(
audio=SimpleNamespace(
transcriptions=SimpleNamespace(
create=AsyncMock(return_value=SimpleNamespace(text="transcribed text"))
)
),
close=AsyncMock(),
)
return provider
@pytest.mark.asyncio
async def test_get_text_converts_opus_files_to_wav_before_transcription(
tmp_path: Path, monkeypatch: pytest.MonkeyPatch
):
provider = _make_provider()
opus_path = tmp_path / "voice.opus"
opus_path.write_bytes(b"fake opus data")
conversions: list[tuple[str, str]] = []
async def fake_convert_audio_to_wav(
audio_path: str, output_path: str | None = None
):
if output_path is None:
output_path = str(tmp_path / "converted.wav")
conversions.append((audio_path, output_path))
Path(output_path).write_bytes(b"fake wav data")
return output_path
monkeypatch.setattr(
"astrbot.core.utils.media_utils.get_astrbot_temp_path",
lambda: str(tmp_path),
)
monkeypatch.setattr(
"astrbot.core.utils.media_utils.convert_audio_to_wav",
fake_convert_audio_to_wav,
)
try:
result = await provider.get_text(str(opus_path))
assert result == "transcribed text"
assert conversions and conversions[0][0] == str(opus_path)
converted_path = Path(conversions[0][1])
assert converted_path.suffix == ".wav"
assert not converted_path.exists()
create_mock = provider.client.audio.transcriptions.create
create_mock.assert_awaited_once()
file_arg = create_mock.await_args.kwargs["file"]
assert file_arg[0] == "audio.wav"
assert file_arg[1].name.endswith(".wav")
file_arg[1].close()
finally:
await provider.terminate()