使用 Gemini 来翻译 .srt 字幕文件

Neuroplexus · 2025 年1 月 13 日 01:10

翻译结果展示

上译下原模式

image453×712 107 KB

源代码

from fastapi import FastAPI, HTTPException, File, UploadFile
from fastapi.responses import JSONResponse, StreamingResponse
from dotenv import load_dotenv
import os
import google.generativeai as genai
from google.generativeai.types import HarmCategory, HarmBlockThreshold
from typing import List, AsyncGenerator
import re
import asyncio

load_dotenv()

# 创建 FastAPI 应用实例
app = FastAPI()

# 配置 Gemini API
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
model = genai.GenerativeModel('gemini-pro')

# 关闭安全过滤器 (请谨慎使用)
SAFETY_SETTINGS = {
        HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: HarmBlockThreshold.BLOCK_NONE,
        HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_NONE,
        HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_NONE,
        HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE
}


# 常量 Prompt
TRANSLATION_PROMPT = """
你是一个.srt字幕文件翻译助手。你的任务是将以下**当前字幕**内容翻译成自然流畅的中文，表达优美且地道。
**请勿重复翻译上下文。你只需要翻译当前字幕内容。请勿翻译下文！**

请注意，你**不能**使用 Markdown 语法。

字幕内容如下:
当前字幕：{current_subtitle}
前{context_translated_count}句已翻译字幕：{context_translated_str}
前{context_untranslated_count}句未翻译字幕：{context_untranslated_str}


请按照以下步骤操作：
1. **直译**：逐句翻译**当前字幕**内容，确保基本语义准确。
2. **反思错误**：
   - 检查是否存在语言不通顺的地方。
   - 分析语法、语义和文化差异导致的翻译问题。
3. **意译优化**：根据错误反思调整译文，使其更加符合中文表达习惯。
4. **生成 XML**：将完成的翻译内容输出到 `<result>` 标签中。
翻译时避免机械化，确保语言流畅自然。使用优雅的表达，不加入过多语气词。
永远将翻译内容以 `<result>` 标签包裹并放在输出的最后。当输出内容被截断时，直接续写，无需重复之前的内容。

开始翻译：
"""


def parse_srt(srt_content: str) -> List[dict]:
    """Parses an SRT file content and returns a list of subtitle dictionaries."""
    lines = srt_content.strip().split("\n")
    subtitles = []
    i = 0
    while i < len(lines):
        try:
            index = int(lines[i])
            i += 1
            time_str = lines[i]
            i += 1
            text_lines = []
            while i < len(lines) and lines[i].strip() != "":
                text_lines.append(lines[i])
                i += 1
            text = "\n".join(text_lines)
            subtitles.append({
                "index": index,
                "time": time_str,
                "text": text
            })
            i += 1  # Skip empty line
        except:
            i += 1
            continue
    return subtitles


def get_context(subtitles: List[dict], current_index: int, translated_subtitles: List[dict], context_size: int = 2) -> dict:
    """Gets the context subtitles before the current subtitle."""
    context_translated = []
    context_untranslated = []

    for i in range(max(0, current_index - context_size), current_index):
        if translated_subtitles and i < len(translated_subtitles):
            context_translated.append(translated_subtitles[i]['translated'])
        else:
            if i < len(subtitles):
                context_untranslated.append(subtitles[i]['text'])
    return {
        "translated": context_translated,
        "untranslated": context_untranslated
    }


def translate_srt(srt_content: str, display_mode: str = "only_translated") -> str:
    """Translates an SRT file content using Gemini API with context."""
    subtitles = parse_srt(srt_content)
    translated_subtitles = []

    for i, subtitle in enumerate(subtitles):
        context = get_context(subtitles, i, translated_subtitles)
        context_translated_str = "\n".join(context["translated"])
        context_untranslated_str = "\n".join(context["untranslated"])
        
        prompt = TRANSLATION_PROMPT.format(
            current_subtitle=subtitle['text'],
            context_translated_str=context_translated_str,
            context_untranslated_str=context_untranslated_str,
            context_translated_count=len(context["translated"]),
            context_untranslated_count=len(context["untranslated"]),
        )

        try:
            response = model.generate_content(prompt, safety_settings=SAFETY_SETTINGS)
            translated_text = ""
            if response.text:
                try:
                    translated_text = re.search(r'<result>(.*?)<\/result>', response.text, re.DOTALL).group(1).strip()
                except:
                    translated_text = response.text
            else:
                translated_text = "翻译失败"
              
        except Exception as e:
            translated_text = f"翻译失败：{str(e)}"

        translated_subtitles.append({
            "index": subtitle['index'],
            "time": subtitle['time'],
            "original": subtitle['text'],
            "translated": translated_text
        })

    output_srt = ""
    for sub in translated_subtitles:
      if display_mode == "only_translated":
         output_srt += f"{sub['index']}\n"
         output_srt += f"{sub['time']}\n"
         output_srt += f"{sub['translated']}\n\n"
      elif display_mode == "original_above_translated":
        output_srt += f"{sub['index']}\n"
        output_srt += f"{sub['time']}\n"
        output_srt += f"{sub['original']}\n"
        output_srt += f"{sub['translated']}\n\n"
      elif display_mode == "translated_above_original":
        output_srt += f"{sub['index']}\n"
        output_srt += f"{sub['time']}\n"
        output_srt += f"{sub['translated']}\n"
        output_srt += f"{sub['original']}\n\n"

    return output_srt


async def translate_srt_stream(srt_content: str, display_mode: str = "only_translated") -> AsyncGenerator[str, None]:
    """Translates an SRT file content using Gemini API with context and streaming."""
    subtitles = parse_srt(srt_content)
    translated_subtitles = []
    
    for i, subtitle in enumerate(subtitles):
        context = get_context(subtitles, i, translated_subtitles)
        context_translated_str = "\n".join(context["translated"])
        context_untranslated_str = "\n".join(context["untranslated"])

        prompt = TRANSLATION_PROMPT.format(
            current_subtitle=subtitle['text'],
            context_translated_str=context_translated_str,
            context_untranslated_str=context_untranslated_str,
            context_translated_count=len(context["translated"]),
            context_untranslated_count=len(context["untranslated"]),
        )
        translated_text = ""
        try:
            response_stream = model.generate_content(prompt, stream=True, safety_settings=SAFETY_SETTINGS)
            
            for chunk in response_stream:
                translated_text += chunk.text
            if translated_text:
              try:
                translated_text = re.search(r'<result>(.*?)<\/result>', translated_text, re.DOTALL).group(1).strip()
              except:
                  pass
            else:
              translated_text = "翻译失败"
        except Exception as e:
            translated_text = f"翻译失败：{str(e)}"
            

        translated_subtitles.append({
            "index": subtitle['index'],
            "time": subtitle['time'],
            "original": subtitle['text'],
            "translated": translated_text
        })
        
        output_srt = ""
        if display_mode == "only_translated":
           output_srt = f"{subtitle['index']}\n"
           output_srt += f"{subtitle['time']}\n"
           output_srt += f"{translated_text}\n\n"
        elif display_mode == "original_above_translated":
          output_srt = f"{subtitle['index']}\n"
          output_srt += f"{subtitle['time']}\n"
          output_srt += f"{subtitle['text']}\n"
          output_srt += f"{translated_text}\n\n"
        elif display_mode == "translated_above_original":
           output_srt = f"{subtitle['index']}\n"
           output_srt += f"{subtitle['time']}\n"
           output_srt += f"{translated_text}\n"
           output_srt += f"{subtitle['text']}\n\n"

        yield output_srt


@app.post("/translate", response_class=JSONResponse)
async def translate_endpoint(file: UploadFile = File(...), display_mode: str = "only_translated"):
    """API endpoint for translating SRT file."""
    try:
        contents = await file.read()
        srt_content = contents.decode("utf-8")
        translated_srt = translate_srt(srt_content, display_mode)
        return JSONResponse(content={"translated_srt": translated_srt})
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))


@app.post("/translate-stream")
async def translate_stream_endpoint(file: UploadFile = File(...), display_mode: str = "only_translated"):
    """API endpoint for streaming translation of SRT file."""
    try:
        contents = await file.read()
        srt_content = contents.decode("utf-8")
        return StreamingResponse(translate_srt_stream(srt_content, display_mode), media_type="text/plain")
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))


if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)

How To?

需要的库

altair==5.5.0
annotated-types==0.7.0
anyio==4.8.0
attrs==24.3.0
blinker==1.9.0
cachetools==5.5.0
certifi==2024.12.14
charset-normalizer==3.4.1
click==8.1.8
fastapi==0.115.6
gitdb==4.0.12
GitPython==3.1.44
google-ai-generativelanguage==0.6.10
google-api-core==2.24.0
google-api-python-client==2.158.0
google-auth==2.37.0
google-auth-httplib2==0.2.0
google-generativeai==0.8.3
googleapis-common-protos==1.66.0
grpcio==1.69.0
grpcio-status==1.69.0
h11==0.14.0
httplib2==0.22.0
idna==3.10
Jinja2==3.1.5
jsonschema==4.23.0
jsonschema-specifications==2024.10.1
markdown-it-py==3.0.0
MarkupSafe==3.0.2
mdurl==0.1.2
narwhals==1.21.1
numpy==2.2.1
packaging==24.2
pandas==2.2.3
pillow==11.1.0
proto-plus==1.25.0
protobuf==5.29.3
pyarrow==18.1.0
pyasn1==0.6.1
pyasn1_modules==0.4.1
pydantic==2.10.5
pydantic_core==2.27.2
pydeck==0.9.1
Pygments==2.19.1
pyparsing==3.2.1
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-multipart==0.0.20
pytz==2024.2
referencing==0.35.1
requests==2.32.3
rich==13.9.4
rpds-py==0.22.3
rsa==4.9
six==1.17.0
smmap==5.0.2
sniffio==1.3.1
starlette==0.41.3
streamlit==1.41.1
tenacity==9.0.0
toml==0.10.2
tornado==6.4.2
tqdm==4.67.1
typing_extensions==4.12.2
tzdata==2024.2
uritemplate==4.1.1
urllib3==2.3.0
uvicorn==0.34.0
watchdog==6.0.0

环境变量

在 .env 中添加 GEMINI_API_KEY 变量, 值为您的 Gemini API Key

WebUI

为了操作方便, 您可以创建一个 WebUI 文件:

import streamlit as st
import requests
import io

st.title("SRT 字幕翻译助手")

uploaded_file = st.file_uploader("上传 .srt 文件", type=["srt"])

translation_mode = st.radio("选择翻译模式", ["非流式", "流式"])
display_mode_options = {
    "仅译文": "only_translated",
    "上原下译": "original_above_translated",
    "上译下原": "translated_above_original"
}
display_mode_name = st.radio("选择显示模式", list(display_mode_options.keys()))
display_mode = display_mode_options[display_mode_name]


if uploaded_file is not None:
    if st.button("开始翻译"):
        try:
            files = {"file": uploaded_file.getvalue()}
            params = {"display_mode": display_mode}

            if translation_mode == "非流式":
                response = requests.post("http://localhost:8000/translate", files=files, params=params)
                response.raise_for_status()  # Raise HTTPError for bad responses (4xx or 5xx)
                translated_srt = response.json()["translated_srt"]
                st.text_area("翻译结果", value=translated_srt, height=300)

            elif translation_mode == "流式":
                response = requests.post("http://localhost:8000/translate-stream", files=files, params=params, stream=True)
                response.raise_for_status()  # Raise HTTPError for bad responses (4xx or 5xx)

                placeholder = st.empty()
                full_text = ""

                for chunk in response.iter_content(chunk_size=8192):
                    if chunk:
                        text_chunk = chunk.decode('utf-8')
                        full_text += text_chunk
                        placeholder.text_area("翻译结果", value=full_text, height=300)


        except requests.exceptions.RequestException as e:
            st.error(f"翻译失败，请检查是否启动 FastAPI 服务: {e}")
        except Exception as e:
            st.error(f"An error occurred: {e}")

然后通过如下方法运行:

streamlit run webui.py

Also See

特色功能列表

支持流式
给模型上下文

如果对你有帮助, 请点击我的头像, 对我认可开发调优领域, 非常感谢!
或者, 可以留下你的点赞吗? ~~真的很想要最佳新人我什么都会做的~~

stevessr · 2025 年1 月 13 日 01:11

٩(•̤̀ᵕ•̤́๑)ᵒᵏᵎᵎᵎᵎ

gsnqazwsx · 2025 年1 月 13 日 01:11

又一篇优秀教程

Metatron · 2025 年1 月 13 日 01:13

gemini-pro是什么古老模型

tonie · 2025 年1 月 13 日 01:15

我懒不会，谁会可以直接打个包？

MatsuzakaSato · 2025 年1 月 13 日 01:16

我嘞个豆啊你这是把所有package全列出来了？

Cimix · 2025 年1 月 13 日 01:17

挺好的，gemini的翻译其实效果不错

Neuroplexus · 2025 年1 月 13 日 01:17

pip freeze 了

handsome · 2025 年1 月 13 日 02:07

不错呀，感谢

ttio · 2025 年1 月 13 日 03:21

感谢分享

DROP_TABLE.user_info · 2025 年1 月 13 日 03:36

感谢佬友分享

system · 2025 年2 月 12 日 03:37

此话题已在最后回复的 30 天后被自动关闭。不再允许新回复。

话题		回复	浏览量
将Telegram全自动翻译更进一步开发调优 Telegram , DeepLX	157	11105	2024 年12 月 14 日
AsrTools：智能语音转字幕文本工具资源荟萃	38	1870	2025 年1 月 10 日
兼容 DeepLX 的谷歌微软翻译 Api 开发调优软件开发	4	481	2025 年1 月 11 日
无需任何API，让你在油管用上中文同声传译（油猴脚本配合沉浸式翻译，Youtube字幕实时转语音）开发调优人工智能 , 软件开发	65	2191	2024 年12 月 16 日
移植“沉浸式翻译”到Thunderbird实现邮件翻译开发调优沉浸式翻译 , 软件开发	26	1833	2025 年2 月 13 日