google gemma 大模型私有部署

google gemma 大模型私有部署
- Gemma 大模型生态
- Gemma3

Gemma 大模型生态

Gemma 基于 Gemini 技术构建的轻量级模型系列
CodeGemma 代码生成大模型
PaliGemma 视觉大模型
ShieldGemma 生成式 AI 应用是否违反安全政策的评估模型

为什么关注 Gemma

Gemma 是 Google 基于 Gemini 技术构建的轻量级模型系列
Google 的开放性很好
Google 的技术实力雄厚

Gemma

Gemma 是 Google 基于 Gemini 技术构建的轻量级模型系列。Gemma 3 模型采用多模态模型（可处理文本和图像），并具有 128K 上下文窗口，支持超过 140 种语言。Gemma 3 提供 1B、4B、12B 和 27B 四种参数大小，在问答、摘要和推理等任务中表现出色，其紧凑的设计使其能够在资源有限的设备上部署。

CodeGemma

CodeGemma 是功能强大的轻量级模型的集合，一些编码任务，例如填充中间代码补全、代码生成、自然语言处理包括理解能力、数学推理和指导。

PaliGemma

PaliGemma 2 和 PaliGemma 是轻量级开放式视觉语言模型 (VLM)，灵感源自 PaLI-3，并基于 SigLIP 视觉模型和 Gemma 语言模型等开放式组件。PaliGemma 同时接受图片和文本作为输入，并且可以回答有关图片的详细问题和背景信息，这意味着 PaliGemma 可以对图片进行更深入的分析，并提供有用的洞见，例如为图片和短视频添加字幕、检测对象以及读取图片中嵌入的文本。

ShieldGemma

ShieldGemma 是一组经过指令调整的模型，用于根据一组定义的安全政策评估文本和图片的安全性。您可以将此模型作为生成式 AI 应用更大规模实现的一部分，以帮助评估生成式 AI 应用是否违反安全政策并防止其违反安全政策。ShieldGemma 系列模型提供开放权重的功能，可让您根据自己的特定用例对其进行微调。

Gemma3

支持多模态，支持图片和文本输入
128K 个令牌上下文，不支持函数调用机制，但是支持基于提示词的工具调用
广泛的语言支持，支持超过 140 种语言
对开发者友好的多个模型大小 1B 4B 12B 27B

训练数据集

这些模型是使用包含各种来源的文本数据集进行训练的。270 亿个词元的模型使用 14 万亿个词元进行训练，120 亿个词元的模型使用 12 万亿个词元进行训练，40 亿个词元的模型使用 4 万亿个词元进行训练，10 亿个词元的模型使用 2 万亿个词元进行训练。训练数据的知识截止日期为 2024 年 8 月。

\

网络文档：丰富多样的网络文本集确保模型接触到各种语言风格、主题和词汇。训练数据集包含 140 多种语言的内容。
代码：向模型展示代码有助于其学习编程语言的语法和模式，从而提高其生成代码和理解与代码相关的问题的能力。
数学：通过数学文本进行训练有助于模型学习逻辑推理、符号表示法，并解答数学问题。
图片：借助各种各样的图片，该模型可以执行图片分析和视觉数据提取任务。

Gemma 性能

Ollama 下载

ollama 命令行

$ ollama run gemma3 "图片里有什么  /private/tmp/demo.jpg"
Added image '/private/tmp/demo.jpg'
根据这张图片，我们可以看到：

*   **西格·斯内普**正在用手把罗恩·韦斯莱的头发拉起来。
*   **罗恩·韦斯莱**正在写东西。
*   还有其他学生正在写东西。

这张图片来自《哈利·波特与魔法石》电影。

dify open-webui 集成

API 调用

使用 Ollama 提供服务
- 支持 ollama api
- 支持 openai api

curl http://localhost:11434/api/generate -d '{\
  "model": "gemma3",\
  "prompt":"roses are red"\
}'

#添加 base64 编码的图片列表以使用视觉输入
curl http://localhost:11434/api/generate -d '{\
  "model": "gemma3",\
  "prompt":"caption this image",\
  "images":[...]\
}'

ReACT 提示词调用

不支持函数调用机制，但是支持 ReACT 提示词

\

{
  "model": "gemma3",
  "stream": false,
  "options": {
    "stop": ["Observation"]
  },
  "messages": [
    {
      "role": "system",
      "content": "Respond to the human as helpfully and accurately as possible. \n\n\n\nYou have access to the following tools:\n\n[{\"name\": \"current_time\", \"description\": \"A tool for getting the current time.\", \"parameters\": {\"type\": \"object\", \"properties\": {}, \"required\": []}}, {\"name\": \"localtime_to_timestamp\", \"description\": \"A tool for localtime convert to timestamp\", \"parameters\": {\"type\": \"object\", \"properties\": {\"localtime\": {\"type\": \"string\", \"description\": \"\"}, \"timezone\": {\"type\": \"string\", \"description\": \"\"}}, \"required\": [\"localtime\"]}}, {\"name\": \"timestamp_to_localtime\", \"description\": \"A tool for timestamp convert to localtime\", \"parameters\": {\"type\": \"object\", \"properties\": {\"timestamp\": {\"type\": \"number\", \"description\": \"\"}, \"timezone\": {\"type\": \"string\", \"description\": \"\"}}, \"required\": [\"timestamp\"]}}, {\"name\": \"timezone_conversion\", \"description\": \"A tool to convert time to equivalent time zone\", \"parameters\": {\"type\": \"object\", \"properties\": {\"current_time\": {\"type\": \"string\", \"description\": \"\"}, \"current_timezone\": {\"type\": \"string\", \"description\": \"\"}, \"target_timezone\": {\"type\": \"string\", \"description\": \"\"}}, \"required\": [\"current_time\", \"current_timezone\", \"target_timezone\"]}}, {\"name\": \"weekday\", \"description\": \"A tool for calculating the weekday of a given date by year, month and day.\", \"parameters\": {\"type\": \"object\", \"properties\": {\"year\": {\"type\": \"number\", \"description\": \"\"}, \"month\": {\"type\": \"number\", \"description\": \"\"}, \"day\": {\"type\": \"number\", \"description\": \"\"}}, \"required\": [\"year\", \"month\", \"day\"]}}]\n\nUse a json blob to specify a tool by providing an action key (tool name) and an action_input key (tool input).\nValid \"action\" values: \"Final Answer\" or current_time, localtime_to_timestamp, timestamp_to_localtime, timezone_conversion, weekday\n\nProvide only ONE action per $JSON_BLOB, as shown:\n\n```\n{\n  \"action\": $TOOL_NAME,\n  \"action_input\": $ACTION_INPUT\n}\n```\n\nFollow this format:\n\nQuestion: input question to answer\nThought: consider previous and subsequent steps\nAction:\n```\n$JSON_BLOB\n```\nObservation: action result\n... (repeat Thought/Action/Observation N times)\nThought: I know what to respond\nAction:\n```\n{\n  \"action\": \"Final Answer\",\n  \"action_input\": \"Final response to human\"\n}\n```\n\nBegin! Reminder to ALWAYS respond with a valid json blob of a single action. Use tools if necessary. Respond directly if appropriate. Format is Action:```$JSON_BLOB```then Observation:.\n"
    },
    {
      "role": "user",
      "content": "当前时间是多少"
    },
    {
      "role": "assistant",
      "content": "Thought: I need to get the current time to answer the question.\n\nAction: {\"action_name\": \"current_time\", \"action_input\": {}}\n\nObservation: 2025-05-05 05:40:54\n\n"
    },
    {
      "role": "user",
      "content": "continue"
    }
  ]
}

工具调用示例

google gemma 大模型私有部署
- Gemma 大模型生态
- Gemma3