Lyria 3 是 Google 的音乐生成模型系列,可通过 Gemini API 使用。借助 Lyria 3,您可以根据文本提示或图片生成高质量的 44.1 kHz 立体声音频。这些模型可提供结构连贯的音频,包括人声、同步歌词和完整的乐器编排。
Lyria 3 系列包含两个模型:
| 模型 | 模型 ID | 适用场景 | 时长 | 输出 |
|---|---|---|---|---|
| Lyria 3 Clip | lyria-3-clip-preview |
短视频片段、循环、预览 | 30 秒 | MP3 |
| Lyria 3 Pro | lyria-3-pro-preview |
包含主歌、副歌和桥段的完整歌曲 | 几分钟(可通过提示控制) | MP3 |
这两个模型都可以使用标准 generateContent 方法和新的
Interactions API,支持多模态
输入(文本和图片),并生成 44.1 kHz 高保真立体声 音频。
生成音乐片段
Lyria 3 Clip 模型始终生成 30 秒 的片段。如需生成片段,请使用文本提示调用 generateContent 方法。响应始终包含生成的歌词和歌曲结构以及音频。
Python
from google import genai
client = genai.Client()
response = client.models.generate_content(
model="lyria-3-clip-preview",
contents="Create a 30-second cheerful acoustic folk song with "
"guitar and harmonica.",
)
# Parse the response
for part in response.parts:
if part.text is not None:
print(part.text)
elif part.inline_data is not None:
with open("clip.mp3", "wb") as f:
f.write(part.inline_data.data)
print("Audio saved to clip.mp3")
JavaScript
import { GoogleGenAI } from "@google/genai";
import * as fs from "node:fs";
const ai = new GoogleGenAI({});
async function main() {
const response = await ai.models.generateContent({
model: "lyria-3-clip-preview",
contents: "Create a 30-second cheerful acoustic folk song with " +
"guitar and harmonica.",
});
for (const part of response.candidates[0].content.parts) {
if (part.text) {
console.log(part.text);
} else if (part.inlineData) {
const buffer = Buffer.from(part.inlineData.data, "base64");
fs.writeFileSync("clip.mp3", buffer);
console.log("Audio saved to clip.mp3");
}
}
}
main();
Go
package main
import (
"context"
"fmt"
"log"
"os"
"google.golang.org/genai"
)
func main() {
ctx := context.Background()
client, err := genai.NewClient(ctx, nil)
if err != nil {
log.Fatal(err)
}
result, err := client.Models.GenerateContent(
ctx,
"lyria-3-clip-preview",
genai.Text("Create a 30-second cheerful acoustic folk song " +
"with guitar and harmonica."),
nil,
)
if err != nil {
log.Fatal(err)
}
for _, part := range result.Candidates[0].Content.Parts {
if part.Text != "" {
fmt.Println(part.Text)
} else if part.InlineData != nil {
err := os.WriteFile("clip.mp3", part.InlineData.Data, 0644)
if err != nil {
log.Fatal(err)
}
fmt.Println("Audio saved to clip.mp3")
}
}
}
Java
import com.google.genai.Client;
import com.google.genai.types.GenerateContentResponse;
import com.google.genai.types.Part;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
public class GenerateMusicClip {
public static void main(String[] args) throws IOException {
try (Client client = new Client()) {
GenerateContentResponse response = client.models.generateContent(
"lyria-3-clip-preview",
"Create a 30-second cheerful acoustic folk song with "
+ "guitar and harmonica.");
for (Part part : response.parts()) {
if (part.text().isPresent()) {
System.out.println(part.text().get());
} else if (part.inlineData().isPresent()) {
var blob = part.inlineData().get();
if (blob.data().isPresent()) {
Files.write(Paths.get("clip.mp3"), blob.data().get());
System.out.println("Audio saved to clip.mp3");
}
}
}
}
}
}
REST
curl -s -X POST \
"https://generativelanguage.googleapis.com/v1beta/models/lyria-3-clip-preview:generateContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [{
"parts": [
{"text": "Create a 30-second cheerful acoustic folk song with guitar and harmonica."}
]
}]
}'
C#
using System.Threading.Tasks;
using Google.GenAI;
using Google.GenAI.Types;
using System.IO;
public class GenerateMusicClip {
public static async Task main() {
var client = new Client();
var response = await client.Models.GenerateContentAsync(
model: "lyria-3-clip-preview",
contents: "Create a 30-second cheerful acoustic folk song with guitar and harmonica."
);
foreach (var part in response.Candidates[0].Content.Parts) {
if (part.Text != null) {
Console.WriteLine(part.Text);
} else if (part.InlineData != null) {
await File.WriteAllBytesAsync("clip.mp3", part.InlineData.Data);
Console.WriteLine("Audio saved to clip.mp3");
}
}
}
}
生成完整歌曲
使用 lyria-3-pro-preview 模型生成时长几分钟的完整歌曲。Pro 模型可以理解音乐结构,并能创作出包含层次分明的主歌、副歌和桥段的乐曲。您可以通过在提示中指定时长(例如“创作一首 2 分钟的歌曲”)或通过使用 时间戳来定义结构,从而影响
时长。
Python
response = client.models.generate_content(
model="lyria-3-pro-preview",
contents="An epic cinematic orchestral piece about a journey home. "
"Starts with a solo piano intro, builds through sweeping "
"strings, and climaxes with a massive wall of sound.",
)
JavaScript
const response = await ai.models.generateContent({
model: "lyria-3-pro-preview",
contents: "An epic cinematic orchestral piece about a journey home. " +
"Starts with a solo piano intro, builds through sweeping " +
"strings, and climaxes with a massive wall of sound.",
});
Go
result, err := client.Models.GenerateContent(
ctx,
"lyria-3-pro-preview",
genai.Text("An epic cinematic orchestral piece about a journey " +
"home. Starts with a solo piano intro, builds through " +
"sweeping strings, and climaxes with a massive wall of sound."),
nil,
)
Java
GenerateContentResponse response = client.models.generateContent(
"lyria-3-pro-preview",
"An epic cinematic orchestral piece about a journey home. "
+ "Starts with a solo piano intro, builds through sweeping "
+ "strings, and climaxes with a massive wall of sound.");
REST
curl -s -X POST \
"https://generativelanguage.googleapis.com/v1beta/models/lyria-3-pro-preview:generateContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [{
"parts": [
{"text": "An epic cinematic orchestral piece about a journey home. Starts with a solo piano intro, builds through sweeping strings, and climaxes with a massive wall of sound."}
]
}]
}'
C#
var response = await client.Models.GenerateContentAsync(
model: "lyria-3-pro-preview",
contents: "An epic cinematic orchestral piece about a journey home. " +
"Starts with a solo piano intro, builds through sweeping " +
"strings, and climaxes with a massive wall of sound."
);
选择输出格式
默认情况下,Lyria 3 模型以 MP3 格式生成音频。对于 Lyria 3 Pro,您还可以通过在 generationConfig 中设置 response_mime_type 来请求以 WAV 格式输出。
Python
response = client.models.generate_content(
model="lyria-3-pro-preview",
contents="An atmospheric ambient track.",
config=types.GenerateContentConfig(
response_modalities=["AUDIO", "TEXT"],
response_mime_type="audio/wav",
),
)
JavaScript
const response = await ai.models.generateContent({
model: "lyria-3-pro-preview",
contents: "An atmospheric ambient track.",
config: {
responseModalities: ["AUDIO", "TEXT"],
responseMimeType: "audio/wav",
},
});
Go
config := &genai.GenerateContentConfig{
ResponseModalities: []string{"AUDIO", "TEXT"},
ResponseMIMEType: "audio/wav",
}
result, err := client.Models.GenerateContent(
ctx,
"lyria-3-pro-preview",
genai.Text("An atmospheric ambient track."),
config,
)
Java
GenerateContentConfig config = GenerateContentConfig.builder()
.responseModalities("AUDIO", "TEXT")
.responseMimeType("audio/wav")
.build();
GenerateContentResponse response = client.models.generateContent(
"lyria-3-pro-preview",
"An atmospheric ambient track.",
config);
C#
var config = new GenerateContentConfig {
ResponseModalities = { "AUDIO", "TEXT" },
ResponseMimeType = "audio/wav"
};
var response = await client.Models.GenerateContentAsync(
model: "lyria-3-pro-preview",
contents: "An atmospheric ambient track.",
config: config
);
REST
curl -s -X POST \
"https://generativelanguage.googleapis.com/v1beta/models/lyria-3-pro-preview:generateContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [{
"parts": [
{"text": "An atmospheric ambient track."}
]
}],
"generationConfig": {
"responseModalities": ["AUDIO", "TEXT"],
"responseMimeType": "audio/wav"
}
}'
解析响应
Lyria 3 的响应包含多个部分。文本部分包含生成的歌词或歌曲结构的 JSON 描述。包含 inline_data 的部分包含音频字节。
Python
lyrics = []
audio_data = None
for part in response.parts:
if part.text is not None:
lyrics.append(part.text)
elif part.inline_data is not None:
audio_data = part.inline_data.data
if lyrics:
print("Lyrics:\n" + "\n".join(lyrics))
if audio_data:
with open("output.mp3", "wb") as f:
f.write(audio_data)
JavaScript
const lyrics = [];
let audioData = null;
for (const part of response.candidates[0].content.parts) {
if (part.text) {
lyrics.push(part.text);
} else if (part.inlineData) {
audioData = Buffer.from(part.inlineData.data, "base64");
}
}
if (lyrics.length) {
console.log("Lyrics:\n" + lyrics.join("\n"));
}
if (audioData) {
fs.writeFileSync("output.mp3", audioData);
}
Go
var lyrics []string
var audioData []byte
for _, part := range result.Candidates[0].Content.Parts {
if part.Text != "" {
lyrics = append(lyrics, part.Text)
} else if part.InlineData != nil {
audioData = part.InlineData.Data
}
}
if len(lyrics) > 0 {
fmt.Println("Lyrics:\n" + strings.Join(lyrics, "\n"))
}
if audioData != nil {
err := os.WriteFile("output.mp3", audioData, 0644)
if err != nil {
log.Fatal(err)
}
}
Java
List<String> lyrics = new ArrayList<>();
byte[] audioData = null;
for (Part part : response.parts()) {
if (part.text().isPresent()) {
lyrics.add(part.text().get());
} else if (part.inlineData().isPresent()) {
audioData = part.inlineData().get().data().get();
}
}
if (!lyrics.isEmpty()) {
System.out.println("Lyrics:\n" + String.join("\n", lyrics));
}
if (audioData != null) {
Files.write(Paths.get("output.mp3"), audioData);
}
C#
var lyrics = new List<string>();
byte[] audioData = null;
foreach (var part in response.Candidates[0].Content.Parts) {
if (part.Text != null) {
lyrics.Add(part.Text);
} else if (part.InlineData != null) {
audioData = part.InlineData.Data;
}
}
if (lyrics.Count > 0) {
Console.WriteLine("Lyrics:\n" + string.Join("\n", lyrics));
}
if (audioData != null) {
await File.WriteAllBytesAsync("output.mp3", audioData);
}
REST
# The output from the REST API is a JSON object containing base64 encoded data.
# You can extract the text or the audio data using a tool like jq.
# To extract the audio and save it to a file:
curl ... | jq -r '.candidates[0].content.parts[] | select(.inlineData) | .inlineData.data' | base64 -d > output.mp3
根据图片生成音乐
Lyria 3 支持多模态输入,您可以提供最多 10 张图片 以及文本提示,模型将根据视觉内容创作音乐。
Python
from PIL import Image
image = Image.open("desert_sunset.jpg")
response = client.models.generate_content(
model="lyria-3-pro-preview",
contents=[
"An atmospheric ambient track inspired by the mood and "
"colors in this image.",
image,
],
)
JavaScript
const imageData = fs.readFileSync("desert_sunset.jpg");
const base64Image = imageData.toString("base64");
const response = await ai.models.generateContent({
model: "lyria-3-pro-preview",
contents: [
{ text: "An atmospheric ambient track inspired by the mood " +
"and colors in this image." },
{
inlineData: {
mimeType: "image/jpeg",
data: base64Image,
},
},
],
});
Go
imgData, err := os.ReadFile("desert_sunset.jpg")
if err != nil {
log.Fatal(err)
}
parts := []*genai.Part{
genai.NewPartFromText("An atmospheric ambient track inspired " +
"by the mood and colors in this image."),
&genai.Part{
InlineData: &genai.Blob{
MIMEType: "image/jpeg",
Data: imgData,
},
},
}
contents := []*genai.Content{
genai.NewContentFromParts(parts, genai.RoleUser),
}
result, err := client.Models.GenerateContent(
ctx,
"lyria-3-pro-preview",
contents,
nil,
)
Java
GenerateContentResponse response = client.models.generateContent(
"lyria-3-pro-preview",
Content.fromParts(
Part.fromText("An atmospheric ambient track inspired by "
+ "the mood and colors in this image."),
Part.fromBytes(
Files.readAllBytes(Path.of("desert_sunset.jpg")),
"image/jpeg")));
REST
curl -s -X POST \
"https://generativelanguage.googleapis.com/v1beta/models/lyria-3-pro-preview:generateContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H 'Content-Type: application/json' \
-d "{
\"contents\": [{
\"parts\":[
{\"text\": \"An atmospheric ambient track inspired by the mood and colors in this image.\"},
{
\"inline_data\": {
\"mime_type\":\"image/jpeg\",
\"data\": \"<BASE64_IMAGE_DATA>\"
}
}
]
}]
}"
C#
var response = await client.Models.GenerateContentAsync(
model: "lyria-3-pro-preview",
contents: new List<Part> {
Part.FromText("An atmospheric ambient track inspired by the mood and colors in this image."),
Part.FromBytes(await File.ReadAllBytesAsync("desert_sunset.jpg"), "image/jpeg")
}
);

提供自定义歌词
您可以自行编写歌词并将其包含在提示中。使用 [Verse]、[Chorus] 和 [Bridge] 等部分标记来帮助模型理解歌曲结构:
Python
prompt = """
Create a dreamy indie pop song with the following lyrics:
[Verse 1]
Walking through the neon glow,
city lights reflect below,
every shadow tells a story,
every corner, fading glory.
[Chorus]
We are the echoes in the night,
burning brighter than the light,
hold on tight, don't let me go,
we are the echoes down below.
[Verse 2]
Footsteps lost on empty streets,
rhythms sync to heartbeats,
whispers carried by the breeze,
dancing through the autumn leaves.
"""
response = client.models.generate_content(
model="lyria-3-pro-preview",
contents=prompt,
)
JavaScript
const prompt = `
Create a dreamy indie pop song with the following lyrics:
[Verse 1]
Walking through the neon glow,
city lights reflect below,
every shadow tells a story,
every corner, fading glory.
[Chorus]
We are the echoes in the night,
burning brighter than the light,
hold on tight, don't let me go,
we are the echoes down below.
[Verse 2]
Footsteps lost on empty streets,
rhythms sync to heartbeats,
whispers carried by the breeze,
dancing through the autumn leaves.
`;
const response = await ai.models.generateContent({
model: "lyria-3-pro-preview",
contents: prompt,
});
Go
prompt := `
Create a dreamy indie pop song with the following lyrics:
[Verse 1]
Walking through the neon glow,
city lights reflect below,
every shadow tells a story,
every corner, fading glory.
[Chorus]
We are the echoes in the night,
burning brighter than the light,
hold on tight, don't let me go,
we are the echoes down below.
[Verse 2]
Footsteps lost on empty streets,
rhythms sync to heartbeats,
whispers carried by the breeze,
dancing through the autumn leaves.
`
result, err := client.Models.GenerateContent(
ctx,
"lyria-3-pro-preview",
genai.Text(prompt),
nil,
)
Java
String prompt = """
Create a dreamy indie pop song with the following lyrics:
[Verse 1]
Walking through the neon glow,
city lights reflect below,
every shadow tells a story,
every corner, fading glory.
[Chorus]
We are the echoes in the night,
burning brighter than the light,
hold on tight, don't let me go,
we are the echoes down below.
[Verse 2]
Footsteps lost on empty streets,
rhythms sync to heartbeats,
whispers carried by the breeze,
dancing through the autumn leaves.
""";
GenerateContentResponse response = client.models.generateContent(
"lyria-3-pro-preview",
prompt);
C#
var prompt = @"
Create a dreamy indie pop song with the following lyrics:
[Verse 1]
Walking through the neon glow,
city lights reflect below,
every shadow tells a story,
every corner, fading glory.
[Chorus]
We are the echoes in the night,
burning brighter than the light,
hold on tight, don't let me go,
we are the echoes down below.
[Verse 2]
Footsteps lost on empty streets,
rhythms sync to heartbeats,
whispers carried by the breeze,
dancing through the autumn leaves.
";
var response = await client.Models.GenerateContentAsync(
model: "lyria-3-pro-preview",
contents: prompt
);
REST
curl -s -X POST \
"https://generativelanguage.googleapis.com/v1beta/models/lyria-3-pro-preview:generateContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [{
"parts": [
{"text": "Create a dreamy indie pop song with the following lyrics: ..."}
]
}]
}'
控制时间和结构
您可以使用时间戳准确指定歌曲中特定时刻发生的情况。这对于控制乐器何时进入、歌词何时出现以及歌曲如何进行非常有用:
Python
prompt = """
[0:00 - 0:10] Intro: Begin with a soft lo-fi beat and muffled
vinyl crackle.
[0:10 - 0:30] Verse 1: Add a warm Fender Rhodes piano melody
and gentle vocals singing about a rainy morning.
[0:30 - 0:50] Chorus: Full band with upbeat drums and soaring
synth leads. The lyrics are hopeful and uplifting.
[0:50 - 1:00] Outro: Fade out with the piano melody alone.
"""
response = client.models.generate_content(
model="lyria-3-pro-preview",
contents=prompt,
)
JavaScript
const prompt = `
[0:00 - 0:10] Intro: Begin with a soft lo-fi beat and muffled
vinyl crackle.
[0:10 - 0:30] Verse 1: Add a warm Fender Rhodes piano melody
and gentle vocals singing about a rainy morning.
[0:30 - 0:50] Chorus: Full band with upbeat drums and soaring
synth leads. The lyrics are hopeful and uplifting.
[0:50 - 1:00] Outro: Fade out with the piano melody alone.
`;
const response = await ai.models.generateContent({
model: "lyria-3-pro-preview",
contents: prompt,
});
Go
prompt := `
[0:00 - 0:10] Intro: Begin with a soft lo-fi beat and muffled
vinyl crackle.
[0:10 - 0:30] Verse 1: Add a warm Fender Rhodes piano melody
and gentle vocals singing about a rainy morning.
[0:30 - 0:50] Chorus: Full band with upbeat drums and soaring
synth leads. The lyrics are hopeful and uplifting.
[0:50 - 1:00] Outro: Fade out with the piano melody alone.
`
result, err := client.Models.GenerateContent(
ctx,
"lyria-3-pro-preview",
genai.Text(prompt),
nil,
)
Java
String prompt = """
[0:00 - 0:10] Intro: Begin with a soft lo-fi beat and muffled
vinyl crackle.
[0:10 - 0:30] Verse 1: Add a warm Fender Rhodes piano melody
and gentle vocals singing about a rainy morning.
[0:30 - 0:50] Chorus: Full band with upbeat drums and soaring
synth leads. The lyrics are hopeful and uplifting.
[0:50 - 1:00] Outro: Fade out with the piano melody alone.
""";
GenerateContentResponse response = client.models.generateContent(
"lyria-3-pro-preview",
prompt);
C#
var prompt = @"
[0:00 - 0:10] Intro: Begin with a soft lo-fi beat and muffled
vinyl crackle.
[0:10 - 0:30] Verse 1: Add a warm Fender Rhodes piano melody
and gentle vocals singing about a rainy morning.
[0:30 - 0:50] Chorus: Full band with upbeat drums and soaring
synth leads. The lyrics are hopeful and uplifting.
[0:50 - 1:00] Outro: Fade out with the piano melody alone.
";
var response = await client.Models.GenerateContentAsync(
model: "lyria-3-pro-preview",
contents: prompt
);
REST
curl -s -X POST \
"https://generativelanguage.googleapis.com/v1beta/models/lyria-3-pro-preview:generateContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [{
"parts": [
{"text": "[0:00 - 0:10] Intro: ..."}
]
}]
}'
生成纯乐器曲目
对于背景音乐、游戏配乐或任何不需要人声的用例,您可以提示模型生成纯乐器曲目:
Python
response = client.models.generate_content(
model="lyria-3-clip-preview",
contents="A bright chiptune melody in C Major, retro 8-bit "
"video game style. Instrumental only, no vocals.",
)
JavaScript
const response = await ai.models.generateContent({
model: "lyria-3-clip-preview",
contents: "A bright chiptune melody in C Major, retro 8-bit " +
"video game style. Instrumental only, no vocals.",
});
Go
result, err := client.Models.GenerateContent(
ctx,
"lyria-3-clip-preview",
genai.Text("A bright chiptune melody in C Major, retro 8-bit " +
"video game style. Instrumental only, no vocals."),
nil,
)
Java
GenerateContentResponse response = client.models.generateContent(
"lyria-3-clip-preview",
"A bright chiptune melody in C Major, retro 8-bit "
+ "video game style. Instrumental only, no vocals.");
C#
var response = await client.Models.GenerateContentAsync(
model: "lyria-3-clip-preview",
contents: "A bright chiptune melody in C Major, retro 8-bit " +
"video game style. Instrumental only, no vocals."
);
REST
curl -s -X POST \
"https://generativelanguage.googleapis.com/v1beta/models/lyria-3-clip-preview:generateContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [{
"parts": [
{"text": "A bright chiptune melody in C Major, retro 8-bit video game style. Instrumental only, no vocals."}
]
}]
}'
生成不同语言的音乐
Lyria 3 会以提示的语言生成歌词。如需生成带有法语歌词的歌曲,请使用法语编写提示。模型会调整其人声风格和发音,以匹配语言。
Python
response = client.models.generate_content(
model="lyria-3-pro-preview",
contents="Crée une chanson pop romantique en français sur un "
"coucher de soleil à Paris. Utilise du piano et de "
"la guitare acoustique.",
)
JavaScript
const response = await ai.models.generateContent({
model: "lyria-3-pro-preview",
contents: "Crée une chanson pop romantique en français sur un " +
"coucher de soleil à Paris. Utilise du piano et de " +
"la guitare acoustique.",
});
Go
result, err := client.Models.GenerateContent(
ctx,
"lyria-3-pro-preview",
genai.Text("Crée une chanson pop romantique en français sur un " +
"coucher de soleil à Paris. Utilise du piano et de " +
"la guitare acoustique."),
nil,
)
Java
GenerateContentResponse response = client.models.generateContent(
"lyria-3-pro-preview",
"Crée une chanson pop romantique en français sur un "
+ "coucher de soleil à Paris. Utilise du piano et de "
+ "la guitare acoustique.");
C#
var response = await client.Models.GenerateContentAsync(
model: "lyria-3-pro-preview",
contents: "Crée une chanson pop romantique en français sur un " +
"coucher de soleil à Paris. Utilise du piano et de " +
"la guitare acoustique."
);
REST
curl -s -X POST \
"https://generativelanguage.googleapis.com/v1beta/models/lyria-3-pro-preview:generateContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [{
"parts": [
{"text": "Crée une chanson pop romantique en français sur un coucher de soleil à Paris. Utilise du piano et de la guitare acoustique."}
]
}]
}'
模型智能
Lyria 3 会分析您的提示过程,模型会根据您的提示推断音乐结构(前奏、主歌、副歌、桥段等)。这会在生成音频之前发生,并确保结构连贯性和音乐性。
Interactions API
您可以将 Lyria 3 模型与 Interactions API 搭配使用; 这是一个与 Gemini 模型和智能体交互的统一接口。它简化了复杂多模态用例的状态管理和长时间运行的任务。
Python
from google import genai
client = genai.Client()
interaction = client.interactions.create(
model="lyria-3-pro-preview",
input="A melancholic jazz fusion track in D minor, " +
"featuring a smooth saxophone melody, walking bass line, " +
"and complex drum rhythms.",
)
for output in interaction.outputs:
if output.text:
print(output.text)
elif output.inline_data:
with open("interaction_output.mp3", "wb") as f:
f.write(output.inline_data.data)
print("Audio saved to interaction_output.mp3")
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const interaction = await client.interactions.create({
model: 'lyria-3-pro-preview',
input: 'A melancholic jazz fusion track in D minor, ' +
'featuring a smooth saxophone melody, walking bass line, ' +
'and complex drum rhythms.',
});
for (const output of interaction.outputs) {
if (output.text) {
console.log(output.text);
} else if (output.inlineData) {
const buffer = Buffer.from(output.inlineData.data, 'base64');
fs.writeFileSync('interaction_output.mp3', buffer);
console.log('Audio saved to interaction_output.mp3');
}
}
REST
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "lyria-3-pro-preview",
"input": "A melancholic jazz fusion track in D minor, featuring a smooth saxophone melody, walking bass line, and complex drum rhythms."
}'
提示指南
您的提示可以很简单,例如“一首关于可爱猫咪躲避水坑的民谣,女声和雨声”,也可以详细且结构化,例如:
一首 20 世纪 80 年代风格的合成器流行乐曲目,节奏强劲、合成器闪耀,副歌朗朗上口、气势恢宏。这首歌应具有复古未来主义风格,让人想起 20 世纪 80 年代的经典流行歌曲,并具有现代制作的润色。节奏应欢快且适合跳舞,约为 120 BPM,具有清晰的主歌-副歌结构和令人难忘的乐器旋律。歌词讲述的是为派对做准备时的心情。
简单和复杂的提示都可以为您提供良好的输出。我们建议您尝试这些提示,找到最适合自己的提示。
类型
在提示中首先说明您想要的音乐流派,例如嘻哈、摇滚和说唱。您可以指定多种流派的组合:
- 金属乐和说唱的融合
- 死亡金属和歌剧的组合
- 带有电子嗡鸣元素的古典乐曲
- 现代电子舞曲 (EDM) 与欧洲流行乐的混合
您还可以加入时代元素:
- 20 世纪 90 年代初的嘻哈
- 20 世纪 60 年代的法国耶耶流行乐
- 20 世纪 80 年代的电子乐实验
- 2000 年代的主流流行乐
如果您提示使用定制流派或区域变体,例如“柏林 techno”或“湾区 hyphy”,模型将尝试捕捉该本质,但可能并不总是能正确捕捉。
乐器
默认情况下,Lyria 3 将使用您期望的流派的乐器和工具来制作歌曲。您无需进行规定。
不过,除非您要求,否则舞曲不会包含萨克斯管。因此,如果您想要萨克斯管独奏,则需要提示:
一首舞曲,节奏强劲、合成器闪耀,副歌朗朗上口、气势恢宏。萨克斯管独奏应在桥段中出现。
您的提示可以包含特定乐器、乐器的声音以及乐器之间的互动方式。您可以使用这种组合来营造特定的氛围或质感:
- 浑浊、失真的低音与干净、清脆的踩镲对抗
- 温暖的模拟合成器音垫在干燥、亲密的吉他声下膨胀
- 由多层模糊吉他声组成的音墙,其中埋藏着遥远的人声
歌曲结构
您可以在提示中概述歌曲的进展。使用箭头或列表定义流程:
[Intro]->[Verse 1]->[Chorus]->[Verse 2]->[Chorus]->[Bridge]->[Outro]- 从安静的钢琴前奏开始,逐渐过渡到响亮的主歌,然后突然静音,最后爆发到副歌。
您还可以指定这些部分之间的能量水平变化方式:
- 在副歌前营造紧张感,然后在副歌前静音
- 整首歌逐渐渐强,一次添加一种乐器,直到形成混乱的音墙
- 桥段后突然停止,然后是无伴奏合唱的副歌
您还可以提示您希望发生特定事件的确切时间:
- 在 12 秒时达到高潮
- 每 2 秒有人说“什么”
- 副歌在 22 秒时开始
歌词
默认情况下,系统会生成人声和歌词。您可以提供自己的歌词,要求不提供歌词(或纯乐器),或者引导歌词生成朝着您想要的方向发展。
您的歌词将采用您编写提示时使用的语言。您还可以要求歌词采用其他语言,例如“用法语写歌词”。
使用自己的歌词
如需向模型提供自己的歌词,请在提示中添加歌词,并使用“歌词:”前缀:
Lyrics:
[Intro]
Oooh, oooh
[Verse 1]
Let's go
Let's go
Go with the flow
[Chorus]
...
您可以使用 [Intro]、[Verse 1]、[Pre-chorus]、[Chorus] 和 [Outro] 等部分标题为歌曲的各个部分添加前缀。
如果您希望重复某个字词或一行,例如回声或伴唱,您可以将其放在括号中:“Let's go (go)”。
提示模型编写歌词
如果您希望 Lyria 3 为您编写歌词,最好在提示中添加有关歌词内容的详细信息。否则,模型需要从您的音乐提示中推断主题,而这可能不是您想要的主题。
歌词讲述的是失去的爱和心碎的痛苦。歌手在回忆过去的一段感情以及涌现的回忆。
如果您想要重复的副歌,最好在提示中要求:
歌词讲述的是失去的爱和心碎的痛苦。歌手在回忆过去的一段感情以及涌现的回忆。一个强劲的副歌专注于克服痛苦并继续前进。
Lyria 3 会自动将歌词的结构引导到您请求的音乐类型,但您也可以在提示中再次强调这一点。例如:
一首 EDM 曲目,一遍又一遍地重复着同样的充满活力的短语。
您还可以提示使用并非严格意义上的歌词的人声效果,例如:
- 整首歌中重复播放电影中的一个样本,内容为“I can't believe this!”
- 一首充满活力的 techno 曲目,在低音部分之前,所有声音都停止了,一个小声音说“I don't know what I'm doing here”,然后音乐低音部分开始。
- 曲目以一段关于 20 世纪 90 年代电影比现在更好的对话开始。然后,曲目过渡到一首流行歌曲。
人声
您可以提示歌词的呈现方式。为了获得最佳效果,请指定详细的歌手个人资料,包括性别、音色和音域。
- 女高音:音色清澈、晶莹剔透,具有敏捷、高亢的 特点。能够发出空灵、轻柔的哨音高音。
- 女中音:低音浑厚、温暖、沙哑。音色烟熏,带有一丝声带颤音,充满灵魂和共鸣。
- 男高音:明亮、尖锐、充满活力。音色年轻,略带鼻音,具有很强的穿透力。
- 男中音:深沉、巧克力色、丝绒般顺滑。胸腔共鸣,声音舒缓、低吟。
- 饱经风霜的摇滚歌手(男):音色沙哑、粗犷,带有沙砾感, 让人想起 20 世纪 90 年代的垃圾摇滚。高音部分紧张,充满情感强度。
其他提示参数
您还可以添加以下参数来进一步优化提示:
- 调/音阶:指定音乐调(例如“G 大调”“D 小调”)。
- 曲调和氛围:使用描述性形容词(例如“怀旧”, “激进”“空灵”“梦幻”)。
- 时长:Clip 模型始终生成 30 秒的片段。对于 Pro 模型,请在提示中指定所需的时长(例如“创作一首 2 分钟的歌曲”)或使用时间戳来控制时长。
示例提示
以下是一些有效提示的示例:
"A 30-second lofi hip hop beat with dusty vinyl crackle, mellow Rhodes piano chords, a slow boom-bap drum pattern at 85 BPM, and a jazzy upright bass line. Instrumental only.""An upbeat, feel-good pop song in G major at 120 BPM with bright acoustic guitar strumming, claps, and warm vocal harmonies about a summer road trip.""A dark, atmospheric trap beat at 140 BPM with heavy 808 bass, eerie synth pads, sharp hi-hats, and a haunting vocal sample. In D minor."
最佳实践
- 先使用 Clip 进行迭代。使用速度更快的
lyria-3-clip-preview模型来尝试提示,然后再使用lyria-3-pro-preview生成完整歌曲。 - 内容要具体。模糊的提示会产生一般性的结果。提及乐器、BPM、调、曲调和结构,以获得最佳输出。
- 使用部分标记。
[Verse]、[Chorus]、[Bridge]标记为模型提供了清晰的结构。 - 将歌词与说明分开。提供自定义歌词时,请将歌词与音乐方向说明清晰分开。
限制
- 安全性:所有提示都经过安全过滤器的检查。触发过滤器的提示将被屏蔽。这包括请求特定歌手的声音或生成受版权保护的歌词的提示。
- 水印:所有生成的音频都包含 SynthID 音频水印,用于 标识。此水印人耳无法察觉,不会影响听觉体验。
- 多轮编辑:音乐生成是一个单轮过程。 Lyria 3 的当前版本不支持通过多个提示迭代编辑或优化生成的片段。
- 时长:Clip 模型始终生成 30 秒的片段。Pro 模型生成的歌曲时长为几分钟;确切时长可以通过提示来影响。
- 确定性:即使使用相同的提示,不同调用之间的结果也可能有所不同。