Hãy dùng thử Lyria 3, mô hình tạo nhạc mới của chúng tôi có khả năng tạo ra âm thanh nổi có độ trung thực cao từ văn bản và hình ảnh đầu vào.

Tạo nhạc bằng Lyria 3

Lyria 3 là nhóm mô hình tạo nhạc của Google, có sẵn thông qua Gemini API. Với Lyria 3, bạn có thể tạo âm thanh nổi 48kHz chất lượng cao từ câu lệnh văn bản hoặc từ hình ảnh. Các mô hình này mang lại sự nhất quán về cấu trúc, bao gồm giọng hát, lời bài hát được tính thời gian và bản phối nhạc cụ đầy đủ.

Nhóm Lyria 3 bao gồm 2 mô hình:

Mô hình	Mã kiểu máy	Phù hợp nhất cho	Thời lượng	Đầu ra
Đoạn trích Lyria 3	`lyria-3-clip-preview`	Đoạn video ngắn, vòng lặp, bản xem trước	30 giây	MP3
Lyria 3 Pro	`lyria-3-pro-preview`	Bài hát đầy đủ có khổ thơ, điệp khúc, đoạn chuyển tiếp	Vài phút (có thể kiểm soát thông qua câu lệnh)	MP3, WAV

Bạn có thể sử dụng cả hai mô hình bằng phương thức generateContent tiêu chuẩn và API Tương tác mới, hỗ trợ dữ liệu đầu vào đa phương thức (văn bản và hình ảnh) và tạo ra âm thanh nổi có độ trung thực cao 48kHz.

Tạo đoạn trích nhạc

Mô hình Lyria 3 luôn tạo ra một đoạn trích video dài 30 giây. Để tạo một đoạn trích video, hãy gọi phương thức generateContent và đặt response_modalities thành ["AUDIO", "TEXT"]. Việc đưa TEXT vào cho phép bạn nhận được lời bài hát hoặc cấu trúc bài hát được tạo cùng với âm thanh.

Python

from google import genai
from google.genai import types

client = genai.Client()

response = client.models.generate_content(
    model="lyria-3-clip-preview",
    contents="Create a 30-second cheerful acoustic folk song with "
             "guitar and harmonica.",
    config=types.GenerateContentConfig(
        response_modalities=["AUDIO", "TEXT"],
    ),
)

# Parse the response
for part in response.parts:
    if part.text is not None:
        print(part.text)
    elif part.inline_data is not None:
        with open("clip.mp3", "wb") as f:
            f.write(part.inline_data.data)
        print("Audio saved to clip.mp3")

JavaScript

import { GoogleGenAI } from "@google/genai";
import * as fs from "node:fs";

const ai = new GoogleGenAI({});

async function main() {
  const response = await ai.models.generateContent({
    model: "lyria-3-clip-preview",
    contents: "Create a 30-second cheerful acoustic folk song with " +
              "guitar and harmonica.",
    config: {
      responseModalities: ["AUDIO", "TEXT"],
    },
  });

  for (const part of response.candidates[0].content.parts) {
    if (part.text) {
      console.log(part.text);
    } else if (part.inlineData) {
      const buffer = Buffer.from(part.inlineData.data, "base64");
      fs.writeFileSync("clip.mp3", buffer);
      console.log("Audio saved to clip.mp3");
    }
  }
}

main();

Go

package main

import (
    "context"
    "fmt"
    "log"
    "os"

    "google.golang.org/genai"
)

func main() {
    ctx := context.Background()
    client, err := genai.NewClient(ctx, nil)
    if err != nil {
        log.Fatal(err)
    }

    config := &genai.GenerateContentConfig{
        ResponseModalities: []string{"AUDIO", "TEXT"},
    }

    result, err := client.Models.GenerateContent(
        ctx,
        "lyria-3-clip-preview",
        genai.Text("Create a 30-second cheerful acoustic folk song " +
                   "with guitar and harmonica."),
        config,
    )
    if err != nil {
        log.Fatal(err)
    }

    for _, part := range result.Candidates[0].Content.Parts {
        if part.Text != "" {
            fmt.Println(part.Text)
        } else if part.InlineData != nil {
            err := os.WriteFile("clip.mp3", part.InlineData.Data, 0644)
            if err != nil {
                log.Fatal(err)
            }
            fmt.Println("Audio saved to clip.mp3")
        }
    }
}

Java

import com.google.genai.Client;
import com.google.genai.types.GenerateContentConfig;
import com.google.genai.types.GenerateContentResponse;
import com.google.genai.types.Part;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;

public class GenerateMusicClip {
  public static void main(String[] args) throws IOException {

    try (Client client = new Client()) {
      GenerateContentConfig config = GenerateContentConfig.builder()
          .responseModalities("AUDIO", "TEXT")
          .build();

      GenerateContentResponse response = client.models.generateContent(
          "lyria-3-clip-preview",
          "Create a 30-second cheerful acoustic folk song with "
              + "guitar and harmonica.",
          config);

      for (Part part : response.parts()) {
        if (part.text().isPresent()) {
          System.out.println(part.text().get());
        } else if (part.inlineData().isPresent()) {
          var blob = part.inlineData().get();
          if (blob.data().isPresent()) {
            Files.write(Paths.get("clip.mp3"), blob.data().get());
            System.out.println("Audio saved to clip.mp3");
          }
        }
      }
    }
  }
}

REST

curl -s -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/lyria-3-clip-preview:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [
        {"text": "Create a 30-second cheerful acoustic folk song with guitar and harmonica."}
      ]
    }],
    "generationConfig": {
      "responseModalities": ["AUDIO", "TEXT"]
    }
  }'

C#

using System.Threading.Tasks;
using Google.GenAI;
using Google.GenAI.Types;
using System.IO;

public class GenerateMusicClip {
  public static async Task main() {
    var client = new Client();
    var config = new GenerateContentConfig {
      ResponseModalities = { "AUDIO", "TEXT" }
    };

    var response = await client.Models.GenerateContentAsync(
      model: "lyria-3-clip-preview",
      contents: "Create a 30-second cheerful acoustic folk song with guitar and harmonica.",
      config: config
    );

    foreach (var part in response.Candidates[0].Content.Parts) {
      if (part.Text != null) {
        Console.WriteLine(part.Text);
      } else if (part.InlineData != null) {
        await File.WriteAllBytesAsync("clip.mp3", part.InlineData.Data);
        Console.WriteLine("Audio saved to clip.mp3");
      }
    }
  }
}

Tạo bài hát đầy đủ

Sử dụng mô hình lyria-3-pro-preview để tạo các bài hát đầy đủ kéo dài vài phút. Mô hình Pro hiểu cấu trúc âm nhạc và có thể tạo các bản nhạc có khổ thơ, điệp khúc và đoạn chuyển tiếp riêng biệt. Bạn có thể ảnh hưởng đến thời lượng bằng cách chỉ định thời lượng trong câu lệnh (ví dụ: "tạo bài hát dài 2 phút") hoặc bằng cách sử dụng dấu thời gian để xác định cấu trúc.

Python

response = client.models.generate_content(
    model="lyria-3-pro-preview",
    contents="An epic cinematic orchestral piece about a journey home. "
             "Starts with a solo piano intro, builds through sweeping "
             "strings, and climaxes with a massive wall of sound.",
    config=types.GenerateContentConfig(
        response_modalities=["AUDIO", "TEXT"],
    ),
)

JavaScript

const response = await ai.models.generateContent({
  model: "lyria-3-pro-preview",
  contents: "An epic cinematic orchestral piece about a journey home. " +
            "Starts with a solo piano intro, builds through sweeping " +
            "strings, and climaxes with a massive wall of sound.",
  config: {
    responseModalities: ["AUDIO", "TEXT"],
  },
});

Go

result, err := client.Models.GenerateContent(
    ctx,
    "lyria-3-pro-preview",
    genai.Text("An epic cinematic orchestral piece about a journey " +
               "home. Starts with a solo piano intro, builds through " +
               "sweeping strings, and climaxes with a massive wall of sound."),
    config,
)

Java

GenerateContentResponse response = client.models.generateContent(
    "lyria-3-pro-preview",
    "An epic cinematic orchestral piece about a journey home. "
        + "Starts with a solo piano intro, builds through sweeping "
        + "strings, and climaxes with a massive wall of sound.",
    config);

REST

curl -s -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/lyria-3-pro-preview:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [
        {"text": "An epic cinematic orchestral piece about a journey home. Starts with a solo piano intro, builds through sweeping strings, and climaxes with a massive wall of sound."}
      ]
    }],
    "generationConfig": {
      "responseModalities": ["AUDIO", "TEXT"]
    }
  }'

C#

var response = await client.Models.GenerateContentAsync(
  model: "lyria-3-pro-preview",
  contents: "An epic cinematic orchestral piece about a journey home. " +
            "Starts with a solo piano intro, builds through sweeping " +
            "strings, and climaxes with a massive wall of sound.",
  config: config
);

Phân tích cú pháp phản hồi

Phản hồi từ Lyria 3 chứa nhiều phần. Các phần văn bản chứa lời bài hát được tạo hoặc nội dung mô tả cấu trúc bài hát ở định dạng JSON. Các phần có inline_data chứa các byte âm thanh.

Python

lyrics = []
audio_data = None

for part in response.parts:
    if part.text is not None:
        lyrics.append(part.text)
    elif part.inline_data is not None:
        audio_data = part.inline_data.data

if lyrics:
    print("Lyrics:\n" + "\n".join(lyrics))

if audio_data:
    with open("output.mp3", "wb") as f:
        f.write(audio_data)

JavaScript

const lyrics = [];
let audioData = null;

for (const part of response.candidates[0].content.parts) {
  if (part.text) {
    lyrics.push(part.text);
  } else if (part.inlineData) {
    audioData = Buffer.from(part.inlineData.data, "base64");
  }
}

if (lyrics.length) {
  console.log("Lyrics:\n" + lyrics.join("\n"));
}

if (audioData) {
  fs.writeFileSync("output.mp3", audioData);
}

Go

var lyrics []string
var audioData []byte

for _, part := range result.Candidates[0].Content.Parts {
    if part.Text != "" {
        lyrics = append(lyrics, part.Text)
    } else if part.InlineData != nil {
        audioData = part.InlineData.Data
    }
}

if len(lyrics) > 0 {
    fmt.Println("Lyrics:\n" + strings.Join(lyrics, "\n"))
}

if audioData != nil {
    err := os.WriteFile("output.mp3", audioData, 0644)
    if err != nil {
        log.Fatal(err)
    }
}

Java

List<String> lyrics = new ArrayList<>();
byte[] audioData = null;

for (Part part : response.parts()) {
  if (part.text().isPresent()) {
    lyrics.add(part.text().get());
  } else if (part.inlineData().isPresent()) {
    audioData = part.inlineData().get().data().get();
  }
}

if (!lyrics.isEmpty()) {
  System.out.println("Lyrics:\n" + String.join("\n", lyrics));
}

if (audioData != null) {
  Files.write(Paths.get("output.mp3"), audioData);
}

C#

var lyrics = new List<string>();
byte[] audioData = null;

foreach (var part in response.Candidates[0].Content.Parts) {
  if (part.Text != null) {
    lyrics.Add(part.Text);
  } else if (part.InlineData != null) {
    audioData = part.InlineData.Data;
  }
}

if (lyrics.Count > 0) {
  Console.WriteLine("Lyrics:\n" + string.Join("\n", lyrics));
}

if (audioData != null) {
  await File.WriteAllBytesAsync("output.mp3", audioData);
}

REST

# The output from the REST API is a JSON object containing base64 encoded data.
# You can extract the text or the audio data using a tool like jq.
# To extract the audio and save it to a file:
curl ... | jq -r '.candidates[0].content.parts[] | select(.inlineData) | .inlineData.data' | base64 -d > output.mp3

Tạo nhạc từ hình ảnh

Lyria 3 hỗ trợ dữ liệu đầu vào đa phương thức – bạn có thể cung cấp tối đa 10 hình ảnh cùng với câu lệnh văn bản và mô hình sẽ sáng tác nhạc lấy cảm hứng từ nội dung trực quan.

Python

from PIL import Image

image = Image.open("desert_sunset.jpg")

response = client.models.generate_content(
    model="lyria-3-pro-preview",
    contents=[
        "An atmospheric ambient track inspired by the mood and "
        "colors in this image.",
        image,
    ],
    config=types.GenerateContentConfig(
        response_modalities=["AUDIO", "TEXT"],
    ),
)

JavaScript

const imageData = fs.readFileSync("desert_sunset.jpg");
const base64Image = imageData.toString("base64");

const response = await ai.models.generateContent({
  model: "lyria-3-pro-preview",
  contents: [
    { text: "An atmospheric ambient track inspired by the mood " +
            "and colors in this image." },
    {
      inlineData: {
        mimeType: "image/jpeg",
        data: base64Image,
      },
    },
  ],
  config: {
    responseModalities: ["AUDIO", "TEXT"],
  },
});

Go

imgData, err := os.ReadFile("desert_sunset.jpg")
if err != nil {
    log.Fatal(err)
}

parts := []*genai.Part{
    genai.NewPartFromText("An atmospheric ambient track inspired " +
        "by the mood and colors in this image."),
    &genai.Part{
        InlineData: &genai.Blob{
            MIMEType: "image/jpeg",
            Data:     imgData,
        },
    },
}

contents := []*genai.Content{
    genai.NewContentFromParts(parts, genai.RoleUser),
}

result, err := client.Models.GenerateContent(
    ctx,
    "lyria-3-pro-preview",
    contents,
    config,
)

Java

GenerateContentResponse response = client.models.generateContent(
    "lyria-3-pro-preview",
    Content.fromParts(
        Part.fromText("An atmospheric ambient track inspired by "
            + "the mood and colors in this image."),
        Part.fromBytes(
            Files.readAllBytes(Path.of("desert_sunset.jpg")),
            "image/jpeg")),
    config);

REST

curl -s -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/lyria-3-pro-preview:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H 'Content-Type: application/json' \
  -d "{
    \"contents\": [{
      \"parts\":[
          {\"text\": \"An atmospheric ambient track inspired by the mood and colors in this image.\"},
          {
            \"inline_data\": {
              \"mime_type\":\"image/jpeg\",
              \"data\": \"<BASE64_IMAGE_DATA>\"
            }
          }
      ]
    }],
    \"generationConfig\": {
      \"responseModalities\": [\"AUDIO\", \"TEXT\"]
    }
  }"

C#

var response = await client.Models.GenerateContentAsync(
  model: "lyria-3-pro-preview",
  contents: new List<Part> {
    Part.FromText("An atmospheric ambient track inspired by the mood and colors in this image."),
    Part.FromBytes(await File.ReadAllBytesAsync("desert_sunset.jpg"), "image/jpeg")
  },
  config: config
);

Cung cấp lời bài hát tuỳ chỉnh

Bạn có thể tự viết lời bài hát và đưa vào câu lệnh. Sử dụng các thẻ phần như [Verse], [Chorus] và [Bridge] để giúp mô hình hiểu cấu trúc bài hát:

Python

prompt = """
Create a dreamy indie pop song with the following lyrics:

[Verse 1]
Walking through the neon glow,
city lights reflect below,
every shadow tells a story,
every corner, fading glory.

[Chorus]
We are the echoes in the night,
burning brighter than the light,
hold on tight, don't let me go,
we are the echoes down below.

[Verse 2]
Footsteps lost on empty streets,
rhythms sync to heartbeats,
whispers carried by the breeze,
dancing through the autumn leaves.
"""

response = client.models.generate_content(
    model="lyria-3-pro-preview",
    contents=prompt,
    config=types.GenerateContentConfig(
        response_modalities=["AUDIO", "TEXT"],
    ),
)

JavaScript

const prompt = `
Create a dreamy indie pop song with the following lyrics:

[Verse 1]
Walking through the neon glow,
city lights reflect below,
every shadow tells a story,
every corner, fading glory.

[Chorus]
We are the echoes in the night,
burning brighter than the light,
hold on tight, don't let me go,
we are the echoes down below.

[Verse 2]
Footsteps lost on empty streets,
rhythms sync to heartbeats,
whispers carried by the breeze,
dancing through the autumn leaves.
`;

const response = await ai.models.generateContent({
  model: "lyria-3-pro-preview",
  contents: prompt,
  config: {
    responseModalities: ["AUDIO", "TEXT"],
  },
});

Go

prompt := `
Create a dreamy indie pop song with the following lyrics:

[Verse 1]
Walking through the neon glow,
city lights reflect below,
every shadow tells a story,
every corner, fading glory.

[Chorus]
We are the echoes in the night,
burning brighter than the light,
hold on tight, don't let me go,
we are the echoes down below.

[Verse 2]
Footsteps lost on empty streets,
rhythms sync to heartbeats,
whispers carried by the breeze,
dancing through the autumn leaves.
`

result, err := client.Models.GenerateContent(
    ctx,
    "lyria-3-pro-preview",
    genai.Text(prompt),
    config,
)

Java

String prompt = """
    Create a dreamy indie pop song with the following lyrics:

    [Verse 1]
    Walking through the neon glow,
    city lights reflect below,
    every shadow tells a story,
    every corner, fading glory.

    [Chorus]
    We are the echoes in the night,
    burning brighter than the light,
    hold on tight, don't let me go,
    we are the echoes down below.

    [Verse 2]
    Footsteps lost on empty streets,
    rhythms sync to heartbeats,
    whispers carried by the breeze,
    dancing through the autumn leaves.
    """;

GenerateContentResponse response = client.models.generateContent(
    "lyria-3-pro-preview",
    prompt,
    config);

C#

var prompt = @"
Create a dreamy indie pop song with the following lyrics:

[Verse 1]
Walking through the neon glow,
city lights reflect below,
every shadow tells a story,
every corner, fading glory.

[Chorus]
We are the echoes in the night,
burning brighter than the light,
hold on tight, don't let me go,
we are the echoes down below.

[Verse 2]
Footsteps lost on empty streets,
rhythms sync to heartbeats,
whispers carried by the breeze,
dancing through the autumn leaves.
";

var response = await client.Models.GenerateContentAsync(
  model: "lyria-3-pro-preview",
  contents: prompt,
  config: config
);

REST

curl -s -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/lyria-3-pro-preview:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [
        {"text": "Create a dreamy indie pop song with the following lyrics: ..."}
      ]
    }],
    "generationConfig": {
      "responseModalities": ["AUDIO", "TEXT"]
    }
  }'

Kiểm soát thời gian và cấu trúc

Bạn có thể chỉ định chính xác những gì xảy ra tại những thời điểm cụ thể trong bài hát bằng dấu thời gian. Điều này hữu ích cho việc kiểm soát thời điểm nhạc cụ bắt đầu, thời điểm lời bài hát được truyền tải và cách bài hát tiến triển:

Python

prompt = """
[0:00 - 0:10] Intro: Begin with a soft lo-fi beat and muffled
              vinyl crackle.
[0:10 - 0:30] Verse 1: Add a warm Fender Rhodes piano melody
              and gentle vocals singing about a rainy morning.
[0:30 - 0:50] Chorus: Full band with upbeat drums and soaring
              synth leads. The lyrics are hopeful and uplifting.
[0:50 - 1:00] Outro: Fade out with the piano melody alone.
"""

response = client.models.generate_content(
    model="lyria-3-pro-preview",
    contents=prompt,
    config=types.GenerateContentConfig(
        response_modalities=["AUDIO", "TEXT"],
    ),
)

JavaScript

const prompt = `
[0:00 - 0:10] Intro: Begin with a soft lo-fi beat and muffled
              vinyl crackle.
[0:10 - 0:30] Verse 1: Add a warm Fender Rhodes piano melody
              and gentle vocals singing about a rainy morning.
[0:30 - 0:50] Chorus: Full band with upbeat drums and soaring
              synth leads. The lyrics are hopeful and uplifting.
[0:50 - 1:00] Outro: Fade out with the piano melody alone.
`;

const response = await ai.models.generateContent({
  model: "lyria-3-pro-preview",
  contents: prompt,
  config: {
    responseModalities: ["AUDIO", "TEXT"],
  },
});

Go

prompt := `
[0:00 - 0:10] Intro: Begin with a soft lo-fi beat and muffled
              vinyl crackle.
[0:10 - 0:30] Verse 1: Add a warm Fender Rhodes piano melody
              and gentle vocals singing about a rainy morning.
[0:30 - 0:50] Chorus: Full band with upbeat drums and soaring
              synth leads. The lyrics are hopeful and uplifting.
[0:50 - 1:00] Outro: Fade out with the piano melody alone.
`

result, err := client.Models.GenerateContent(
    ctx,
    "lyria-3-pro-preview",
    genai.Text(prompt),
    config,
)

Java

String prompt = """
    [0:00 - 0:10] Intro: Begin with a soft lo-fi beat and muffled
                  vinyl crackle.
    [0:10 - 0:30] Verse 1: Add a warm Fender Rhodes piano melody
                  and gentle vocals singing about a rainy morning.
    [0:30 - 0:50] Chorus: Full band with upbeat drums and soaring
                  synth leads. The lyrics are hopeful and uplifting.
    [0:50 - 1:00] Outro: Fade out with the piano melody alone.
    """;

GenerateContentResponse response = client.models.generateContent(
    "lyria-3-pro-preview",
    prompt,
    config);

C#

var prompt = @"
[0:00 - 0:10] Intro: Begin with a soft lo-fi beat and muffled
              vinyl crackle.
[0:10 - 0:30] Verse 1: Add a warm Fender Rhodes piano melody
              and gentle vocals singing about a rainy morning.
[0:30 - 0:50] Chorus: Full band with upbeat drums and soaring
              synth leads. The lyrics are hopeful and uplifting.
[0:50 - 1:00] Outro: Fade out with the piano melody alone.
";

var response = await client.Models.GenerateContentAsync(
  model: "lyria-3-pro-preview",
  contents: prompt,
  config: config
);

REST

curl -s -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/lyria-3-pro-preview:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [
        {"text": "[0:00 - 0:10] Intro: ..."}
      ]
    }],
    "generationConfig": {
      "responseModalities": ["AUDIO", "TEXT"]
    }
  }'

Tạo bản nhạc không lời

Đối với nhạc nền, nhạc phim hoặc bất kỳ trường hợp sử dụng nào không yêu cầu giọng hát, bạn có thể yêu cầu mô hình tạo ra các bản nhạc chỉ có nhạc cụ:

Python

response = client.models.generate_content(
    model="lyria-3-clip-preview",
    contents="A bright chiptune melody in C Major, retro 8-bit "
             "video game style. Instrumental only, no vocals.",
    config=types.GenerateContentConfig(
        response_modalities=["AUDIO", "TEXT"],
    ),
)

JavaScript

const response = await ai.models.generateContent({
  model: "lyria-3-clip-preview",
  contents: "A bright chiptune melody in C Major, retro 8-bit " +
            "video game style. Instrumental only, no vocals.",
  config: {
    responseModalities: ["AUDIO", "TEXT"],
  },
});

Go

result, err := client.Models.GenerateContent(
    ctx,
    "lyria-3-clip-preview",
    genai.Text("A bright chiptune melody in C Major, retro 8-bit " +
               "video game style. Instrumental only, no vocals."),
    config,
)

Java

GenerateContentResponse response = client.models.generateContent(
    "lyria-3-clip-preview",
    "A bright chiptune melody in C Major, retro 8-bit "
        + "video game style. Instrumental only, no vocals.",
    config);

C#

var response = await client.Models.GenerateContentAsync(
  model: "lyria-3-clip-preview",
  contents: "A bright chiptune melody in C Major, retro 8-bit " +
            "video game style. Instrumental only, no vocals.",
  config: config
);

REST

curl -s -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/lyria-3-clip-preview:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [
        {"text": "A bright chiptune melody in C Major, retro 8-bit video game style. Instrumental only, no vocals."}
      ]
    }],
    "generationConfig": {
      "responseModalities": ["AUDIO", "TEXT"]
    }
  }'

Tạo nhạc bằng nhiều ngôn ngữ

Lyria 3 tạo lời bài hát bằng ngôn ngữ trong câu lệnh của bạn. Để tạo một bài hát có lời bằng tiếng Pháp, hãy viết câu lệnh bằng tiếng Pháp. Mô hình sẽ điều chỉnh phong cách giọng hát và cách phát âm cho phù hợp với ngôn ngữ.

Python

response = client.models.generate_content(
    model="lyria-3-pro-preview",
    contents="Crée une chanson pop romantique en français sur un "
             "coucher de soleil à Paris. Utilise du piano et de "
             "la guitare acoustique.",
    config=types.GenerateContentConfig(
        response_modalities=["AUDIO", "TEXT"],
    ),
)

JavaScript

const response = await ai.models.generateContent({
  model: "lyria-3-pro-preview",
  contents: "Crée une chanson pop romantique en français sur un " +
            "coucher de soleil à Paris. Utilise du piano et de " +
            "la guitare acoustique.",
  config: {
    responseModalities: ["AUDIO", "TEXT"],
  },
});

Go

result, err := client.Models.GenerateContent(
    ctx,
    "lyria-3-pro-preview",
    genai.Text("Crée une chanson pop romantique en français sur un " +
               "coucher de soleil à Paris. Utilise du piano et de " +
               "la guitare acoustique."),
    config,
)

Java

GenerateContentResponse response = client.models.generateContent(
    "lyria-3-pro-preview",
    "Crée une chanson pop romantique en français sur un "
        + "coucher de soleil à Paris. Utilise du piano et de "
        + "la guitare acoustique.",
    config);

C#

var response = await client.Models.GenerateContentAsync(
  model: "lyria-3-pro-preview",
  contents: "Crée une chanson pop romantique en français sur un " +
            "coucher de soleil à Paris. Utilise du piano et de " +
            "la guitare acoustique.",
  config: config
);

REST

curl -s -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/lyria-3-pro-preview:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [
        {"text": "Crée une chanson pop romantique en français sur un coucher de soleil à Paris. Utilise du piano et de la guitare acoustique."}
      ]
    }],
    "generationConfig": {
      "responseModalities": ["AUDIO", "TEXT"]
    }
  }'

Trí tuệ mô hình

Lyria 3 phân tích quy trình câu lệnh của bạn, trong đó mô hình suy luận thông qua cấu trúc âm nhạc (đoạn mở đầu, khổ thơ, điệp khúc, đoạn chuyển tiếp, v.v.) dựa trên câu lệnh của bạn. Quá trình này diễn ra trước khi âm thanh được tạo và đảm bảo tính nhất quán về cấu trúc cũng như tính âm nhạc.

API Tương tác

Bạn có thể sử dụng các mô hình Lyria 3 với API Tương tác; một giao diện hợp nhất để tương tác với các mô hình và tác nhân Gemini. API này giúp đơn giản hoá việc quản lý trạng thái và các tác vụ chạy trong thời gian dài cho các trường hợp sử dụng đa phương thức phức tạp.

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="lyria-3-pro-preview",
    input="An epic cinematic orchestral piece about a journey home. " +
          "Starts with a solo piano intro, builds through sweeping " +
          "strings, and climaxes with a massive wall of sound.",
    response_modalities=["AUDIO", "TEXT"]
)

for output in interaction.outputs:
    if output.text:
        print(output.text)
    elif output.inline_data:
         with open("interaction_output.mp3", "wb") as f:
            f.write(output.inline_data.data)
         print("Audio saved to interaction_output.mp3")

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
  model: 'lyria-3-pro-preview',
  input: 'An epic cinematic orchestral piece about a journey home. ' +
         'Starts with a solo piano intro, builds through sweeping ' +
         'strings, and climaxes with a massive wall of sound.',
  responseModalities: ['AUDIO', 'TEXT'],
});

for (const output of interaction.outputs) {
  if (output.text) {
    console.log(output.text);
  } else if (output.inlineData) {
    const buffer = Buffer.from(output.inlineData.data, 'base64');
    fs.writeFileSync('interaction_output.mp3', buffer);
    console.log('Audio saved to interaction_output.mp3');
  }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "lyria-3-pro-preview",
    "input": "An epic cinematic orchestral piece about a journey home. Starts with a solo piano intro, builds through sweeping strings, and climaxes with a massive wall of sound.",
    "responseModalities": ["AUDIO", "TEXT"]
}'

Hướng dẫn đặt câu lệnh

Câu lệnh càng cụ thể thì kết quả càng tốt. Sau đây là những nội dung bạn có thể đưa vào để hướng dẫn quá trình tạo:

Thể loại: Chỉ định một thể loại hoặc kết hợp các thể loại (ví dụ: "lo-fi hip hop", "jazz fusion", "cinematic orchestral").
Nhạc cụ: Nêu tên các nhạc cụ cụ thể (ví dụ: "đàn piano Fender Rhodes", "đàn guitar slide", "máy trống TR-808").
BPM: Đặt nhịp độ (ví dụ: "120 BPM", "nhịp độ chậm khoảng 70 BPM").
Âm giai/Gam: Chỉ định một âm giai (ví dụ: "trong gam G trưởng", "gam D thứ").
Tâm trạng và không khí: Sử dụng các tính từ mô tả (ví dụ: "hoài niệm", "mạnh mẽ", "thanh tao", "mơ màng").
Cấu trúc: Sử dụng các thẻ như [Verse], [Chorus], [Bridge], [Intro], [Outro] hoặc dấu thời gian để kiểm soát tiến trình của bài hát.
Thời lượng: Mô hình Đoạn trích video luôn tạo ra các đoạn trích video dài 30 giây. Đối với mô hình Pro, hãy chỉ định độ dài mong muốn trong câu lệnh (ví dụ: "tạo bài hát dài 2 phút") hoặc sử dụng dấu thời gian để kiểm soát thời lượng.

Câu lệnh mẫu

Sau đây là một số ví dụ về câu lệnh hiệu quả:

"A 30-second lofi hip hop beat with dusty vinyl crackle, mellow Rhodes piano chords, a slow boom-bap drum pattern at 85 BPM, and a jazzy upright bass line. Instrumental only."
"An upbeat, feel-good pop song in G major at 120 BPM with bright acoustic guitar strumming, claps, and warm vocal harmonies about a summer road trip."
"A dark, atmospheric trap beat at 140 BPM with heavy 808 bass, eerie synth pads, sharp hi-hats, and a haunting vocal sample. In D minor."

Các phương pháp hay nhất

Lặp lại bằng Đoạn trích trước. Sử dụng mô hình lyria-3-clip-preview nhanh hơn để thử nghiệm với các câu lệnh trước khi cam kết tạo một bản nhạc đầy đủ bằng lyria-3-pro-preview.
Mô tả cụ thể. Các câu lệnh mơ hồ sẽ tạo ra kết quả chung chung. Hãy đề cập đến nhạc cụ, BPM, âm giai, tâm trạng và cấu trúc để có kết quả đầu ra tốt nhất.
Sử dụng ngôn ngữ phù hợp. Đặt câu lệnh bằng ngôn ngữ mà bạn muốn lời bài hát sử dụng.
Sử dụng thẻ phần. Các thẻ [Verse], [Chorus], [Bridge] cung cấp cho mô hình cấu trúc rõ ràng để tuân theo.
Tách lời bài hát khỏi hướng dẫn. Khi cung cấp lời bài hát tuỳ chỉnh, hãy tách riêng lời bài hát khỏi hướng dẫn về hướng âm nhạc.

Các điểm hạn chế

An toàn: Tất cả các câu lệnh đều được kiểm tra bằng bộ lọc an toàn. Các câu lệnh kích hoạt bộ lọc sẽ bị chặn. Điều này bao gồm cả các câu lệnh yêu cầu giọng nói của nghệ sĩ cụ thể hoặc tạo lời bài hát có bản quyền.
Tạo hình mờ: Tất cả âm thanh được tạo đều có thuỷ vân âm thanh SynthID để nhận dạng. Hình mờ này không thể nghe thấy bằng tai người và không ảnh hưởng đến trải nghiệm nghe.
Chỉnh sửa nhiều lượt: Quá trình tạo nhạc là một quy trình một lượt. Phiên bản Lyria 3 hiện tại không được hỗ trợ chỉnh sửa lặp đi lặp lại hoặc tinh chỉnh một đoạn trích được tạo thông qua nhiều câu lệnh.
Độ dài: Mô hình Clip luôn tạo ra các đoạn trích dài 30 giây. Mô hình Pro tạo ra các bài hát kéo dài vài phút; bạn có thể ảnh hưởng đến thời lượng chính xác thông qua câu lệnh.
Tính xác định: Kết quả có thể khác nhau giữa các lệnh gọi, ngay cả khi sử dụng cùng một câu lệnh.

Bước tiếp theo

Kiểm tra giá của các mô hình Lyria 3,
Thử tạo nhạc truyền trực tuyến theo thời gian thực bằng Lyria RealTime,
Tạo cuộc trò chuyện nhiều người nói bằng các mô hình chuyển văn bản sang lời nói ,
Khám phá cách tạo hình ảnh hoặc video,
Tìm hiểu cách Gemini có thể hiểu tệp âm thanh,
Trò chuyện theo thời gian thực với Gemini bằng Live API.