Gemini Deep Research がプレビュー版で利用可能になりました。共同プランニング、可視化、MCP サポートなどが含まれています。

Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

Lyria 3 で音楽を生成する

Lyria 3 は、Gemini API を通じて利用できる Google の音楽生成モデルファミリーです。Lyria 3 を使用すると、テキストプロンプトや画像から、高音質の 44.1 kHz ステレオ音声を生成できます。これらのモデルは、ボーカル、タイミングに合わせた歌詞、完全なインストゥルメンタルアレンジなど、構造的な一貫性を提供します。

Lyria 3 ファミリーには次の 2 つのモデルがあります。

モデル	モデル ID	最適な用途	所要時間	出力
Lyria 3 Clip	`lyria-3-clip-preview`	短いクリップ、ループ、プレビュー	30 秒	MP3
Lyria 3 Pro	`lyria-3-pro-preview`	A メロ、サビ、ブリッジを含むフルレングスの曲	数分（プロンプトで制御可能）	MP3

どちらのモデルも、標準の generateContent メソッドと新しい Interactions API を使用して利用できます。マルチモーダル入力（テキストと画像）をサポートし、44.1 kHz の高忠実度ステレオ 音声を生成します。

音楽クリップを生成する

Lyria 3 Clip モデルは常に 30 秒 のクリップを生成します。クリップを生成するには、テキストプロンプトを指定して generateContent メソッドを呼び出します。レスポンスには常に、生成された歌詞と曲の構成が音声とともに含まれます。

Python

from google import genai

client = genai.Client()

response = client.models.generate_content(
    model="lyria-3-clip-preview",
    contents="Create a 30-second cheerful acoustic folk song with "
             "guitar and harmonica.",
)

# Parse the response
for part in response.parts:
    if part.text is not None:
        print(part.text)
    elif part.inline_data is not None:
        with open("clip.mp3", "wb") as f:
            f.write(part.inline_data.data)
        print("Audio saved to clip.mp3")

JavaScript

import { GoogleGenAI } from "@google/genai";
import * as fs from "node:fs";

const ai = new GoogleGenAI({});

async function main() {
  const response = await ai.models.generateContent({
    model: "lyria-3-clip-preview",
    contents: "Create a 30-second cheerful acoustic folk song with " +
              "guitar and harmonica.",

  });

  for (const part of response.candidates[0].content.parts) {
    if (part.text) {
      console.log(part.text);
    } else if (part.inlineData) {
      const buffer = Buffer.from(part.inlineData.data, "base64");
      fs.writeFileSync("clip.mp3", buffer);
      console.log("Audio saved to clip.mp3");
    }
  }
}

main();

Go

package main

import (
    "context"
    "fmt"
    "log"
    "os"

    "google.golang.org/genai"
)

func main() {
    ctx := context.Background()
    client, err := genai.NewClient(ctx, nil)
    if err != nil {
        log.Fatal(err)
    }

    result, err := client.Models.GenerateContent(
        ctx,
        "lyria-3-clip-preview",
        genai.Text("Create a 30-second cheerful acoustic folk song " +
                   "with guitar and harmonica."),
        nil,
    )
    if err != nil {
        log.Fatal(err)
    }

    for _, part := range result.Candidates[0].Content.Parts {
        if part.Text != "" {
            fmt.Println(part.Text)
        } else if part.InlineData != nil {
            err := os.WriteFile("clip.mp3", part.InlineData.Data, 0644)
            if err != nil {
                log.Fatal(err)
            }
            fmt.Println("Audio saved to clip.mp3")
        }
    }
}

Java

import com.google.genai.Client;
import com.google.genai.types.GenerateContentResponse;
import com.google.genai.types.Part;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;

public class GenerateMusicClip {
  public static void main(String[] args) throws IOException {

    try (Client client = new Client()) {
      GenerateContentResponse response = client.models.generateContent(
          "lyria-3-clip-preview",
          "Create a 30-second cheerful acoustic folk song with "
              + "guitar and harmonica.");

      for (Part part : response.parts()) {
        if (part.text().isPresent()) {
          System.out.println(part.text().get());
        } else if (part.inlineData().isPresent()) {
          var blob = part.inlineData().get();
          if (blob.data().isPresent()) {
            Files.write(Paths.get("clip.mp3"), blob.data().get());
            System.out.println("Audio saved to clip.mp3");
          }
        }
      }
    }
  }
}

REST

curl -s -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/lyria-3-clip-preview:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [
        {"text": "Create a 30-second cheerful acoustic folk song with guitar and harmonica."}
      ]
    }]
  }'

C#

using System.Threading.Tasks;
using Google.GenAI;
using Google.GenAI.Types;
using System.IO;

public class GenerateMusicClip {
  public static async Task main() {
    var client = new Client();
    var response = await client.Models.GenerateContentAsync(
      model: "lyria-3-clip-preview",
      contents: "Create a 30-second cheerful acoustic folk song with guitar and harmonica."
    );

    foreach (var part in response.Candidates[0].Content.Parts) {
      if (part.Text != null) {
        Console.WriteLine(part.Text);
      } else if (part.InlineData != null) {
        await File.WriteAllBytesAsync("clip.mp3", part.InlineData.Data);
        Console.WriteLine("Audio saved to clip.mp3");
      }
    }
  }
}

フルレングスの曲を生成する

lyria-3-pro-preview モデルを使用して、数分間のフルレングスの曲を生成します。Pro モデルは音楽の構成を理解し、明確な A メロ、サビ、ブリッジを含む楽曲を作成できます。プロンプトで期間を指定する（例:「2 分間の曲を作成する」）か、タイムスタンプを使用して構成を定義することで、期間に影響を与えることができます。

Python

response = client.models.generate_content(
    model="lyria-3-pro-preview",
    contents="An epic cinematic orchestral piece about a journey home. "
             "Starts with a solo piano intro, builds through sweeping "
             "strings, and climaxes with a massive wall of sound.",
)

JavaScript

const response = await ai.models.generateContent({
  model: "lyria-3-pro-preview",
  contents: "An epic cinematic orchestral piece about a journey home. " +
            "Starts with a solo piano intro, builds through sweeping " +
            "strings, and climaxes with a massive wall of sound.",

});

Go

result, err := client.Models.GenerateContent(
    ctx,
    "lyria-3-pro-preview",
    genai.Text("An epic cinematic orchestral piece about a journey " +
               "home. Starts with a solo piano intro, builds through " +
               "sweeping strings, and climaxes with a massive wall of sound."),
    nil,
)

Java

GenerateContentResponse response = client.models.generateContent(
    "lyria-3-pro-preview",
    "An epic cinematic orchestral piece about a journey home. "
        + "Starts with a solo piano intro, builds through sweeping "
        + "strings, and climaxes with a massive wall of sound.");

REST

curl -s -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/lyria-3-pro-preview:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [
        {"text": "An epic cinematic orchestral piece about a journey home. Starts with a solo piano intro, builds through sweeping strings, and climaxes with a massive wall of sound."}
      ]
    }]
  }'

C#

var response = await client.Models.GenerateContentAsync(
  model: "lyria-3-pro-preview",
  contents: "An epic cinematic orchestral piece about a journey home. " +
            "Starts with a solo piano intro, builds through sweeping " +
            "strings, and climaxes with a massive wall of sound."
);

出力形式を選択する

デフォルトでは、Lyria 3 モデルは MP3 形式で音声を生成します。Lyria 3 Pro の場合は、generationConfig で response_mime_type を設定して、WAV 形式で出力をリクエストすることもできます。

Python

response = client.models.generate_content(
    model="lyria-3-pro-preview",
    contents="An atmospheric ambient track.",
    config=types.GenerateContentConfig(
        response_modalities=["AUDIO", "TEXT"],
        response_mime_type="audio/wav",
    ),
)

JavaScript

const response = await ai.models.generateContent({
  model: "lyria-3-pro-preview",
  contents: "An atmospheric ambient track.",
  config: {
    responseModalities: ["AUDIO", "TEXT"],
    responseMimeType: "audio/wav",
  },
});

Go

config := &genai.GenerateContentConfig{
    ResponseModalities: []string{"AUDIO", "TEXT"},
    ResponseMIMEType:   "audio/wav",
}

result, err := client.Models.GenerateContent(
    ctx,
    "lyria-3-pro-preview",
    genai.Text("An atmospheric ambient track."),
    config,
)

Java

GenerateContentConfig config = GenerateContentConfig.builder()
    .responseModalities("AUDIO", "TEXT")
    .responseMimeType("audio/wav")
    .build();

GenerateContentResponse response = client.models.generateContent(
    "lyria-3-pro-preview",
    "An atmospheric ambient track.",
    config);

C#

var config = new GenerateContentConfig {
  ResponseModalities = { "AUDIO", "TEXT" },
  ResponseMimeType = "audio/wav"
};

var response = await client.Models.GenerateContentAsync(
  model: "lyria-3-pro-preview",
  contents: "An atmospheric ambient track.",
  config: config
);

REST

curl -s -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/lyria-3-pro-preview:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [
        {"text": "An atmospheric ambient track."}
      ]
    }],
    "generationConfig": {
      "responseModalities": ["AUDIO", "TEXT"],
      "responseMimeType": "audio/wav"
    }
  }'

レスポンスをパースする

Lyria 3 からのレスポンスには複数のパートが含まれます。テキストパートには、生成された歌詞または曲の構成の JSON 説明が含まれます。inline_data を含むパートには、音声バイトが含まれます。

Python

lyrics = []
audio_data = None

for part in response.parts:
    if part.text is not None:
        lyrics.append(part.text)
    elif part.inline_data is not None:
        audio_data = part.inline_data.data

if lyrics:
    print("Lyrics:\n" + "\n".join(lyrics))

if audio_data:
    with open("output.mp3", "wb") as f:
        f.write(audio_data)

JavaScript

const lyrics = [];
let audioData = null;

for (const part of response.candidates[0].content.parts) {
  if (part.text) {
    lyrics.push(part.text);
  } else if (part.inlineData) {
    audioData = Buffer.from(part.inlineData.data, "base64");
  }
}

if (lyrics.length) {
  console.log("Lyrics:\n" + lyrics.join("\n"));
}

if (audioData) {
  fs.writeFileSync("output.mp3", audioData);
}

Go

var lyrics []string
var audioData []byte

for _, part := range result.Candidates[0].Content.Parts {
    if part.Text != "" {
        lyrics = append(lyrics, part.Text)
    } else if part.InlineData != nil {
        audioData = part.InlineData.Data
    }
}

if len(lyrics) > 0 {
    fmt.Println("Lyrics:\n" + strings.Join(lyrics, "\n"))
}

if audioData != nil {
    err := os.WriteFile("output.mp3", audioData, 0644)
    if err != nil {
        log.Fatal(err)
    }
}

Java

List<String> lyrics = new ArrayList<>();
byte[] audioData = null;

for (Part part : response.parts()) {
  if (part.text().isPresent()) {
    lyrics.add(part.text().get());
  } else if (part.inlineData().isPresent()) {
    audioData = part.inlineData().get().data().get();
  }
}

if (!lyrics.isEmpty()) {
  System.out.println("Lyrics:\n" + String.join("\n", lyrics));
}

if (audioData != null) {
  Files.write(Paths.get("output.mp3"), audioData);
}

C#

var lyrics = new List<string>();
byte[] audioData = null;

foreach (var part in response.Candidates[0].Content.Parts) {
  if (part.Text != null) {
    lyrics.Add(part.Text);
  } else if (part.InlineData != null) {
    audioData = part.InlineData.Data;
  }
}

if (lyrics.Count > 0) {
  Console.WriteLine("Lyrics:\n" + string.Join("\n", lyrics));
}

if (audioData != null) {
  await File.WriteAllBytesAsync("output.mp3", audioData);
}

REST

# The output from the REST API is a JSON object containing base64 encoded data.
# You can extract the text or the audio data using a tool like jq.
# To extract the audio and save it to a file:
curl ... | jq -r '.candidates[0].content.parts[] | select(.inlineData) | .inlineData.data' | base64 -d > output.mp3

画像から音楽を生成する

Lyria 3 はマルチモーダル入力をサポートしています。テキストプロンプトとともに最大 10 個の画像 を指定すると、モデルはビジュアルコンテンツにインスパイアされた音楽を作成します。

Python

from PIL import Image

image = Image.open("desert_sunset.jpg")

response = client.models.generate_content(
    model="lyria-3-pro-preview",
    contents=[
        "An atmospheric ambient track inspired by the mood and "
        "colors in this image.",
        image,
    ],
)

JavaScript

const imageData = fs.readFileSync("desert_sunset.jpg");
const base64Image = imageData.toString("base64");

const response = await ai.models.generateContent({
  model: "lyria-3-pro-preview",
  contents: [
    { text: "An atmospheric ambient track inspired by the mood " +
            "and colors in this image." },
    {
      inlineData: {
        mimeType: "image/jpeg",
        data: base64Image,
      },
    },
  ],

});

Go

imgData, err := os.ReadFile("desert_sunset.jpg")
if err != nil {
    log.Fatal(err)
}

parts := []*genai.Part{
    genai.NewPartFromText("An atmospheric ambient track inspired " +
        "by the mood and colors in this image."),
    &genai.Part{
        InlineData: &genai.Blob{
            MIMEType: "image/jpeg",
            Data:     imgData,
        },
    },
}

contents := []*genai.Content{
    genai.NewContentFromParts(parts, genai.RoleUser),
}

result, err := client.Models.GenerateContent(
    ctx,
    "lyria-3-pro-preview",
    contents,
    nil,
)

Java

GenerateContentResponse response = client.models.generateContent(
    "lyria-3-pro-preview",
    Content.fromParts(
        Part.fromText("An atmospheric ambient track inspired by "
            + "the mood and colors in this image."),
        Part.fromBytes(
            Files.readAllBytes(Path.of("desert_sunset.jpg")),
            "image/jpeg")));

REST

curl -s -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/lyria-3-pro-preview:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H 'Content-Type: application/json' \
  -d "{
    \"contents\": [{
      \"parts\":[
          {\"text\": \"An atmospheric ambient track inspired by the mood and colors in this image.\"},
          {
            \"inline_data\": {
              \"mime_type\":\"image/jpeg\",
              \"data\": \"<BASE64_IMAGE_DATA>\"
            }
          }
      ]
    }]
  }"

C#

var response = await client.Models.GenerateContentAsync(
  model: "lyria-3-pro-preview",
  contents: new List<Part> {
    Part.FromText("An atmospheric ambient track inspired by the mood and colors in this image."),
    Part.FromBytes(await File.ReadAllBytesAsync("desert_sunset.jpg"), "image/jpeg")
  }
);

カスタムの歌詞を指定する

独自の歌詞を作成してプロンプトに含めることができます。[Verse]、[Chorus]、[Bridge] などのセクションタグを使用して、モデルが曲の構成を理解できるようにします。

Python

prompt = """
Create a dreamy indie pop song with the following lyrics:

[Verse 1]
Walking through the neon glow,
city lights reflect below,
every shadow tells a story,
every corner, fading glory.

[Chorus]
We are the echoes in the night,
burning brighter than the light,
hold on tight, don't let me go,
we are the echoes down below.

[Verse 2]
Footsteps lost on empty streets,
rhythms sync to heartbeats,
whispers carried by the breeze,
dancing through the autumn leaves.
"""

response = client.models.generate_content(
    model="lyria-3-pro-preview",
    contents=prompt,
)

JavaScript

const prompt = `
Create a dreamy indie pop song with the following lyrics:

[Verse 1]
Walking through the neon glow,
city lights reflect below,
every shadow tells a story,
every corner, fading glory.

[Chorus]
We are the echoes in the night,
burning brighter than the light,
hold on tight, don't let me go,
we are the echoes down below.

[Verse 2]
Footsteps lost on empty streets,
rhythms sync to heartbeats,
whispers carried by the breeze,
dancing through the autumn leaves.
`;

const response = await ai.models.generateContent({
  model: "lyria-3-pro-preview",
  contents: prompt,

});

Go

prompt := `
Create a dreamy indie pop song with the following lyrics:

[Verse 1]
Walking through the neon glow,
city lights reflect below,
every shadow tells a story,
every corner, fading glory.

[Chorus]
We are the echoes in the night,
burning brighter than the light,
hold on tight, don't let me go,
we are the echoes down below.

[Verse 2]
Footsteps lost on empty streets,
rhythms sync to heartbeats,
whispers carried by the breeze,
dancing through the autumn leaves.
`

result, err := client.Models.GenerateContent(
    ctx,
    "lyria-3-pro-preview",
    genai.Text(prompt),
    nil,
)

Java

String prompt = """
    Create a dreamy indie pop song with the following lyrics:

    [Verse 1]
    Walking through the neon glow,
    city lights reflect below,
    every shadow tells a story,
    every corner, fading glory.

    [Chorus]
    We are the echoes in the night,
    burning brighter than the light,
    hold on tight, don't let me go,
    we are the echoes down below.

    [Verse 2]
    Footsteps lost on empty streets,
    rhythms sync to heartbeats,
    whispers carried by the breeze,
    dancing through the autumn leaves.
    """;

GenerateContentResponse response = client.models.generateContent(
    "lyria-3-pro-preview",
    prompt);

C#

var prompt = @"
Create a dreamy indie pop song with the following lyrics:

[Verse 1]
Walking through the neon glow,
city lights reflect below,
every shadow tells a story,
every corner, fading glory.

[Chorus]
We are the echoes in the night,
burning brighter than the light,
hold on tight, don't let me go,
we are the echoes down below.

[Verse 2]
Footsteps lost on empty streets,
rhythms sync to heartbeats,
whispers carried by the breeze,
dancing through the autumn leaves.
";

var response = await client.Models.GenerateContentAsync(
  model: "lyria-3-pro-preview",
  contents: prompt
);

REST

curl -s -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/lyria-3-pro-preview:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [
        {"text": "Create a dreamy indie pop song with the following lyrics: ..."}
      ]
    }]
  }'

タイミングと構成を制御する

タイムスタンプを使用して、曲の特定の瞬間に何が起こるかを正確に指定できます。これは、楽器の開始タイミング、歌詞の配信タイミング、曲の進行方法を制御するのに役立ちます。

Python

prompt = """
[0:00 - 0:10] Intro: Begin with a soft lo-fi beat and muffled
              vinyl crackle.
[0:10 - 0:30] Verse 1: Add a warm Fender Rhodes piano melody
              and gentle vocals singing about a rainy morning.
[0:30 - 0:50] Chorus: Full band with upbeat drums and soaring
              synth leads. The lyrics are hopeful and uplifting.
[0:50 - 1:00] Outro: Fade out with the piano melody alone.
"""

response = client.models.generate_content(
    model="lyria-3-pro-preview",
    contents=prompt,
)

JavaScript

const prompt = `
[0:00 - 0:10] Intro: Begin with a soft lo-fi beat and muffled
              vinyl crackle.
[0:10 - 0:30] Verse 1: Add a warm Fender Rhodes piano melody
              and gentle vocals singing about a rainy morning.
[0:30 - 0:50] Chorus: Full band with upbeat drums and soaring
              synth leads. The lyrics are hopeful and uplifting.
[0:50 - 1:00] Outro: Fade out with the piano melody alone.
`;

const response = await ai.models.generateContent({
  model: "lyria-3-pro-preview",
  contents: prompt,

});

Go

prompt := `
[0:00 - 0:10] Intro: Begin with a soft lo-fi beat and muffled
              vinyl crackle.
[0:10 - 0:30] Verse 1: Add a warm Fender Rhodes piano melody
              and gentle vocals singing about a rainy morning.
[0:30 - 0:50] Chorus: Full band with upbeat drums and soaring
              synth leads. The lyrics are hopeful and uplifting.
[0:50 - 1:00] Outro: Fade out with the piano melody alone.
`

result, err := client.Models.GenerateContent(
    ctx,
    "lyria-3-pro-preview",
    genai.Text(prompt),
    nil,
)

Java

String prompt = """
    [0:00 - 0:10] Intro: Begin with a soft lo-fi beat and muffled
                  vinyl crackle.
    [0:10 - 0:30] Verse 1: Add a warm Fender Rhodes piano melody
                  and gentle vocals singing about a rainy morning.
    [0:30 - 0:50] Chorus: Full band with upbeat drums and soaring
                  synth leads. The lyrics are hopeful and uplifting.
    [0:50 - 1:00] Outro: Fade out with the piano melody alone.
    """;

GenerateContentResponse response = client.models.generateContent(
    "lyria-3-pro-preview",
    prompt);

C#

var prompt = @"
[0:00 - 0:10] Intro: Begin with a soft lo-fi beat and muffled
              vinyl crackle.
[0:10 - 0:30] Verse 1: Add a warm Fender Rhodes piano melody
              and gentle vocals singing about a rainy morning.
[0:30 - 0:50] Chorus: Full band with upbeat drums and soaring
              synth leads. The lyrics are hopeful and uplifting.
[0:50 - 1:00] Outro: Fade out with the piano melody alone.
";

var response = await client.Models.GenerateContentAsync(
  model: "lyria-3-pro-preview",
  contents: prompt
);

REST

curl -s -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/lyria-3-pro-preview:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [
        {"text": "[0:00 - 0:10] Intro: ..."}
      ]
    }]
  }'

インストゥルメンタルトラックを生成する

バックグラウンドミュージック、ゲームサウンドトラック、ボーカルが不要なユースケースでは、インストゥルメンタルのみのトラックを生成するようにモデルにプロンプトを表示できます。

Python

response = client.models.generate_content(
    model="lyria-3-clip-preview",
    contents="A bright chiptune melody in C Major, retro 8-bit "
             "video game style. Instrumental only, no vocals.",
)

JavaScript

const response = await ai.models.generateContent({
  model: "lyria-3-clip-preview",
  contents: "A bright chiptune melody in C Major, retro 8-bit " +
            "video game style. Instrumental only, no vocals.",

});

Go

result, err := client.Models.GenerateContent(
    ctx,
    "lyria-3-clip-preview",
    genai.Text("A bright chiptune melody in C Major, retro 8-bit " +
               "video game style. Instrumental only, no vocals."),
    nil,
)

Java

GenerateContentResponse response = client.models.generateContent(
    "lyria-3-clip-preview",
    "A bright chiptune melody in C Major, retro 8-bit "
        + "video game style. Instrumental only, no vocals.");

C#

var response = await client.Models.GenerateContentAsync(
  model: "lyria-3-clip-preview",
  contents: "A bright chiptune melody in C Major, retro 8-bit " +
            "video game style. Instrumental only, no vocals."
);

REST

curl -s -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/lyria-3-clip-preview:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [
        {"text": "A bright chiptune melody in C Major, retro 8-bit video game style. Instrumental only, no vocals."}
      ]
    }]
  }'

さまざまな言語で音楽を生成する

Lyria 3 は、プロンプトの言語で歌詞を生成します。フランス語の歌詞を含む曲を生成するには、プロンプトをフランス語で記述します。モデルは、言語に合わせてボーカルスタイルと発音を調整します。

Python

response = client.models.generate_content(
    model="lyria-3-pro-preview",
    contents="Crée une chanson pop romantique en français sur un "
             "coucher de soleil à Paris. Utilise du piano et de "
             "la guitare acoustique.",
)

JavaScript

const response = await ai.models.generateContent({
  model: "lyria-3-pro-preview",
  contents: "Crée une chanson pop romantique en français sur un " +
            "coucher de soleil à Paris. Utilise du piano et de " +
            "la guitare acoustique.",

});

Go

result, err := client.Models.GenerateContent(
    ctx,
    "lyria-3-pro-preview",
    genai.Text("Crée une chanson pop romantique en français sur un " +
               "coucher de soleil à Paris. Utilise du piano et de " +
               "la guitare acoustique."),
    nil,
)

Java

GenerateContentResponse response = client.models.generateContent(
    "lyria-3-pro-preview",
    "Crée une chanson pop romantique en français sur un "
        + "coucher de soleil à Paris. Utilise du piano et de "
        + "la guitare acoustique.");

C#

var response = await client.Models.GenerateContentAsync(
  model: "lyria-3-pro-preview",
  contents: "Crée une chanson pop romantique en français sur un " +
            "coucher de soleil à Paris. Utilise du piano et de " +
            "la guitare acoustique."
);

REST

curl -s -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/lyria-3-pro-preview:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [
        {"text": "Crée une chanson pop romantique en français sur un coucher de soleil à Paris. Utilise du piano et de la guitare acoustique."}
      ]
    }]
  }'

モデルのインテリジェンス

Lyria 3 は、プロンプトに基づいてモデルが音楽の構成（イントロ、A メロ、サビ、ブリッジなど）を推論するプロンプトプロセスを分析します。これは音声が生成される前に行われ、構造的な一貫性と音楽性を確保します。

Interactions API

Lyria 3 モデルは、Interactions API（ Gemini モデルとエージェントを操作するための統合インターフェース）で使用できます。複雑なマルチモーダルユースケースの状態管理と長時間実行タスクを簡素化します。

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="lyria-3-pro-preview",
    input="A melancholic jazz fusion track in D minor, " +
          "featuring a smooth saxophone melody, walking bass line, " +
          "and complex drum rhythms.",
)

for output in interaction.outputs:
    if output.text:
        print(output.text)
    elif output.inline_data:
         with open("interaction_output.mp3", "wb") as f:
            f.write(output.inline_data.data)
         print("Audio saved to interaction_output.mp3")

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
  model: 'lyria-3-pro-preview',
  input: 'A melancholic jazz fusion track in D minor, ' +
         'featuring a smooth saxophone melody, walking bass line, ' +
         'and complex drum rhythms.',
});

for (const output of interaction.outputs) {
  if (output.text) {
    console.log(output.text);
  } else if (output.inlineData) {
    const buffer = Buffer.from(output.inlineData.data, 'base64');
    fs.writeFileSync('interaction_output.mp3', buffer);
    console.log('Audio saved to interaction_output.mp3');
  }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "lyria-3-pro-preview",
    "input": "A melancholic jazz fusion track in D minor, featuring a smooth saxophone melody, walking bass line, and complex drum rhythms."
}'

プロンプトガイド

プロンプトは、「水たまりを避けるかわいい猫についてのフォークソング、女性ボーカルと雨の音」のようにシンプルなものにすることも、次のように詳細で構造化されたものにすることもできます。

力強いビート、きらめくシンセサイザー、キャッチーでアンセムのようなサビが特徴の 1980 年代風シンセポップトラック。この曲は、80 年代のクラシックなポップヒット曲を彷彿とさせるレトロフューチャリスティックな雰囲気で、モダンなプロダクションの磨きがかかっています。テンポはアップビートで踊りやすい 120 BPM 程度で、明確な A メロとサビの構成と、記憶に残るインストゥルメンタルフックがあります。歌詞はパーティーの準備をする気持ちについてです。

シンプルなプロンプトでも複雑なプロンプトでも、優れた出力を得ることができます。これらのヒントを試して、最適な方法を見つけることをおすすめします。

ジャンル

プロンプトの先頭に、ヒップホップ、ロック、ラップなど、希望する音楽のジャンルを指定します。ジャンルの組み合わせを指定できます。

メタルとラップの融合
デスメタルとオペラの組み合わせ
電子ドローン要素を含むクラシック曲
ユーロポップとミックスされたモダンなエレクトロニックダンスミュージック（EDM）

時代を取り入れることもできます。

90 年代初頭のヒップホップ
60 年代のフランスのイエイエポップ
80 年代のエレクトロニック実験
2000 年代のメインストリームポップ

「ベルリンテクノ」や「ベイエリアハイフィー」など、カスタムジャンルや地域バリアントをプロンプトで指定すると、モデルはそのエッセンスを捉えようとしますが、必ずしも正しく取得できるとは限りません。

楽器

デフォルトでは、Lyria 3 はジャンルに合った楽器やツールを使用して曲を作成します。細かく指定する必要はありません。

ただし、サックスをリクエストしない限り、ダンストラックにサックスは含まれません。サックスのソロを演奏したい場合は、次のようにプロンプトを表示する必要があります。

力強いビート、きらめくシンセサイザー、キャッチーでアンセムのようなサビが特徴のダンストラック。ブリッジでサックスのソロが入るようにしてください。

プロンプトには、特定の楽器、その音、楽器同士の相互作用を含めることができます。この組み合わせを使用して、特定のムードやテクスチャを作成できます。

ダーティで歪んだベースラインと、クリーンで歯切れの良いハイハットの対決
ドライで親密なアコースティックギターの下で膨らむ、温かみのあるアナログシンセサイザーパッド
複数のレイヤのファジーギターで作成されたサウンドウォール。ボーカルは埋もれて遠くに聞こえる

曲の構成

プロンプトで曲の進行を概説できます。矢印またはリストを使用してフローを定義します。

[Intro] -> [Verse 1] -> [Chorus] -> [Verse 2] -> [Chorus] -> [Bridge] -> [Outro]
静かなピアノのイントロから始まり、大きな A メロに発展し、静寂に落ち、サビに爆発する。

これらのセクション間のエネルギーレベルの変化を指定することもできます。

プレサビで緊張感を高め、大規模で爆発的なサビの前に静寂に落ちる
曲全体で徐々にクレッシェンドし、1 つずつ楽器を追加して、カオスなサウンドウォールになる
ブリッジの後に突然停止し、アカペラのサビが続く

何かを発生させたい正確な時間をプロンプトで指定することもできます。

12 秒でドロップする
2 秒ごとに「what」と言う
22 秒でサビが始まる

歌詞

デフォルトでは、ボーカルと歌詞が生成されます。独自の歌詞を指定したり、歌詞なし（またはインストゥルメンタル）をリクエストしたり、歌詞の生成を希望する方向に誘導したりできます。

歌詞は、プロンプトを記述する言語で表示されます。「フランス語で歌詞を書いて」など、別の言語で歌詞をリクエストすることもできます。

独自の歌詞を使用する

独自の歌詞をモデルに指定するには、「歌詞:」という接頭辞を付けてプロンプトに含めます。

Lyrics:

[Intro]
Oooh, oooh

[Verse 1]
Let's go
Let's go
Go with the flow

[Chorus]
...

曲の一部に、[Intro], [Verse 1], [Pre-chorus], [Chorus]、[Outro] などのセクションタイトルを接頭辞として付けることができます。

単語や行を繰り返したい場合は、エコーやバックコーラスのように、かっこで囲んで含めることができます（「Let's go（go）」）。

歌詞を作成するようにモデルにプロンプトを表示する

Lyria 3 に歌詞を作成させる場合は、歌詞の内容の詳細をプロンプトに含めることをおすすめします。そうしないと、モデルは音楽プロンプトから主題を推測する必要があるため、希望する内容にならない可能性があります。

歌詞は失恋と失恋の痛みをテーマにしています。歌手は過去の恋愛と、思い出がよみがえることを回想しています。

繰り返しのサビが必要な場合は、プロンプトでリクエストすると便利です。

歌詞は失恋と失恋の痛みをテーマにしています。歌手は過去の恋愛と、思い出がよみがえることを回想しています。力強いサビは、痛みを乗り越えて前進することに焦点を当てています。

Lyria 3 は、リクエストした音楽のタイプに合わせて歌詞の構成を自動的に誘導しますが、プロンプトでこれを再強調することもできます。次に例を示します。

同じエネルギッシュなフレーズを何度も繰り返す EDM トラック。

厳密には歌詞ではないボーカルエフェクトをプロンプトで指定することもできます。次に例を示します。

映画の繰り返しサンプルが曲全体で「信じられない！」と言う
ハイエナジーテクノトラック。ドロップの直前に音がすべて止まり、小さな声で「ここで何をしているのかわからない」と言ってから、音楽がドロップする。
このトラックは、90 年代の映画が今よりも優れていたという会話から始まります。その後、ポップソングに移行します。

ボーカル

歌詞の配信方法をプロンプトで指定できます。最適な結果を得るには、性別、音色、音域をカバーする詳細な歌手プロファイルを指定します。

女性ソプラノ: 軽快で伸びやかな、クリアでクリスタルのような音色。エアリーでブレス感のあるテクスチャで、ホイッスル音域の高音を出すことができます。
女性アルト：豊かで温かみのある、ハスキーな低音域。スモーキーな音色で、ボーカルフライが少し入っています。ソウルフルで響きのある音色です。
男性テノール: 明るく、突き抜けるような、エネルギッシュな音色。若々しい音色で、鼻にかかったようなエッジがあり、高いベルティングパワーでミックスを切り抜けます。
男性バリトン: 深く、チョコレートのような、ベルベットのように滑らかな音色。響きのある胸声で、心地よいクルーニングのような配信。
ベテランロッカー（男性）: 90 年代のグランジを彷彿とさせる、ざらざらした音色とテクスチャ。感情的な強さのための緊張感のある高音域。

その他のプロンプトパラメータ

次のパラメータを含めて、プロンプトをさらに絞り込むこともできます。

キー/スケール: 音楽キーを指定します（例:「G メジャー」、「D マイナー」）。
ムードと雰囲気: 説明的な形容詞（例:「ノスタルジック」、「アグレッシブ」、「エーテル」、「夢のような」）を使用します。
期間: Clip モデルは常に 30 秒のクリップを生成します。Pro モデルの場合は、プロンプトで希望する長さを指定する（例:「2 分間の曲を作成する」）か、タイムスタンプを使用して期間を制御します。

プロンプトの例

効果的なプロンプトの例を次に示します。

"A 30-second lofi hip hop beat with dusty vinyl crackle, mellow Rhodes piano chords, a slow boom-bap drum pattern at 85 BPM, and a jazzy upright bass line. Instrumental only."
"An upbeat, feel-good pop song in G major at 120 BPM with bright acoustic guitar strumming, claps, and warm vocal harmonies about a summer road trip."
"A dark, atmospheric trap beat at 140 BPM with heavy 808 bass, eerie synth pads, sharp hi-hats, and a haunting vocal sample. In D minor."

ベストプラクティス

最初に Clip で反復処理します。より高速な lyria-3-clip-preview モデルを使用してプロンプトを試してから、lyria-3-pro-preview でフルレングスの生成を行います。
具体的に記述しましょう。曖昧なプロンプトでは一般的な結果しか得られません。最適な出力を得るには、楽器、BPM、キー、ムード、構成を指定します。
セクションタグを使用します。[Verse]、[Chorus]、[Bridge] タグを使用すると、モデルが従うべき明確な構成が提供されます。
歌詞と指示を分離します。カスタムの歌詞を指定する場合は、音楽の方向性に関する指示と明確に区別してください。

制限事項

安全性: すべてのプロンプトは安全フィルタによってチェックされます。フィルタをトリガーするプロンプトはブロックされます。これには、特定のアーティストの音声や著作権で保護された歌詞の生成をリクエストするプロンプトが含まれます。
透かし: 生成されたすべての音声には、識別用の SynthID オーディオウォーターマークが含まれます。この透かしは人間の耳には聞こえず、リスニング体験に影響しません。
マルチターン編集: 音楽生成はシングルターンプロセスです。現在のバージョンの Lyria 3 では、複数のプロンプトを使用して生成されたクリップを反復的に編集または改良することはできません。
長さ: Clip モデルは常に 30 秒のクリップを生成します。Pro モデルは数分間の曲を生成します。正確な期間はプロンプトで指定できます。
決定論: 同じプロンプトでも、呼び出しごとに結果が異なる場合があります。

次のステップ

Lyria 3 モデルの料金を確認する。
リアルタイムのストリーミング音楽生成を Lyria RealTime を使用して試す。
TTS モデルを使用して複数の話者による会話を生成する。
画像や動画を生成する方法を確認する。
Gemini が音声ファイルを理解する方法を確認する。
Live API を使用して Gemini とリアルタイムで会話する。

Lyria 3 で音楽を生成する

音楽クリップを生成する

Python

JavaScript

Go

Java

REST

C#

フルレングスの曲を生成する

Python

JavaScript

Go

Java

REST

C#

出力形式を選択する

Python

JavaScript

Go

Java

C#

REST

レスポンスをパースする

Python

JavaScript

Go

Java

C#

REST

画像から音楽を生成する

Python

JavaScript

Go

Java

REST

C#

カスタムの歌詞を指定する

Python

JavaScript

Go

Java

C#

REST

タイミングと構成を制御する

Python

JavaScript

Go

Java

C#

REST

インストゥルメンタル トラックを生成する

Python

JavaScript

Go

Java

C#

REST

さまざまな言語で音楽を生成する

Python

JavaScript

Go

Java

C#

REST

モデルのインテリジェンス

Interactions API

Python

JavaScript

REST

プロンプト ガイド

ジャンル

楽器

曲の構成

歌詞

独自の歌詞を使用する

歌詞を作成するようにモデルにプロンプトを表示する

ボーカル

その他のプロンプト パラメータ

プロンプトの例

ベスト プラクティス

インストゥルメンタルトラックを生成する

プロンプトガイド

その他のプロンプトパラメータ

ベストプラクティス