Gemini dapat membuat dan memproses gambar secara percakapan. Anda dapat memberi Gemini perintah dengan teks, gambar, atau kombinasi keduanya sehingga Anda dapat membuat, mengedit, dan melakukan iterasi pada visual dengan kontrol yang belum pernah ada sebelumnya:
- Text-to-Image: Hasilkan gambar berkualitas tinggi dari deskripsi teks sederhana atau kompleks.
- Gambar + Teks ke Gambar (Pengeditan): Berikan gambar dan gunakan perintah teks untuk menambahkan, menghapus, atau mengubah elemen, mengubah gaya, atau menyesuaikan gradasi warna.
- Multi-Gambar ke Gambar (Komposisi & Transfer Gaya): Menggunakan beberapa gambar input untuk menyusun adegan baru atau mentransfer gaya dari satu gambar ke gambar lain.
- Penyempurnaan Iteratif: Lakukan percakapan untuk menyempurnakan gambar Anda secara progresif dalam beberapa giliran, dengan melakukan penyesuaian kecil hingga gambar tersebut sempurna.
- Rendering Teks dengan Akurasi Tinggi: Hasilkan gambar secara akurat yang berisi teks yang mudah dibaca dan ditempatkan dengan baik, ideal untuk logo, diagram, dan poster.
Semua gambar yang dihasilkan menyertakan watermark SynthID.
Pembuatan gambar (teks ke gambar)
Kode berikut menunjukkan cara membuat gambar berdasarkan perintah deskriptif.
Python
from google import genai
from google.genai import types
from PIL import Image
from io import BytesIO
client = genai.Client()
prompt = (
"Create a picture of a nano banana dish in a fancy restaurant with a Gemini theme"
)
response = client.models.generate_content(
model="gemini-2.5-flash-image-preview",
contents=[prompt],
)
for part in response.candidates[0].content.parts:
if part.text is not None:
print(part.text)
elif part.inline_data is not None:
image = Image.open(BytesIO(part.inline_data.data))
image.save("generated_image.png")
JavaScript
import { GoogleGenAI, Modality } from "@google/genai";
import * as fs from "node:fs";
async function main() {
const ai = new GoogleGenAI({});
const prompt =
"Create a picture of a nano banana dish in a fancy restaurant with a Gemini theme";
const response = await ai.models.generateContent({
model: "gemini-2.5-flash-image-preview",
contents: prompt,
});
for (const part of response.candidates[0].content.parts) {
if (part.text) {
console.log(part.text);
} else if (part.inlineData) {
const imageData = part.inlineData.data;
const buffer = Buffer.from(imageData, "base64");
fs.writeFileSync("gemini-native-image.png", buffer);
console.log("Image saved as gemini-native-image.png");
}
}
}
main();
Go
package main
import (
"context"
"fmt"
"os"
"google.golang.org/genai"
)
func main() {
ctx := context.Background()
client, err := genai.NewClient(ctx, nil)
if err != nil {
log.Fatal(err)
}
result, _ := client.Models.GenerateContent(
ctx,
"gemini-2.5-flash-image-preview",
genai.Text("Create a picture of a nano banana dish in a " +
" fancy restaurant with a Gemini theme"),
)
for _, part := range result.Candidates[0].Content.Parts {
if part.Text != "" {
fmt.Println(part.Text)
} else if part.InlineData != nil {
imageBytes := part.InlineData.Data
outputFilename := "gemini_generated_image.png"
_ = os.WriteFile(outputFilename, imageBytes, 0644)
}
}
}
REST
curl -s -X POST
"https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image-preview:generateContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [{
"parts": [
{"text": "Create a picture of a nano banana dish in a fancy restaurant with a Gemini theme"}
]
}]
}' \
| grep -o '"data": "[^"]*"' \
| cut -d'"' -f4 \
| base64 --decode > gemini-native-image.png

Pengeditan gambar (teks dan gambar ke gambar)
Pengingat: Pastikan Anda memiliki hak yang diperlukan atas gambar apa pun yang Anda upload. Jangan membuat konten yang melanggar hak orang lain, termasuk video atau gambar yang menipu, melecehkan, atau membahayakan. Penggunaan layanan AI generatif ini oleh Anda tunduk pada Kebijakan Penggunaan Terlarang kami.
Contoh berikut menunjukkan cara mengupload gambar berenkode base64. Untuk beberapa gambar, payload yang lebih besar, dan jenis MIME yang didukung, lihat halaman Pemahaman gambar.
Python
from google import genai
from google.genai import types
from PIL import Image
from io import BytesIO
client = genai.Client()
prompt = (
"Create a picture of my cat eating a nano-banana in a "
"fancy restaurant under the Gemini constellation",
)
image = Image.open("/path/to/cat_image.png")
response = client.models.generate_content(
model="gemini-2.5-flash-image-preview",
contents=[prompt, image],
)
for part in response.candidates[0].content.parts:
if part.text is not None:
print(part.text)
elif part.inline_data is not None:
image = Image.open(BytesIO(part.inline_data.data))
image.save("generated_image.png")
JavaScript
import { GoogleGenAI, Modality } from "@google/genai";
import * as fs from "node:fs";
async function main() {
const ai = new GoogleGenAI({});
const imagePath = "path/to/cat_image.png";
const imageData = fs.readFileSync(imagePath);
const base64Image = imageData.toString("base64");
const prompt = [
{ text: "Create a picture of my cat eating a nano-banana in a" +
"fancy restaurant under the Gemini constellation" },
{
inlineData: {
mimeType: "image/png",
data: base64Image,
},
},
];
const response = await ai.models.generateContent({
model: "gemini-2.5-flash-image-preview",
contents: prompt,
});
for (const part of response.candidates[0].content.parts) {
if (part.text) {
console.log(part.text);
} else if (part.inlineData) {
const imageData = part.inlineData.data;
const buffer = Buffer.from(imageData, "base64");
fs.writeFileSync("gemini-native-image.png", buffer);
console.log("Image saved as gemini-native-image.png");
}
}
}
main();
Go
package main
import (
"context"
"fmt"
"os"
"google.golang.org/genai"
)
func main() {
ctx := context.Background()
client, err := genai.NewClient(ctx, nil)
if err != nil {
log.Fatal(err)
}
imagePath := "/path/to/cat_image.png"
imgData, _ := os.ReadFile(imagePath)
parts := []*genai.Part{
genai.NewPartFromText("Create a picture of my cat eating a nano-banana in a fancy restaurant under the Gemini constellation"),
&genai.Part{
InlineData: &genai.Blob{
MIMEType: "image/png",
Data: imgData,
},
},
}
contents := []*genai.Content{
genai.NewContentFromParts(parts, genai.RoleUser),
}
result, _ := client.Models.GenerateContent(
ctx,
"gemini-2.5-flash-image-preview",
contents,
)
for _, part := range result.Candidates[0].Content.Parts {
if part.Text != "" {
fmt.Println(part.Text)
} else if part.InlineData != nil {
imageBytes := part.InlineData.Data
outputFilename := "gemini_generated_image.png"
_ = os.WriteFile(outputFilename, imageBytes, 0644)
}
}
}
REST
IMG_PATH=/path/to/cat_image.jpeg
if [[ "$(base64 --version 2>&1)" = *"FreeBSD"* ]]; then
B64FLAGS="--input"
else
B64FLAGS="-w0"
fi
IMG_BASE64=$(base64 "$B64FLAGS" "$IMG_PATH" 2>&1)
curl -X POST \
"https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image-preview:generateContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H 'Content-Type: application/json' \
-d "{
\"contents\": [{
\"parts\":[
{\"text\": \"'Create a picture of my cat eating a nano-banana in a fancy restaurant under the Gemini constellation\"},
{
\"inline_data\": {
\"mime_type\":\"image/jpeg\",
\"data\": \"$IMG_BASE64\"
}
}
]
}]
}" \
| grep -o '"data": "[^"]*"' \
| cut -d'"' -f4 \
| base64 --decode > gemini-edited-image.png

Mode pembuatan gambar lainnya
Gemini mendukung mode interaksi gambar lainnya berdasarkan struktur dan konteks perintah, termasuk:
- Teks ke gambar dan teks (disisipkan): Menghasilkan gambar dengan teks terkait.
- Contoh perintah: "Buat resep paella yang diilustrasikan."
- Gambar dan teks ke gambar dan teks (berselang-seling): Menggunakan gambar dan teks input untuk membuat gambar dan teks baru yang terkait.
- Contoh perintah: (Dengan gambar ruangan yang dilengkapi perabot) "Sofa warna apa lagi yang cocok untuk ruangan saya? Bisakah Anda memperbarui gambar ini?"
- Pengeditan gambar multi-turn (chat): Terus buat dan edit gambar secara percakapan.
- Contoh perintah: [upload gambar mobil biru.] , "Ubah mobil ini menjadi mobil convertible", "Sekarang ubah warnanya menjadi kuning."
Panduan dan strategi penulisan perintah
Menguasai Pembuatan Gambar Gemini 2.5 Flash dimulai dengan satu prinsip dasar:
Deskripsikan suasananya, jangan hanya mencantumkan kata kunci. Kekuatan inti model ini adalah pemahaman bahasanya yang mendalam. Paragraf naratif dan deskriptif hampir selalu menghasilkan gambar yang lebih baik dan lebih koheren daripada daftar kata-kata yang tidak terhubung.
Perintah untuk membuat gambar
Strategi berikut akan membantu Anda membuat perintah yang efektif untuk membuat gambar yang Anda inginkan.
1. Adegan fotorealistik
Untuk gambar yang realistis, gunakan istilah fotografi. Sebutkan sudut kamera, jenis lensa, pencahayaan, dan detail halus untuk memandu model menghasilkan gambar fotorealistik.
Template
A photorealistic [shot type] of [subject], [action or expression], set in
[environment]. The scene is illuminated by [lighting description], creating
a [mood] atmosphere. Captured with a [camera/lens details], emphasizing
[key textures and details]. The image should be in a [aspect ratio] format.
Perintah
A photorealistic close-up portrait of an elderly Japanese ceramicist with
deep, sun-etched wrinkles and a warm, knowing smile. He is carefully
inspecting a freshly glazed tea bowl. The setting is his rustic,
sun-drenched workshop. The scene is illuminated by soft, golden hour light
streaming through a window, highlighting the fine texture of the clay.
Captured with an 85mm portrait lens, resulting in a soft, blurred background
(bokeh). The overall mood is serene and masterful. Vertical portrait
orientation.
Python
from google import genai
from google.genai import types
from PIL import Image
from io import BytesIO
client = genai.Client()
# Generate an image from a text prompt
response = client.models.generate_content(
model="gemini-2.5-flash-image-preview",
contents="A photorealistic close-up portrait of an elderly Japanese ceramicist with deep, sun-etched wrinkles and a warm, knowing smile. He is carefully inspecting a freshly glazed tea bowl. The setting is his rustic, sun-drenched workshop with pottery wheels and shelves of clay pots in the background. The scene is illuminated by soft, golden hour light streaming through a window, highlighting the fine texture of the clay and the fabric of his apron. Captured with an 85mm portrait lens, resulting in a soft, blurred background (bokeh). The overall mood is serene and masterful.",
)
image_parts = [
part.inline_data.data
for part in response.candidates[0].content.parts
if part.inline_data
]
if image_parts:
image = Image.open(BytesIO(image_parts[0]))
image.save('photorealistic_example.png')
image.show()

2. Ilustrasi & stiker bergaya
Untuk membuat stiker, ikon, atau aset, sebutkan gaya secara eksplisit dan minta latar belakang transparan.
Template
A [style] sticker of a [subject], featuring [key characteristics] and a
[color palette]. The design should have [line style] and [shading style].
The background must be transparent.
Perintah
A kawaii-style sticker of a happy red panda wearing a tiny bamboo hat. It's
munching on a green bamboo leaf. The design features bold, clean outlines,
simple cel-shading, and a vibrant color palette. The background must be white.
Python
from google import genai
from google.genai import types
from PIL import Image
from io import BytesIO
client = genai.Client()
# Generate an image from a text prompt
response = client.models.generate_content(
model="gemini-2.5-flash-image-preview",
contents="A kawaii-style sticker of a happy red panda wearing a tiny bamboo hat. It's munching on a green bamboo leaf. The design features bold, clean outlines, simple cel-shading, and a vibrant color palette. The background must be white.",
)
image_parts = [
part.inline_data.data
for part in response.candidates[0].content.parts
if part.inline_data
]
if image_parts:
image = Image.open(BytesIO(image_parts[0]))
image.save('red_panda_sticker.png')
image.show()

3. Teks yang akurat dalam gambar
Gemini unggul dalam merender teks. Jelaskan teks, gaya font (secara deskriptif), dan desain keseluruhan.
Template
Create a [image type] for [brand/concept] with the text "[text to render]"
in a [font style]. The design should be [style description], with a
[color scheme].
Perintah
Create a modern, minimalist logo for a coffee shop called 'The Daily Grind'.
The text should be in a clean, bold, sans-serif font. The design should
feature a simple, stylized icon of a a coffee bean seamlessly integrated
with the text. The color scheme is black and white.
Python
from google import genai
from google.genai import types
from PIL import Image
from io import BytesIO
client = genai.Client()
# Generate an image from a text prompt
response = client.models.generate_content(
model="gemini-2.5-flash-image-preview",
contents="Create a modern, minimalist logo for a coffee shop called 'The Daily Grind'. The text should be in a clean, bold, sans-serif font. The design should feature a simple, stylized icon of a a coffee bean seamlessly integrated with the text. The color scheme is black and white.",
)
image_parts = [
part.inline_data.data
for part in response.candidates[0].content.parts
if part.inline_data
]
if image_parts:
image = Image.open(BytesIO(image_parts[0]))
image.save('logo_example.png')
image.show()

4. Mockup produk & fotografi komersial
Sempurna untuk membuat foto produk yang bersih dan profesional untuk e-commerce, iklan, atau branding.
Template
A high-resolution, studio-lit product photograph of a [product description]
on a [background surface/description]. The lighting is a [lighting setup,
e.g., three-point softbox setup] to [lighting purpose]. The camera angle is
a [angle type] to showcase [specific feature]. Ultra-realistic, with sharp
focus on [key detail]. [Aspect ratio].
Perintah
A high-resolution, studio-lit product photograph of a minimalist ceramic
coffee mug in matte black, presented on a polished concrete surface. The
lighting is a three-point softbox setup designed to create soft, diffused
highlights and eliminate harsh shadows. The camera angle is a slightly
elevated 45-degree shot to showcase its clean lines. Ultra-realistic, with
sharp focus on the steam rising from the coffee. Square image.
Python
from google import genai
from google.genai import types
from PIL import Image
from io import BytesIO
client = genai.Client()
# Generate an image from a text prompt
response = client.models.generate_content(
model="gemini-2.5-flash-image-preview",
contents="A high-resolution, studio-lit product photograph of a minimalist ceramic coffee mug in matte black, presented on a polished concrete surface. The lighting is a three-point softbox setup designed to create soft, diffused highlights and eliminate harsh shadows. The camera angle is a slightly elevated 45-degree shot to showcase its clean lines. Ultra-realistic, with sharp focus on the steam rising from the coffee. Square image.",
)
image_parts = [
part.inline_data.data
for part in response.candidates[0].content.parts
if part.inline_data
]
if image_parts:
image = Image.open(BytesIO(image_parts[0]))
image.save('product_mockup.png')
image.show()

5. Desain minimalis & ruang negatif
Sangat cocok untuk membuat latar belakang situs, presentasi, atau materi pemasaran yang akan ditumpuk dengan teks.
Template
A minimalist composition featuring a single [subject] positioned in the
[bottom-right/top-left/etc.] of the frame. The background is a vast, empty
[color] canvas, creating significant negative space. Soft, subtle lighting.
[Aspect ratio].
Perintah
A minimalist composition featuring a single, delicate red maple leaf
positioned in the bottom-right of the frame. The background is a vast, empty
off-white canvas, creating significant negative space for text. Soft,
diffused lighting from the top left. Square image.
Python
from google import genai
from google.genai import types
from PIL import Image
from io import BytesIO
client = genai.Client()
# Generate an image from a text prompt
response = client.models.generate_content(
model="gemini-2.5-flash-image-preview",
contents="A minimalist composition featuring a single, delicate red maple leaf positioned in the bottom-right of the frame. The background is a vast, empty off-white canvas, creating significant negative space for text. Soft, diffused lighting from the top left. Square image.",
)
image_parts = [
part.inline_data.data
for part in response.candidates[0].content.parts
if part.inline_data
]
if image_parts:
image = Image.open(BytesIO(image_parts[0]))
image.save('minimalist_design.png')
image.show()

6. Seni sekuensial (Panel komik / Storyboard)
Membangun konsistensi karakter dan deskripsi adegan untuk membuat panel penceritaan visual.
Template
A single comic book panel in a [art style] style. In the foreground,
[character description and action]. In the background, [setting details].
The panel has a [dialogue/caption box] with the text "[Text]". The lighting
creates a [mood] mood. [Aspect ratio].
Perintah
A single comic book panel in a gritty, noir art style with high-contrast
black and white inks. In the foreground, a detective in a trench coat stands
under a flickering streetlamp, rain soaking his shoulders. In the
background, the neon sign of a desolate bar reflects in a puddle. A caption
box at the top reads "The city was a tough place to keep secrets." The
lighting is harsh, creating a dramatic, somber mood. Landscape.
Python
from google import genai
from google.genai import types
from PIL import Image
from io import BytesIO
client = genai.Client()
# Generate an image from a text prompt
response = client.models.generate_content(
model="gemini-2.5-flash-image-preview",
contents="A single comic book panel in a gritty, noir art style with high-contrast black and white inks. In the foreground, a detective in a trench coat stands under a flickering streetlamp, rain soaking his shoulders. In the background, the neon sign of a desolate bar reflects in a puddle. A caption box at the top reads \"The city was a tough place to keep secrets.\" The lighting is harsh, creating a dramatic, somber mood. Landscape.",
)
image_parts = [
part.inline_data.data
for part in response.candidates[0].content.parts
if part.inline_data
]
if image_parts:
image = Image.open(BytesIO(image_parts[0]))
image.save('comic_panel.png')
image.show()

Perintah untuk mengedit gambar
Contoh ini menunjukkan cara memberikan gambar bersama perintah teks Anda untuk pengeditan, komposisi, dan transfer gaya.
1. Menambahkan dan menghapus elemen
Berikan gambar dan jelaskan perubahan Anda. Model akan cocok dengan gaya, pencahayaan, dan perspektif gambar asli.
Template
Using the provided image of [subject], please [add/remove/modify] [element]
to/from the scene. Ensure the change is [description of how the change should
integrate].
Perintah
"Using the provided image of my cat, please add a small, knitted wizard hat
on its head. Make it look like it's sitting comfortably and matches the soft
lighting of the photo."
Python
from google import genai
from google.genai import types
from PIL import Image
from io import BytesIO
client = genai.Client()
# Base image prompt: "A photorealistic picture of a fluffy ginger cat sitting on a wooden floor, looking directly at the camera. Soft, natural light from a window."
image_input = Image.open('/path/to/your/cat_photo.png')
text_input = """Using the provided image of my cat, please add a small, knitted wizard hat on its head. Make it look like it's sitting comfortably and not falling off."""
# Generate an image from a text prompt
response = client.models.generate_content(
model="gemini-2.5-flash-image-preview",
contents=[text_input, image_input],
)
image_parts = [
part.inline_data.data
for part in response.candidates[0].content.parts
if part.inline_data
]
if image_parts:
image = Image.open(BytesIO(image_parts[0]))
image.save('cat_with_hat.png')
image.show()
Input |
Output |
![]() |
![]() |
2. Lukisan (Penyamaran semantik)
Tentukan "mask" secara percakapan untuk mengedit bagian tertentu dari gambar tanpa mengubah bagian lainnya.
Template
Using the provided image, change only the [specific element] to [new
element/description]. Keep everything else in the image exactly the same,
preserving the original style, lighting, and composition.
Perintah
"Using the provided image of a living room, change only the blue sofa to be
a vintage, brown leather chesterfield sofa. Keep the rest of the room,
including the pillows on the sofa and the lighting, unchanged."
Python
from google import genai
from google.genai import types
from PIL import Image
from io import BytesIO
client = genai.Client()
# Base image prompt: "A wide shot of a modern, well-lit living room with a prominent blue sofa in the center. A coffee table is in front of it and a large window is in the background."
living_room_image = Image.open('/path/to/your/living_room.png')
text_input = """Using the provided image of a living room, change only the blue sofa to be a vintage, brown leather chesterfield sofa. Keep the rest of the room, including the pillows on the sofa and the lighting, unchanged."""
# Generate an image from a text prompt
response = client.models.generate_content(
model="gemini-2.5-flash-image-preview",
contents=[living_room_image, text_input],
)
image_parts = [
part.inline_data.data
for part in response.candidates[0].content.parts
if part.inline_data
]
if image_parts:
image = Image.open(BytesIO(image_parts[0]))
image.save('living_room_edited.png')
image.show()
Input |
Output |
![]() |
![]() |
3. Transfer gaya
Berikan gambar dan minta model untuk membuat ulang kontennya dalam gaya artistik yang berbeda.
Template
Transform the provided photograph of [subject] into the artistic style of [artist/art style]. Preserve the original composition but render it with [description of stylistic elements].
Perintah
"Transform the provided photograph of a modern city street at night into the artistic style of Vincent van Gogh's 'Starry Night'. Preserve the original composition of buildings and cars, but render all elements with swirling, impasto brushstrokes and a dramatic palette of deep blues and bright yellows."
Python
from google import genai
from google.genai import types
from PIL import Image
from io import BytesIO
client = genai.Client()
# Base image prompt: "A photorealistic, high-resolution photograph of a busy city street in New York at night, with bright neon signs, yellow taxis, and tall skyscrapers."
city_image = Image.open('/path/to/your/city.png')
text_input = """Transform the provided photograph of a modern city street at night into the artistic style of Vincent van Gogh's 'Starry Night'. Preserve the original composition of buildings and cars, but render all elements with swirling, impasto brushstrokes and a dramatic palette of deep blues and bright yellows."""
# Generate an image from a text prompt
response = client.models.generate_content(
model="gemini-2.5-flash-image-preview",
contents=[city_image, text_input],
)
image_parts = [
part.inline_data.data
for part in response.candidates[0].content.parts
if part.inline_data
]
if image_parts:
image = Image.open(BytesIO(image_parts[0]))
image.save('city_style_transfer.png')
image.show()
Input |
Output |
![]() |
![]() |
4. Komposisi lanjutan: Menggabungkan beberapa gambar
Berikan beberapa gambar sebagai konteks untuk membuat adegan komposit baru. Fitur ini sangat cocok untuk membuat mockup produk atau kolase kreatif.
Template
Create a new image by combining the elements from the provided images. Take
the [element from image 1] and place it with/on the [element from image 2].
The final image should be a [description of the final scene].
Perintah
"Create a professional e-commerce fashion photo. Take the blue floral dress
from the first image and let the woman from the second image wear it.
Generate a realistic, full-body shot of the woman wearing the dress, with
the lighting and shadows adjusted to match the outdoor environment."
Python
from google import genai
from google.genai import types
from PIL import Image
from io import BytesIO
client = genai.Client()
# Base image prompts:
# 1. Dress: "A professionally shot photo of a blue floral summer dress on a plain white background, ghost mannequin style."
# 2. Model: "Full-body shot of a woman with her hair in a bun, smiling, standing against a neutral grey studio background."
dress_image = Image.open('/path/to/your/dress.png')
model_image = Image.open('/path/to/your/model.png')
text_input = """Create a professional e-commerce fashion photo. Take the blue floral dress from the first image and let the woman from the second image wear it. Generate a realistic, full-body shot of the woman wearing the dress, with the lighting and shadows adjusted to match the outdoor environment."""
# Generate an image from a text prompt
response = client.models.generate_content(
model="gemini-2.5-flash-image-preview",
contents=[dress_image, model_image, text_input],
)
image_parts = [
part.inline_data.data
for part in response.candidates[0].content.parts
if part.inline_data
]
if image_parts:
image = Image.open(BytesIO(image_parts[0]))
image.save('fashion_ecommerce_shot.png')
image.show()
Masukan 1 |
Input 2 |
Output |
![]() |
![]() |
![]() |
5. Mempertahankan detail fidelitas tinggi
Untuk memastikan detail penting (seperti wajah atau logo) dipertahankan selama pengeditan, deskripsikan detail tersebut secara mendalam bersama dengan permintaan pengeditan Anda.
Template
Using the provided images, place [element from image 2] onto [element from
image 1]. Ensure that the features of [element from image 1] remain
completely unchanged. The added element should [description of how the
element should integrate].
Perintah
"Take the first image of the woman with brown hair, blue eyes, and a neutral
expression. Add the logo from the second image onto her black t-shirt.
Ensure the woman's face and features remain completely unchanged. The logo
should look like it's naturally printed on the fabric, following the folds
of the shirt."
Python
from google import genai
from google.genai import types
from PIL import Image
from io import BytesIO
client = genai.Client()
# Base image prompts:
# 1. Woman: "A professional headshot of a woman with brown hair and blue eyes, wearing a plain black t-shirt, against a neutral studio background."
# 2. Logo: "A simple, modern logo with the letters 'G' and 'A' in a white circle."
woman_image = Image.open('/path/to/your/woman.png')
logo_image = Image.open('/path/to/your/logo.png')
text_input = """Take the first image of the woman with brown hair, blue eyes, and a neutral expression. Add the logo from the second image onto her black t-shirt. Ensure the woman's face and features remain completely unchanged. The logo should look like it's naturally printed on the fabric, following the folds of the shirt."""
# Generate an image from a text prompt
response = client.models.generate_content(
model="gemini-2.5-flash-image-preview",
contents=[woman_image, logo_image, text_input],
)
image_parts = [
part.inline_data.data
for part in response.candidates[0].content.parts
if part.inline_data
]
if image_parts:
image = Image.open(BytesIO(image_parts[0]))
image.save('woman_with_logo.png')
image.show()
Masukan 1 |
Input 2 |
Output |
![]() |
![]() |
![]() |
Praktik Terbaik
Untuk meningkatkan hasil dari baik menjadi luar biasa, masukkan strategi profesional ini ke dalam alur kerja Anda.
- Buat Perintah yang Sangat Spesifik: Semakin banyak detail yang Anda berikan, semakin besar kontrol yang Anda miliki. Daripada "armor fantasi", deskripsikan: "armor pelat elf yang indah, diukir dengan pola daun perak, dengan kerah tinggi dan pauldron berbentuk sayap elang".
- Berikan Konteks dan Maksud: Jelaskan tujuan gambar. Pemahaman model tentang konteks akan memengaruhi output akhir. Misalnya, "Buat logo untuk merek perawatan kulit kelas atas yang minimalis" akan memberikan hasil yang lebih baik daripada hanya "Buat logo".
- Lakukan Iterasi dan Tingkatkan Kualitas: Jangan mengharapkan gambar yang sempurna pada percobaan pertama. Gunakan sifat percakapan model untuk melakukan perubahan kecil. Lanjutkan dengan perintah seperti, "Bagus, tapi bisakah kamu membuat pencahayaannya sedikit lebih hangat?" atau "Biarkan semuanya sama, tapi ubah ekspresi karakter menjadi lebih serius."
- Gunakan Petunjuk Langkah demi Langkah: Untuk adegan kompleks dengan banyak elemen, pecah perintah Anda menjadi beberapa langkah. "Pertama, buat latar belakang hutan berkabut yang tenang saat fajar. Kemudian, di latar depan, tambahkan altar batu kuno yang tertutup lumut. Terakhir, letakkan pedang tunggal yang bercahaya di atas altar."
- Gunakan "Semantic Negative Prompts": Daripada mengatakan "tidak ada mobil", deskripsikan adegan yang diinginkan secara positif: "jalan yang kosong dan sepi tanpa tanda-tanda lalu lintas".
- Mengontrol Kamera: Gunakan bahasa fotografi dan sinematik untuk mengontrol komposisi. Istilah seperti
wide-angle shot
,macro shot
,low-angle perspective
.
Batasan
- Untuk performa terbaik, gunakan bahasa berikut: EN, es-MX, ja-JP, zh-CN, hi-IN.
- Pembuatan gambar tidak mendukung input audio atau video.
- Model tidak akan selalu mengikuti jumlah output gambar persis yang diminta pengguna secara eksplisit.
- Model ini berfungsi paling baik dengan maksimal 3 gambar sebagai input.
- Saat membuat teks untuk gambar, Gemini akan bekerja paling baik jika Anda membuat teks terlebih dahulu, lalu meminta gambar dengan teks tersebut.
- Mengupload gambar anak saat ini tidak didukung di EEA, Swiss, dan Inggris Raya.
- Semua gambar yang dihasilkan menyertakan watermark SynthID.
Kapan menggunakan Imagen
Selain menggunakan kemampuan pembuatan gambar bawaan Gemini, Anda juga dapat mengakses Imagen, model pembuatan gambar khusus kami, melalui Gemini API.
Atribut | Imagen | Gambar Native Gemini |
---|---|---|
Keunggulan | Model pembuatan gambar tercanggih hingga saat ini. Direkomendasikan untuk gambar fotorealistik, kejernihan yang lebih tajam, ejaan dan tipografi yang lebih baik. | Rekomendasi default. Fleksibilitas yang tak tertandingi, pemahaman kontekstual, dan pengeditan tanpa mask yang sederhana. Mampu melakukan pengeditan percakapan bolak-balik secara unik. |
Ketersediaan | Tersedia secara umum | Pratinjau (Penggunaan produksi diizinkan) |
Latensi | Rendah. Dioptimalkan untuk performa yang mendekati real-time. | Lebih banyak. Lebih banyak komputasi diperlukan untuk kemampuan tingkat lanjutnya. |
Biaya | Hemat biaya untuk tugas khusus. $0,02/gambar hingga $0,12/gambar | Harga berbasis token. $30 per 1 juta token untuk output gambar (output gambar di-tokenisasi dengan tarif tetap 1.290 token per gambar, hingga 1024x1024 piksel) |
Tugas yang direkomendasikan |
|
|
Imagen 4 akan menjadi model pilihan Anda untuk mulai membuat gambar dengan Imagen. Pilih Imagen 4 Ultra untuk kasus penggunaan lanjutan atau saat Anda memerlukan kualitas gambar terbaik (perhatikan bahwa Imagen 4 Ultra hanya dapat membuat satu gambar dalam satu waktu).
Langkah berikutnya
- Temukan contoh dan sampel kode lainnya di panduan cookbook.
- Lihat panduan Veo untuk mempelajari cara membuat video dengan Gemini API.
- Untuk mempelajari model Gemini lebih lanjut, lihat Model Gemini.