ChatGoogleGenerativeAI
You can access Google's gemini
and gemini-vision
models, as well as other
generative models in LangChain through ChatGoogleGenerativeAI
class in the
@langchain/google-genai
integration package.
You can also access Google's gemini
family of models via the LangChain VertexAI and VertexAI-web integrations.
Click here to read the docs.
Get an API key here: https://ai.google.dev/tutorials/setup
You'll first need to install the @langchain/google-genai
package:
- npm
- Yarn
- pnpm
npm install @langchain/google-genai
yarn add @langchain/google-genai
pnpm add @langchain/google-genai
Usage
We're unifying model params across all packages. We now suggest using model
instead of modelName
, and apiKey
for API keys.
import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { HarmBlockThreshold, HarmCategory } from "@google/generative-ai";
/*
* Before running this, you should make sure you have created a
* Google Cloud Project that has `generativelanguage` API enabled.
*
* You will also need to generate an API key and set
* an environment variable GOOGLE_API_KEY
*
*/
// Text
const model = new ChatGoogleGenerativeAI({
model: "gemini-pro",
maxOutputTokens: 2048,
safetySettings: [
{
category: HarmCategory.HARM_CATEGORY_HARASSMENT,
threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
},
],
});
// Batch and stream are also supported
const res = await model.invoke([
[
"human",
"What would be a good company name for a company that makes colorful socks?",
],
]);
console.log(res);
/*
AIMessage {
content: '1. Rainbow Soles\n' +
'2. Toe-tally Colorful\n' +
'3. Bright Sock Creations\n' +
'4. Hue Knew Socks\n' +
'5. The Happy Sock Factory\n' +
'6. Color Pop Hosiery\n' +
'7. Sock It to Me!\n' +
'8. Mismatched Masterpieces\n' +
'9. Threads of Joy\n' +
'10. Funky Feet Emporium\n' +
'11. Colorful Threads\n' +
'12. Sole Mates\n' +
'13. Colorful Soles\n' +
'14. Sock Appeal\n' +
'15. Happy Feet Unlimited\n' +
'16. The Sock Stop\n' +
'17. The Sock Drawer\n' +
'18. Sole-diers\n' +
'19. Footloose Footwear\n' +
'20. Step into Color',
name: 'model',
additional_kwargs: {}
}
*/
API Reference:
- ChatGoogleGenerativeAI from
@langchain/google-genai
Tool calling
import { StructuredTool } from "@langchain/core/tools";
import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { z } from "zod";
const model = new ChatGoogleGenerativeAI({
model: "gemini-pro",
});
// Define your tool
class FakeBrowserTool extends StructuredTool {
schema = z.object({
url: z.string(),
query: z.string().optional(),
});
name = "fake_browser_tool";
description =
"useful for when you need to find something on the web or summarize a webpage.";
async _call(_: z.infer<this["schema"]>): Promise<string> {
return "fake_browser_tool";
}
}
// Bind your tools to the model
const modelWithTools = model.bind({
tools: [new FakeBrowserTool()],
});
// Or, you can use `.bindTools` which works the same under the hood
// const modelWithTools = model.bindTools([new FakeBrowserTool()]);
const res = await modelWithTools.invoke([
[
"human",
"Search the web and tell me what the weather will be like tonight in new york. use a popular weather website",
],
]);
console.log(res.tool_calls);
/*
[
{
name: 'fake_browser_tool',
args: {
query: 'weather in new york',
url: 'https://www.google.com/search?q=weather+in+new+york'
}
}
]
*/
API Reference:
- StructuredTool from
@langchain/core/tools
- ChatGoogleGenerativeAI from
@langchain/google-genai
See the above run's LangSmith trace here
.withStructuredOutput
import { StructuredTool } from "@langchain/core/tools";
import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { z } from "zod";
const model = new ChatGoogleGenerativeAI({
model: "gemini-pro",
});
// Define your tool
class FakeBrowserTool extends StructuredTool {
schema = z.object({
url: z.string(),
query: z.string().optional(),
});
name = "fake_browser_tool";
description =
"useful for when you need to find something on the web or summarize a webpage.";
async _call(_: z.infer<this["schema"]>): Promise<string> {
return "fake_browser_tool";
}
}
const tool = new FakeBrowserTool();
// Bind your tools to the model
const modelWithTools = model.withStructuredOutput(tool.schema, {
name: tool.name, // this is optional
});
// Optionally, you can pass just a Zod schema, or JSONified Zod schema
// const modelWithTools = model.withStructuredOutput(
// zodSchema,
// );
const res = await modelWithTools.invoke([
[
"human",
"Search the web and tell me what the weather will be like tonight in new york. use a popular weather website",
],
]);
console.log(res);
/*
{
url: 'https://www.accuweather.com/en/us/new-york-ny/10007/night-weather-forecast/349014',
query: 'weather tonight'
}
*/
API Reference:
- StructuredTool from
@langchain/core/tools
- ChatGoogleGenerativeAI from
@langchain/google-genai
See the above run's LangSmith trace here
Multimodal support
To provide an image, pass a human message with a content
field set to an array of content objects. Each content object
where each dict contains either an image value (type of image_url) or a text (type of text) value. The value of image_url must be a base64
encoded image (e.g., data:image/png;base64,abcd124):
import fs from "fs";
import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { HumanMessage } from "@langchain/core/messages";
// Multi-modal
const vision = new ChatGoogleGenerativeAI({
model: "gemini-pro-vision",
maxOutputTokens: 2048,
});
const image = fs.readFileSync("./hotdog.jpg").toString("base64");
const input2 = [
new HumanMessage({
content: [
{
type: "text",
text: "Describe the following image.",
},
{
type: "image_url",
image_url: `data:image/png;base64,${image}`,
},
],
}),
];
const res2 = await vision.invoke(input2);
console.log(res2);
/*
AIMessage {
content: ' The image shows a hot dog in a bun. The hot dog is grilled and has a dark brown color. The bun is toasted and has a light brown color. The hot dog is in the center of the bun.',
name: 'model',
additional_kwargs: {}
}
*/
// Multi-modal streaming
const res3 = await vision.stream(input2);
for await (const chunk of res3) {
console.log(chunk);
}
/*
AIMessageChunk {
content: ' The image shows a hot dog in a bun. The hot dog is grilled and has grill marks on it. The bun is toasted and has a light golden',
name: 'model',
additional_kwargs: {}
}
AIMessageChunk {
content: ' brown color. The hot dog is in the center of the bun.',
name: 'model',
additional_kwargs: {}
}
*/
API Reference:
- ChatGoogleGenerativeAI from
@langchain/google-genai
- HumanMessage from
@langchain/core/messages
Gemini Prompting FAQs
As of the time this doc was written (2023/12/12), Gemini has some restrictions on the types and structure of prompts it accepts. Specifically:
- When providing multimodal (image) inputs, you are restricted to at most 1 message of "human" (user) type. You cannot pass multiple messages (though the single human message may have multiple content entries)
- System messages are not natively supported, and will be merged with the first human message if present.
- For regular chat conversations, messages must follow the human/ai/human/ai alternating pattern. You may not provide 2 AI or human messages in sequence.
- Message may be blocked if they violate the safety checks of the LLM. In this case, the model will return an empty response.