Which AI models can you automate on Zapier?

New AI models launch practically every week, and keeping up with which ones to use for specific workflows is a job in itself. Consider this article your living reference.

At Zapier, we run every model through AutomationBench. It's our benchmark for testing how well models carry out multi-step workflows, not just static prompts.

Below, I'll walk through every major AI provider available on Zapier, the models you can plug into your Zap workflows today, and what each one is best for based on Zapier's AutomationBench. You'll also learn about direct AI integrations with hundreds of other AI apps—and how easy it is to automate AI with our built-in tool, AI by Zapier.

Zapier is the most connected AI orchestration platform—integrating with thousands of apps from partners like Google, Salesforce, and Microsoft. Use forms, data tables, and logic to build secure, automated, AI-powered systems for your business-critical workflows across your organization's technology stack. Learn more.

AutomationBench, our benchmarking tool
OpenAI (ChatGPT) models
Anthropic (Claude) models
Google AI Studio (Gemini) models
What is AI by Zapier?
Other AI apps available on Zapier

AutomationBench, Zapier's benchmarking tool

As you scroll through the content below for OpenAI, Anthropic, and Gemini models, you'll notice a "best for" section based on AutomationBench. That's Zapier's benchmarking tool for measuring AI model efficacy.

The Zapier team built AutomationBench to determine which models to deploy on our platform. We couldn't find an AI benchmark that measured whether an AI model could do the messy, complicated work businesses actually rely on. Realizing that gap existed in the market, we made it public.

Every measured task is modeled on real workflow patterns we noticed on our platform. (No PII was used in the process, though.) To make scoring meaningful, we complicated those tasks to reflect the friction that shows up in real business environments. That included adding irrelevant data, hiding key info behind tool calls, introducing ambiguity about where the right info could be found, using similar naming conventions to create plausible wrong answers, and enforcing strict business policy rules with overriding priorities.

To show you what we mean by "complicated," here's an example task used for testing purposes (you can find more in the white paper):

There’s a scheduling conflict on February 20, 2026 at 2:00 PM — a Zoom meeting and a Google Calendar event overlap. Check the meeting priority policy in the spreadsheet to determine which one wins, then reschedule the loser by prepending [RESCHEDULED] to its topic/title. Post a summary to #ops-updates on Slack noting which meeting won and which was rescheduled, including both the Zoom meeting ID and Calendar event ID.

When it comes to scoring models, we don't evaluate how an agent completes the task. It doesn't matter which tools are called or in what order. We only look at the end state: if it did the job, and whether it had any side effects. This means a model that costs more but gets the job done will score higher than a cheaper one that doesn't.

Here are the top five models from our leaderboard. Percentages represent the share of workflow tasks each AI model was able to fully complete.

Model	Score
1. GPT-5.6 Sol — OpenAI	18.1%
2. Claude Fable 5.0 (Max) — Anthropic	17.4%
3. GPT-5.6 Sol (XHigh) — OpenAI	17.0%
4. Claude Fable 5.0 (XHigh) — Anthropic	16.0%
5. Claude Opus 4.8 (XHigh)	15.5%

See the full leaderboard

OpenAI (ChatGPT) models

OpenAI's model lineup is the broadest on Zapier, spanning everything from budget-friendly mini models to heavy-duty reasoning engines, plu specialized tools for transcription and image generation.

Best for: Most business functions. OpenAI's new GPT-5.6 models now lead five of the six domains we test. GPT-5.6 Sol (Max) tops Marketing (21.0%) and ties GPT-5.6 Terra (Max) for the lead in Finance (14.2%) and HR (22.5%). Sol (XHigh) ties Claude Fable 5.0 (Max) at the top of Operations (27.0%), and Luna (Max) tops Sales (20.5%). See the AutomationBench leaderboard.

What's new: OpenAI just released the GPT-5.6 family—Sol, Terra, and Luna—each tuned for a different kind of work. Sol is built for high-stakes, one-shot workflows like compliance reviews, record pulls, approval chains, and processes where the right move is sometimes to pause, escalate, or do nothing at all. Terra handles workflows that gather from many tools before acting, like assembling a sales brief from your CRM, email, and Slack, or reconciling data across systems ahead of a report. And Luna targets high-volume, rule-based tasks where cost adds up fast—applying discount logic row by row, tagging leads against set criteria, or routing support tickets at scale.

Try it in the Zap editor

Model	Best for	Inputs	Outputs	Context window	Output pricing (per 1M tokens)
GPT-5.6 Sol	Complex reasoning and coding	Text, images	Text	1 million tokens	$30
GPT-5.6 Terra	Balancing intelligence with cost	Text, images	Text	1 million tokens	$15
GPT-5.6 Luna	Cost-sensitive, high-volume workloads	Text, images	Text	1 million tokens	$6
GPT-5.5 Pro	Problems that need the deepest reasoning and highest reliability, where getting it right matters more than speed or cost	Text, images	Text	1 million tokens	$180
GPT-5.5	Complex professional work, including coding, research, data analysis, and autonomous multi-step tasks across tools	Text, images	Text	1 million tokens	$30
GPT-5.4 nano	High-volume, repeatable tasks where speed and cost matter most, like classification, data extraction, and ranking	Text, images	Text	400,000 tokens	$1.25
GPT-5.4 mini	Complex, multi-step workflows that need fast reasoning across different content types and tools	Text, images	Text	400,000 tokens	$4.50
GPT 5.4	Complex, multi-step professional workflows that need deep reasoning and planning	Text, images, audio	Text	1,050,000 tokens	$15
GPT 5.3	Fast, context-aware chat and search	Text, images	Text	128,000 tokens	$14
GPT-5.2	Advanced coding and agentic tasks with reliable multi-step reasoning	Text, images	Text	128,000 tokens	$14
GPT-5 mini	Affordable reasoning and logic for well‑defined tasks	Text, images	Text	400,000 tokens	$2
GPT-5 nano	Very affordable reasoning and logic for summaries, classification, and other lightweight tasks	Text	Text	400,000 tokens	$.40
GPT-4o mini	Multimodal on a budget	Text, images, audio	Text	128,000 tokens	$.60
GPT-4o	Multimodal tasks, especially live, human‑like voice and vision interaction	Text, images, audio, video	Text	128,000 tokens	$10
GPT-4.1 mini	Balancing power, performance, and affordability for general‑purpose workloads	Text, images	Text	1,047,576 tokens	$1.60
GPT-4.1	Complex tasks that don't require advanced reasoning, with very long context windows	Text, images	Text	1,047,576 tokens	$8
GPT-4.1 nano	Simple tasks where speed and price matter more than raw capability	Text	Text	1,047,576 tokens	$.40
o4-mini	Fast, cost‑efficient reasoning	Text, images		200,000 tokens	$4.40
o3-mini	Lightweight, lower‑cost alternative to o3 for reasoning‑heavy tasks	Text	Text	200,000 tokens	$4.40
o3	Advanced reasoning and logic	Text, images	Text	200,000 tokens	$8
GPT Image 1.5	State‑of‑the‑art image generation	Text, images	Text, images	N/A	$10
GPT Image 1	Image generation	Text, images	Images	N/A	$40

Note: You'll see additional models inside the OpenAI integration on Zapier. OpenAI sometimes retires models from its product while keeping them in the API, deprecating them from the API on a separate schedule. We recommend building new workflows on the models listed above, but you can also see the complete list below.

See all available OpenAI models

GPT-5.4 pro

GPT-5.4 thinking

GPT-5.3-Codex

GPT-5.3 instant

GPT-5.2 pro

GPT-5.2 thinking

GPT-5.2-Codex

GPT-5.2 instant

GPT-5.1-Codex-Max

GPT-5.1 Codex mini

GPT-5.1-Codex

GPT-5.1

GPT-5.1 Chat

GPT-5 pro

GPT-5-Codex

GPT-5 nano

GPT-5 Chat

GPT-5

o4-mini

o3-mini

o1 Pro

GPT-4.5 Turbo

GPT-4o

GPT-4o Audio

GPT-4o Search Preview

GPT-4o Transcribe

GPT-4o Transcribe Diarize

GPT-4o mini

GPT-4o mini Audio

GPT-4o mini Realtime

GPT-4o mini Search Preview

GPT-4o mini Transcribe

GPT-4o mini TTS

GPT-4.1

GPT-4 Turbo

GPT-4

GPT Image 1.5

GPT Image 1

DALL·E 3

DALL·E 2

Whisper

omni-moderation

Sora 2 Pro

Sora 2

TTS-1 HD

TTS-1

text-embedding-3-large

text-embedding-3-small

text-embedding-ada-002

Related reading:

Anthropic (Claude) models

Anthropic's Claude models are known for strong writing quality, careful instruction-following, and a safety-first design philosophy. Claude is a popular choice for tasks like drafting long-form content, analyzing documents, and powering customer-facing chatbots that need a natural, conversational tone.

Best for: Support workflows. Claude Opus 4.8 (XHigh) leads Support at 15.0%, with Fable 5.0 (XHigh) right behind in second (14.0%). Claude still holds strong ground elsewhere: Fable 5.0 (Max) ties GPT-5.6 Sol (XHigh) at the top of Operations (27.0%) and lands second overall on the leaderboard at 17.4%. See the AutomationBench leaderboard.

What's new: Sonnet 5 is now available on Zapier. It delivers Opus-level performance at Sonnet-level pricing, working more than twice as effectively as Sonnet 4.6 on multi-step workflows. The biggest difference is follow-through: Sonnet 5 trusts what it finds, understands how your apps work, and completes tasks that would've stalled halfway on Sonnet 4.6.

Try it in the Zap editor

Model	Best for	Inputs	Outputs	Context window	Output pricing (per 1M tokens)
Sonnet 5	Coding, agents, and everyday professional work at scale	Text, images	Text	1 million tokens	$15
Fable 5.0	The most demanding reasoning and long-horizon agentic work	Text, images	Text	1 million tokens	$50
Sonnet 4.6	Coding, agents, enterprise workflow—best balance of price and performance	Text, images	Text	1 million tokens	$15
Opus 4.8	Complex reasoning, agentic coding, long-running tasks	Text, images	Text	1 million tokens	$25
Opus 4.7	Complex reasoning, agentic coding	Text, images	Text	1 million tokens	$25
Opus 4.6	Complex reasoning, coding, long-horizon tasks	Text, images	Text	1 million tokens	$25
Haiku 4.5	High-volume, latency-sensitive, cost-efficient tasks	Text, images	Text	200,000 tokens	$5
Sonnet 4.5	Complex agents, coding; highest general intelligence	Text, images	Text	200,000 tokens	$15
Opus 4.1	Complex reasoning, analysis, creative tasks	Text, images	Text	200,000 tokens	$75
Sonnet 4	Balanced coding and workflows	Text, images	Text	200,000 tokens	$15
Haiku 3	Fast, simple, cost-effective classification tasks	Text, images	Text	200,000 tokens	$1.25

Related reading:

Gemini (Google AI Studio) models

Google's Gemini family stands out for its massive context windows, competitive pricing, and strong multimodal capabilities across text, images, audio, and video. Gemini models are a great fit for processing long documents, research-heavy workflows, and tasks where keeping costs low matters.

Best for: High-volume workflows where cost is the priority. Gemini 3.5 Flash (Medium) posts a competitive 14.5%, landing ninth overall at just $0.87 per task—cheaper than the top-ranked GPT-5.6 Sol (Max) at $1.53, while staying within about four points of it. It remains one of the lowest-cost models on the leaderboard. See the AutomationBench leaderboard.

What's new: Gemini 3.5 Flash is now available on Zapier. It excels at step coordination and strict policy adherence, the kind of work that tends to result in drift in other models. But it can struggle at following strict output formats and making decisions based on math it has to do on its own.

Try it in the Zap editor

Model	Best for	Inputs	Outputs	Context window	Output pricing (per 1M tokens)
Gemini 3.5 Flash	Sub-agent deployment, multi-step workflows, and long-horizon tasks at scale	Text, images, audio, video, PDF	Text	1 million	$9
Gemini 3.1 Pro	Complex reasoning, high-stakes coding, and massive data synthesis	Text, images, audio, video, PDF	Text, code, reasoning	1 million	$30
Gemini 3 Flash	High-speed automation, real-time chat, and cost-effective scaling	Text, images, audio, video	Text, code	1 million	$.30
Gemini 3 Pro	Balanced professional workflows and creative content generation	Text, images, audio, video	Text, code	1 million	$3.75
Gemini 2.0 Flash Lite	Basic classification, ultra-low latency tasks, and simple extraction	Text, images	Text	1 million	$.15
Gemini 2.0 Flash	Legacy support for high-throughput 2.0-era applications	Text, images, audio	Text, code	1 million	$.30
Gemini 2.5 Pro	Detailed multimodal analysis with high accuracy for older pipelines	Text, images, audio, video	Text, code	2 million	$10.50
Gemini 2.5 Flash	Transition-tier speed for multimodal processing	Text, images, audio, video	Text, code	1 million	$.90
Nano Banana Pro	Professional-grade high-fidelity image generation and editing	Text, images	Images	N/A	$.05

Note: You'll see additional models inside the Google (Gemini) integration on Zapier. We recommend building new workflows on the models listed above, but you can also see the complete list below.

See all available Gemini models

Nano Banana 2

Gemini 3.1 Pro

Gemini 3 Flash

Gemini Deep Research Pro

Gemini 2.5 Pro TTS (Preview)

Gemini 2.5 Flash TTS (Preview)

Nano Banana Pro

Gemini 3 Pro

Veo 3.1 (Fast)

Veo 3.1 (Preview)

Gemini 2.5 Flash Image

Imagen 4.0 (Fast)

Veo 3 (Fast)

Gemini 2.5 Flash-Lite

Gemma 3N

Gemini 2.5 Flash

Gemini 2.5 Pro

Gemini 2.5 Flash-Lite (Preview)

Veo 3

Veo 2

Gemini 2.0 Flash Image Generation

Gemini Robotics

Gemma 3 27B

Gemma 3 12B

Gemma 3 1B

Gemini 2.0 Flash-Lite

Gemini 2.0 Flash

Gemini Flash-Lite

Gemini Pro

Related reading:

What is AI by Zapier?

AI by Zapier is our built-in integration that lets you add AI steps directly to any Zap. It comes with several OpenAI and Google models out of the box, no account required, plus a prompt optimizer. But the real value is in how easy it is to swap models inside AI by Zapier without breaking your existing workflows.

When you're configuring an AI by Zapier step, you can select the model you want from a dropdown menu with just a couple of clicks. That's handy when an AI provider releases a model that leapfrogs the one you're currently using, or you're handing off a Zap template to a team that prefers automating with another model. Whoever manages the Zap can swap in their preferred model without having to fuss over deleting the original step and reconfiguring a new one from scratch.

An orange arrow points to a list of available OpenAI models in a model dropdown in the Zap editor.

Try it in the Zap editor

Here's a snapshot of the models available through AI by Zapier today:

Provider	Models
OpenAI (ChatGPT)	GPT-5.6 Sol, GPT-5.6 Terra, GPT-5.6 Luna, GPT-5.5 Pro, GPT-5.5, GPT-5.4 nano, GPT-5.4 mini, GPT-5.4, GPT-5.2, GPT-5, GPT-5 mini, GPT-5 nano, GPT-4o mini, GPT-4.1 nano, o3, o3-mini, o1
Anthropic (Claude)	Sonnet 5, Opus 4.8, Opus 4.7, Opus 4.6, Haiku 4.5, Opus 4.5, Sonnet 4.6
Google (Gemini)	Gemini 3.5 Flash, Gemini 3.1 Pro, Gemini 3 Pro, Gemini 2.5 Pro, Gemini 2.5 Flash Lite, Gemini 2.5 Flash, Gemini 2.0 Flash Lite*, Gemini 2.0 Flash
Azure OpenAI	Uses the AI models you've already set up in your own Azure OpenAI account. The exact models depend on what your Azure admin has turned on.
Amazon Bedrock	Uses the AI models your company has access to in Amazon Bedrock. The exact models depend on what's enabled in your AWS account and region.

*Can be used for free in AI by Zapier

Looking for setup guidance or automation inspiration? Check out our AI by Zapier feature guide.

Other AI apps on Zapier

These aren't the only providers in town. Zapier also integrates directly with hundreds of specialized AI apps, including:

Grok by xAI—xAI's conversational model, with real-time access to web and X data and a more irreverent tone than most assistants.
DeepSeek—A cost-efficient model with strong coding and reasoning chops, popular for technical workflows on a budget.
Mistral AI—Models with strong instruction-following and multilingual performance, ranging from fast lightweight options to larger frontier models.
OpenRouter—A single integration that gives you access to models from dozens of providers, so you can mix and match without managing multiple connections.
Groq—Not a model, but a hardware-accelerated inference engine. Use it when speed is the priority and you need near-instant response.
AssemblyAI—Specializes in speech-to-text and audio intelligence, including transcription, speaker detection, and sentiment analysis.
Google Vertex AI—Google's enterprise AI platform, ideal for teams already in the Google Cloud ecosystem who need more control and customization.

With Zapier, you're never locked into a single model or provider. You can take advantage of every app's unique strengths and experiment with whatever fits your workflow best. Browse the full list in our ever-growing AI app directory.

Connect to the latest AI models on Zapier

Whether you're just getting started with AI automation or you're deep into building multi-step workflows, Zapier gives you the flexibility to use the AI tools and models that actually fit your needs. The landscape keeps evolving, and so will this guide. Bookmark it and check back whenever a new model drops.

This article was originally published in March 2026. It was most recently updated in July 2026.

Which AI models can you automate on Zapier? (GPT-5.6 Sol, Gemini 3.5 Flash, and more)

Table of contents

AutomationBench, Zapier's benchmarking tool

OpenAI (ChatGPT) models

Anthropic (Claude) models

Gemini (Google AI Studio) models

What is AI by Zapier?

Other AI apps on Zapier

Connect to the latest AI models on Zapier

Related articles

Improve your productivity automatically. Use Zapier to get your apps working together.