Gemini 3.5 and action-first AI: what small teams should test before automating more work

Abstract workflow automation dashboard on a laptop with approval checkpoints for small team AI pilots

Google’s Gemini 3.5 announcement gives small teams a useful signal: the next wave of AI tools is being built to take more action inside real workflows. Google describes Gemini 3.5 as its latest model family for “frontier intelligence with action”, with 3.5 Flash released first for agentic workflows, coding and long-running tasks.

That matters because many small businesses are already past the first phase of AI adoption. Staff have tried chatbots for drafts, summaries and research. The harder question now is whether AI should be allowed to touch live workflows: updating records, preparing documents, categorising files, changing code, or passing work between tools.

The sensible answer is a controlled test with clear limits.

What changed with Gemini 3.5

Google’s primary claim is that Gemini 3.5 is built for complex, agentic workflows. In its 19 May 2026 announcement, Google says 3.5 Flash is available in the Gemini app, AI Mode in Search, Google Antigravity, the Gemini API in Google AI Studio and Android Studio, Gemini Enterprise Agent Platform and Gemini Enterprise.

The announcement also puts emphasis on multi-step work. Google says 3.5 Flash can plan, build and iterate on real-world tasks, and it gives examples such as asset categorisation, document-heavy onboarding, invoice OCR and long-running data analysis.

For a small team, the practical lesson is that AI pilots need to move beyond isolated prompts. The model, the surrounding tool access and the review process now need to be tested together.

Start with work that has a clear stop point

Action-capable AI is easiest to assess when the task has a visible definition of done. Good early candidates include:

  • categorising support emails before a human replies
  • drafting a first pass of a supplier comparison table
  • turning meeting notes into task records
  • checking invoices against purchase orders
  • reviewing a web page for missing metadata or broken links
  • preparing a weekly analytics summary from agreed data sources

Each of these tasks has an output a person can inspect. That makes them safer than open-ended work where the AI decides what matters, chooses the tools and commits changes without review.

The best first pilots usually have three features. They happen often enough to be worth improving. Mistakes are easy to spot before they reach a customer. The current manual process is already understood by the team.

Keep approval close to the action

The biggest operational risk with agentic AI is that it may take a wrong action quickly.

Small teams should separate preparation from execution. Let the system collect context, draft the next step, prepare records or identify exceptions. Keep approval with a named person until the task has passed enough live tests.

For example, an AI workflow might review customer onboarding documents and recommend missing information. It should not silently approve the customer, change their account status and trigger downstream emails until the business has measured accuracy and failure cases.

This distinction is important for web, marketing and operations teams. AI can shorten the time between identifying a problem and preparing a fix, but the final action still needs ownership.

Measure outcomes, not model demos

Gemini 3.5’s announcement includes benchmark and partner examples, but a small business needs local evidence. A useful pilot should answer practical questions:

  • How much time did the workflow save per run?
  • How often did a human need to correct the output?
  • Which mistakes were harmless, and which could have reached a customer?
  • Did the AI need more context than expected?
  • Did the workflow create new review work for another person?
  • Was the result better, faster or more consistent than the manual process?

Track these for a fixed period, such as two weeks or 50 completed runs. Without a measurement window, teams tend to judge AI by impressive examples and forget the quiet rework.

This is also where analytics discipline helps. Kahunam’s article on AI search traffic and Q1 2026 data makes a similar point for marketing measurement: small percentages can matter when the channel is growing, but only if you can see the numbers clearly.

Build a lightweight control list

Before a small team gives AI more workflow access, write down a short control list. It doesn’t need to be complicated.

  • What systems can the AI read?
  • What systems can it write to?
  • Which actions require human approval?
  • What should happen when confidence is low?
  • Who reviews exceptions?
  • Where are prompts, outputs and decisions logged?
  • How can the workflow be stopped quickly?

The final question matters. A useful automation can still fail because an API changes, a data source goes stale, or an instruction no longer matches the business process. A stop switch is part of making the system usable.

Where not to start

Avoid first pilots where the cost of a mistake is high or the result is hard to review. That includes changing payment settings, sending customer-facing messages without approval, deleting records, rewriting legal or HR documents, and publishing website changes directly to production.

These tasks may become candidates later, but they need stronger controls: permissions, audit trails, rollback steps, test environments and clear accountability.

For now, small teams should look for repetitive work where AI can prepare the next move and a person can approve it quickly.

The practical takeaway

Gemini 3.5 is a reminder that AI adoption is shifting from content generation towards action. Small teams should use each major model release to improve how they test AI.

The right question is no longer “can the model answer this?” It’s “can this workflow take a useful next step, under supervision, with fewer errors and less manual effort than before?”

Start there. Pick one bounded process, run it with human approval, measure the results and only then decide whether the AI should be trusted with more of the workflow.

Is your website not up to scratch?

Hire our team to take care of it and get back to focusing on what really matters to you.