Case Study: a Czech agency sped up reporting by 60% thanks to an AI workflow

Reporting is often a routine activity in agencies that consumes surprisingly expensive hours of senior specialists. The typical problem is not just downloading data from advertising systems themselves, but mainly unifying them, checking anomalies, adding context, and turning them into a clear comment for the client. In this case study, I describe the deployment of an AI workflow in a Czech performance agency of roughly 25 people that manages dozens of client accounts in Google Ads, Meta Ads, and GA4. The result: the average time needed to prepare one regular report dropped by approximately 60%, more precisely from 5 hours to 2 hours.

Zapier

It was not a fully autonomous solution without people. The agency built the process on a combination of Looker Studio, Google Sheets, Zapier, the OpenAI API, and internal rules for output control. This combination is exactly what matters: the data stayed in familiar reporting tools, while AI generated only summaries, interpretation suggestions, and the first layer of comments. If you are dealing with which models and tools currently make sense for text outputs, it is also useful to keep an eye on the overviews at AIVýběr, where real AI services used in practice are compared.

The article is structured as a practical reconstruction of the project: what the agency changed, how it set metrics, where AI actually saved time, and where it ran into limits instead. Take all prices and time estimates as approximate, because they vary depending on data volume, number of clients, and the complexity of approvals.

Initial state: where reporting burned the most time and money

Before deploying AI, the agency prepared three main types of reports: weekly operational overviews, monthly performance reports, and quarterly strategic summaries. The monthly reports for clients with spending from CZK 80,000 to 400,000 per month consumed the most time. The typical process looked like this:

a specialist downloaded data from GA4, Google Ads, and Meta Ads,
manually adjusted spreadsheets and unified campaign names,
wrote comments on performance, deviations, and recommendations,
an account manager edited the text into a client-friendly form,
the output went through internal review.

The biggest problem was not exporting the data, but the repeated “translation” between numbers and commentary. Specialists kept writing very similar passages: why cost per conversion increased, why ROAS dropped, what seasonality did, which campaigns drove performance, and which were limited by budget. In practice, mainly the numbers and a few contextual sentences differed.

What to do: First measure the real time spent on each step, not just your impression. In this agency, they measured separately: data collection, data cleaning, comment writing, review, and finalization.

Who it is for: Primarily agencies and in-house teams that do recurring reporting over a stable set of metrics.

When not to use it: If every report is created as a one-off analytical study without a recurring structure, text automation pays back much less effectively.

An internal audit showed that one monthly report took 300 minutes on average. Of that, only 70 minutes were spent on downloading and consolidating the data itself, while 140 minutes went to writing and editing comments. That was exactly where the biggest room for change was.

Workflow design: what the agency automated and what it left to people

The deployment did not start with choosing a model, but with dividing activities into three layers:

Zapier

Data layer – collecting and unifying metrics.
Interpretation layer – identifying changes, trends, and anomalies.
Presentation layer – turning them into commentary for the client.

The agency kept the data layer free of generative AI. The reason is simple: if the input data are prepared incorrectly, AI will only produce a bad comment faster. The data therefore remained in Looker Studio and Google Sheets, where unified client-level tables were prepared through connectors and exports.

AI entered only the interpretation and presentation layers. Via Zapier, after the month was closed, a scenario was triggered that took a defined range of values from the prepared Google Sheet: revenue, number of conversions, CPA, PNO/ROAS, impression share, top campaigns, top ad sets, and the biggest month-over-month changes. It sent these data to the OpenAI API with a fixed written prompt and style rules.

The generated text was not sent to the client automatically. It was saved to a separate sheet, where the specialist added two types of information that the model could not reliably handle without broader account knowledge:

reasons for changes outside the advertising platforms, for example a stock outage or a margin change,
commentary on strategic decisions, for example why brand activity was temporarily cut or why a new campaign structure was being tested.

What to do: Deploy AI only on tasks where you already have standardized input. First unify metric names, periods, currencies, and conversion definitions.

Who it is for: Teams that already have reporting in Looker Studio, Google Sheets, or a BI tool and want to shorten the time spent on commentary, not rebuild the entire data stack.

When not to use it: If you do not have a stable KPI definition and each client understands “conversion” differently, AI will create incomparable or misleading outputs.

The chosen architecture is also important from a security perspective. No personal data or detailed customer lists were sent in the prompt, only aggregated performance metrics at the campaign and account level.

Tools used, functions, and approximate costs

The agency chose a setup of commonly available services, not custom development. That significantly shortened deployment time.

1. Looker Studio

Zapier

It served as the main reporting interface for clients and the internal team. Standard dashboards, filters by period and source, and connections to advertising platforms via connectors and exports were used. Official service: https://lookerstudio.google.com/.

Approximate cost: basic use is free; paid costs may arise with some third-party connectors.

2. Google Sheets

It functioned as an intermediate layer for cleaned data and as the AI input template. The advantage was easy versioning, comments, and availability for the whole team. Official service: https://www.google.com/sheets/about/.

Approximate cost: within Google Workspace; Business Starter starts approximately in the low single-digit euro range per user per month depending on the plan and billing.

3. Zapier

It automated workflow triggering after data completion, prompt assembly, and writing the output back into the spreadsheet. Official service: https://zapier.com/.

Approximate cost: from the lower tens of euros per month, but with a larger number of steps and clients, costs rise quickly.

4. OpenAI API

Via the API, text summaries, lists of reasons for changes, and recommendation suggestions were generated. Official documentation: https://platform.openai.com/docs/overview.

Approximate cost: depends on the specific model, number of tokens, and volume of processed inputs; in this case study, for dozens of reports per month, it amounted to low to mid hundreds of CZK per client per month.

For comparing the suitability of different tools for automation and text outputs, the directory of AI tools on AIVýběr is also useful, especially if you are deciding whether a ready-made workflow is enough for you or whether you already need an API.

What to do: Start with tools your team already uses. The biggest savings usually do not come from the “smartest” model, but from reducing the number of manual handoff steps.

Who it is for: Small and mid-sized agencies that want to deploy a solution within weeks, not months.

When not to use it: If you have strict requirements for operation in a closed environment without third-party cloud services, an internal custom implementation or another type of infrastructure will be more suitable.

What the prompt looked like and why outputs fell apart without style rules

The first version of the workflow failed at one important point: the model wrote texts quickly, but too generally. So the agency did not expand the prompt with “be more specific,” but with very precise rules:

write in Czech, briefly, and without marketing clichés,
first state the 3 most important changes compared to the previous period,
always separate performance, causes, and recommendations,
when data are missing, do not estimate and explicitly state that context is missing,
do not use claims of causality without support in the data,
mention only metrics that are in the input.

The prompt also included a miniature “style guide” with forbidden phrasing. This removed phrases such as “the campaign performed well” or “we recommend focusing on optimization.” Instead, the model was given sample formulations: “The number of conversions increased by 18%, but cost per conversion rose by 11% because the share of remarketing with a smaller audience volume increased.”

Limiting length was also essential. If the model received too broad an input, it added irrelevant details. The ideal approach turned out to be splitting the input into several blocks: account summary, top changes, risks, specialist comment. The output had fixed sections and a maximum length.

What to do: Do not write one universal prompt for all clients. Create a prompt core and 2 to 4 variants for different account types: e-commerce, lead gen, B2B, local services.

Who it is for: Teams that want stable tone of voice and less need for manual editing after generation.

When not to use it: If you expect the model to understand the client’s business context on its own without supplied rules and structure, you will correct more than you save.

Project metrics: how the agency measured success and what actually improved

The most important part of the deployment was not the automation itself, but the evaluation method. The agency tracked five metrics over three months:

Time per report – from data closing to sending it to the client.
Number of manual interventions – how many text blocks the specialist had to substantially rewrite.
Error rate – factual inaccuracies caught by internal review.
Approval time – how long the report waited between the specialist and the account manager.
Client satisfaction – measured by a simple report rating and the number of follow-up questions.

After three months, the picture looked like this:

average time for a monthly report dropped from 300 to 120 minutes,
time spent on writing the commentary itself dropped from 140 to 35 minutes,
the number of major rewrites after the first version fell by approximately half,
the error rate increased slightly in the first month, then returned to the original level from the second month thanks to better input validation,
clients most often appreciated faster delivery and clearer summaries of the main changes.

The net 60% saving did not come from AI “writing the report” on its own. It came from a combination of three effects: a shorter first draft, fewer repeated formulations, and faster internal approval thanks to a unified structure.

What to do: Evaluate time savings and output quality separately. Speed alone without accuracy control is a dangerous metric in reporting.

Who it is for: Heads of delivery teams and operations managers who decide on the return on automation.

When not to use it: If you do not have a baseline benchmark and cannot say how much time reporting really costs today, you will only be estimating the return.

Converted into costs, for roughly 40 regular reports per month, the change meant savings of approximately 120 working hours per month. With the internal hourly rate of the specialist and account manager combined at approximately CZK 700 to 1,200, the investment in the workflow paid back very quickly, within several weeks to a few months depending on report volume.

Practical scenarios: where AI reporting worked best

Monthly e-commerce report

The best results were with e-shops with clear metrics: revenue, ROAS, PNO, new vs. returning orders, brand share, and top product campaigns. The model was able to quickly summarize what changed and which segments the growth or decline concerned. The specialist then added business context, for example a promotion, a feed change, or a stock outage.

Weekly overview for account management

Here, AI did not generate a long text for the client, but a short internal summary: what to watch, where there is a risk of budget overspend, which campaigns dropped in impression share, and where lead quality changed. This saved time in internal meetings.

Commentary on anomalies

When CPA jumped above a predefined threshold or when the number of conversions fell below a certain minimum, the workflow generated a draft comment on the deviation. The specialist thus did not have to start from a blank page.

What to do: Start where the same types of interpretation repeat and clients expect a regular structure.

Who it is for: Performance marketing, PPC teams, account management, and smaller analytics departments.

When not to use it: For one-off audits, brand campaigns without a clearly defined performance metric, or board presentations where deeper business interpretation is necessary.

Limits and mistakes: where AI added work instead

The most problematic situations were those where the input data formally matched, but context was missing. A typical example: campaign performance dropped because the client had limited stock availability. Without this information, the model suggested standard optimizations that would not have made sense at that moment.

The second limit concerned causality. AI could describe concurrent phenomena well, but without supplied context it was not safe to claim that a specific performance change was caused by one particular intervention in the account.

The third weakness appeared in multilingual or multi-market accounts. If the input table was not perfectly unified, the model mixed metrics across different countries and the interpretation lost accuracy.

The agency therefore added three protective mechanisms:

validation of input data before sending them to the API,
a list of mandatory contextual fields that the specialist must complete,
an internal checklist of what must be checked before sending to the client.

What to do: Introduce a rule that AI must not create final recommendations on its own without at least one human review.

Who it is for: Agencies working with greater responsibility for budgets and interpretation of results in front of the client.

When not to use it: If the client requires legally or regulatorily sensitive commentary, for example in finance or healthcare, without a very strict governance process.

How to replicate a similar deployment in practice within 30 days

If you want to introduce a similar workflow without unnecessarily overshooting the scope, this process proved effective:

Week 1: reporting audit

choose one report type with the highest repeatability,
measure time spent on each step,
determine 8 to 15 metrics that must always be in the input.

Week 2: data standardization

unify campaign names and KPIs,
separate “hard data” from specialist context,
prepare an input template in Google Sheets or a similar tool.

Week 3: prompt and testing

create a prompt with a fixed output structure,
test at least 10 historical reports,
compare time, quality, and number of corrections.

Week 4: live operation on a limited sample

run the workflow on 3 to 5 clients,
collect errors and adjust validation,
only then expand to other accounts.

What to do: Pilot on a limited number of clients and decide on scaling only after two completed cycles.

Who it is for: Teams that want a fast pilot without custom development and without interfering with the entire BI environment.

When not to use it: If you are simultaneously planning to change attribution, the data model, and the reporting format. Too many changes at once will undermine the evaluation.

FAQ

Is it possible to send AI reports to clients fully automatically without review?

Technically yes, but in agency practice it is usually not sensible. The biggest risk is not a stylistic mistake, but incorrect interpretation without business context. For regular client reports, human review is safer.

What data should not be sent into a workflow like this?

Personal data, customer lists, non-public contractual data, and anything you do not need for performance interpretation itself. For reporting, aggregated metrics at the campaign, ad set, or account level are usually enough.

From what report volume does it pay off?

Usually where you produce at least dozens of recurring reports per month or where one report takes several hours of senior time. With only a few reports per month, the benefit may be smaller than the setup time.

Will AI replace the specialist or the account manager?

No. It works best as the first layer of summary and draft commentary. Responsibility for interpretation, prioritization, and client communication remains with people.

How do I know whether the problem is in the prompt and not in the data?

If the model repeats generic formulations across clients, the problem is usually in the prompt. If it makes factual mistakes or confuses metrics, the problem is usually in the input data or their labeling.

Conclusion

This case study shows a fairly sober, but in practice very effective model of AI deployment in agency reporting. The key was not to “replace reporting with artificial intelligence,” but rather to divide the process into parts that can be standardized well and leave to people what requires irreplaceable context and responsibility. The result in the form of a 60% speed-up is realistic where reports are created regularly, from stable data, and with a recurring commentary structure.

If your work has a similar profile, start with a time audit and one pilot workflow. The greatest benefit comes when AI speeds up not only text writing, but also approvals, handoff of materials, and output consistency. And that is exactly where today’s tools have the greatest practical value.

Recommended AI stack for implementation

Choose tools according to your budget and level of automation. Below is a direct overview of services for implementing the project.

Service	Service description	Offer
NordVPN	VPN service for privacy protection and secure connections.	Open offer
Semrush	SEO and marketing platform for analysis and traffic growth.	Open offer
Make	Advanced visual automation for workflows and integrations.	Open offer
Hostinger	Web hosting and domains for fast website launch.	Open offer
Fiverr	Marketplace for freelancers and external specialists.	Open offer
Adobe	Creative tools for graphics, video, and digital content.	Open offer
Canva	Online design tool for graphics, presentations, and social media.	Open offer
Jasper	AI tool for marketing copy and content campaigns.	Open offer