25
51 Comments

This system tells you what’s working in your startup — every week

Most founders don't run enough experiments.

Here's how to build a simple system that forces you to run growth experiments every week --- and runs them for you.

What you're building

A simple system that runs like this:

  • Every Monday → new experiments appear
  • You run 1--3
  • You enter results
  • The system tells you what worked

Tools

  • Jotform → stores experiments
  • Zapier → runs automation
  • ChatGPT (via Zapier) → creates new ideas and variations
  • Google Sheets → tracks results and shows winners

Step 1 --- Create the experiment form

Go to Jotform and create a new form.

Add these fields:

  • Experiment Name
  • Channel
  • What are you changing?
  • Primary metric
  • Baseline rate
  • Target rate
  • Hypothesis
  • AI Variations
  • Start date
  • End date

Add one required field:

  • If I change [X], then [metric] will improve because [reason]

This forces every idea to be clear before it enters your system.

Step 2 --- Connect Jotform to Google Sheets

In Jotform: Settings → Integrations → Google Sheets → Connect

What happens: When a form is submitted, a new row is added to Google Sheets.

This sheet becomes your experiment tracker.

Step 3 --- Automatically generate experiments every week

Go to Zapier. Create Zap

  • Trigger: Schedule → Every Monday → 8:00 AM

Action 1 --- Generate experiments (ChatGPT)

  • App: OpenAI
  • Action: Send Prompt

Paste this (or similar):

Generate 5 growth experiments for a bootstrapped SaaS founder.

Business:
[describe product]

Audience:
[describe audience]

Rules:
- Must be testable in  7  days
- No engineering required
- No paid ads above $100

Return:
Experiment Name
Channel\
What to  change
Metric
Hypothesis

Action 2 --- Send to Jotform

  • Jotform → Create Submission (via Zapier integration)
  • Map fields from ChatGPT → Jotform.

Result: Every Monday → 5 experiments are automatically added to your system

You no longer ask: "What should I test?"

Step 4 --- Automatically add variations

Add another step in the same Zap:

  • OpenAI → Send Prompt (In the prompt, use the experiment name from the previous step)

Paste this (or similar):

For this experiment:

{Experiment Name}

Give:
- 5 headlines
- 5 hooks
- 3 offers

Then map the output to the "AI Variations" field in Jotform.

Result: Each experiment now includes ready-to-use variations.

Step 5 --- Run experiments

During the week:

  • Pick 1--3 experiments
  • Run them

That's it.

No dashboards. No complexity.

Step 6 --- Enter results in Google Sheets

Open your sheet.

Each row = one experiment.

For each experiment, enter:

  • Traffic
  • Conversions
  • Baseline rate

This takes just a few minutes.

Step 7 --- Automatically flag real winners

This happens inside Google Sheets.

Set your columns first
Make sure your columns look like this:

  • B = Traffic
  • C = Conversions
  • D = Baseline Rate

Add "Conversion Rate"

Add a new column called: Conversion Rate

Click the first cell in that column (example E2) and type:

=C2/B2

Press Enter, then drag the formula down.

Add "Z Score"

Add another column called: Z Score

Click the first cell (example F2) and type:

=(E2 - D2) / SQRT((D2*(1-D2))/B2)

Press Enter, then drag down.

Add "Winner Flag"

Add another column called: Winner Flag

Click the first cell (example G2) and type:

=IF(AND(B2>100, ABS(E2)>1.96), "Winner", "Keep Testing")

Press Enter, then drag down.

What happens now
After you enter:

  • Traffic
  • Conversions
  • Baseline rate

Google Sheets will:

  • Calculate your conversion rate
  • Compare it to your baseline
  • Check if the result is strong enough

Then it will label each experiment: Winner or Keep Testing

Important: This is just a quick check. It may not always be right. Use it to spot good experiments, then review them.

Step 8 --- Get notified when something works

In Zapier:

  • Create a second Zap:
  • Trigger: Google Sheets → Updated Row
  • Filter: Only continue if Winner Flag = Winner
  • Action: Send Email (or Slack)

Message (or something similar):


You have a winner:

{{Experiment Name}}\
Conversion Rate: {{Conversion Rate}}

Review now.

Result: When something works, you get notified automatically.

For best results

1. Keep it small. Run a small number of experiments per week (1-3).

2. Use simple tests. The best ones are:

  • Headline changes
  • Offer changes
  • Distribution (where you post)

3. Be consistent... consistency beats creativity.

4. Don't overthink stats. Good enough is good enough. You don't need perfect math. You need clear direction.

on May 8, 2026
  1. 1

    What people usually miss with weekly review systems is the cost of capturing everything. If I have to stop and type out a bunch of notes after every call or experiment, the review never really happens. I get better signal when I can dump the rough thought right away and sort it out later. That’s where DictaFlow helps, especially when the week is already too packed for clean note-taking to feel realistic.

  2. 1

    This is gold. The "If I change X, then Y will improve because Z" forced field alone is worth implementing. Most experiments fail because the hypothesis was never clear to begin with. Saved this.

  3. 1

    Having ChatGPT automatically suggest 5 ideas every week sounds super convenient, but I’m a bit worried the ideas might eventually become repetitive or too generic. Do you have any prompt tricks that help the AI understand your customer insights more deeply? Or do you think it’s better to feed past failed experiments into Zapier as well so it can avoid repeating the same mistakes?

  4. 1

    I like how practical this is. I’ve noticed most founders don’t fail because of lack of ideas — they fail because they don’t test consistently enough. Turning experimentation into a weekly system instead of a random task makes a huge difference.

  5. 1

    The discipline of forcing yourself to write down the answer every week is
    the actual value of the system, more than the framework itself. The
    framework is just the prompt.

    One thing I would add to a weekly signal scan for early stage products:
    qualitative signals from cold conversations. Not just metrics. If three
    people you cold-pitched this week independently used the same word to push
    back, that is a signal worth more than a 5 percent conversion bump.

    Numbers tell you what happened. Conversations tell you why.

  6. 1

    Visibility into what's actually working is underrated - most solo founders are flying blind.
    What's the one metric you track most obsessively?

  7. 2

    Once you write “if I change X, then metric Y improves because Z,” you can’t smuggle in a feature you just wanted to build. The other underrated piece is that small, structured experiments compound into a written record of what actually moved the needle, which is the closest thing to institutional memory a small team can have. One thing I’d add from running this in a few orgs: tie the weekly experiment back to the problem it’s solving for a named user or segment, otherwise the system optimises for cleverness over learning. Anything we create as founders needs to solve a problem, unless we have enough stamina (money, people, resources etc..) to keep an experiment loop to identify a problem. A KPI is always tied to a business outcome, and the business outcome is always tied to a problem statement.

  8. 2

    Yeah, I totally agree, but don't you think a founder already has enough on their plate? Going through the complex architecture and UI of tools like Zapier or n8n just makes them scratch their head even more.
    I've been working on solving exactly this problem for the past 2 months. I'm the co-founder of Autom8AI , we let you build these workflows just by describing them in plain English. No nodes, no drag-and-drop complexity.

    Would love to know what you think ( visit at Autom8AI with io as domain )

  9. 1

    Love this framework — especially the Z-score step. Most founders skip statistical significance entirely and just eyeball results, which leads to chasing noise.

    One thing worth noting: Google Sheets works great here at 50-100 experiments. Once you're tracking 300+ experiments across multiple channels though, it starts to buckle — slow formulas, version conflicts, accidental overwrites. At that point a simple SQL table (even SQLite) with a Power BI dashboard on top is a much cleaner setup. You get the same Winner Flag logic as a stored query, automatic history, and you can slice by channel/hypothesis type instantly.

    But for the stage most IH founders are at, this Sheets setup is exactly the right level of complexity. The key insight you nailed: make the system do the thinking, not you.

    Also the forced hypothesis format ("If I change X, then metric will improve because Y") is underrated. It eliminates at least half the garbage experiments before they start.

    I put together a free Top 15 SQL Server Interview Q&A that covers some of this query-building logic if anyone wants to go deeper on the data side: https://growthwithshehroz.gumroad.com/l/vgiex

  10. 1

    Super système. Ce qui me frappe c'est l'étape 1 forcer la formulation "Si je change X, alors Y s'améliorera car Z" avant même de lancer. La plupart des fondateurs testent sans hypothèse claire, donc ils ne savent jamais vraiment pourquoi ça a marché.

    Je build Visario en ce moment, je vais essayer d'intégrer cette logique dans mon suivi hebdo.

  11. 1

    The forced "If I change X, then metric will improve because reason" syntax is the part of this that actually makes the system work. Most experiment trackers fail not because founders can't generate ideas but because the ideas they generate are masked-as-experiments-but-actually-features ("let's try a new landing page" is not a hypothesis).

    One thing I'd add as a required field: a termination rule. "I will stop this experiment when X" — either a sample size, a date, or a result threshold. Without it, every experiment stays open forever waiting for "just a bit more data," and the system slowly becomes a graveyard of unconcluded tests. The hypothesis field forces clarity at the start; the termination field forces a decision at the end.

    Genuinely useful template. Bookmarking this.

  12. 1

    Really impressive system Love how you’ve turned growth experiments into a structured weekly workflow with automation. The combination of ChatGPT + Zapier + Sheets makes it super scalable and removes a lot of manual thinking. Definitely a smart way to stay consistent with testing and actually learn from results.

  13. 1

    Really practical system—love that it forces a hypothesis before any experiment goes in. One small fix worth catching: the Winner Flag formula uses ABS(E2) > 1.96 but E2 is your conversion rate, not your Z-score. You want ABS(F2) > 1.96 (the Z-score column) to actually test for statistical significance. Easy fix but it matters when you're making real budget decisions off the output. This Sheets setup is perfect for 0–$10k MRR. Once you're running 20+ experiments across multiple channels and need to slice by cohort, user segment, or acquisition source simultaneously, you'll want this living in a proper data warehouse—that's where the real patterns emerge. I've built experiment tracking pipelines in SQL Server + Power BI for SaaS founders at exactly that inflection point. Free SQL interview guide here if useful → https://growthwithshehroz.gumroad.com/l/vgiex

  14. 1

    The system is well built. Two pushbacks for early-stage founders.

    The Z-score gate at 100 traffic is too low. With realistic conversion rates around 2 to 5 percent, you need roughly 1,000 visits per arm to detect a meaningful lift, otherwise you are calling random noise a 'winner' and chasing it. Below that volume, the better unit of testing is qualitative ('talked to 20 ICPs, ran 5 cold loops') not statistical.

    Also, ChatGPT generating 5 experiments every Monday is a recipe for novelty bias. The hardest part of growth is not running a new experiment, it is running the same boring thing for 12 weeks until the pattern shows up. At SocialPost we tracked weekly experiments for a year and the winners were almost never the clever AI-generated ones, they were the obvious ones I had been resisting. I would replace 'generate 5 new experiments' with 'add 1 new experiment per week and continue the 3 that are still running.'

  15. 1

    Curious how you handle the ChatGPT generation repeating similar ideas week after week. Do you feed it the previous week’s experiment list in the prompt, or does it stay fresh on its own?

  16. 1

    The Z-score approach for flagging winners is the right call — most founders skip statistical significance entirely and declare a winner based on gut feel after 50 visits.

    Quick heads-up on Step 7: the Winner Flag formula checks ABS(E2)>1.96 but E2 is your Conversion Rate (a decimal like 0.05), not the Z Score. You want to check column F (Z Score) instead: =IF(AND(B2>100, ABS(F2)>1.96), "Winner", "Keep Testing"). Otherwise you'd be flagging any conversion rate above 196% — probably not what you intended!

    The Sheets setup is a great starting point. In my experience working with SaaS founders on their data infrastructure, once you hit 50+ experiments across multiple channels, the sheet gets unwieldy. That's usually when people consider moving experiment logs into a proper DB — makes it much easier to slice by channel, timeframe, or experiment type.

    Your "consistency beats creativity" takeaway is the most valuable line in here. The founders I've seen build real growth habits run boring, simple, repeatable tests — not clever ones.

    If you're also running a SQL backend and want to catch performance issues before they sneak up on you, I put together a free diagnostic scripts pack → https://growthwithshehroz.gumroad.com/l/psmqnx

  17. 1

    Most people get stuck in the Jotform stage and never move to the Zapier automation part. Automating the prompt to generate variations every Monday morning takes the emotional labor out of deciding what to test next. It basically turns growth into a scheduled task rather than a creative burst you have to force.

  18. 1

    This is actually a really smart approach. A lot of founders operate almost entirely on intuition in the early stages, so having a simple weekly feedback system for product, growth, and retention signals probably helps reduce emotional decision-making a lot.

    Also liked the focus on trend tracking instead of obsessing over single-day metrics — that’s something more early-stage builders probably need to hear.

  19. 1

    I think my favorite part of this is the forcing function. Asking yourself "what should I test this week" is a question that derails more growth loops than anything else. If you automate the question so you literally only have to pick an experiment, you remove all the decision fatigue.

    I think the Z-score flag in Sheets is also underrated for how simple it is. Founders tend to either ignore statistical significance altogether or way over-engineer it. This is built at just the right level of rigour for early stage.

    1. 1

      That point about decision fatigue is spot on. When you're in the middle of a growth sprint, having to come up with a test from scratch every single week usually just leads to analysis paralysis or picking something low-impact because you're tired of thinking about it. Having a pre-defined queue changes the whole dynamic.

  20. 1

    This is brilliantly practical—thank you for sharing, Aytekin.

    Quick question: For the weekly ChatGPT experiment generation, how do you keep it from repeating similar ideas week after week without adding a "past experiments" memory block?

    Really appreciate you laying this out so cleanly. Already mapping it out for my own stack.

  21. 1

    In the early stages, it’s so easy to get fooled by 'false wins' due to small sample sizes. Setting a technical threshold (B > 100 and Z > 1.96) keeps founders grounded. For products with extremely low traffic, would you suggest leaning more into qualitative insights or sticking to these statistical formulas?

  22. 1

    The setup is clean, but the failure mode I've watched 50+ times running marketing teams isn't the tooling. It's that the experiment list becomes a wishlist. Week one you run three. Week three you run zero, and the sheet quietly turns into a graveyard. The discipline that keeps it alive is having someone whose only job that hour is reviewing last week's results before any new ideas get added. The Jotform/Zapier/Sheets stack works. The weekly review meeting is the actual product.

  23. 1

    The Z-score plus minimum-traffic guard is the part that quietly earns this whole post. Most growth-experiment templates I've seen skip the significance check entirely, which is exactly the failure mode I fell into running my own weekly tests on a small indie iOS app — for about six weeks I called anything above baseline a "win", then realized maybe half of those were just sample-size noise. The B>100 plus |Z|>1.96 rule on a sheet is a tiny piece of friction that prevents an enormous amount of self-deception, and it's also more honest than most founder dashboards. One thing I haven't solved on my side: how do you decide when to retire a losing experiment versus let it run another week? Do you have a rule, or is it gut?

    1. 1

      Most people hate doing this, but setting a target sample size before you start is the cleanest way. If you hit that number and the Z-score is still nowhere near 1.96, it’s a wash.

  24. 1

    This system is certainly effective when the product-market fit is clear—meaning you know exactly who your users are and what value you're delivering. After all, if you don't deliver the right message, you'll never see a drop in churn rates.

    However, we have to be careful about two things:

    First, are we truly delivering the right message to the right users? If the product is cluttered with unnecessary or misaligned features, users simply won't perceive its value, no matter how much you optimize the messaging.

    Second, this method won't work during the "unscalable" early stage where you're manually acquiring users one by one. Just because people are visiting your site doesn't mean they're willing to pay.

    It's easy to forget.

  25. 1

    This is honestly one of the best “simple but effective” startup systems I’ve seen in a while.

    Most founders spend too much time thinking about growth and not enough time testing things consistently.

    What I like here is the mindset:
    small experiments, every week, clear feedback loop.

    No giant dashboards.
    No over-engineering.
    Just momentum and iteration.

    The part about forcing every experiment into a clear hypothesis before running it is especially underrated. That alone probably improves decision-making massively.

    Really solid framework 👏

  26. 1

    This is a really practical system especially the part about forcing structured thinking before an experiment even gets created. Most teams skip that step and end up testing random ideas instead of clear hypotheses.

    I also like the automation angle with Zapier + Sheets. It removes the “what should we test next?” friction, which is usually where consistency breaks down. The winner flag setup is a nice lightweight way to avoid over-engineering analytics too.

    Curious if anyone here has tried something similar but applied it beyond SaaS (e.g. content growth, e-commerce, or agency workflows)?

  27. 1

    This is exactly the mental shift I've been applying to a different problem — managing multiple Claude API accounts.

    I used to do it manually: check which account hit its limit, switch keys, restart services. Same chaos you're describing with experiments — too much overhead, so you end up doing less of it.

    The fix was the same as your approach: stop doing it manually, build a layer that does it for you. Pool the accounts under a virtual key alias, let the system handle rotation and health checking. Now I almost never think about it.

    The principle generalizes well: if you're manually managing something more than once a week, it probably deserves a system.

  28. 1

    I loved it... thank you for explaining how to do these very important experiments that many people overlook...

  29. 1

    I've run something similar with agents doing the experiment queuing - what breaks first is usually results entry. Founders skip it when the result is ambiguous. The system assumes you'll be honest about the data.

  30. 1

    This is one of those systems that looks simple on the surface but actually solves a huge founder problem: inconsistency. Most startups don’t fail because they never had ideas — they fail because they stop testing systematically. I really liked how this turns growth into a weekly operating habit instead of random bursts of motivation.

    The best part for me was the balance between automation and practicality. No giant analytics stack, no overengineered dashboards — just experiments, tracking, and feedback loops. Also appreciated the point about “good enough” stats. Early-stage founders usually need direction more than perfect scientific certainty.

    And honestly, the real power here isn’t the AI generation itself — it’s building a repeatable system that compounds what works over time. That mindset alone can separate stagnant startups from fast-moving ones 🚀

  31. 1

    allthough it seems a bit complicated .. i think it will be useful in long run - thanks for sharing

  32. 1

    Love the transparency

  33. 1

    This is a really useful framework.

    I’m probably not at the stage yet where I need a fully automated experiment system, but the core idea makes a lot of sense: stop guessing, run small tests every week, and actually write down what worked.

    As an early builder, I think even a simpler version of this could help a lot — one sheet, 1–2 experiments per week, and one clear metric.

    The biggest takeaway for me is consistency. It’s easy to keep thinking about growth, but much harder to keep testing it every week.

  34. 1

    This is useful because it turns growth into a repeatable habit instead of something founders only do when they feel stuck.

    The part I like most is the weekly rhythm: new experiments, run 1–3, enter results, then let the system show what actually worked. That removes a lot of guessing.

    One thing I’d be curious about: how do you decide which experiments are worth running first? Do you rank them by effort, expected impact, funnel stage, or just rotate through ideas every week?

  35. 1

    This is actually a really smart lightweight experimentation system. Most founders get stuck waiting for the “perfect” strategy instead of just testing consistently. I like that the workflow removes idea fatigue and keeps everything measurable without needing a huge analytics stack.

  36. 1

    Love the transparency. Just started building in public myself this week. It's scary but the feedback is invaluable.

  37. 1

    There's a bug in the Winner Flag formula. You have:

    =IF(AND(B2>100, ABS(E2)>1.96), "Winner", "Keep Testing")

    But E2 is the Conversion Rate column, not the Z-Score. A conversion rate is between 0 and 1, so ABS(E2) > 1.96 will almost never be true. The formula should reference F2 (the Z-Score):

    =IF(AND(B2>100, ABS(F2)>1.96), "Winner", "Keep Testing")

    With E2, the system will almost never flag a winner even when results are statistically significant.

  38. 1

    Solid system, but the bottleneck for most early-stage founders isn't the workflow, it's the willingness to call something a loser and kill it. I've watched founders run this kind of process and then keep variants alive 'one more week' for sentimental reasons until the data goes stale. The consistency point at the end is the actual unlock. One add: at sub-1,000 weekly visitors the z-score will almost always say 'Keep Testing,' which is technically right but operationally useless. For early-stage products, swap the statistical filter for a 'directional plus repeat' rule. If a change moves the metric in the right direction two weeks in a row, treat it as a winner.

  39. 1

    The feedback engine framing is exactly right, generate, test, score, reinforce. Most founders treat distribution the same way they treat experiments: random and inconsistent. Curious, are you applying this same system to content distribution or just conversion testing?

  40. 1

    The interesting part here isn’t the automation.

    It’s that you’re turning “growth intuition” into a repeatable operating system.

    Most founders still run experiments randomly:
    idea → test → forget → repeat.

    What you built is closer to a feedback engine:
    generate → test → score → reinforce.

    That layer becomes much bigger than “weekly growth experiments” if you keep pushing it.

    Also feels like the product may eventually outgrow educational/system-style branding into something more infrastructure-grade.

    Names like Xevoa.com, Beryxa.com, or Exirra.com would fit that direction unusually well.

    1. 1

      The infrastructure angle is interesting, most growth tools stay educational and never cross into actual operating layer. Are you building something in this space yourself?

      1. 1

        Not building a growth tool.

        I work more on the naming and positioning side for early products.

        That’s why the infrastructure angle stood out.

        If the product stays as “growth experiment ideas,” the current frame may work.

        But if it becomes the layer founders rely on to decide what to test, score what worked, and repeat what compounds, then the brand has to feel more like a system than a course or template.

        That’s where names like Xevoa, Beryxa, or Exirra start making more sense.

        They give the product room to feel like operating infrastructure, not just another growth playbook.

        1. 1

          Naming and positioning side makes sense, that infrastructure framing came naturally from how the product actually behaves, not how it's currently marketed. Interesting niche to be in, early founders usually get both wrong at the same time. Do you work with founders pre-launch or mostly after they've already shipped?

          1. 1

            Mostly early-stage founders after there’s already a real product or clear direction.

            Pre-launch naming is often guesswork.

            The best timing is when the product has enough shape to reveal what it actually wants to become, but early enough that the brand can still be changed without massive cost.

            That’s usually where the leverage is highest.

            The product starts as one thing, but the real category shows up through usage.

            That’s what I was pointing at here:
            if this becomes a repeatable growth operating layer, the name should not keep it trapped in “growth ideas” or “playbook” territory.

            That’s the gap I usually help founders pressure-test.

            1. 1

              That timing makes sense, post-launch is when the brand actually has something real to work with, not just a hypothesis.
              Interesting overlap actually. We work at a similar stage, when the product has shape but distribution hasn't clicked yet. We help founders get organic reach at scale through creator content, that's usually when positioning work like yours compounds fastest.
              Might be worth staying connected, sounds like we're often talking to the same founders at the same stage.

              1. 1

                Yeah, that overlap is real.

                You’re helping founders get reach.
                I’m usually looking at whether the product is framed and named well enough for that reach to actually convert.

                A lot of founders try to scale distribution while the brand still reads too small for what the product is becoming.

                That’s where the work connects.

                Send me your LinkedIn — worth staying connected. We’re probably seeing the same naming and positioning gaps from different sides.

  41. 1

    I’d use this more as a discipline system than a magic growth system. It keeps you honest: what did we test, what happened, what are we doing next?

  42. 1

    gonna try this stat. thanks!

Trending on Indie Hackers
I've been building for months and made $0. Here's the honest psychological reason — and it's not what I expected. User Avatar 166 comments Agencies charge $5,000 for a 60-second product demo video. I make mine for $0. Here's the exact workflow. User Avatar 152 comments 11 Weeks Ago I Had 0 Users. Now VIDI Has Reviewed $10M+ in Contracts - and I’m Opening a Small SAFE Round User Avatar 41 comments I built a health platform for my family because nobody has a clue what is going on User Avatar 15 comments Most teams think they have a detection problem. They don't. User Avatar 8 comments