あなたの分析ダッシュボードはほとんどのAIトラフィックを見逃しています。私がそれを修正した方法。
原題: Your analytics dashboard is blind to most AI traffic. Here's how I fixed mine.
分析結果
- カテゴリ
- AI
- 重要度
- 71
- トレンドスコア
- 33
- 要約
- 多くの分析ダッシュボードはAIトラフィックを正確に追跡できません。この記事では、AIトラフィックを可視化するための具体的な手法やツールを紹介し、どのようにして自分のダッシュボードを改善したかを解説しています。AIによるデータの影響を理解し、より正確な分析を行うためのステップを示しています。
- キーワード
A few weeks ago I went digging through raw server logs on a WordPress site I run, out of simple curiosity about how often AI crawlers — GPTBot, ClaudeBot, Perplexity, and friends — were actually visiting. The number I found didn't match GA4 at all. Not even close. The blind spot GA4 (and most JS-based analytics) works by firing an event from client-side JavaScript when a page loads in a browser. That's a reasonable assumption when your visitors are humans with browsers. It's a bad assumption when an increasing share of your traffic is AI agents fetching pages via HTTP to read, summarize, or train on your content. Most of these agents: Don't execute JavaScript Don't render the DOM Just request the HTML and parse it server-side Which means: GA4 never sees them. Not "undercounts them" — never sees them at all , structurally, by design. When I cross-checked GA4's pageview count against my raw access logs filtered for known AI user-agents, the gap was roughly 9x . Nine times more AI bot requests than GA4 reported as traffic of any kind. That's not a rounding error — that's an entire category of visitor your dashboard doesn't know exists. Why this matters more every month As more search behavior shifts toward AI Overviews, AI Mode, and conversational assistants doing the browsing on a user's behalf, the traffic GA4 can see is shrinking as a proportion of total attention your content receives. You can be making real progress with the systems generating zero-click answers — and your analytics will tell you nothing changed. If you can't see it, you can't optimize for it. You're flying half-blind. What I built EdgeShaping Lite is a small, free WordPress plugin that observes AI bot traffic at the PHP layer instead of the JavaScript layer. No JS dependency, no reliance on the bot executing anything — it just logs the request when it matches a dictionary of known AI crawler user-agents. Core design constraints I held myself to: It doesn't block anything. This is an observation tool, not a firewall. Blocking AI crawlers is a different (valid) problem with different tools. It doesn't modify content. No injected markup, no cloaking. No data leaves the site. Everything stays in the WordPress database. No third-party telemetry. Install it, activate it, and you immediately get a dashboard: which bots, which pages, how often, when. The more interesting part: the AHQG Matrix Knowing that AI reads your pages is useful. Knowing which pages AI reads relative to which pages humans actually find through search is more useful — because the mismatch between those two signals is where the actionable insight lives. That's what the AHQG Matrix does (patent application filed on the underlying method). It's a simple idea executed as a 2x2: High human search clicks | STANDARD | ALIGNED (humans find it, | (both AI and humans AI mostly ignores it) | find it — healthy state) | ---------------------------------------------------- High AI bot visits | INCUBATION | LATENT GAP (neither finds it yet) | (AI already reads it heavily, | humans haven't discovered it yet) The quadrant that matters most in practice is LATENT GAP : pages AI is already crawling frequently — meaning some AI system has judged them worth reading and probably worth citing — that haven't yet translated into human search visibility. These are early signals worth acting on before they show up anywhere else in your funnel metrics. Implementation-wise, the matrix needs two data sources: AI bot visit counts per page (from EdgeShaping's own observation log) Human search click counts per page (from the Search Console API) It plots every page on those two axes, splits the distribution at a computed threshold per axis, and buckets pages into the four quadrants. The Google Search Console integration is optional — without it, you still get the raw AI traffic ranking, just not the cross-reference. There's also a secondary signal I didn't expect to find useful until I built it: pages that get AI traffic but aren't in your sitemap at all (an "inferred path" — AI found a route to a page your own site architecture doesn't formally declare), and the inverse — pages in your sitemap that neither AI nor humans ever reach (a genuine dead end, observable for the first time). What I'd do differently Two honest lessons from shipping this: OAuth is a bad default for a free tier. The original GSC integration required users to create a Google Cloud project and an OAuth client just to unlock the matrix view. For a plugin aimed at WordPress site owners — not necessarily developers — that's a steep ask, and it shows in support friction. I'm moving the free tier to a simpler CSV-import flow and reserving live OAuth sync for the paid edition. Localization infrastructure has more layers than you'd guess. WordPress.org's plugin UI strings and the plugin's directory listing page (the readme) are translated through completely separate systems. I had the in-plugin UI fully localized into Japanese while the public-facing listing page was silently still in English — for over a week, with zero indication anything was wrong, quietly costing conversions from non-English-speaking visitors who landed on the page and bounced. If you're shipping a plugin for a non-English-primary audience, check both translation projects independently; don't assume one implies the other. Try it Free, open on the WordPress.org directory: https://wordpress.org/plugins/edgeshaping-lite/ If you run a non-trivial amount of content and haven't checked your raw logs for AI crawler traffic recently, I'd genuinely be curious what gap you find. Mine was 9x. I don't think that's an outlier. A few weeks ago I went digging through raw server logs on a WordPress site I run, out of simple curiosity about how often AI crawlers — GPTBot, ClaudeBot, Perplexity, and friends — were actually visiting. The number I found didn't match GA4 at all. Not even close. The blind spot GA4 (and most JS-based analytics) works by firing an event from client-side JavaScript when a page loads in a browser. That's a reasonable assumption when your visitors are humans with browsers. It's a bad assumption when an increasing share of your traffic is AI agents fetching pages via HTTP to read, summarize, or train on your content. Most of these agents: Don't execute JavaScript Don't render the DOM Just request the HTML and parse it server-side Which means: GA4 never sees them. Not "undercounts them" — never sees them at all , structurally, by design. When I cross-checked GA4's pageview count against my raw access logs filtered for known AI user-agents, the gap was roughly 9x . Nine times more AI bot requests than GA4 reported as traffic of any kind. That's not a rounding error — that's an entire category of visitor your dashboard doesn't know exists. Why this matters more every month As more search behavior shifts toward AI Overviews, AI Mode, and conversational assistants doing the browsing on a user's behalf, the traffic GA4 can see is shrinking as a proportion of total attention your content receives. You can be making real progress with the systems generating zero-click answers — and your analytics will tell you nothing changed. If you can't see it, you can't optimize for it. You're flying half-blind. What I built EdgeShaping Lite is a small, free WordPress plugin that observes AI bot traffic at the PHP layer instead of the JavaScript layer. No JS dependency, no reliance on the bot executing anything — it just logs the request when it matches a dictionary of known AI crawler user-agents. Core design constraints I held myself to: It doesn't block anything. This is an observation tool, not a firewall. Blocking AI crawlers is a different (valid) problem with different tools. It doesn't modify content. No injected markup, no cloaking. No data leaves the site. Everything stays in the WordPress database. No third-party telemetry. Install it, activate it, and you immediately get a dashboard: which bots, which pages, how often, when. The more interesting part: the AHQG Matrix Knowing that AI reads your pages is useful. Knowing which pages AI reads relative to which pages humans actually find through search is more useful — because the mismatch between those two signals is where the actionable insight lives. That's what the AHQG Matrix does (patent application filed on the underlying method). It's a simple idea executed as a 2x2: High human search clicks | STANDARD | ALIGNED (humans find it, | (both AI and humans AI mostly ignores it) | find it — healthy state) | ---------------------------------------------------- High AI bot visits | INCUBATION | LATENT GAP (neither finds it yet) | (AI already reads it heavily, | humans haven't discovered it yet) The quadrant that matters most in practice is LATENT GAP : pages AI is already crawling frequently — meaning some AI system has judged them worth reading and probably worth citing — that haven't yet translated into human search visibility. These are early signals worth acting on before they show up anywhere else in your funnel metrics. Implementation-wise, the matrix needs two data sources: AI bot visit counts per page (from EdgeShaping's own observation log) Human search click counts per page (from the Search Console API) It plots every page on those two axes, splits the distribution at a computed threshold per axis, and buckets pages into the four quadrants. The Google Search Console integration is optional — without it, you still get the raw AI traffic ranking, just not the cross-reference. There's also a secondary signal I didn't expect to find useful until I built it: pages that get AI traffic but aren't in your sitemap at all (an "inferred path" — AI found a route to a page your own site architecture doesn't formally declare), and the inverse — pages in your sitemap that neither AI nor humans ever reach (a genuine dead end, observable for the first time). What I'd do differently Two honest lessons from shipping this: OAuth is a bad defaul