1950年から2070年までの日本の人口ピラミッドのアニメーション - 1950年の三角形を正しくするためにコホート特有の死亡率が必要な理由
原題: Animating Japan's Population Pyramid From 1950 to 2070 — Why You Need Cohort-Specific Mortality to Get the 1950 Triangle Right
分析結果
- カテゴリ
- 介護
- 重要度
- 62
- トレンドスコア
- 24
- 要約
- この記事では、1950年から2070年までの日本の人口ピラミッドの変化をアニメーションで示し、特に1950年のデータを正確に再現するためにはコホート特有の死亡率が重要であることを説明しています。人口動態の理解には、年齢層ごとの死亡率を考慮することが不可欠であり、これにより将来の人口予測や社会政策の策定に役立つとしています。
- キーワード
"Have you ever seen Japan's ageing?" — I knew the data, but I'd never watched the shape of the country's population pyramid actually deform in front of me. So I built a 250-line page that does it: year slider plus auto-play, 1950 to 2070, with the first baby-boom cohort visibly rising from the bottom of the chart all the way to the top. 🌐 Demo : https://sen.ltd/portfolio/jp-population-pyramid/ 📦 GitHub : https://github.com/sen-ltd/jp-population-pyramid Hit the ▶ button and watch the chart morph at 120 ms per year. The first baby-boomers (born 1947-49) start as the broad base of a true triangle in 1950, become a fat bulge that walks up the chart through the decades, hit the very top of the projection in the 2030s, and disappear off the top by 2050. The two-bulge shape of 2020 turns into the inverted lopsided shape of 2070. Median age goes from 20 to 56, share aged 65+ from 7% to 38%. Why this is harder than it looks Drawing a single population pyramid in Plotly or D3 is a 30-line job. The interesting questions are upstream: Where do the numbers come from , and how do you cover both 70 years of historical data and 50 years of projection in one consistent dataset? How do you keep the shape evolving smoothly as a user drags the slider, when the source data is naturally 5- or 10-year snapshots? Both of those have non-obvious answers, and one of them I got wrong on the first pass. The data: emit it from a function, then nail the totals The dataset that ships in data.json is generated by generate-data.py — a small Python script that exposes the demographic model as four functions: annual_births(year) — single-year births in thousands, 1850 → 2070. Two Gaussian bumps for the first and second baby booms, plus a piecewise-linear trend that drops through the late 20th century. survival(age, sex, birth_year) — share of a cohort still alive at the given exact age. More on this below. cohort_size(birth_year, age, sex) — multiply the two together with a sex-split factor at birth. bin_population(year, bin_idx, sex) — sum five single-age cohorts to make a 5-year bin. Once those are in place, building a snapshot for any calendar year is just calling bin_population 21 times for each sex. The last thing the script does is calibrate : def snapshot ( year ): raw_male = [ bin_population ( year , i , " M " ) for i in range ( N_BINS )] raw_female = [ bin_population ( year , i , " F " ) for i in range ( N_BINS )] raw_total = sum ( raw_male ) + sum ( raw_female ) target = TARGET_TOTALS [ year ] # IPSS / UN WPP value scale = target / raw_total male = [ round ( m * scale ) for m in raw_male ] female = [ round ( f * scale ) for f in raw_female ] return { " year " : year , " male " : male , " female " : female } Even with carefully tuned birth and survival functions, the unscaled total comes out 92-105% of the actual figure. Rather than fight the model into perfect alignment, I let it generate the shape and then scale each year's totals to the published numbers (1950 = 83.2 M, 2020 = 126.3 M, 2070 = 87.0 M, etc.). The shape lives in the model, the size is anchored to reality. Where I got it wrong: 1950 wasn't a pyramid The first version of survival() took only two arguments — age and sex — and used a stylized contemporary Japanese life table: def survival ( age , sex ): if sex == " F " : return math . exp ( - (( age / 90 ) ** 7 )) return math . exp ( - (( age / 85 ) ** 7 )) Plug this into the 1950 generator. Anyone in the chart who's age 75 in 1950 was born in 1875, and the function predicts that 42% of their cohort is still alive: exp(-(75/85)^7) ≈ 0.42 . That's roughly correct for someone born in 1945 , but for someone born in 1875 it's wildly optimistic. Tuberculosis, infant mortality, two wars — real cohort-survival to age 75 for 1875-born Japanese was something like 5-10%. The result is a 1950 pyramid that looks like a slightly-tapered trapezoid, with a median age of 36.9 and an aging ratio of 18.6%. Both numbers are way off from reality (median age in 1950 Japan was about 22; aging ratio about 5%). The shape isn't a pyramid at all. The fix is to make survival take birth_year as a parameter, and attenuate the modern curve for older cohorts: def cohort_factor ( birth_year ): if birth_year >= 1945 : return 1.0 if birth_year >= 1925 : return 0.55 + 0.45 * ( birth_year - 1925 ) / 20 if birth_year >= 1890 : return 0.25 + 0.30 * ( birth_year - 1890 ) / 35 return 0.20 def survival ( age , sex , birth_year ): base = math . exp ( - (( age / 90 if sex == " F " else 85 ) ** 7 )) return base * cohort_factor ( birth_year ) Now the 1875-born cohort's survival to 75 drops to 0.42 × 0.20 = 8.4% , which is in the right neighbourhood. Median age in 1950 is 20.4, aging ratio 7.2%. The pyramid actually looks like a pyramid. The lesson here is one any demographer would tell you immediately: mortality is cohort-specific, not just age-specific . If your survival function takes only (age, sex) , you're applying today's life table to 19th-century births. The correction is one extra parameter and a piecewise-linear adjustment factor. Linear interpolation between snapshots The dataset only carries 13 snapshots — 1950, 1960, ..., 2070. When the user drags the slider to 2005, we blend the 2000 and 2010 snapshots 50/50: export function interpolateSnapshots ( a , b , year ) { const t = ( year - a . year ) / ( b . year - a . year ); return { year , male : interpolateArrays ( a . male , b . male , t ), female : interpolateArrays ( a . female , b . female , t ), }; } export function getSnapshot ( snapshots , year ) { if ( year <= snapshots [ 0 ]. year ) return clone ( snapshots [ 0 ]); if ( year >= snapshots . at ( - 1 ). year ) return clone ( snapshots . at ( - 1 )); for ( let i = 0 ; i < snapshots . length - 1 ; i ++ ) { const a = snapshots [ i ], b = snapshots [ i + 1 ]; if ( year >= a . year && year <= b . year ) return interpolateSnapshots ( a , b , year ); } } Linear blending of bin counts is not what real demographic dynamics do — cohorts move discretely between bins as they age, with deaths and births perturbing the totals. But for a dragable visualization at one-year resolution, the lie is small enough not to matter; nothing on screen jumps. If you wanted to show actual cohort flow you'd need a continuous-time integrator, which is a different tool with a different scope. SVG diverging bars with one global scale The chart is a vertical stack of 21 horizontal bars, with a centre column for the age labels. Males extend leftward from the centre, females rightward. function renderBars ( snapshot , globalMax ) { const halfPlotW = ( VIEW_W - PAD_LEFT - PAD_RIGHT - CENTER_GAP ) / 2 ; const cx = VIEW_W / 2 ; for ( let i = 0 ; i < snapshot . male . length ; i ++ ) { const m = snapshot . male [ i ], f = snapshot . female [ i ]; const mw = ( m / globalMax ) * halfPlotW ; const fw = ( f / globalMax ) * halfPlotW ; barCache . male [ i ]. setAttribute ( " x " , cx - CENTER_GAP / 2 - mw ); barCache . male [ i ]. setAttribute ( " width " , mw ); barCache . female [ i ]. setAttribute ( " width " , fw ); } } Two design choices worth calling out: globalMax is the largest single bin across all snapshots , not per-frame. If you re-normalise per-frame, the boomer bulge appears stationary in the chart while everything around it grows and shrinks; the eye-catching effect — the bulge visibly rising up the pyramid as cohorts age — disappears. With a fixed scale you keep the right answer. CSS transitions on <rect> width and x . The setAttribute calls are direct, no animation library, but .bar-male { transition: x 0.15s, width 0.15s; } makes every per-year update interpolate smoothly. Drag the slider and the bars glide. Hit play and you get a film. The <rect> elements are pooled at first render and reused on every year update. No DOM churn, no React. Computing stats live Median age, aging ratio, and working-age ratio are recomputed in JS on every render. With 21 bins and 13 snapshots there's nothing to optimise — just walk the cumulative sum. export function medianAge ( snapshot , binWidth = 5 ) { const total = totalPopulation ( snapshot ); const half = total / 2 ; let cumulative = 0 ; for ( let i = 0 ; i < snapshot . male . length ; i ++ ) { const binSize = snapshot . male [ i ] + snapshot . female [ i ]; const next = cumulative + binSize ; if ( next >= half ) { const fraction = binSize === 0 ? 0 : ( half - cumulative ) / binSize ; return i * binWidth + fraction * binWidth ; } cumulative = next ; } return ( snapshot . male . length - 1 ) * binWidth ; } The unit tests pin a tiny synthetic dataset that lets you verify the median by hand: const TINY = [ { year : 2000 , male : [ 400 , 300 , 200 , 100 ], female : [ 400 , 300 , 200 , 100 ] }, ... ]; // Bin sizes (M+F): 800, 600, 400, 200. Cumulative: 800, 1400, 1800, 2000. // Half-total = 1000 → falls in bin 1. Need 200 more out of 600 → 1/3 of the bin. // median ≈ 5 + (1/3)*5 = 6.667 years. assert . ok ( Math . abs ( medianAge ( TINY [ 0 ]) - 6.667 ) < 0.01 ); The Japanese number formatting trap Population in this codebase is in thousands , because that's what the source data looks like. Rendering "1.26億" or "8,320万" requires getting the unit conversions exactly right, and I shipped two different bugs before settling: 1万 = 10 千 , so man = thousands / 10 . I once wrote Math.round(thousands / 10) followed by "0万" , which appended an extra zero and rendered 83,202 千 (= 83.2 M people = 8,320 万) as "83200万" (= 832 億, 832,000 M, off by a factor of 100). 1億 = 1万 × 1万 = 100,000 千 . I once divided by 10_000 instead of 100_000 and rendered 128,097 千 as "12.81億" (off by a factor of 10). The fix is one careful function with three branches and boundary-value tests for all three: export function formatJpPopulation ( thousands ) { if ( thousands >= 100 _000 ) return ` ${( thousands / 100 _000 ). toFixed ( 2 )} 億` ; if ( thousands >= 10 ) return ` ${ Math . round ( thousands / 10 ). toLocaleString ( " en-US " )} 万` ; return ` ${ thousands } 千` ; } assert .