Age Period Cohort Analysis
Age-Period-Cohort analysis identifies patterns in cancer incidence or mortality rates from population-based Count (numerator) and Population (denominator) data. Often the data come from a Cancer Registry (e.g., SEER) in the form of a table showing the numbers of cancer cases or cancer deaths (counts) and corresponding person-years at risk (population) for particular age groups and calendar time periods. In cancer research, the Age-Period-Cohort (APC) framework is a fundamental model to analyze these data. The APC Model includes parameters that describe the mathematical relationships between the Rate of cancer and attained age, calendar period (year of diagnosis), and birth cohort (year of birth). The cancer Rate is expressed as the number of cancers per 100,000 persons at risk, which is calculated from the data using the formula: (Count/Population) x 100,000.
This web tool uses the inputted count and population data to fit the APC Model and returns a number of Outputs. To use this tool, the width of the age and period intervals must all be equal. When this is so, the diagonals of the rate table represent birth cohorts.
Visit the following sections to learn more about the APC web tool:
- Getting started - Learn how to enter data into the tool.
- Changing the default reference age, period, and cohort.
- Sample data - Enter any of these sample datasets to see the tool in action.
- What it does - Find out what outputs are produced.
- Saving your results - Find out how to save raw output as an Excel Workbook, R Workspace, or text file.
- FAQ - Find out answers to some common questions about the APC tool.
Getting started
Input data for the web tool consist of Count and Population data for particular age groups over calendar time, in the form of a matrix of rows with paired columns. Rows correspond to particular age groups and columns correspond to calendar time periods. The age and period intervals must all be equal, i.e. if 5-year age groups are used then 5-year calendar periods must also be used. The data can be input by copy-and-paste from an Excel worksheet or file upload of a comma-separated-values (csv) file.
To input from Excel:
- Copy the paired columns of data you want to analyze from your spreadsheet, right-click inside the empty matrix on the Input tab, and paste your selection.
- Fill in the information (meta-data) on the left hand side of the Input page:
- Title - describe your data
- Description - add optional details
- Start Year - list the first calendar year of the first calendar period of your data, for example, use 1990 for the interval 1990 - 1994
- Start Age - list the first age of the first age group of your data, for example use 30 for the interval 30 - 34
- Interval (Years) - the width of the age and period intervals, for example use 1 for single-year data, 2 for two-year data, 5 for five-year data (e.g., 1990 - 1994), etc.
- Click the Calculate button.
To input data from a csv file:
The csv file can include only count and population data, or it can include information fields in header lines delimited by semicolons, followed by the count and population data. Click here to see an example.
- Click the Browse button and select your file.
- Add or modify the meta-data on the left hand side of the input page.
- Click the Calculate button.
Changing the default reference age, period, and cohort
Default references
By default, the webtool uses the median age and period ranges as reference points for calculations. The reference range is calculated by the following formulas:- 1Reference Age = (Number of Age Groups + 1)/2
- 2Reference Period = (Number of Periods + 1)/2
- 3Reference Cohort = (Reference Period – Reference Age + Number of Age Groups)
Let’s use the Sample Data 1 as an example:
- Reference Age = (7+1)/2 = 4, so the 4th age range is the reference age range
- Reference Period = (7+1)/2=4, so the 4th period group is the reference period range
- Reference Cohort = (4-4+7)=7, so the 7th cohort group is the reference cohort range
Since these are all ranges, the reference points are calculated by the following formula:
- 4Period and Age Points: (Lower Value + Upper Value +1)/2
- 5Cohort Point: [(Lower Period Value + Upper Period Value +1)/2] – [(Lower Age Value +Upper Age Value +1)/2], or more simply
- 6Cohort Point: (Reference Period Point – Reference Age Point)
Using the Sample Data 1 as an example:
- Age: 4th age range is 65-69, so (65+69+1)/2 = 67.5
- Period: 4th period range is 1950-54, so (1950+1954+1)/2 = 1952.5
- Cohort: 1952.5-67.5 = 1885
Changing the default:
To change this default to your own selection, after loading your data, change the radio button under References from Automatic to Manual. The drop-down menu’s for Age and Period will contain all possible reference points for the loaded data. Simply choose which of them you want as the reference points for Age and Period, this will calculate the reference point for Cohort from formula 6 above.
Sample data
Illustrative Examples
You can demo the web tool by cutting and pasting the example data into the web tool, then clicking the Calculate button.
Sample Data 1 | |
---|---|
Title Prostate Cancer Mortality in Nonwhites |
Copy all data below:
177 301000 271 317000 312 353000 382 395000 321 426000 305 473000 308 498000 262 212000 350 248000 552 279000 620 301000 714 358000 649 411000 738 443000 360 159000 479 194000 644 222000 949 222000 932 258000 1292 304000 1327 341000 409 132000 544 144000 812 169000 1150 210000 1668 230000 1958 264000 2153 297000 328 76000 509 94000 763 110000 1097 125000 1593 149000 2039 180000 2433 197000 222 37000 359 47000 584 59000 845 71000 1192 91000 1638 108000 2068 118000 108 19000 178 22000 285 32000 475 39000 742 44000 992 56000 1374 66000 |
Description
Example from: Holford T.R. The estimation of age, period, and cohort effects for vital rates. Biometrics, 1983; 39:311-324. |
|
Start Year 1935 | |
Start Age 50 | |
Interval 5 | |
The csv file is found here: Prostate Cancer Example File |
Sample Data 2 | |
---|---|
Title Belgium Female Lung Cancer Mortality |
Copy all data below:
3 1578947.368 2 1538461.538 7 1400000.000 3 1578947.368 10 1428571.429 11 1666666.667 16 1632653.061 11 1527777.778 10 1408450.704 7 1228070.175 11 1410256.41 22 1666666.667 24 1632653.061 25 1524390.244 15 1136363.636 36 1348314.607 44 1392405.063 42 1660079.051 53 1568047.337 48 1221374.046 77 1590909.091 74 1321428.571 68 1379310.345 99 1636363.636 88 1288433.382 106 1606060.606 131 1541176.471 99 1294117.647 142 1340887.63 134 1285988.484 157 1515444.015 184 1533333.333 189 1490536.278 180 1255230.126 177 986072.4234 193 1307588.076 232 1417226.634 262 1455555.556 249 1414772.727 239 999581.765 219 1066731.612 267 1181415.929 323 1297188.755 325 1335799.425 343 1048929.664 223 849847.561 250 902527.0758 308 1010830.325 412 1115322.144 358 930595.269 198 591574.5444 214 636715.2633 253 688060.9192 338 773632.4102 312 690265.4867 |
Description
Example from: Clayton D. & Schifflers E. Models for temporal variation in cancer rates. I: Age-period and age-cohort models. Stat. Med., 1987; 6:449-467. |
|
Start Year 1955 | |
Start Age 25 | |
Interval 5 | |
The csv file is found here: Lung Cancer Example File |
Sample Data 3 | |
---|---|
Title US White Female Breast Cancer Mortality |
Copy all data below:
45 5415162.455 37 5891719.745 27 6293706.294 19 6690140.845 28 6796116.505 21 7000000 20 7194244.604 22 7213114.754 20 7092198.582 15 6849315.068 66 4958677.686 64 5531547.105 78 5967865.34 57 6390134.529 63 6508264.463 55 6740196.078 59 7065868.263 62 7217694.994 43 7251264.755 62 7085714.286 103 4575744.114 138 5126300.149 143 5583756.345 145 6021594.684 114 6263736.264 124 6595744.681 128 6885422.27 123 7089337.176 138 7236497.116 113 7220447.284 172 4269049.392 201 4713883.677 206 5174579.251 195 5609896.433 226 6052490.627 205 6507936.508 220 6644518.272 229 6852184.321 226 7080200.501 222 7264397.906 256 4055124.347 254 4362761.937 301 4768694.55 317 5164548.713 314 5729927.007 410 6220603.854 358 6348643.376 346 6591731.758 346 6881463.803 336 7151979.566 334 3936822.254 341 4110910.187 359 4394124.847 410 4711019.189 427 5243123.772 412 5630722.974 549 6003937.008 562 6349565.021 499 6698885.756 493 6873954.267 442 3894273.128 433 3932788.374 455 4080717.489 460 4303086.997 536 4764444.444 606 5029045.643 612 5630174.793 756 6062550.12 732 6449339.207 761 6526586.621 661 3993957.704 631 3929016.189 587 3942243.116 567 4061604.585 572 4400000 713 4605943.152 729 5192307.692 856 5638998.682 992 6048780.488 869 6242816.092 905 4178208.68 792 4040816.327 750 3928758.512 786 3939849.624 782 4124472.574 824 4309623.431 961 4738658.777 995 5147439.214 997 5560513.107 1225 5993150.685 1205 4339214.98 1123 4168522.643 1013 3986619.441 878 3914400.357 1005 3958251.28 950 4077253.219 985 4356479.434 1144 4703947.368 1184 5083726.921 1351 5638564.274 1501 4442142.646 1408 4287454.324 1328 4122943.185 1251 4009615.385 1139 3948006.932 1167 3937246.964 1147 4081850.534 1237 4340350.877 1348 4638678.596 1417 5141509.434 1945 4509622.073 1826 4396821.575 1707 4274981.217 1851 4647250.816 1463 3795071.336 1291 3874549.82 1367 3884626.314 1412 4041213.509 1484 4243637.403 1701 4657721.796 2118 4453322.119 2260 4443570.586 2192 4394546.913 2038 4281512.605 1875 4120879.121 1779 3963020.717 1523 3869410.569 1632 3891273.247 1728 3986159.17 1728 4295302.013 2348 4309838.473 2529 4439178.515 2464 4482444.97 2389 4411004.431 2256 4271109.428 2122 4139680.062 1981 3973921.765 1913 3848320.257 1839 3832847.02 1977 4024017.912 2634 4157853.197 2654 4358679.586 2695 4461920.53 2823 4448471.478 2657 4370784.669 2545 4287398.922 2421 4094368.341 2207 3888301.621 2185 3790112.749 2195 3846827.9 2825 4012214.174 2808 4168027.312 2807 4264661.197 3019 4330177.854 3117 4371055.953 3026 4366522.367 2710 4208074.534 2739 4026756.836 2432 3894939.142 2299 3802514.059 2827 3849400.871 2994 3946223.804 2977 4028416.779 3111 4167448.091 3211 4325744.308 3271 4404200.889 3221 4303848.21 3047 4173972.603 2810 4032721.01 2682 3817793.594 3000 3687768.9 3006 3774011.299 3015 3855498.721 3258 4000982.439 3170 4175997.892 3407 4291472.478 3478 4280615.385 3423 4229058.562 3159 4109535.58 3147 3903013.767 2744 3524727.039 3025 3623188.406 3087 3709890.638 3175 3817941.318 3183 3937894.346 3431 4070953.963 3626 4169253.766 3662 4215979.738 3676 4139173.516 3496 4023941.068 2614 3101198.244 3065 3450799.37 3058 3549210.771 3351 3645164.799 3301 3748154.877 3421 3872537.922 3637 4028131.576 3916 4134290.541 3869 4117271.47 3750 4086748.038 2770 3120423.567 2996 3260774.924 2962 3382436.908 3216 3516291.275 3473 3623747.913 3462 3737853.595 3623 3855075.548 3866 3944093.042 3992 4008837.116 3906 4041804.636 2649 2894765.599 2905 3058538.64 3059 3203476.804 3231 3370188.797 3237 3900939.986 3546 3581456.419 3584 3650809.82 3685 3713594.679 3977 3852935.478 4008 3946047.061 2382 2689398.216 2737 2827187.274 2797 2969844.978 3181 3138938.228 3177 3260802.628 3562 3361963.19 3642 3437794.978 3912 3505376.344 3888 3638745.905 4097 3740869.248 2374 2493173.703 2599 2573012.573 2662 2699523.375 2995 2854283.808 3078 2991835.148 3282 3102079.395 3511 3212847.731 3766 3299745.904 3935 3375943.72 4016 3452544.704 2310 2277881.866 2614 2328523.071 2598 2437605.555 2722 2569621.448 2975 2713673.265 3206 2830405.226 3344 2962962.963 3600 3060964.204 3746 3111812.593 4021 3177400.237 2279 2037368.139 2366 2106293.955 2502 2196663.74 2513 2292046.698 2777 2427447.552 3000 2548203.517 3258 2679276.316 3495 2790419.162 3627 2871733.967 3919 2955282.407 2169 1787686.475 2290 1880594.564 2277 1956521.739 2428 2017784.426 2448 2137617.883 2710 2256452.956 2938 2382033.404 3388 2514099.139 3346 2612633.716 3670 2714095.548 1946 1538826.506 2023 1634879.586 2237 1705290.441 2208 1747388.414 2256 1850090.208 2403 1957318.563 2740 2078119.075 2868 2194506.083 3155 2306117.974 3294 2407542.757 1723 1286300.859 1834 1375121.841 1884 1453367.276 2047 1494596.963 1998 1574592.166 2219 1657702.077 2340 1767238.124 2610 1872981.701 2863 1963918.233 3175 2056746.777 1424 1018816.627 1539 1117484.752 1711 1220312.389 1848 1278185.088 1920 1322314.05 1915 1366393.15 2167 1444859.315 2308 1527768.584 2419 1614173.228 2732 1711350.539 |
Description
Tarone RE and Chu KC, Evaluation of birth cohort patterns in population diseases rates. Am J Epidemiol 1996; 143: 85-91. |
|
Start Year 1970 | |
Start Age 24 | |
Interval 2 | |
The csv file is found here: Breast Cancer Example File |
What it does
The web tool applies the APC Model to the data in order to estimate parameters (trends and deviations). The parameters are combined to produce functions that describe relationships between the observed Rate of cancer and attained age, calendar period, and birth cohort. The web tool also calculates a number of statistical hypothesis tests (Wald Tests), which address whether the Rate of cancer is statistically significantly variable according to age, period, and cohort factors.
The most important functions calculated by the web tool are summarized in the Table of Key Functions using the following conventions:
- The APC model is defined over A age groups and P calendar periods with equal intervals.
- The central age group, calendar period, and birth cohort define standard reference values a0, p0 and c0, respectively.
- When there is an even number of age, period, or cohort categories, the reference value is the lower of the two central values.
- All values labeled CI Lo are lower 95% confidence limits. All values labeled CI Hi are upper 95% confidence limits.
Table of Key Functions
Nomenclature | Interpretation |
---|---|
Fitted Temporal Trend | Expected rates over time in reference age group a0 adjusted for cohort effects |
Net Drift | Annual percentage change of the expected age-adjusted rates over time |
Local Drifts | Annual percentage change of the expected age-specific rates over time |
Cross-Sectional Age Curve (Cross Age) | Expected age-specific rates in reference period p0 adjusted for cohort effects |
Longitudinal Age Curve (Long Age) | Expected age-specific rates in reference cohort c0 adjusted for period effects |
Period Rate Ratios (PeriodRR) | Ratio of age-specific rates in each period relative to reference period p0 |
Cohort Rate Ratios (CohortRR) | Ratio of age-specific rates in each cohort relative to reference cohort c0 |
- Other parameters and functions calculated by the web tool and not listed above are described in greater mathematical detail elsewhere (click here for a PowerPoint presentation).
Statistical hypothesis tests calculated by the web tool are summarized in the Table of Hypothesis Tests.
The Wald Tests follow a Chi-Square distribution when the Null Hypothesis is true. The df (degrees of freedom) count the number of free parameters included in each test. The web tool reports P-values; values less than 0.05 are often considered "statistically significant", meaning there is statistical evidence that the Null Hypothesis is unlikely to be correct.
Table of Hypothesis Tests
Null Hypothesis | Implications | Degrees of Freedom |
---|---|---|
Net drift = 0 | Fitted temporal trends are stable (i.e., flat with no change) over time. Fitted longitudinal and cross-sectional age curves are proportional. |
1 |
All age deviations = 0 | Fitted longitudinal and cross-sectional age curves are log-linear (i.e., log-additive). | A - 2 |
All period deviations = 0 | Fitted temporal trends and period rate ratios are log-linear (i.e., log-additive). | P - 2 |
All cohort deviations = 0 | Cohort rate ratios are log-linear; all local drifts equal the net drift. | C - 2 |
All period rate ratios = 1 | Net drift is 0 and fitted temporal trends are constant; Cross-sectional age curve describes age incidence pattern in every period. |
P - 1 |
All cohort rate ratios = 1 | Net drift is 0 and all local drifts are 0; Longitudinal age curve describes age incidence pattern in every cohort. |
C - 1 |
All local drifts = the net drift | Temporal patterns are the same in every age group. | A-1 if A=P, A otherwise |
*For APC model defined over A age groups, P calendar periods, and C = P + A - 1 birth cohorts.
Saving your results
You can save a complete record of the inputs and outputs using the Download button:
- "Text Input" - save the input rates in a separate tab in your web browser. This is a printed version of the inputs in R format.
- "Text Output" - save the model outputs in a separate tab in your web browser. This is a printed version of the outputs in R format.
- "R-Studio Input" - save the input data in an R Workspace.
- "R-Studio Output" - save the model outputs in an R Workspace.
- "Excel Output" - save the model outputs in an Excel Workbook.
Frequently Asked Questions:
- Can my counts include zeroes? Yes. The program fills in the zero counts with a small positive value. However, the results may be unreliable or unstable if many counts are zero.
- Can my Populations include zeroes? No, zero population counts will generate errors because the corresponding rates are undefined (you can"t divide by zero!)
- What algorithm do you use to estimate the parameters and functions? We use the statistical method of weighted least squares and the assumption that the count data follow a Poisson distribution allowing for extra-Poisson variation.
- What software do you use? Our calculations are carried out using the R environment for statistical computing and graphics.
- Are your R programs available? Yes, our R programs are freely available. You can obtain a copy of our code here.
- Where do I go for technical support? Please send an e-mail to our technical support team.
- What browsers does the web tool support? The web tool has been tested to work in Google Chrome, Mozilla Firefox, Microsoft Edge, and Safari. The web tool is not supported by Internet Explorer 11 and below.