Werner03.Data

Reading: BASIC RATEMAKING, Fifth Edition, May 2016, Geoff Werner, FCAS, MAAA & Claudine Modlin, FCAS, MAAA Willis Towers Watson

Chapter 3: Data

Pop Quiz

What is the Fundamental Insurance Equation? Click for Answer

Study Tips

VIDEO: W-03 (001) Aggregating Data → 4:00Forum

Geoff Werner and Claudine Modlin write very well and I think it's valuable to keep this source text handy. For our purposes however, we need to focus firmly on the types of questions that appear on the exam. If you use the BattleTable below to quickly click through old questions, you'll see there are essentially just a few types of problems:

advantages / disadvantages of different data aggregation methods (calendar year, accident year, policy year, report year)
calculating losses and premiums from policy & claims information

The first is memorization, the second is drill-type practice. We'll cover both types of problems in the quizzes.

This chapter is listed as Core Content in the Ranking Table, but you don't need to spend a lot of time here. The early chapters in the text are all building to Chapter 8 - Overall Indication. Chapter 8 encompasses everything from the earlier chapters.

Estimated study time: ½ day (not including subsequent review time)

BattleTable

Based on past exams, the main things you need to know (in rough order of importance) are:

advantage / disadvantage of data aggregation methods
calculation of losses from policy or claims databases

reference	part (a)	part (b)	part (c)	part (d)
E (2019.Spring #3)	advantage / disadvantage: - CY aggregation ^2a	aggregation method: - long-tailed line	aggregation method: - claims-made line ¹
E (2018.Spring #3)	BattleActs PowerPack - Past Exams
E (2018.Spring #16)	BattleActs PowerPack - Past Exams
E (2017.Spring #3)	calculation: - AY IL	calculation: - CY IL	advantage / disadvantage: - CY aggregation ^2b
E (2016.Spring #4)	calculation: - CY case IL	calculation: - PY case IL	calculation: - AY case ILR	advantage / disadvantage: - PY aggregation
E (2015.Spring #6)	advantage / disadvantage: - CY aggregation	advantage / disadvantage: - PY aggregation	advantage / disadvantage: - RY aggregation
E (2013.Fall #3)	objectives met? - CY aggregation	objectives met? - CY/AY aggregation	objectives met? - PY aggregation

¹ Claims-made lines are discussed in Werner16.ClaimsMade.

^2a,2b One of the advantages of CY aggregation given in the examiner's report for 2017.Spring #3 (suitability for financial reporting) was not accepted in 2019.Spring #3. This happens from time to time in the grading process but there is likely nothing you can do about it. An appeal on this basis is not likely to be successful.

Full BattleQuiz You must be logged in or this will not work.

In Plain English!

Alice had a good laugh over the very first sentence from this chapter in the source text:

One of the most significant underpinnings of the ratemaking process is data. The quality of the final rates depends largely on the quality and quantity of data available.

She laughed because it's just so obvious. Yes, quality data is important! Duh.

Important Terminology: These terms mean the same thing: reported loss (Friedland's term) & incurred loss (Werner's term)

The term reported loss is better because it's consistent with the formula:

ultimate loss = reported loss + unreported loss = reported loss + IBNR

The reason this makes sense is that IBNR (Incurred-But-Not-Reported) is just a different way of saying unreported. But if you use the term incurred loss then the above formula looks like this:

ultimate loss = incurred loss + IBNR

which strictly speaking doesn't make sense. If you take Incurred loss literally then it should include IBNR. (Incurred loss should include both reported and unreported losses.) But the convention is that incurred loss includes only case incurred loss, making it exactly the same as reported loss. Anyway, don't get too hung up on this – just be aware that the terminology is slightly different between Friedland and Werner.

Internal Data

Ratemaking uses 2 types of internal information or data:

risk information at the policy level (exposures, premiums, claim counts, losses, individual risk characteristics like age, gender,...)
accounting information at the aggregate level (U/W expenses, ALAE, ULAE,...)

The goal in a pricing analysis is to determine whether current rates are adequate and to assess the need for future rate changes. Most companies maintain risk data in 2 separate databases:

policy database (policy ID, premiums, exposures, individual risk characteristics)
claims database (policy ID, claim #, paid loss, case reserve,...)

These databases are linked by policy ID but since not every policy will have a claim, the IDs in the claims database will be a subset of the IDs in the policy database. Here is a very simple example of how policy data is organized into a data table. Read the short highlighted excerpt from Werner using the link below.

EXAMPLE: Policy Database

The claims database is also very simple. Here's the link to Werner. Take a few minutes to make sure you can follow it.

EXAMPLE: Claims Database

There is no example for the accounting database in the text. Note that the U/W expense and ULAE components are generally tracked by calendar year whereas ALAE can be tracked by accident year. Accounting data is used to determine the expenses provision in the ratemaking process.

Here are 4 practice problems in using a claims database to calculate AY and CY losses. (You might first want to watch the Quick-Vids in the web-based problems in the next quiz.)

Practice: 4 claim aggregation (like 2017.Spring #3)

Once you've done the practice problems, you won't have any trouble with this exam problem:

E (2017.Spring #3)

And the quiz...

mini BattleQuiz 1 You must be logged in or this will not work.

Data Aggregation

There are 4 common ways of aggregating data: CAP-R

Calendar Year (CY)
Accident Year (AY)
Policy Year (PY)
Report Year (RY)

The definitions and advantages/disadvantages of each are discussed in Reserving Chapter 3 - Organizing the Data. Please review that section before continuing. The pricing and reserving source texts cover essentially the same concepts. The pricing text however specifically covers the following:

Question: identify the objectives of data aggregation for ratemaking [Hint: MMR]

Match losses and premiums as closely as possible

Minimize the cost of collecting the data

Recent? (Use the most recent data available)

Based on what you already know from reserving, you can probably guess whether these MMR objectives are met for each of the 4 data aggregation methods.

Calendar Year

losses and premiums are not well-matched because premiums come from policies from the current and/or prior year but losses may come from many prior years depending on the length of the development tail
low cost and readily available since CY data is collected for other purposes like financial statements
CY data is recent because it doesn't develop after year-end and is available immediately

Accident Year

losses and premiums are better matched than in CY aggregation because losses on accidents occurring during the year are compared to premiums earned on policies during that same year (this is sometimes called calendar-accident year)
higher cost versus CY data since AY data is generally collected only for actuarial purposes and is specific to insurance
fully developed AY data is not recent so if you need recent AY data, you may need to estimate future development (this is why actuaries exist)

Policy Year

losses and premiums are perfectly matched because losses on policies written during the PY are compared with premiums earned on those same policies
higher cost versus CY data since PY data is specific to insurance
fully developed PY data is not recent so if you need recent PY data, you may need to estimate future development (this is why actuaries exist)

(But PY data has an additional complication that AY data doesn't. As an example, for AY 2024 data, all accidents have occurred by Dec 31, 2024. But for PY 2024, policies written at the end of 2024 will be in force until the same date in 2025. That means accidents attributed to PY 2024 can happen over a 2-year period from Jan 1, 2024 to Dec 31, 2025. This extra lag time means PYs are relatively less mature than AYs so there's potentially more uncertainty in estimates of ultimate losses.)

Report Year (often used for commercial products like medical malpractice that use "claims-made" policies – discussed in detail in Werner16.ClaimsMade)

losses and premiums are better matched than in CY aggregation because losses on accidents reported during the year are compared to premiums earned on policies during that same year (similar to calendar-accident year)
higher cost versus CY data since RY data is specific to insurance
RY data has no pure IBNR (IBNYR) because by definition RY data contains only claims reported for that year (makes actuarial estimates less uncertain because there is only IBNER)

External Data

We discussed external data in Reserving Chapter 3 - Sources of Data so click the link for a quick review. In the reserving material, I mentioned 1 source of external data for the U.S.: ISO or Insurance Services Office. Werner goes into a little more detail on what this is and also mentions a few other good sources of external data but it has far more detail than you would need for the exam. Good bedtime reading.

Statistical Plans

ISO and NCCI (National Council for Compensation Insurance) collect data from insurers and aggregate it into statistical plans. It is often mandatory for insurers to submit this data and it must be in a standard format to ensure consistency.

Other aggregated industry data

A common example is the Fast Track Monitoring System. It's based on voluntary reporting of data by insurers to a data collecting organization. Fast Track data is often used to analyze frequency and severity trends.

Competitor information:

A rate filing is public information in many jurisdictions so it can be used by an insurer to understand their competition. The drawbacks however are that a given individual rate filing may not be complete (rate filings only contain changes from the last filing) and that each insurer may have different customers, goals, expenses, and operating procedures so the competitors's pricing strategies may not be relevant.

Other 3^rd party data

This refers to non-insurance data including economic, demographic, and credit data. For example the Consumer Price Index or CPI might be used to connect inflation with severity trends.

And the quiz...

mini BattleQuiz 2 You must be logged in or this will not work.

Exam Problems

Here are just a few extra exam problems for chapter 3:

mini BattleQuiz 3 You must be logged in or this will not work.

Full BattleQuiz You must be logged in or this will not work.

POP QUIZ ANSWERS

Fundamental Insurance Equation: premium = (Losses + LAE + U/W Expenses) + U/W profit

Go back

Werner03.Data

Contents

Pop Quiz

Study Tips

BattleTable

In Plain English!

Internal Data

Data Aggregation

External Data

Exam Problems

POP QUIZ ANSWERS

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Reserving Top 5 Topics

Pricing Top 5 Topics

Miscellaneous BattleReports

Tools