Friedland03.Data

Reading: Friedland, J.F., Estimating Unpaid Claims Using Basic Techniques, Casualty Actuarial Society, Third Version, July 2010. The Appendices are excluded.

Chapter 3: Understanding the Types of Data Used in the Estimation of Unpaid Claims

Pop Quiz

Alice gave you 2 mega-useful formulas in Chapter 2: The Claims Process, for total unpaid and ultimate. What are they? Click for Answer

Study Tips

VIDEO: F-03 (001) Data → 4:00 Forum

There is a lot of basic information in this chapter but aside from a few key facts to memorize, it isn't something you can fully absorb on your first pass. You are probably familiar with some of it anyway from your work duties. So don't spend too long here – you can always refer back when necessary. The most important set of facts you have to memorize is:

advantages/disadvantages of various data aggregation methods (CY, AY, RY, PY)

There are a lot of BattleCards in the quizzes but much of it is pretty easy and won't take a tremendous amount of time to memorize. Remember: you have to get through these first 6 chapters quickly because the main reserving material is in chapters 7-17.

Estimated study time: ½ day (not including subsequent review time)

BattleTable

Based on past exams, the main things you need to know (in rough order of importance) are:

data aggregation - advantages/disadvantages of CY/AY/PY/RY aggregation
homogeneity & credibility - considerations for combining data
large claims threshold - selection considerations

reference	part (a)	part (b)	part (c)	part (d)
E (2016.Fall #17)	data for analysis: - compare strategies
E (2016.Spring #15)	Friedland05.Triangles	data aggregation: - is CY appropriate?	data aggregation: - is AY appropriate?	Friedland07.Development
E (2015.Fall #15)	large claim thresold: - selection considerations	Friedland09.BornFerg
E (2015.Fall #16)	Friedland05.Triangles	data aggregation: - RY versus AY

Full BattleQuiz You must be logged in or this will not work.

In Plain English!

Source of Data

This short section is just memorization of the facts below. You sometimes need to draw on this background information to answer certain essay-style questions.

Question: do large insurers normally use their own internal data or rely on external/industry data

large insurers prefer to use their own data as it's more relevant to their own experience
large insurers might still use external data as benchmarks in relation to their own data

Question: for what purposes might external benchmarks be particularly useful to an insurer [Hint: TET]

Tail factors

Expected claims (loss) ratios

(used in some reserving methods:

→ Chapter 8: Expected Claims Method

→ Chapter 9: Bornhuetter-Ferguson Method

→ Chapter 10: Cape Cod Method)

Trends (severity & frequency of claims)

Question: in what situations might an insurer want to use external data

small insurers without credible internal data
a company with limitations on what data their systems can provide (often the case with a small company)
entering a new LOB (Line of Business) or geographical region where the company wouldn't have any prior data

Question: identify 1 U.S. source and 1 Canadian source of external data

U.S.: ISO (Insurance Services Office)

Canada: IBC (Insurance Bureau of Canada)

Question: identify potential differences in external versus internal data that may reduce the relevance of external data

insurance product
insurer operations
case reserving and case settlement practices
mix of business

mini BattleQuiz 1 You must be logged in or this will not work.

Homogeneity and Credibility

Suppose you have 5 different LOBs: A, B, C, D, E (Lines of Business), as part of your reserve analysis:

LOB A: 20,000 claims

LOB B: 100 claims

LOB C: 20,000 claims

LOB D: 100 claims

LOB E: 100 claims

Let's also suppose that LOBs A & B are similar to each other, LOBs C & D are similar to each other, but LOB E is not similar to any of the others.

Question: what is a reasonable way to group these LOBs for a reserve analysis

The key concepts in making that decision are homogeneity and credibility. We would like to maximize both.

→ homogeneity refers to how similar claims are within a grouping (a grouping of similar claims is called a homogeneous grouping)

→ credibility refers to the statistical significance of a grouping (the more claims in a grouping, the more statistically significant it is)

So a reasonable answer to the above question is:

group LOBs A & B:

- LOB A is homogeneous and has enough claims to be credible (and could be analyzed on its own)

- LOB B is homogeneous but is not credible (cannot be analyzed on its own)

- since LOBs A & B are similar, they can be combined and the resulting grouping is credible and still homogeneous

group LOBs C & D:

- the reasoning is the same as for A & B

LOB E should be supplemented with external data

- LOB E is homogeneous but not credible (cannot be analyzed on its own)

- LOB E is not similar to any other LOBs (so cannot be grouped with them)

- a valid alternative is to look for credible external data that is similar to LOB E

I said above that we'd like to maximize homogeneity and credibility simultaneously but that isn't always possible. Often, increasing homogeneity of a group of claims means getting rid of some individual claims that aren't like the others. That means the group gets smaller. But a smaller group has less credibility.

Inconvenient Fact: homogeneity is inversely proportional to credibility

Homogeneity and credibility have to be balanced. The overall goal is to produce an accurate reserve analysis and you have to strike a balance in terms of how to group your data. The source text provides a list of considerations in making this determination. If you want to glance ahead, there's an example of combining data for analysis in Chapter 7: Influence of a Changing Environment

Question: identify considerations in determining data groupings for analysis (many answers are possible)

number of claims in grouping (more claims means higher credibility, but potentially less homogeneity)
claim development patterns (better if each group member has a similar pattern – this is discussed much more fully in Chapter 7: Development)
severity distribution of claims
case reserving strength
likelihood of a claim reopening

mini BattleQuiz 2 You must be logged in or this will not work.

Types of Data Used by Actuaries

This whole section in the source text seems pretty pointless – bedtime reading at best. There was 1 exam question where you had to memorize a list of considerations regarding thresholds for large claims but other than that, the material is either obvious or covered in later chapters. Read the summary below and memorize the answers to the 3 questions.

Claim and Claim Count Data

- a list of types of data that actuaries use in their analysis: claims/losses, counts, claims closed with payment, claims closed without payment,...

Claim-Related Expenses

- you need to know how the insurer handles expenses before using the data

- see Chapter 16: ALAE and Chapter 17: ULAE

Multiple Currencies

- make sure all your data is in the same currency

Large Claims

- large claims can distort your data and subsequent analysis

- sometimes it's preferable to remove large claims from your basic analysis and deal with them separately

Question: identify considerations in establishing a large claims threshold

number of claims over threshold
size of claim relative to policy limits
size of claim relative to reinsurance limits
credibility of internal data regarding large claims
availability of relevant external data

Here's the exam problem that asked you for this list (part a only): E (2015.Fall #15)

Recoveries

- includes deductibles, salvage & subgration, reinsurance

- see Chapter 14: Recoveries

Reinsurance

- see Chapter 14: Recoveries

Question: identify 3 possible treatments of ALAE in excess-of-loss reinsurance

included with the claim amount in determining excess of loss coverage (most common treatment)
not included in the coverage
included on a pro rata basis; the ratio of the excess portion of the claim to the total claim amount determines coverage for ALAE

Exposure Data

- EP (Earned Premium) is a very common exposure base

- other exposure bases are: ECYs (Earned Car-Years), payroll (common in Worker's Comp), miles driven (auto), square footage (General Liability for corporations),...

Insurer Reporting & Understanding the Data

- the idea here is that you must understand your data before you begin your analysis

Verification of the Data

Question: identify some components of a data review process

reconcile data with financial statements
check consistency of current data against prior data
check that the data look reasonable
check data definitions (Ex: are counts tallied by claim or by claimant – because 1 claims could have multiple claimants)

Here's a short quiz on this material...

mini BattleQuiz 3 You must be logged in or this will not work.

Organizing the Data

We'll start with some very basic information. This would likely never be asked on the exam but it's something you should know.

Question: what are the 5 key dates for the organization of claim data [Hint: PARAV]

Policy effective dates: beginning and ending dates of the policy term

Accident date: when the accident or event occurred that triggered coverage

Report date: when the claim was reported/recorded in the claim system

Accounting date: defines a group of claims for which liability exists, often Dec 31 of the given year

Valuation date: the date through which transactions are included in the data used by the actuary for the analysis, often Dec 31 of the given year

The first 3 term above are completely obvious. The accounting date is often referred to as when the books close, either for the month, quarter, of year. The accounting date and valuation date are usually the same. If the books close on Dec 31, the actuary would normally do the reserve analysis using data through Dec 31 (valuation date of Dec 31).

The rest of this section is stuff you definitely have to know as it does come up on the exam.

Question: identify 4 common ways of aggregating data

CY, AY, PY, RY

→ The terms are used very often and I don't want to write them out in full every time. They are, respectively, Calendar Year, Accident Year, Policy Year, Report Year. Note that Policy Year is also sometimes called Underwriting Year.

define & describe: CY aggregation of data

definition: CY data includes all data with a transaction date within the given year.

use:

- premium and exposure data is usually aggregated by CY

- loss data is not usually aggregated by CY, except possibly for diagnostics

- here's a very useful formula for CYEP, Calendar Year Earned Premium,

CYEP = WP + ( UEP_beg – UEP_end )

advantages:

- readily available

- no future development so CY data is recent (data doesn't change after accounting date so it's available immediately)

disadvantages:

- no future development (generally cannot use CY loss data for a reserve analysis)

define & describe: AY aggregation of data

definition: AY data includes all data with an occurrence date within the given year.

use:

- loss data is very commonly aggregated by AY

advantages:

- AY loss aggregation is the accepted norm in the U.S. and Canada (uniformity is good)

- easy to obtain & understand

- industry benchmarks are available

disadvantages:

- mismatch between AY losses and CY premiums or exposures (premiums & exposures are usually aggregated by CY)

- AY losses may contain policies written at different price levels (because may contain policies from different PYs)

- AY losses may contain policies written at different retention levels (because may contain policies from different PYs)

define & describe: PY aggregation of data

definition: PY data includes all data with a policy effective date within the given year.

use:

- good for self-insurers because they have only 1 policy (their own)

advantages:

- perfect match between losses and premiums or exposures

- easier to isolate effects of policy changes from year to year (because all policies in a PY will have the same policy characteristics)

disadvantages:

- extended time frame of 24 months (an accident may occur on day 1 for a policy written on Jan 1, all the way through to day 365 for a policy written on Dec 31)

- harder to isolate effects of catastrophes or court rulings that happen on a specific calendar date (because policies from the previous and current PYs would be affected)

define & describe: RY aggregation of data

definition: RY data includes all data with a report date within the given year.

use:

- good for of claims-made policies like medical malpractice or product liability (policies where coverage is triggered by the reporting of a claim versus the date of the event)

advantages:

- number of claims is fixed at the close of a RY (unlike an AY where the number of claims can still increase due to late-reported claims)

- development patterns are more stable (because number of claims is fixed, and there is only IBNER, no pure IBNR)

disadvantages:

- only measures development on known claims (unreported claims get dumped into the subsequent RY so there's a potential lag in understanding the true extent of an insurer's liabilities)

mini BattleQuiz 4 You must be logged in or this will not work.

Full BattleQuiz You must be logged in or this will not work.

POP QUIZ ANSWERS

Alice's mega-useful formula #1: total unpaid = Case O/S + IBNR

Alice's mega-useful formula #2: ultimate = paid + total unpaid

Go back

Friedland03.Data

Contents

Pop Quiz

Study Tips

BattleTable

In Plain English!

Source of Data

Homogeneity and Credibility

Types of Data Used by Actuaries

Organizing the Data

POP QUIZ ANSWERS

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Reserving Top 5 Topics

Pricing Top 5 Topics

Miscellaneous BattleReports

Tools