Friedland03.Data
Reading: Friedland, J.F., Estimating Unpaid Claims Using Basic Techniques, Casualty Actuarial Society, Third Version, July 2010. The Appendices are excluded.
Chapter 3: Understanding the Types of Data Used in the Estimation of Unpaid Claims
Contents
Pop Quiz
Study Tips
There is a lot of good, basic information in this chapter but aside from a few key facts to memorize, it isn't something you can fully absorb on your first pass. You are probably familiar with some of it anyway from your work duties. So don't spend too long here – you can always refer back when necessary. Among the facts you have to memorize are:
- advantages/disadvantages of various data aggregation methods (CY, AY, RY, PY)
BattleTable
Based on past exams, the main things you need to know (in rough order of importance) are:
- fact A...
- fact B...
reference part (a) part (b) part (c) part (d) E (2019.Spring #13) combining data:
- argue forcombining data:
- argue againstrate recommendation:
- provide commentE (2017.Spring #14) wrong chapter - move to 7 E (2016.Fall #17) data for analysis:
- compare strategiesE (2016.Spring #15) Friedland05.Triangles data aggregation:
- is CY appropriate?data aggregation:
- is AY appropriate?Friedland07.Development E (2015.Fall #14) Friedland07.Development Friedland07.Development E (2015.Fall #15) large claim thresold:
- selection considerationsFriedland09.BornFerg E (2015.Fall #16) Friedland05.Triangles data aggregation:
- RY versus AY
In Plain English!
Source of Data
This short section is just memorization of the facts below. You sometimes need to draw on this background information to answer certain essay-style questions.
Question: do large insurers normally use their own internal data or rely on external/industry data
- large insurers prefer to use their own data as it's more relevant to their own experience
- large insurers might still use external data as benchmarks in relation to their own data
Question: when might external benchmarks be particularly useful to an insurer [Hint: TET]
- Tail factors
- Expected claims (loss) ratios
- (used in some reserving methods:
- → Chapter 8: Expected Claims Method
- → Chapter 9: Bornhuetter-Ferguson Method
- → Chapter 10: Cape Cod Method)
- Trends (severity & frequency of claims)
Question: in what situations might an insurer want to use external data
- small insurers without credible internal data
- a company with limitations on what data their systems can provide (often the case with a small company)
- entering a new LOB (Line of Business) or geographical region where the company wouldn't have any prior data
Question: identify 1 U.S. source and 1 Canadian source of external data
- U.S.: ISO (Insurance Services Office)
- Canada: IBC (Insurance Bureau of Canada)
Question: identify potential differences in external versus internal data that may reduce the relevance of external data
- insurance product
- insurer operations
- case reserving and case settlement practices
- mix of business
mini BattleQuiz 1 You must be logged in or this will not work.
Homogeneity and Credibility
Suppose you have 5 different LOBs: A, B, C, D, E (Lines of Business), as part of your reserve analysis:
- LOB A: 20,000 claims
- LOB B: 100 claims
- LOB C: 20,000 claims
- LOB D: 100 claims
- LOB E: 100 claims
Let's also suppose that LOBs A & B are similar to each other, LOBs C & D are similar to each other, but LOB E is not similar to any other the others.
Question: what is a reasonable way to group these LOBs for a reserve analysis
The key concepts in making that decision are homogeneity and credibility. We would like to maximize both.
- homogeneity refers to how similar claims are within a grouping (a grouping of similar claims is called a homogeneous grouping)
- credibility refers to the statistical significance of a grouping (the more claims in a grouping the more statistically significant it is)
So a reasonable answer to the above question is:
- group LOBs A & B
- - LOB A is homogeneous and has enough claims to be credible (and could be analyzed on its own)
- - LOB B is homogeneous but is not credible (cannot be analyzed on its own)
- - since LOBs A & B are similar, they can be combined and the resulting grouping is credible and still homogeneous
- group LOBs D & E
- - the reasoning is the same as for A & B
- LOB E should be supplemented with external data
- - LOB E is homogeneous but not credible (cannot be analyzed on its own)
- - LOB E is not similar to any other LOBs (so cannot be grouped with them)
- - a valid alternative is to look for credible external data that is similar to LOB E
I said above that we'd like to maximize homogeneity and credibility simultaneously but that isn't always possible. Often, increasing homogeneity of a group of claims means getting rid of some individual claims that aren't like the others. That means the group gets smaller. But a smaller group has less credibility.
Inconvenient Fact: homogeneity is inversely proportional to credibility
Homogeneity and credibility have to be balanced. The overall goal is to produce an accurate reserve analysis and you have to strike a balance in terms of how to group your data.
mini BattleQuiz 2 You must be logged in or this will not work.
Types of Data Used by Actuaries
- Claim and Claim Count Data
- Claim-Related Expenses
- Multiple Currencies
- Large Claims
- Recoveries
- Reinsurance
- Exposure Data
- Insurer Reporting & Understanding the Data
- Verification of the Data
mini BattleQuiz 3 You must be logged in or this will not work.
Organizing the Data
We'll start with some very basic information. This would likely never be asked on the exam but it's something you should know.
Question: what are the 5 key dates for the organization of claim data [Hint: PARAV]
- Policy effective dates: beginning and ending dates of the policy term
- Accident date: when the accident or event occurred that triggered coverage
- Report date: when the claim was reported/recorded in the claim system
- Accounting date: defines a group of claims for which liability exists, often Dec 31 of the given year
- Valuation date : the date through which transactions are included in the data used by the actuary for the analysis, often Dec 31 of the given year
- The first 3 term above are completely obvious. The accounting date is often referred to as when the books close, either for the month, quarter, of year. The accounting date and valuation date are usually the same. If the books close on Dec 31, the actuary would normally do the reserve analysis using data through Dec 31 (valuation date of Dec 31).
The rest of this section is stuff you definitely have to know as it does come up on the exam.
Question: identify 4 common ways of aggregating data
- CY, AY, PY, RY
- → The terms are used very often and I don't want to write them out in full every time. They are, respectively, Calendar Year, Accident Year, Policy Year, Report Year. Note that Policy Year is also sometimes called Underwriting Year.
define & describe: CY aggregation of data
- definition: CY data includes all data with a transaction date within the given year.
- use:
- - premium and exposure data is usually aggregated by CY
- - loss data is not usually aggregated by CY, except possibly for diagnostics
- - here's a very useful formula for CYEP, Calendar Year Earned Premium,
CYEP = WP + ( UEPbeg – UEPend )
- advantages:
- - readily available
- - no future development (data doesn't change after accounting date)
- disadvantages:
- - no future development (generally cannot use it for a reserve analysis)
- advantages:
define & describe: AY aggregation of data
- definition: AY data includes all data with an occurrence date within the given year.
- use:
- - loss data is very commonly aggregated by AY
- advantages:
- - AY loss aggregation is the accepted norm in the U.S. and Canada (uniformity is good)
- - easy to obtain & understand
- - industry benchmarks are available
- disadvantages:
- - mismatch between AY losses and CY premiums or exposures (premiums & exposures are usually aggregated by CY)
- - AY losses may contain policies written at different price levels (because may contain policies from different PYs)
- - AY losses may contain policies written at different retention levels (because may contain policies from different PYs)
define & describe: PY aggregation of data
- definition: PY data includes all data with an policy effective date within the given year.
- use:
- - great for self-insurers because they have only 1 policy (their own)
- advantages:
- - perfect match between losses and premiums or exposures
- - easier to isolate effects of policy changes from year to year (because all policies in a PY will have the same policy characteristics)
- disadvantages:
- - extended time frame of 24 months (an accident may occur on day 1 for a policy written on Jan 1, all the way through to day 365 for a policy written on Dec 31)
- - harder to isolate effects of catastrophes or court rulings that happen on a specific calendar date (because policies from the previous and current PYs would be affected)
define & describe: RY aggregation of data
- definition: RY data includes all data with a report date within the given year.
- use:
- - great of claims-made policies like medical malpractice or product liability (policies where coverage is triggered by the reporting of a claim versus the date of the event)
- advantages:
- - number of claims is fixed at the close of a RY (unlike an AY where the number of claims can still increase due to late-reported claims)
- - development patterns are more stable (because number of claims is fixed, and there is only IBNER, no pure IBNR)
- disadvantages:
- - only measures development on known claims (unreported claims get dumped into the subsequent RY so there's a potential lag in understanding the true extent of an insurer's liabilities)
mini BattleQuiz 4 You must be logged in or this will not work.