Forecasting Age Distribution in a Swimming Program By Henrik Bech, Copenhagen, Denmark and Tim Henrich, Ph. D, University of the Incarnate Word, San Antonio, Texas.



Background: The model was developed in the Danish ministry of Finance to forecast the age distribution of the Danish police force and has also been used to forecast the age distribution of the total population of government employees. The principles can be used on any population and will here be applied to that of a swimming program.

Introduction: Coaches compile a lot of data in their everyday work mostly data related directly to the swimmer’s performance in the water but also attendance records. This paper will show how it is possible to forecast the age distribution of your swimming program based on those data if they are organized in the right way.

The results of the forecasts can help you plan the way you recruit swimmers for the program. It can also help you account for changes in the demographic development in the geographic area you recruit from. It can thus enable you to see problem before they happen and that way you can inform the management about a problem before it occurs. The model will be used to forecast the age distribution on fictitious data.

There are many examples of swimming clubs, where the performance varies considerably over time mainly due to the fact, that large numbers of swimmers quit from one season to the next causing the age-group distribution within the club to be uneven. Reasons for this problem might be that,

1) coaches don’t acknowledge that the problem is there 2) they believe there is not anything they can do about it outside of the usual promotion of the program, or 3) they do not believe that anticipation of the of problems could help correct the them.

When coaches talk about planning, it is on most occasions the planning of the training in the pool, and this is often done using sophisticated and long standing comprehension of training methodology. The same efforts are not, however, used when it comes to planning of the recruiting policy of a program and determining what the numbers might be in terms of participation by age might be in the future. While many articles have been written that describe the reasons for young swimmers of other athletes drop out of various programs at different ages, this paper’s focus is on the numerical development.

This project outlines a means of analyzing and forecasting the age distribution of a given population, and thus enabling the coaches to plan their way around the problem. To be more specific this model shows how it is possible to forecast the age distribution for a program or a club and forecast the total natural attrition from the program or club and analyze the age profile of the natural attrition.

The paper will show how the data can be organized and how to make the model. The relevance of doing this will be discussed in relation to priorities of Danish swim leaders.

The Danish way Team Nationals is a prestigious event in the Danish meet schedule. There are a women’s and a men’s title and is organized in 4 divisions. The championships run over two weekends, preliminaries and finals, and each time they swim the individual Olympic schedule and the 400 backstroke, 400 butterfly, 400 breaststroke and 800 free style relays and each club must enter 2 swimmers in each individual event and one team in each relay.

Each swimmer can participate in 3 individual events and 2 relays.

The scoring is not done according to the place the swimmer achieve in the events. Each swim is scored using the German Points table where the long course world record equals 1000 points.1

The points of the individual swims are added and the club with the most points win. The advantage of scoring the meet this way is that each swimmer must do the best that he or she can do, because your place in the result list doesn’t matter.

Data – what goes into the model

The most difficult part about these calculations is probably to produce the relevant data. The forecasts are calculated on historical data.

First an expression for the movements within the program is needed and done by registering the status for each athlete during time in question, for instance a season or a whole year. The period must have a one day overlap to catch all movements.

The athletes can have the following status:

  1. The athlete stays with the program for the full period, i.e. they were enrolled in the program at beginning and continued to participate until the end of the period.


  1. The athletes quit during the period, i.e. they were in the program at the beginning of the period but not in the end of it.


  1. The athletes join during the period, i.e. they were not in the program at the beginning of the period but were in it at the end of it.


There is of course a fourth possibility namely that the athlete both joins and quits during the period. These movements are not included in this model because they do not contain any information relevant for this analysis.

I think the latter two categories it would be appropriate to register where the athletes go either if they moved to another team, move to another level of swimming outside of the group for which the forecasting is being done or leave swimming entirely.

These data should be collected as far back as possible preferably five years back and at least three years back because the calculations are mainly done using averages. The results will be invalid if you have insufficient historical data.

Besides the above it would be appropriate to collect data about gender, age, which team the athlete belongs to. Then running the model on all levels of detail will be possible.

Assumptions The model is based on the assumption, that the club is structured as a hierarchy. The natural attrition will increase the older the swimmers get. That is why one will need to have relatively more swimmers in the younger ages than in the older end of the age scale.

It’s possible to have movements both up and down in the hierarchy. The movements of primary interest for this model are these going up out of the program and up within the program. The movements out of the program equals the recruitment need for the program to maintain the same size over time.

For bigger clubs, a small part of the swimmers you recruit will probably come from other clubs. This will be included in the model like the rest of the intake. It will not be meaningful to forecast this, because it will be too random.

The model This part of the paper describes the way the model calculates.

The methods described here are used to analyze a fictitious set of data the third paragraph.

When the data has been produced, the frequency of the natural attrition and the intake of the program have to be calculated. The frequency of the natural attrition is calculated by taking the average natural attrition of the historical data in proportion to the current population. This has to be done for each age-group in the population. Then the frequency of the natural attrition is multiplied by the current number of athletes of each age of the population.

The intake frequency can be calculated by calculating the average intake for each age of the data available. The in-take is the distributed relatively in proportion to the total intake.

The model calculates 3 times for each year/period using the above described frequencies requiring 3 rows in a spreadsheet per period.

In the first column of the spreadsheet you put the population that is to be forecasted, namely the current age distribution of the program. The current age distribution is calculated by taking the swimmers in your data, who have an unchanged status, that means they have been there the entire period in question and add the swimmers, who left during the period.

The next step is to calculate the natural attrition. This is done by multiplying the frequency of the natural attrition by the total population of swimmers. Then the total numbers of swimmers are then summarized and by subtracting that sum from the initial sum you have the total attrition. This calculation has to be done every year and in this way, you can forecast the age distribution and the total attrition. It is also possible to age distribute the total attrition, as will be seen later on.

This difference between the two populations, that was just calculated, will then be distributed as the intake of the program, given you assume that the program will have the same size over time. If not, it is this difference you can either add to or subtract from if you want to change the size of the program. That is one way to you can you can calculate different scenarios. It can also be done be changing the attrition frequencies or the pattern of the intake.

Analysis of fictitious data

An age distribution of a swimming program is shown in figure

  1. The known age distribution of the program is depicted in the left columns for each age. The objective is to forecast the age distribution two years later. It is assumed, that the program has the same number of swimmers all the way through the forecast.

In Figure 1 (next page), the result of the forecast is depicted in the columns to the right for each age. In this case, the forecast shows that the number of swimmers in the range 8-11 years of age will increase dramatically and the number of swimmers in the older ages will decrease.

In figure 2 the same age distribution has been forecasted using another set of intake and attrition frequencies.

Compared to figure 1 the age distribution has shifted slightly to the right. That means that the bulk of the swimmers will be from the age of 10 to 15.

The two situations illustrate the how the same population will be using two different set of intake and attrition frequencies.

In the example above, the club has the same size all the way through the forecast. The size can of course be changed, if one believes that the number of swimmers will change. The way to do it in the model is by reducing on increasing the number of swimmers you distribute in the third column of the model using the intake frequencies.



The age profile of the attrition can be calculated at the end of each period by subtracting the first two columns in each iteration. Will it be possible to analyze if the club has an above normal attrition for certain ages? A more thorough and qualitative study can then be initiated as a result of the calculations.

Discussion: After going through these rather technical calculations one could ask, why go through all this trouble?

Some coaches will argue that ordinary common sense can give you a feeling of the mobility of your program, so organizing the data and constructing the model is not worthwhile.

I have to disagree for at least two reasons. First, it will be very hard to have perfect knowledge about the mobility, as we have seen earlier. Secondly, it will not be possible to assess the consequences for the development of the program. This is the strength of the model – you can change the assumptions of the model anyway you like.

This model is a planning tool and can – like any other tool – be put to good or bad use. It has its strengths and weaknesses.

The weakness would be, that in most programs there will probably be a very limited amount of data present. It will obviously be that way in the beginning of collecting data.


One advantage of using the methods described above is that these calculations can be done without using sophisticated statistical or econometric methods which means anybody who can use a spreadsheet can do these calculations.

The strength as I see it is the possibility of doing “if-then” scenarios. You make assumptions about the mobility and the size of the program over time and if they hold up then the development will be in a certain way.

That means you can present actual analysis for the politicians, that is not based on guesses. You present your assumptions, describe what has been calculated and present the results. When that has been done, the job of the model is done.

The decisions made after that are political. But with realistic and thought through assumptions worked into this model, the risk of making bad political decisions has been reduced considerably.



Notify of
Inline Feedbacks
View all comments

Sponsorship & Partnerships

Official Sponsors and Partners of the American Swimming Coaches Association

Join Our Mailing List

Subscribe and get the latest Swimming Coach news