The Irish Longitudinal Study on Ageing (TILDA) Anonymised Microdata File – Frequently Asked Questions

If you have any queries on the The Irish Longitudinal Study on Ageing (TILDA) Anonymised Microdata File (AMF) please read through these frequently asked questions (FAQs) which cover the most common queries received.


Q. What is the scale used as BEHcage: CAGE alcohol scale?

A. The BEHcage scale is the CAGE Questionnaire scale which is asked in the SCQ part of TILDA. It is used for screening alcoholism. The questions are –

  • Have you ever felt you needed to cut down on your drinking?
  • Have people annoyed you by criticizing your drinking?
  • Have you ever felt guilty about drinking?
  • Have you ever felt you needed a drink first thing in the morning (eye-opener) to steady your nerves or to get rid of a hangover?

Those who answer yes to any question are given a +1 to their score. Those who have a missing value (don’t know / refused) for any of the questions are excluded from the final scoring of the scale. There are also a number of missing values for respondents who did not fill out the SCQ booklet. The four original variables used to create the scored variable are SCQcage1, SCQcage2, SCQcage3 and SCQcage4. The study found here - outlines how the scoring works for identification of excessive drinking and alcoholism.


Q1. I cannot figure out the children living in the household because variables cm010_* are missing.

2. I cannot get the age of children who live outside the household since the variables cs031_* are missing. I have no idea how I can get such information.

A. These variables were removed from the public dataset as there was a potential to identify participants by using them in conjunction with other variables.  It was decided that although we understand these omitted variables reduce the quality of the dataset in terms of the research that can be carried out, the risk to data protection has to take priority.


Q. Self-rated vision, hearing, taste and smell have been used twice in the questionnaire as well as the derived variables. I am confused which to use? What is the difference between them?

A. Self-rated vision

PH102 asks respondents to rate their general vision. PH103 asks to rate their vision at a distance and PH104 asks to rate vision close up.

DISvision is identical to PH102 and was only derived to group a number of the disability related questions within the questionnaire (those with DIS as the prefix).

The response options are slightly different in PH102 and DISvision than in the questionnaire. 1 – 4 stay the same as excellent, very good, good and fair while 5 and 6 have been grouped into one option of 5 for those who rate their vision as poor or legally blind. This is for anonymity purposes with the publically available data.

PH102 or DISvision can be used for general investigation into respondent’s vision while PH103 and PH104 can be used if you want to investigate anything specific to close or distance vision problems.

Self-rated hearing

PH108 = DIShearing (General self-rated hearing from excellent to poor)

DIShearing and PH108 are identical variables derived with the DIS prefix to group them with other disability related variables. Either can be used for general investigation into respondents’ level of hearing.

PH109 looks at difficulty following conversation with one person while PH110 looks at difficulty following conversation with four people. These would be used for more specific hearing related research.

Self-rated taste/smell

DISsmell is derived from PH112 to group the DISsmell variable with other disability related variables

DIStaste is derived from PH113 to group the DIStaste variable with other disability related variables

For the DISsmell and DIStaste variables, the only difference from their original variables are that the don’t know responses from PH112 and PH113 have been coded as general missing responses (.).


Q. Word list learning is presented from code ph117-ph120, where people are asked to recall a list of words which are presented to them. The values 98 and 99 indicate ‘don’t know’ and ‘refusal to answer’ but there are quite a lot of responses coded as -1. Please advise what this code means as I would suppose that if people did not remember any of the words on the list they would have scored zero. Is this correct?

A.   With regards to the ph117-ph120, the reason for the ‘-1’ score relates to the variable called ph116. This variable states whether the respondent had the list of words read out to them by computer, or whether it was read out to them by the interviewer. If the words were read out by the computer, then the respondents scores were recorded in ph117 and ph118. If the words were read out by the interviewer, then the respondents score were recorded in ph119 and ph120. There should be no respondent who has answered all four variables. The score of ‘-1’ simply denotes that the respondent did not record anything for that variable. I hope that clears it up.

Q. In the verbal fluency task data is presented that indicates how many animals were named in one minute. I wonder if data is available which gives a breakdown of the actual animals recalled, in what order and what timeframe within the one minute.

A.  Unfortunately, in the verbal fluency data, there is no breakdown of actual animals recalled, in what order and the timeframe. The interviewers were not told to do this, so we do not have any data available. It would have added time to the interview, which we could not afford.


Q. For the individuals who report that they are widowed, I am looking for year of widowhood to construct a variable that reflects the duration of widowhood at baseline. I can't seem to find anything that relates to year the individual lost their spouse/partner. Please advise.

  1. In relation to the query below, we do collect the year the person became widowed but we do not include it in the AMF as it was deemed too identifiable. Obviously as this person is based in the U.S. we won’t be able to provide the information he is looking for as the RMF is only available within the TILDA offices.


Q. Is it possible to access the Researcher Microdata Files (RMF) for TILDA?

TILDA have the RMF for both wave 1 and wave 2 available. All proposals to access the RMF are reviewed individually and researchers may be given access to only one of the waves depending on their research question. If a researcher wishes to use the RMF they can email our executive officer Anya Guiney ( with details of their research. Anya will then send on a ‘Proposal Form for Permission to use the TILDA Dataset’. When we have received the completed form and access has been approved by the TILDA Management Team, we will try to link the applicant with an internal researcher based on the research topic. Once this is completed the applicant can arrange to come into the TILDA offices and access the RMF from  a ‘hot desk’. The applicant will have to sign a Data Use Contract to ensure confidential use of the data and there are strict polices around what information can be taken from the TILDA ‘hot desks’ with all files being automatically scanned for confidential information before being released.