Dietary intake (DI) may be assessed through several methods, but some of them are complex and laborious. Population studies require methods that are simple and quick as well as reliable. The Food Frequency Questionnaire (FFQ) has been commonly used as a practical and efficient method for assessing diet over periods of time in large-scale dietary surveys.1 It has also been widely used in epidemiological studies given its low cost and usefulness for determining food and nutrient intake. Additionally, the FFQ allows the identification of changes in food intake over time. This is particularly relevant for national surveys, where monitoring plays an essential role.2
The Mexican National Health and Nutrition Survey, 2012 (Ensanut 2012 by its Spanish acronym, from Encuesta Nacional de Salud y Nutrición, 2012) employed a Semi-Quantitative Food Frequency Questionnaire (SFFQ) to collect dietary data from the Mexican population. Analysis of dietary data contributes to provide a vision on the dietary status of the Mexican population, and allows for monitoring dietary trends, formulating recommendations for improving food and nutrient intake, and designing evidence-based interventions as well as public health policies.3,4 The quality of dietary data is crucial, and documented methods and processes for estimating food, energy and nutrient intake as well as for cleaning data are of primary importance. They provide technical information towards establishing a benchmark for reliable and valid DI data. Therefore the aim of the present article was to describe the methodology used to estimate and clean DI data derived from the SFFQ used by Ensanut 2012.
Materials and methods
Design and study population
We analyzed the dietary data collected through the SFFQ of Ensanut 2012 (SFFQ-2012). Ensanut 2012 is a probabilistic survey stratified by cluster, and representative at the national, regional and urban/rural levels. For determining sample size a design effect of 1.836 was considered, allowing for the estimation of the following expected prevalences for the different population age groups: 2% for adults (>20 years), 3% for adolescents (12-19 years) and 4% for pre-school (1-4 years) and school (5-11 years) children, with a 95% confidence interval and response rates of 0.81 in pre-school as well as school children and 0.78 in adolescents as well as adults in general. The Ensanut 2012 obtained information from 50 528 households, and its SFFQ was randomly applied in a population subsample was made up of one out of every six subjects per age group: pre-school children (1-4 y), school children (5-11y), adolescents (12-19 y) and adults (>20 y). Survey sampling and methodology procedures have been described elsewhere.5,6
The Ensanut 2012 protocol and survey tools were approved by the Ethics Committee of the National Institute of Public Health in Mexico (INSP by its Spanish acronym, from Instituto Nacional de Salud Pública). Informed consent was obtained from all study subjects.
SFFQ-2012: food items and portion sizes
The SFFQ-2012 was based on the previous SFFQ used by Ensanut 2006 (SFFQ-2006).7 Designed to characterize the diets of the Mexican population by age group, the SFFQ-2006 collected dietary data over the last seven days to obtain current dietary data for the population, and was composed of selected foods based on their contribution to total consumption of energy and the 11 nutrient intake from 24-hour recall (24-HR) of the National Nutrition Survey 1999 (ENN 99, by the Spanish acronym for Encuesta Nacional de Nutrición, 1999). The foods that contributed ≥90% of the total energy and nutrients intake were included in the SFFQ-2006 list. The questionnaire was also complemented with food items identified as inhibitors or facilitators of iron and zinc absorption. The portion sizes used in the 2006 and 2012 SFFQs were divided into (1) standard portion size (the principal portion size specified in the food list) based on the median food intake by population age group from the 24-HR of the ENN 99; and (2) alternative (standard) portions sizes ranging from very small to extra-large. All the portion sizes for each food item were expressed in standardized home-measurements.
On the other hand, the SFFQ-2012 incorporated 39 additional food items given their relation to the high prevalence of non-communicable diseases across all age groups. The new food items were selected by a group of experts in nutrition from the Center for Nutrition and Health Research in the National Institute of Public Health in Mexico. Commonly consumed foods and dishes high in calories, sodium, fats and sugars were added (Table I). With 140 food items classified into 14 general groups (Table II), the final SFFQ-2012 was validated for estimating energy and nutrient intake (further details are published elsewhere).8
SFFQ application
The SFFQ-2012 was administered by trained health personnel (i.e., nutritionists, nurses and physicians) using laptop computers (Hewlett Packard 435) with specifically designed Ensanut 2012 software (Visual Fox Pro program, v.7) for data entry. Interviewers asked study subjects to recall all foods (and portions) consumed in the seven days prior to the interview. The SFFQ standard portion sizes specified by interviewers were based on the average weight value assigned to each food item per age group. The subjects who reported not having consumed the standard portion size were asked to select an alternative portion size, ranging from very small to extra-large. They were then asked to specify the number of days and the number of times per day they consumed the food item in question during the seven days prior to the interview, as well as the number of portions consumed on each occasion. For estimations, the number of days was multiplied by the number of times per day that the food item was consumed in the last seven days. Exclusively in the case of children aged from 1 to 11, the questionnaire was administered to their mothers or caretakers; for adolescents and adults (≥12 years of age) it was applied directly. Figure 1 shows SFFQ-2012 software and its use.
Food composition database
We constructed a database to estimate the energy and nutrient intake data collected through the SFFQ-2012. It was based on a more general food composition database (GDFC) containing 1 600 foods which had been compiled and updated that same year by the Center for Nutrition and Health Research of the INSP.9 To construct our database, we followed a series of steps and criteria: Step 1. We identified and selected the GDFC foods that represented each food item on the SFFQ-2012 food list. For example, in the case of the "banana" food item, we selected all GDFC banana types: dominico, manzano, macho, red and tabasco. For the mature and fresh cheese item, we selected all GDFC cheeses and classified them within the fresh and mature categories according to the corresponding Official Mexican Regulation.10Step 2. We complemented the food items in Step 1 with those reported under the Ensanut 2012 24-HR, taking into account information from all the population age groups (pre-school and school children, adolescents, adults and older adults). Step 3. For 40% of the SFFQ list, we estimated the composition of each food item, and computed the average energy, fiber and nutrient content (per 100 g) taking into account the matching GDFC foods. For the remaining 60%, we weighted the average energy and nutrient content based on the Ensanut 2012 24-HR data. Step 4. In the case of the newly incorporated dish items, we estimated the energy and nutrient content (per 100 g of preparation) according to their raw ingredients.
Nutrient estimates
Using the food composition database and the weights of the size portions described above, we estimated the average daily intake of energy, fiber, macronutrients and micronutrients of the population (Figure 2). Net grams of food consumed were determined considering the density factor for beverages and the edible portion factor for fruits, vegetables and meats. Daily dietary intake per person was calculated using Access (Microsoft Office, 2003), Visual Basic Code and SQL consults, and Stata v.13 software. SFFQ-2012 asked about consumption of food supplements and complements. However, we constructed a qualitative variable that allowed us to identify who had consumed food supplements/ complements in the last seven days. Figure 2 shows the process followed for computing the average daily intake obtained through the SFFQ-2012.
Finally, we generated DI data for 133 nutrients: energy, water, fiber, five macronutrients, 12 minerals, 26 vitamins, 66 types of fatty acids, 19 amino acids and other nutrients such as phytic acid and tannins. We also estimated the proportion of the population that reported consuming food complements/supplements: <11% of adults, 16.7% of older adults, 6.3% of adolescents, 8.8% of pre-school and 8.3% of school children.
Cleaning DI data
We prioritized the cleaning of DI data as regards to energy, fiber and the 13 nutrients identified as most relevant in the area of public health according to evidence from Mexico. These consisted in five macronutrients (carbohydrates, proteins, total fats, saturated fats and polyunsaturated fats), five vitamins (vitamin A, vitamin C, folates, vitamin D and vitamin B12), and three minerals (iron, zinc and calcium). Cleaning was performed in two stages. We first cleaned data in grams per food item consumed per subject of study. At this level it is important to clarify that there were no individuals with missing data. Those who had consumed one or more foods above three standard deviations (>3 SD) were excluded from analysis (Figure 3).11 We also evaluated the biological plausibility of food intake and the contribution percentage of each food vis-à-vis total DI. Additionally, the data identified as high was corroborated. Extremely low values were considered valid.
* Excluded for having one or more implausible values in energy, fiber and nutrients: energy=49, macronutrients=8, vitamins=62 and minerals=23
‡ Excluded for having one or more implausible values in energy, fiber and nutrients: energy=49, macronutrients=6, vitamins=1 and minerals=3
§ Excluded for having one or more implausible values in energy, fiber and nutrients: energy=121, macronutrients=9, and vitamins=22
# Excluded for having one or more implausible values in energy, fiber and nutrients: energy=99, macronutrients=12, and vitamins=37
& Excluded for having one or more implausible values in energy, fiber and nutrients: energy=42, macronutrients=6, and vitamins=1
In a second stage, we cleaned daily intake regarding energy, macronutrients, micronutrients and fiber. To clean data at the upper extreme values of energy intake, we estimated the ratio of energy intake/estimated energy requirement (EER). As reference, we used two sets of equations from the Institute of Medicine (IOM).12 With regard to body mass maintenance, we used specific equations for the population with obesity, overweight and normal nutritional status by age group. For children <3 years of age, only one type of equations was applied. For study subjects without weight and height information, values were imputed according to data drawn from the same survey population of the same age and sex. Only data for 5.4% of subjects from the total SFFQ-2012 population were imputed. The physical activity level of each subject was considered according to several studies regarding data of the ENN-99.13 We assigned the physical activity factor as follows: a light physical activity factor was assigned to pre-school and school children as well as adolescent and adult males, whereas a sedentary activity value was assigned to adolescent and adult females. In relation to nutritional status, physical activity factors of 1.13 and 1.16 were assigned to non-obese boys and girls ages 3 to 18, respectively, whereas factors of 1.12 and 1.18 were assigned to obese boys and girls of the same age, respectively. For adults' ≥19 years, factors of 1.11 and 1.0 were assigned to men and women, respectively. Physical activity levels were not assigned to children <3 years.12
To clean data at the lower extreme values of energy intake, we excluded subjects with energy intake/basal metabolic rate (BMR) ratios below 0.5. We estimated the BMR for adults (≥19 years of age) according to the Mifflin-St Jeor equations14 for overweight and obese populations by gender. For subjects <19 years, we used the Food and Agriculture Organization (FAO) equations15 according to age and gender.
Cleaning of nutrient data was performed by age group using the following ratios: intake/estimated average requirement (EAR) for micronutrients16-18 and proteins;12 intake/adequate intake (AI) for fiber;12 and intake/range of acceptable macronutrients distribution range (AMDR).12 For carbohydrates and total fats, average values of 55 and 30% of AMDR, respectively, were considered to be adequate.
We cleaned iron data for all age groups using the specific recommendations for the Mexican population,19 as they take into account the low bioavailability of iron in the diet (<10%). Intake of phytates -the main inhibitor of iron absorption- is high in Mexico. Additionally, the Mexican diet is low in heme iron (<5% of total iron), which has a higher absorption rate than non-heme iron.20 In light of the foregoing, iron intake recommendations for the Mexican population are higher than those issued, for instance, for the US population by its respective IOM, with the latter positing an iron bioavailability of 18% and a heme intake of ≈ 10% of total iron.16 For adults >60, we used the WHO nutrient recommendations21 given their comparability with Ensanut 2006 intake results, as well as their similarity to the IOM recommendations, except as regards folates (400 μg/d recommended by the WHO and 320 μg/d recommended by the IOM).
All respondents with ratios above +3 SD were excluded. All DI cleaning was done using Stata v.13.1 (StataCorp. 2011, College Station, TX: Stata Press).
Finally, we obtained DI for 7 214 study subjects from SFFQ-2012 with plausible data for a total 7 810 subjects. Between 4.8% and 9.6% of population from the different age groups were excluded for lack of plausible information (Figure 3). Final estimated daily intake did not include energy or nutrients from food complements/ supplements consumed by <11% of adults, 16.7% of older adults, 6.3% of adolescents, 8.3% of school children and 8.8% of pre-school children.
Discussion
We documented the procedures used to estimate the energy and nutrient intake data collected through the SFFQ-2012, as well as the criteria underlying the corresponding data cleaning process. The SFFQ-2012 is similar to the questionnaire developed by Block22 and modified by Willett.23 It has been used for the U.S. population and also in the Ensanut 2006 for the Mexican population. In the latter case, it was designed to evaluate total DI.7
DI evaluation is complex, it always represents a challenge given the broad range of random and systemic errors to which it is subject.
An aspect to be considered in processing the SFFQ is the food composition database for estimating energy and nutrient intake. Subar and colleagues24 recommend using means rather than medians for the nutrient content of SFFQ food items, and obtaining portion size information through 24-HR. We used means accordingly, and considered the food intake from different population groups based on consumption weights (classified by age group) instead of portions sizes, because this information was already available. This was useful for representing the contribution of each food to total energy and nutrient intake for each food item.
While the SFFQ-2012 drew its standard portion sizes from the ENN 1999 24-HR,24,25 as indicated in the first column of the instrument, it offered subjects the possibility of selecting alternative portion sizes ranging from very small to extra-large. All portion sizes for food items were expressed in home measurements; being standardized, they have not varied over time.
At the international level, various studies have demonstrated the limited capacity of SFFQs to estimate DI given the large number of potential measurement errors,26 whereas others have concluded that their data are reasonable.27,28 It is worth noting that the structure and content of SFFQs can vary substantially, such that the findings and limitations of one cannot always be transferred to another FFQ.
Tuker and colleagues29 documented that one of the greatest sources of systematic error in DI estimation may stem from the way in which food and preparations are considered in FFQ, often occurring when preparations are incorrectly described or planned. For example, omitting oil as one of the ingredients of a commonly consumed dish is one source of energy underestimation. SFFQ-2012 considers whether fats, oils are consumed individually or included into foods, thereby allowing for estimates to distinguished between fried and unfried foods. Likewise, the questionnaire explores whether sugars are added to drinks and foods.
Energy and nutrient estimates took into account key factors such as whether the food was raw or cooked. They also considered factors for estimates in net grams, bearing in mind the density factor for beverages and the edible portion factor for foods. This improved the precision of DI data generated from the SFFQ-2012.9
Another strength of the SFFQ-2012 concerns its national representativity, which allows for extrapolating the results to the Mexican population. One of the methodological limitations to take into account, though, was the effect of using information subrogates for children ages 1 to 11. In this respect, it should be noted that the questionnaire was administered to caretakers in the presence of the children, thus allowing them to complement the information.30 Another limitation refers to the fact that estimated DI does not include the energy and nutrients consumed from food complements/ supplements, as reported by 8.3 to 16.7% of subjects. Thus, the nutrient intake of this proportion of subjects may be underestimated to a certain degree. Here, it is important to mention though, that those who reported consuming supplements indicated a higher mean energy and nutrient intake than those who did not (data not shown). A final aspect to consider is that intake of vitamins and minerals was not adjusted for loss factors during cooking and may therefore be overestimated.
For cleaning data at the extreme low values, we used BMR as a reference; in turn, we estimated BMR for the adult population based on the Mifflin-St. Jeor equation. As this equation offers a lower range of error, a systematic review14 found it to be more reliable for predicting BMR in obese/non-obese individuals than any other equations, including those proposed by Harris-Benedict, Owen and the WHO/FAO/UN.
We consider that final daily intake data were reliable because <10% of the subjects from the total survey sample were excluded during the cleaning process. The highest losses occurred among pre-school children and older adults. This may have resulted from subrogate reporting.
In conclusion, the methodology for estimating daily intake in the Mexican population based on the SFFQ-2012 was documented. This is of vital importance for health and nutrition surveys in order to ensure reproducibility, and allows assessment of the reliability and validity of the findings. The latter will be of use as a guide for planning and evaluating interventions, programs and policies at the national level.