Introduction
Smell dysfunction is defined as decreased or impaired ability to smell when smelling (orthonasal smell) or eating (retronasal smell), and generally mild and even asymptomatic cases have been reported. The quality of life of patients is affected as those with smell dysfunction may experience problems in cooking, personal hygiene, social relationships, and have emotional problems such as depression. Smell plays an important role in the perception of dangers in daily life such as the presence of gas and chemicals1.
The most common means of acquiring information is through the Internet. Patients tend to use it as a cheaper and more easily accessible source to find a response to health concerns and suspicions before seeing a doctor and sometimes, even before receiving a definitive diagnosis. Unfortunately, these sources are not always unbiased or of high quality2,3.
YouTube is a free-of-charge, video publication Internet site, where approximately 65,000 videos are uploaded each day, with more than 100 million viewers per day and more than 1 billion per month4-6. The videos uploaded to YouTube are not controlled in respect of either content or quality. Therefore, it is important to emphasise that health-related inappropriate and prejudiced information on YouTube can affect patients making irrational decisions7. In studies that have evaluated informative videos related to Ear, Nose, and Throat (ENT) which have been uploaded to YouTube, both the content and quality of the videos have been found to be insufficient8,9. To the best of our knowledge, there is no study in literature that has evaluated the reliability of the dozens of uploaded videos related to smell dysfunction. Therefore, the aim of this study was to conduct unbiased research into the quality and reliability of videos published on YouTube on the subject of smell dysfunction and treatment, through two ENT specialists using three different tools.
Materials and methods
On 20 August 2021, YouTube was searched using the terms "smell dysfunction" and "smell dysfunction treatment." At this level, the first 500 videos were listed. These video links were recorded because the order can change every day with new videos being uploaded. All the videos were evaluated by 2 ENT specialists, independently and blinded to the evaluations of each other. Videos were excluded from the analysis if the content was irrelevant, if there was no sound or only music, if they were not in English, or if they were shorter than one minute.
A record was made for each video of the upload date, the upload source, the total number of likes, dislikes, comments, and views, and the daily number of views and comments. The video power index was calculated using the formula "(likes/likes+dislikes) × 100". The upload source was classified as physicians/universities/professional organizations (source 1), health-related websites (source 2), individual users/patients (source 3), non-physician healthcare personnel (source 4), and television program (source 5). The videos were separated into two groups by the two ENT specialist physicians taking previous studies as reference for whether or not the video content was scientifically reliable, proven, correct, and useful10,11.
Group 1 included videos with content accepted as reliable information with scientifically correct content about smell dysfunction and treatment (definition, symptoms, epidemiology, diagnosis, and treatment). Group 2 included videos defined as unreliable information with scientifically unproven and non-medical content. In any case of disagreement between the specialist raters, consensus was reached, and the video was assigned to the appropriate group.
The DISCERN reliability tool, Global quality scale (GQS), and JAMA scoring system were used as video scoring tools in the evaluations. The original DISCERN is a scale of 15 items used for the evaluation of health information quality. In this study, a 5-point modified DISCERN tool was used to score video reliability, as in previous studies10,12.
The GQS is used to evaluate the general quality of videos. In the 5 items of the GQS, a score of 1-2 points indicates low quality, 3 points intermediate quality, and 4-5 points high quality13. JAMA is a well-known quality tool that is used to evaluate information obtained from health-related internet sites. It includes four criteria of authorship, attribution, disclosure, and currency. Each criterion is scored with 1 point to give a maximum total of 4 points, indicating high quality14.
Statistical analysis
The data analyses were performed using PASW 18 software (SPSS/IBM, Chicago, IL, USA). Visual (histogram and probability graphs) and analytical (Kolmogorov-Smirnov test) methods were used to determine the conformity of the variables to normal distribution. The results were reported as mean ± standard deviation, or when distributions were skewed, as median (minimum–maximum) values. Categorical variables were stated as number (n) and percentage (%). The significance of the differences between the groups in terms of median values was investigated with the Mann–Whitney U-test. Kruskal–Wallis variance analysis was used for intergroup comparisons of continuous variables (Post hoc: Bonferroni). The Student's t-test was used for the intergroup analysis of continuous variables. Nominal variables were assessed with the Pearson's Chi-square or Fisher's Exact test. Inter-rater agreement was determined using Cohen's kappa score. Interobserver reliability for the three tools was quantified by calculating the intraclass correlation coefficient. A value of p < 0.05 was considered statistically significant.
Results
From the first 500 videos added to the playlist, 189 videos that met the study criteria were included for evaluation in the study (Fig. 1). Group 1 (reliable information) included 173 videos, and Group 2 (unreliable information), 16 videos. The inter-rater agreement in the formation of the groups was excellent (Kappa coefficient = 0.93, p = 0.0001). The number of likes and dislikes were 64 (0-67.000), and 2 (0-9.300), respectively, in Group 1, and 79 (0-52.000) and 14 (0-2.200) in Group 2 (p = 0.907, p = 0.08, respectively). The video power index was calculated as 96.9 (0-100) in Group 1, and 89.1 (0-100) in Group 2 (p = 0.005) (Table 1).
Video scoring tools and video charecteristics | Reliable information (n = 173) | Nonreliable information (n = 16) | p-value |
---|---|---|---|
DISCERN score& | 3 (2-5) | 2 (1-3) | 0.0001*‡ |
GQS score& | 3 (2-5) | 2 (2-3) | 0.0001*‡ |
JAMA score& | 2 (1-4) | 1 (1-2) | 0.0001*‡ |
DISCERN scoreΩ | 3 (2-5) | 2 (2-3) | 0.0001*‡ |
GQS scoreΩ | 3 (2-5) | 2 (1-3) | 0.0001*‡ |
JAMA scoreΩ | 2 (1-4) | 1 (1-2) | 0.0001*‡ |
Views | 5.532 (9- 4.417.908) | 18.083 (180-5.710.551) | 0.234* |
Views per day | 11.9 (0.03-46.504) | 9.3 (0.33-2059.34) | 0.789* |
Comments | 13 (0-16.593) | 18 (0-3.705) | 0.521* |
Comments per day | 0.0208 (0-174.66) | 0.0143(0-5.86) | 0.996* |
Duration of video (min) | 5.27 (1.03-62.2) | 3.33 (1.4-14.44) | 0.037*‡ |
Duration of Youtube (day) | 387 (74-4.036) | 1404 (128-2.860) | 0.011*‡ |
Likes | 64 (0-67.000) | 79 (0-52.000) | 0.907* |
Dislikes | 2 (0-9.300) | 14 (0-2.200) | 0.08* |
Video power index | 96.9 (0-100) | 89.1 (0-100) | 0.005*‡ |
Source of upload, n, % | |||
Source 1Π | 80 (42.4%) | 1 (0.5%) | 0.001†‡ |
Source 2Π | 34(18%) | 3 (1.6%) | |
Source 3Π | 17 (9%) | 7 (3.7%) | |
Source 4Π | 6 (3.2%) | 1 (0.5%) | |
Source 5Π | 36 (19%) | 4 (2.1%) |
*Mann-Whitney U test.
†Chi-square test.
‡Statistically significant.
Πphysicians/universities/professional organizations (source 1), health related website (source 2), individual users/patients (source 3), non physician health personel (source 4) ve TV program show (source 5).
&Rater 1.
ΩRater 2.
When the groups were evaluated with GQS scoring, the scores of the videos containing reliable information were found to be higher by both raters. The GQS (First ENT specialist) points were 3 (2-5) in Group 1 and 2 (2-3) in Group 2 (p = 0.0001). The GQS (Second ENT specialist) points were determined to be 3 (2-5) in Group 1, and 2 (1-3) in Group 2 (p = 0.0001). The points given by both raters in the DISCERN and JAMA scoring systems were found to be higher in Group 1 than in Group 2 (p = 0.0001 in all comparisons). The differences were determined to be statistically significant (Table 1). The intraclass correlation coefficients for the GQS, DISCERN, and JAMA scoring systems were determined as 0.96, 0.98, and 1.0, respectively (p = 0.0001 in all comparisons).
The number of videos uploaded by physicians/universities/professional organizations was determined to be 80 (42.4%) in Group 1, and 1 (0.5%) in Group 2 (p = 0.001) (Table 1). The videos in Group 1 had richer content in terms of etiology, general information, symptoms, and treatment. The comparisons of the GQS, DISCERN, and JAMA scores according to the source of the videos are shown in table 2.
Video scoring tools | Physicians/universities/professional organizations | Health related website | Individual users/patients | Non physician health personel | TV program show | p-value |
---|---|---|---|---|---|---|
DISCERN score& | 4 (2-5) | 3 (1-5) | 3 (2-4) | 3 (2-4) | 3 (1-5) | 0.0001* |
GQS score& | 4 (2-5) | 4 (2-5) | 3 (2-4) | 4 (2-4) | 4 (2-4) | 0.0001* |
JAMA score& | 3 (2-4) | 3 (1-4) | 2 (1-3) | 3 (1-3) | 2 (1-4) | 0.0001* |
DISCERN scoreΩ | 4 (2-5) | 3 (1-5) | 3 (2-4) | 3 (2-4) | 3 (1-5) | 0.0001* |
GQS scoreΩ | 4 (2-5) | 4 (2-5) | 3 (2-4) | 3 (2-4) | 4 (2-4) | 0.0001* |
JAMA scoreΩ | 3 (2-4) | 3 (1-4) | 2 (1-3) | 3 (1-3) | 2 (1-4) | 0.0001* |
*Kruskal Wallis test.
&Rater 1.
ΩRater 2.
Discussion
YouTube is a video-sharing platform established in California, USA, in 2005. After Google Search, YouTube is the second most visited website worldwide4. 95% of Internet users use YouTube as it is easily accessed, is free of charge, and videos can be accessed in different languages5. Although the information is easy to access, there is a real difficulty of obtaining the correct material. The fundamental concern related to expanding resources such as YouTube is "which source can we trust and is the information sufficient?".
The current research revealed a total 500 videos when the search terms of "smell dysfunction" and "smell dysfunction treatment" were used. When the analyses were narrowed down, it was discovered that 71 (37.5%) of the 189 videos that met the study criteria had been uploaded by individual users or patients or non-physician healthcare personnel or television programs. This finding means that only the remaining 118 (62.5%) relevant videos were uploaded by physicians/universities/professional organizations or a health-related website. The source of the current study videos containing unreliable information was determined to be more individual users or patients or non-physician healthcare personnel or television programs. This heterogeneous and uncontrolled information pollution on YouTube has been examined before by Keelan et al.15 and Roshan et al.16 in studies on vaccination and tonsillectomy, respectively. A similar heterogeneity and lack of control in the information available online was shown by Hassona et al.17 in a study of oral cavity cancer, and by Enver et al. in a study of larynx cancer18.
Smell dysfunction is defined as impaired perception of odors. This has a negative impact on quality of life with effects on personal hygiene and social relationships, and the person may not be able to notice dangerous conditions such as gas or chemicals. If treatment of the disease is delayed, or time is lost with incorrect or invalid treatment, the individual may experience life-threatening situations as they cannot smell. The results of the current study showed that 71 of 189 videos related to smell dysfunction had been uploaded to YouTube by people unqualified in this subject. Of these, 12 (17%) were found to be unreliable, demonstrating that approximately 1 in every 5 patients obtains unreliable information.
The rates of videos found to be reliable and non-reliable in the current study were similar to the rates in most studies in literature7-11. However, there are also studies which have reported higher or similar rates of unreliable videos to reliable videos15-18. Those studies have also reported that, unfortunately, the number of total views of these videos is higher than those of videos containing reliable information. Videos that have been most watched and most liked must not be thought to be the most correct scientifically. As the YouTube search engine orders videos according to the number of views, likes, and dislikes, videos containing unreliable information are among the recommended videos. In the ordering of videos related to health, YouTube should be sensitive on the point of prioritizing videos prepared by professional health-care specialists and physicians specialized in the field.
In the current study, the points of the GQS, DISCERN, and JAMA tools completed by two ENT specialists were determined to be statistically significantly higher in Group 1. This objectively showed the quality and reliability of the videos containing reliable information. In a previous study that used these three tools to measure the health information quality of videos, statistically higher scores were seen to have been objectively obtained by the reliable videos compared to the unreliable videos19.
In the current study, the number of daily views and likes were found to be similar in the Group 1 and Group 2 videos. However, there are also studies showing that videos containing reliable information have been determined with statistically significantly lower levels of daily views and likes19. This demonstrates that viewers on the social platform cannot differentiate which videos are more correct and have content of a high scientific level.
This study had some limitations. First was that a playlist was formed at the same time by the two raters of the first 500 videos. Only the videos available at that time were included for evaluation, but YouTube is a dynamic platform with new videos constantly emerging. Second, as different languages could not be understood, only videos in the international common language of English could be included. A further limitation may be that only the first 500 videos on the subject were evaluated, and these may not have been representative of all the videos. However, this study can be considered of value as the first study to have evaluated the reliability of smell dysfunction videos on YouTube using the three evaluation tools together which have been used in previous studies of the quality and reliability of videos.
Conclusion
The results of the evaluations in this study showed that although the majority of videos on YouTube related to smell dysfunction are reliable, the number of unreliable videos is not inconsiderable. Unreliable videos can cause negative outcomes for patients. Therefore, when videos related to smell dysfunction or including other medical information are accepted onto YouTube, weighting should be given to videos which include scientifically proven evidence uploaded by specialist professionals and institutions.