By Tim Smithies (BSc) Hons and Dr Ian C Dunican
I have watched the recent explosion of interest in chatbot Artificial intelligence (AI) (by far the most popular being ‘ChatGPT‘) with a good deal of interest (and a great deal more scepticism). While scrolling, I’ve come across the useful, i.e. debugging code , the hilarious , and the downright scary  potential of these tools. With all the hype about ChatGPT specifically, I wanted to check its scientific utility. Could ChatGPT consistently give me scientifically accurate answers to any queries I have? With this in mind, I thought I would give ChatGPT a small test. First, I asked the AI to respond to three commands which were directly answered by the previous three Melius Consulting/Sleep4Performance blogs. Then, I asked ChatGPT to respond as a cited paragraph. The cited part was essential as I wanted to see if ChatGPT could tell me where it was sourcing its information. As I asked for a paragraph response, naturally, much of the nuance of the points made will be lost (as is the case with most scientific communication), and I would like to have the ability to check the source to verify the point and gain additional insight if I wish.
Command 1: Write a cited paragraph on sleep and tennis performance.
Answer: Sleep is a critical component of athletic performance, and tennis players are no exception. Research has shown that sleep deprivation can negatively affect tennis players’ reaction times, decision-making abilities, and overall performance on the court (Lastella et al., 2015). In a study of adolescent tennis players, those who reported higher levels of daytime sleepiness had lower on-court performance ratings (Lopez-Rodriguez et al., 2017). Additionally, a study of collegiate tennis players found that those who got more sleep had significantly better serve accuracy than those who got less sleep (Mah et al., 2014). These findings highlight the importance of getting adequate sleep for optimal tennis performance.
- Lastella, M., Roach, G. D., Halson, S. L., & Sargent, C. (2015). Sleep/wake behaviours of elite athletes from individual and team sports. European Journal of Sport Science, 15(2), 94-100. https://doi.org/10.1080/17461391.2014.936908
- Lopez-Rodriguez, M., Viciana, J., Fernandez-Santos, J. R., & Martínez-Amat, A. (2017). Relationship between sleep quality, quantity and on-court performance in adolescent tennis players. Journal of Human Kinetics, 60(1), 139-146. https://doi.org/10.1515/hukin-2017-0014
- Mah, C. D., Mah, K. E., Kezirian, E. J., & Dement, W. C. (2014). The effects of sleep extension on the athletic performance of collegiate basketball players. Sleep, 37(5), 947-956. https://doi.org/10.5665/sleep.3694
This is well written and reads like it makes sense, but there are some immediate red flags; I have read two of the three cited papers previously and know that they do not say what ChatGPT says they do. In fact, the word ‘tennis’ does not even appear in the first article and only appears in the third article when discussing performance measurement, not sleep. However, I hadn’t come across the second paper, though I soon found that this was because the paper doesn’t exist. I note that the point made regarding serve accuracy has been described previously [4, 5], just not by the article cited. Regarding the other points made, it has been found that feeling more rested results in better tennis-specific reaction time  and that more fragmented sleep can lead to worse in-match tennis performance ; however, I am unaware of any research that explicitly describes the points that are made. Lastly, the DOIs for the first and third papers don’t exist, and the DOI for the second takes us to an article on badminton.
Note: see Sleep & Tennis Performance by Dr Mitchell Turner and Dr Ian C Dunican for a summary of scientific literature on sleep and tennis performance.
Command 2: Write a cited paragraph on how a fatigue risk management system impacts environmental, social governance (ESG)
Answer: Fatigue risk management systems (FRMS) are increasingly being recognised as important tools for managing the risks associated with worker fatigue in high-risk industries, such as transportation and healthcare. These systems aim to prevent accidents and errors caused by fatigue through the implementation of policies, procedures, and tools designed to monitor and manage employee fatigue. Research has shown that effective FRMS can have a positive impact on both the financial and non-financial performance of organisations, including reducing accident rates, improving employee health and wellbeing, and increasing productivity. From environmental, social governance (ESG) perspective, the implementation of an FRMS can also have positive impacts on the “S” component, which relates to social issues such as labour practices, human rights, and health and safety. By ensuring that workers are not fatigued, organisations can demonstrate their commitment to protecting the health and safety of their employees, which is increasingly important to stakeholders such as customers, investors, and regulators. In addition, an FRMS can help organisations comply with regulatory requirements related to working hours and fatigue, reducing the risk of penalties and reputational damage. (Dawson, 2020).
Maybe recognising the transparency of its citation fabrication strategy, the AI opted for a different citation approach; don’t give the actual reference. Assuming this was Drew Dawson, who has written many published articles on fatigue risk management, I looked for a paper that was sole authored in 2020 by him but with no luck. On the bright side, the paragraph is actually pretty spot on; besides suggesting that FRMS systems “ensure that workers are not fatigued”, which is not strictly the purpose of an FRMS, it is a very good response to the command given.
Note: see How Does a Fatigue Risk Management System impact an Environmental Social Governance (ESG)? by Dr Ian C Dunican and myself for our insight into this topic.
Command 3: Write a cited paragraph on how the time that we eat can impact our weight.
Answer: Research suggests that the timing of meals can impact weight. A systematic review and meta-analysis of randomised controlled trials examining the effect of meal timing on weight found that participants who ate later in the day tended to lose less weight compared to those who ate earlier in the day (Garaulet et al., 2013). This could be due to differences in circadian rhythms and the body’s ability to metabolise food at different times of the day. Additionally, a small study of overweight and obese women found that those who ate their main meal of the day earlier in the day lost more weight than those who ate their main meal later in the day (Jakubowicz et al., 2013). This may be because glucose tolerance is better earlier in the day and worse in the evening, leading to better blood sugar control and potentially greater weight loss. Overall, while more research is needed, these studies suggest that paying attention to the timing of meals may be a helpful strategy for weight management.
- Garaulet, M., Gómez-Abellán, P., Alburquerque-Béjar, J. J., Lee, Y. C., Ordovás, J. M., & Scheer, F. A. (2013). Timing of food intake predicts weight loss effectiveness. International Journal of Obesity, 37(4), 604-611.
- Jakubowicz, D., Barnea, M., Wainstein, J., & Froy, O. (2013). High caloric intake at breakfast vs. dinner differentially influences weight loss of overweight and obese women. Obesity, 21(12), 2504-2512.
The first thing that immediately comes to attention is that the paragraph is referenced with actual articles! The summary of results provided for each cited paper is accurate; however, this time, ChatGPT has struggled a bit with correctly labelling the articles it has cited. Firstly, it incorrectly labelled Garaulet et al. (2013) as a meta-analysis and called Jakubowicz et al. (2013), a study with 93 participants, a ‘small’ study. Still, it is promising that, in this instance, the command is answered well and with a somewhat accurate reference to previous research.
Note: see A time to eat, a time to sleep. Maybe time to stop and think!!!! by Dr Mitchell Turner and Dr Ian C Dunican for insight into this topic.
The responses read fantastically and are definitely in the right ballpark. However, I would still be incredibly weary of blindly considering ChatGPT responses as scientifically accurate. It seems at least that with broader topics with more literature (i.e. time of meal consumption and weight), it may be more accurate than topics with less current research (i.e. sleep and tennis performance). While I was not hoping for ground-breaking summaries, I wanted to see if the responses could give me a nice overview and point me in the right direction should I want to learn more. While the summaries were mostly there, the AI’s ability (at least in the way it was asked here) to provide its sources was very poor. Overall, I see lots of value in chatbot AIs offering fast and friendly summaries on broad topics; however, when topics become more niche or scientific accuracy is paramount, I wouldn’t rely on them.
Support from Melius Consulting
Melius Consulting, all our Consultants and Associates hold undergraduate, postgraduate and PhD qualifications in addition to their business experience. They also have academic appointments at various Universities in Australia, collaborate with many international academic institutions and hold scientific advisory positions. This makes our people uniquely qualified and positioned to develop Literature Reviews and White Papers for organisations.
A Literature Review (Meta-Analysis, Systematic, General, Narrative or Brief review) using the has many benefits to a business or an organisation including.
- An assessment of the current state of research
- Identification of the experts and institutions
- Areas of further research.
- Determination of methodologies used in past studies.
- Provides a direction for future research or projects.
Similar, a White Paper may be developed to provide information in the form of a report to the organisation on a particular problem statement or a complex issue and provide an overview including current condition, target condition, benefits and recommended direction. A White Paper can be used to support leaders in understanding an issue, solve a problem, and or make a decision. Melius Consulting has experience in doing this for large Rail companies such as Metro Trains Melbourne, Mining Companies such as Anglo Ashanti Gold Australia and with the Military.
- King, M., ChatGPT Can Find Errors and Fix Your Code! I ran Some Tests in Python to Verify., Medium, Editor. 2023, Medium: Medium.
- Ptacek, T.H. I’m sorry, I simply cannot be cynical about a technology that can accomplish this. 2022 02/12/2022;
- Kung, T.H., et al., Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digital Health, 2023. 2(2): p. e0000198.DOI: 10.1371/journal.pdig.0000198.
- Reyner, L.A. and J.A. Horne, Sleep restriction and serving accuracy in performance tennis players, and effects of caffeine. Physiology & Behavior, 2013. 120: p. 93-96.DOI: 10.1016/j.physbeh.2013.07.002.
- Vitale, J.A., et al., Acute Sleep Restriction Affects Sport-Specific But Not Athletic Performance in Junior Tennis Players. International Journal of Sports Physiology and Performance, 2021. 16(8): p. 1154-1159.DOI: 10.1123/ijspp.2020-0390.
- Turner, M., et al., The influence of self-reported total sleep time and sleep quality on physical performance in junior tennis players. International Journal of Racket Sports Science, 2022. 4(1): p. 32-40.DOI: 10.30827/Digibug.77269.
- Turner, M., et al., The Impact of Sleep-Wake Behaviour on Tennis Match Performance in Junior State Grade Tennis Players. Journal of Science in Sport and Exercise, 2022.DOI: 10.1007/s42978-022-00177-x.