Combining GSR data and music to reduce stress

PREMISE

Exploring ways to create an effective stress management method using GSR data and music

SYNOPSIS

Stress is undoubtedly a pandemic by itself. With so many different stressors that we have no control of the only thing we can do is focus on the things that we have control over – our mind and body. There are many proven techniques to manage stress such as exercise or guided mindfulness practice, however, most of them have their limitations and are not easily accessible to everyone. I wanted to find a way that would not only be highly accessible and fun to use but also be evidence-based. I chose to explore ways how we can use GSR biosensor to measure people’s emotional response to relaxing music and recommend them songs that will further enhance their stress reduction effectiveness.

PREFACE

Nobody is immune to experiencing stress, regardless of age, sex, race etc., however, a survey by the American Psychological Association revealed that Gen Z (15-21-year-olds) and Millennials (22-39-year-olds) in the US reported higher than average stress levels and higher than any other generation (APA, 2018). Similarly, a study in the Netherlands found young people to be the biggest population group (82%) at risk of burnout in the past year, a high increase from the previous year (NCPSB, 2022). While the main reason identified for such spike in numbers was Covid-19 related stressors such as job insecurity, financial problems, school-related issues, affected relationships etc. (Graupensperger, Cadigan, Einberger, & Lee, 2021), these and similar issues have been weighing down young people for years before the pandemic. One of the most mentioned causes of stress in young people in the existing research is academic-related.

When stress is not managed it can wreak havoc on a wide range of areas of an individual’s life. According to Hazen et al. (2011), as a response to stress, young people commonly experience anxiety, moodiness and irritability, as well as cognitive issues such as difficulty concentrating, which has a negative impact on students’ academic performance (OECD, 2017). Over time, unmanaged stress paves the way for more serious mental health issues such as depression (Kessler, 2012). Physical health is also at risk, research has shown that stressed people are less likely to exercise and more likely to overeat (Dallman et al., 1993; Stults-Kolehmainen & Sinha, 2013).

One of the most common ways students attempt to manage stress is by listening to music (54.8%), others watch internet videos, talk to friends and even indulge in substance abuse (Mahani & Panchal, 2019). Although there is no data on how popular stress-management apps are among young people, a study found that roughly a half of them were not evidence-based and likely were not effective anyways (Coulon et al., 2016). Given the current state of stress and the highly detrimental consequences it has on young people, there must be a solution that is easily accessible to everyone, evidence-based and engaging. With the currently available technological affordances such as wearable biosensors and an abundance of stress management research in the past decades, there is an opportunity for creating a data-driven solution for this problem.

PROTOTYPING

Building a song database

To begin building the prototype, I first needed to have a database with a lot of different potentially relaxing songs. To achieve that, I created a Spotify playlist and added 50 songs that I found in other playlists curated by Spotify (i.e. “Calming Classical”, “Deep House Relax”, “lofi beats”, “Stress Relief” etc.). Then, using Spotistats, I was able to export the playlist data with all the song characteristics such as tempo, genre, valence, acousticness, instrumentalness and many more. After some data cleaning with python, this is what it looked like:

Inducing stress

Before I start testing songs on people, I need to make sure they are in a stressed state. After a few unsuccessful attempts to induce stress, I found a method that worked relatively well. It is called the mental arithmetics task. The participant is asked to subtract 7 from 500 and then repeat as fast as possible (500-7, 493-7, 486-7 etc.). Below you can see how well it worked over 30 seconds – the GSR reading went from around 280 to above 300.

 

Iteration 1

Testing the songs and creating a recommender system

 

Arduino Uno board and GSR sensor by Grove

 

Now that I am able to put the participant in a stressed state, I can start measuring stress. To do that, I started by playing a random song while the participant was wearing the GSR electrodes on their fingers. This is an example of the GSR response to one of the songs:

 

As you can see the GSR dropped in the beginning and then started slowly increasing.

 

I repeated the process together with the mental arithmetics task for seven different songs and stored the data in separate JSON files. At this point, I am still not sure what is the best way to make meaning out of all the ups and downs in the GSR data that I collected. Therefore, for the initial version of the recommender system, I simply used the difference between the first (t=0 sec) and last (t=500 sec) GSR values registered, to determine how effective was the song in reducing stress. I added these values ( “before”, “after”, “difference” ) to the data frame for every song I tested on the participant.

Building a recommender system

I now have a data frame with song titles and their features, as well as GSR data indicating how well some of the songs worked on the testing participant. Below you can see that the song called “Hold On” worked the best out of the other few that I tested. It decreased the GSR by 67 units.

Now I want to build a recommender system that will be able to recommend the user songs that are similar to the one that performed the best. This type of recommender system is called content-based filtering. I will be using a cosine similarity machine learning model, which compares the similarity of two feature vectors. The first step is to select the features I want the recommender system to base its recommendation off and transform them into the range between 0 and 1. I did it using MinMaxScaler from sklearn and used the following features: ‘Energy’, ‘Key’, ‘Loudness’, ‘Mode’, ‘Speechiness’, ‘Acousticness’, ‘Instrumentalness’, ‘Liveness’, ‘Valence’, ‘Tempo’. Then using the sklearn’s cosine_similarity library and the code I found and adapted from machinelearninggeek.com, I was able to generate a list of top 10 songs from the database that were the most similar to “Hold On” (the one that performed the best so far).

The recommended songs

 

Iteration 2

Testing the recommendations with an app interface

Then l tested the first couple recommendations to see how they performed. In addition, I created an interface for the music player with the recommender system, which I also tested out with the user. One feature of the interface is seeing your GSR numbers in real-time. I personally like seeing them because it helps me concentrate on relaxing and my breathing, however, I can imagine that for some people it might be stressful to watch them, especially if the numbers are increasing. Unfortunately, I was not able to implement this feature into the design due to my limited time and skills. Therefore, I tested this feature by asking the participant to watch the live numbers on the serial monitor of Arduino IDE on a laptop.

 

The top two recommended songs were “Romeo” and “Mutual Feeling”. The first song performed relatively well with a decrease of 28 units, however, the second only showed a decrease of 7. However, this is not surprising since the dataset is very small. Regarding the interface, the participant prefered not to see the real-time data for the reason that I suspected, it was too distracting.

Iteration 3

 

Although there is evidence that specific genres and types of music will have a universal relaxing impact on one’s body, a number of studies have also shown that when exposed to one’s preferred genre of music, the indicators of relaxation can significantly increase and anxiety may decrease compared to when exposed to other unfamiliar music (Bernardi, 2005; Salamon, 2003). Therefore, for the third iteration, I decided to let the user choose their own relaxing song that they are familiar with. In addition, I changed the interface based on the user feedback and enabled an option to hide real-time data. Also, to improve the user experience and the effectiveness in stress reduction, I added a breathing guide that tells the user when to inhale and exhale. Such diaphragmatic breathing can significantly reduce stress after just a single practice (Arsenio & Loria, 2014). Although I was not able to implement it due to time and skills limitations, I imagine it to send light vibrations via the phone or a wearable device indicating when to inhale or exhale. However, I asked the participant to take deep breaths during the listening session.

 

The participant chose a song that they personally find relaxing and listened to it while taking deep breaths. The song performed rather well, GSR went from 280 to 230 in the first 40 seconds, likely due to the deep breathing too.

Listening to a personally preferred relaxing song while taking deep breaths.

 

REFLECTION AND CONCLUSION

Overall, the process of testing GSR in response to music turned out to be trickier than I thought. It is difficult to ensure the participant is in a stressed state over and over again as they become habituated to the stressor. There are so many other factors that should be accounted for to ensure accuracy in testing, such as room temperature, testing time etc. After doing the experiments, GSR seems to be quite an appropriate indicator of one’s emotional arousal in response to music, however, it would be more powerful if it was combined with other biosensors such as HR monitor. The recommendation system should be tested further with a larger and better organised data set but I think having quite an extensive variety of song characteristics to train it with was very helpful. Lastly, incorporating breathing techniques proved really effective in reducing stress. There is potential to make it even more engaging and data-driven by combining the breathing guide with HR data.

REFERENCES

APA. (2018, October). STRESS IN AMERICA<sup>TM GENERATION Z</i>. Retrieved from https://www.apa.org/news/press/releases/stress/2018/stress-gen-z.pdf

Arsenio, W. F., & Loria, S. (2014). Coping with Negative Emotions: Connections with Adolescents’ Academic Performance and Stress. The Journal of Genetic Psychology, 175(1), 76–90. https://doi.org/10.1080/00221325.2013.806293

Bernardi, L. (2005). Cardiovascular, cerebrovascular, and respiratory changes induced by different types of music in musicians and non-musicians: the importance of silence. Heart, 92(4), 445–452. https://doi.org/10.1136/hrt.2005.064600

Coulon, S. M., Monroe, C. M., & West, D. S. (2016). A Systematic, Multi-domain Review of Mobile Smartphone Apps for EvidenceBased Stress Managementsa. American Journal of Preventive Medicine. https://doi.org/10.1016/j.amepre.2016.01.026

Dallman, M. F., Strack, A. M., Akana, S. F., Bradbury, M. J., Hanson, E. S., Scribner, K. A., & Smith, M. (1993). Feast and Famine: Critical Role of Glucocorticoids with Insulin in Daily Energy Flow. Frontiers in Neuroendocrinology, 14(4), 303–347. https://doi.org/10.1006/frne.1993.1010

Graupensperger, S., Cadigan, J. M., Einberger, C., & Lee, C. M. (2021). Multifaceted COVID-19-Related Stressors and Associations with Indices of Mental Health, Well-being, and Substance Use Among Young Adults. International Journal of Mental Health and Addiction. https://doi.org/10.1007/s11469-021-00604-0

Hazen, E. P., Goldstein, M. C., Goldstein, M. C., & Jellinek, M. S. (2011). Mental Health Disorders in Adolescents. Amsterdam, Netherlands: Amsterdam University Press.

Kessler, R. C. (2012). The Costs of Depression. Psychiatric Clinics of North America, 35(1), 1–14. https://doi.org/10.1016/j.psc.2011.11.005

Mahani, S., & Panchal, P. (2019). Evaluation of Knowledge, Attitude and Practice Regarding Stress Management among Undergraduate Medical Students at Tertiary Care Teaching Hospital. Journal of Clinical and Diagnostic Research. https://doi.org/10.7860/JCDR/2019/41517.13099

NCPSB. (2022). AD: Tachtig procent van jongeren zit door corona tegen burn-out aan. Retrieved from https://nationaalcentrumpreventiestressenburn-out.nl/ad-tachtig-procent-van-jongeren-zit-door-corona-tegen-burn-out-aan/

OECD. (2017). Most teenagers happy with their lives but schoolwork anxiety and bullying an issue. Retrieved from https://www.oecd.org/newsroom/most-teenagers-happy-with-their-lives-but-schoolwork-anxiety-and-bullying-an-issue.htm

Student Development, 51(1), 79–92. https://doi.org/10.1353/csd.0.0108

Stults-Kolehmainen, M. A., & Sinha, R. (2013). The Effects of Stress on Physical Activity and Exercise. Sports Medicine, 44(1), 81–121. https://doi.org/10.1007/s40279-013-0090-5

Read more "Combining GSR data and music to reduce stress"

Identifying Invasive Plant Species Using Deep Learning

Premise How can a deep learning image classification tool be used to identify invasive plant species? Synopsis Biological invasions, where a non-native species is introduced to a new environment where it proliferates and dominates, i.e., becomes invasive, drastically altering the functions of that ecosystem, are one of the biggest drivers of biodiversity loss globally. Once […]

Read more "Identifying Invasive Plant Species Using Deep Learning"

Awareness for the invisible: Predict healthier journeys from air pollution data

Premise Creating healthier cycling commutes using crowdsourced IoT GitHub: https://github.com/philffm/FreshAir Figma Concept: https://www.figma.com/proto/ Synopsis The global climate and health crisis is affecting everyone and from 2030 and 2050 is expected to account for “250 000 additional deaths per year.” (World Health Organization (2021) Air pollution, which poses a major challenge in most emerging economies, is […]

Read more "Awareness for the invisible: Predict healthier journeys from air pollution data"

Exploring gamification techniques to increase ADHD students’ self-motivation.

Premise Exploring gamification techniques to increase ADHD students’ self-motivation. Research Synopsis  ADHD students experience impairment with self-motivation to execute daily activities. After a student moves out, they encounter many extra activities and temptations. Their former environment where their parents’ structure was automatically provided is gone.  I have ADHD myself and experienced the same impairments after […]

Read more "Exploring gamification techniques to increase ADHD students’ self-motivation."

Discovering digital tools for stimulating healthy food decisions

Premise Discovering digital tools, like APIs, Image Recognition and chatbots, to investigate which tools are the most effective for stimulating healthy food decisions. Synopsis Making healthy food decisions can be difficult. People are sometimes not motivated for a healthy lifestyle, or they can be influenced by grocery store (unhealthy) promotions or find it difficult to […]

Read more "Discovering digital tools for stimulating healthy food decisions"

Creating a responsive chatbot against loneliness among students.

Premise

 

Creating a responsive chatbot against loneliness among students.

 

Synopsis

 

The lockdown has continued to isolate students from their old social life. The mental health state of many is dwindling significantly compared to pre-corona times. When we are unable to meet people, what do we do?
With that as a leading question, a chatbot was created to fight loneliness among younger people. What started out as a narrow chatbot with clickable elements turned into a responsive chatbot, able to interpret user utterances. Future recommendations include the use of sentiment analysis based on Azure and more training based on the user utterances.
Chatbots without enough learning data are rather narrow. Therefore, it is essential to keep training the chatbot with the utterances that users generate. Additionally, if the chatbot is able to interpret the sentiment of the user, it could respond accordingly. Thus, increasing the responsiveness of the chatbot drastically.

 

The Problem

 

In the current pandemic the worries about the mental wellbeing of young adults is an almost daily topic. However, nothing seems to be done about their isolated state. Social anxiety is part of the problem for people during the pandemic (Panchal et al, 2021).
Especially among the student demographic, the consequences of a lack of social life is felt (Krendl, 2021; Lippke, Fischer & Ratz, 2021; Padmanabhanunni, 2021). As a student myself I am feeling the consequences as well. With no social outlet other than my housemates, my social life has come to a stop.

 

The Approach

 

The initial idea for the prototype was a chatbot created with IBM Watson. The user would be able to have natural conversations with the chatbot in order to feel less lonely. The chatbot would be less chatbot and more of a listening ear for socially isolated individuals.
The first conversation flow was meant to divide the conversation in three distinct emotional states, bad, neutral and positive. The chatbot would then be able to discuss the reason for the emotional state with the individual. However, it quickly became apparent that IBM Watson did not fit the goal of the chatbot.

 

It was then decided to create the chatbot through the Bot Framework Composer. The composer offers more options and has pre-trained models, which is convenient for the chatbot. Additionally, the Composer allows for an open conversation design.
Later in the process, both Azure Cognitive Services and Tensorflow were researched to implement sentiment analysis in the chatbot. Thus, improving the ability of the chatbot to interpret user utterances. 

 

1st Iteration User Testing

 

General feedback was that the chatbot did not feel like a “chatbot against loneliness”, but rather as a social hub. Additionally, the testers noted the static feeling of the conversation, since the options are clickable and not typable. The primary goal of the second iteration is to create a less restrictive chatbot. A chatbot that can understand the user and where the user is able to actually type messages instead of static clicking.

 

The Design Process Second Iteration

The second iteration uses the Bot Framework Composer, which is a visual development environment developed by Microsoft. It allows the developer to create intricate bots with various goals. It is important to note down exactly what we want to accomplish and what the bot should be able to do. Some rough guidelines for the design process:

 

  • What utterances are common in human conversations?
  • How do we fight loneliness? Should we create a bot that feels like it is listening to the person? Or should the bot talk in a more proactive manner, maybe even give suggestions to the person to battle their own loneliness.

 

Two main concepts that should be identified straight away are Intent and Utterance. Intent is identified as the intention that a user has when saying/typing something. The utterance is the entity of text/speech that the user writes/says. The bot extracts the intent from the utterance by analysing the text and searching for certain trigger words. The intent(s) drive the conversation forward, as it provides the bot with the knowledge of what the user is attempting to achieve. 

The primary goal for the second iteration in Composer was to ask the user for their name and let the bot greet them with their name. It was quite difficult to make it work as some limitations are in place. 


When the bot asked “What is your name?”, the expected utterance that acted as training data were along the lines of “My name is X” Or “I am X”. The Composer works with LUIS, a language understanding module. The convenient thing about LUIS is that it is trained on a larger scale and can thus easily work with a smaller amount of expected utterances. However, it was difficult to extract the name of the user, as a response along the lines of “My name is X” would see the Bot thinking that the name of the user was My Name Is X, instead of just X. Documentation on the possibilities of the Composer are rather constrictive and after much trying, the bot is now able to extract the name from an utterance.

Another hiccup was with the age. Similar problems arose as with the age, however the solution that worked with the extraction of the name, did not work with the extraction of the age of the user. 

[aesop_image img=”https://cmdstudio.nl/mddd/wp-content/uploads/sites/58/2021/06/ChatBotGif.gif” panorama=”off” align=”center” lightbox=”on” captionsrc=”custom” captionposition=”left” revealfx=”off” overlay_revealfx=”off”]

2nd Iteration User Testing

 

The user testing with the second iteration was more positive than the first iteration. The users enjoyed the way in which they communicated with the chatbot. It also felt familiar to normally chat with the chatbot, in contrast with the clickable elements from the first iteration.
Increased adaptiveness and responsiveness of the chatbot was a feedback point of both user testings. As such, the final iteration is an attempt on the integration of sentiment analysis into the chatbot. 

[aesop_image img=”https://cmdstudio.nl/mddd/wp-content/uploads/sites/58/2021/06/JoesTest.jpeg” panorama=”off” align=”center” lightbox=”on” captionsrc=”custom” captionposition=”left” revealfx=”off” overlay_revealfx=”off”]

 

Final Iteration

The final iteration of the prototype was aimed at improving the responsiveness of the chatbot. For this reason, research was done into both Tensorflow and Azure Cognitive Services. Both cloud services provide excellent sentiment analysis.
However, Azure Cognitive Services proves to be the most beneficial for the prototype. The reason for this is that Tensorflow can only interpret the sentiment of a given paragraph. Additionally, the Tensorflow model needs training data from the creator of the model. As the research was not as broad scale, it would be more efficient to have access to a model with pre-trained models.
Azure Cognitive Service allows for better extraction of sentiment analysis through an API call. The API call is rather sophisticated as it also provides opinion mining. Which means that the model is able to see what the subject is of an opinion. E.g. from an utterance such as, “The pizza was great.” Azure is able to extract that the user’s opinion was regarding the “pizza” and that the opinion was “great”, which would return a positive sentiment. 

[aesop_image img=”https://cmdstudio.nl/mddd/wp-content/uploads/sites/58/2021/06/SentimentGif.gif” panorama=”off” align=”center” lightbox=”on” captionsrc=”custom” captionposition=”left” revealfx=”off” overlay_revealfx=”off”]

Reflection

 

The project was fun to create. IBM Watson was easy to use and understand. However, it was rather restrictive in its free version capabilities. Luckily, the Bot Framework Composer offered many more possibilities. 
Additionally, it was difficult to find solutions to certain problems. Websites such as Stackoverflow are normally of high value for developers. A similar website for Composer does not exist. Which was frustrating, as the solution for the extraction of the user’s age was not found sadly. 

Another problem was the connection between the Azure Cognitive Services sentiment analysis API and the Composer. The API call needs a Python environment to properly send the request and receive the response. However, the Composer does not provide the adaptability for such a call, as only simple API calls are possible through the Composer. Rasa would be an interesting point of research as well. However, Rasa does need a lot of training data in comparison with Azure.

Lastly, more research has to be done on the extraction of sentiment from the chat logs. The Azure API call splits paragraphs up at the “.” mark. However, the question has to be asked, what part of the paragraph is most important for the bot to respond to? Is a positive sentiment more important than a negative sentiment?

 

Recommendation/Conclusion

 

Future recommendations for improved iterations are to integrate sentiment analysis from Azure if possible. The chatbot will be more adaptive and is able to both ask and interpret open ended questions. While the chatbot is able to acknowledge the user after asking for their name, it is not able to ask how the day of the user was. When fighting loneliness, it is important to create a chatbot that is able to ask “how was your day?”. Additionally, through sentiment analysis and perhaps even opinion mining, the chatbot is able to adapt to a user’s mood and ask questions accordingly. 

Furthermore, the chat logs of future iterations should be used to extract utterances of users that are not yet recognized by the model. E.g. if people often ask “how was your day?” to the chatbot, the designer should provide such an intent for the conversation flow. Thus, improving the adaptiveness of the chatbot. 

 

References

 

Hudson, J., Ungar, R., Albright, L., Tkatch, R., Schaeffer, J., & Wicker, E. R. (2020). Robotic Pet Use Among Community-Dwelling Older Adults. The Journals of Gerontology: Series B, 75(9), 2018-2028.

Krendl, A. C. (2021). Changes in stress predict worse mental health outcomes for college students than does loneliness; evidence from the COVID-19 pandemic. Journal of American College Health, 1-4.

Lippke, S., Fischer, M. A., & Ratz, T. (2021). Physical activity, loneliness, and meaning of friendship in young individuals–a mixed-methods investigation prior to and during the COVID-19 pandemic with three cross-sectional studies. Frontiers in Psychology12, 146

Padmanabhanunni, A., & Pretorius, T. B. (2021). The unbearable loneliness of COVID-19: COVID-19-related correlates of loneliness in South Africa in young adults. Psychiatry research296, 113658.

 

Panchal, N., Kamal, R., Orgera, K., Cox, C., Garfield, R., Hamel, L., & Chidambaram, P. (2020). The implications of COVID-19 for mental health and substance use. Kaiser family foundation.

Vasileiou, K., Barnett, J., Barreto, M., Vines, J., Atkinson, M., Long, K., … & Wilson, M. (2019). Coping with loneliness at University: A qualitative interview study with students in the UK. Mental Health & Prevention13, 21-30.

 

 

Read more "Creating a responsive chatbot against loneliness among students."

A deep dive into food recognition technology

Premise

Exploring techniques that can ease calorie tracking and optimize the quality of nutritional data.

Synopsis

During other courses I already did some research into dietary applications and their pros and cons. The three most common ways to collect data are manual input, data collection by sensors and data collection by cameras. Manual input is a threshold for users to maintain their data collection, automated technologies such as smartwatches and barcode scanning can help users with this task. During this research I want to discover which technologies are already used and which technologies have the potential to take dietary data collection to a next level.

Background

Obesity numbers have tripled since 1975. 1,9 billion adults and 340 million children and adolescents were overweight in 2016 (World Health Organization, 2020). Obesity is a serious problem; it is associated with mental health problems and reduced quality of life. This includes diabetes, heart disease, stroke and several types of cancer (Adult Obesity, 2020). Behavior usually plays an important role in obesity (Jebb, 2004). Treatment for obesity is therefore often focused on food intake and physical activity. Maintaining a behavioral change is often difficult. The adherence of a diet to maintain weight loss is crucial. New technologies that can help users adhere a diet can make a real change (Thomas et al., 2014). There are applications that help users with calorie tracking but users have to give input themselves which is a threshold to maintain calorie tracking. Therefore, I want to discover technologies that can help with calorie tracking.

 

Experiments

 

Arduino

 This course started with a few Arduino workshops which enthused me to try to work with Arduino. At first, I did some research into Arduino compatible sensors. I inspired my research on the HEALBE GoBe3 device which I already did research on during another course. This device uses a bioelectrical impedance sensor to measure calorie intake. It turns out that bioelectrical impedance sensors are quite expensive and that I had to think of another way to measure calorie intake with Arduino. I found this project based on a game of Sures Kumar in which they can measure foods by their unique resistance value. I managed to make it work and measure the resistance which I could link to different fruits. However, the results were not very promising. The resistance values of the fruits would differ from time-to-time due to temperature differences and ripeness. Secondly, there was an overlap of values by different fruits which means that the system could not detect the right fruit. This means that this method is not suitable for precise food detection. Therefore I, switched to a method I already did some experimenting with.

 

 

 

 

 

[aesop_video src=”self” hosted=”https://cmdstudio.nl/mddd/wp-content/uploads/sites/58/2021/06/Schermopname-2021-06-13-om-12.20.45.mp4″ width=”content” align=”center” disable_for_mobile=”on” loop=”on” controls=”on” mute=”off” autoplay=”off” viewstart=”off” viewend=”off” show_subtitles=”off” revealfx=”off” overlay_revealfx=”off”]

 

IBM Watson

 I already did some experimenting with IBM Watson in the past. Therefore, I knew that image recognition is a technology that can recognize foods. During my experiment with IBM Watson, I managed to use the IBM food model. This model could detect images of fruit very well with a confidence score of 90 percent. The disadvantage of using a pre-made model is that you do not have the possibility to change or complement the model with your own data. Therefore, I tried to make my own model that could detect different apple types.The model was trained with 20 apples of each type and 20 negative images of fruits that were not an apple. After the training, the model showed different confidence scores per type of apple. I checked the training images with the lowest confidence scores and noticed why they did not achieve the same score as the rest. The images that scored the lowest were not present at the same amount as the images that scored a high score. This made me realize that it is important to be very cautious choosing which image is suitable for the training data. Another thing I realized was the limitation of a system such as IBM Watson. Training customization and free implementation options were limited. The challenge is to dive into image recognition technologies that could be implemented into python.

OpenCV

The first technology that caught my attention was OpenCV. They provide computer vision libraries and tools to execute real time computer vision. I completed a tutorial in which the computer could recognize a stop sign in a picture. This method used a single image and a pre-trained Haar classifier. During my search for a tutorial to make a custom Haar classifier I found a tutorial that could make realtime object detections. This would mean that users only have to open their camera and immediately see the object detection. This caught my attention and challenged me to follow a new tutorial which would teach me to create a real time object detection model. This was pretty easy; the model used the coco dataset. This dataset is known as one of the best image datasets available. It has around 200.000 labeled images and 80 different categories. I was surprised how easy it was to install this model and detect objects in real time. It could detect 80 different objects including persons, cellphones, bananas and apples. Although this dataset worked really well it was not suitable for my goal. It could not detect any fruits other than bananas and apples. A solution for this problem was the creation of my own coco dataset that would be focused on fruits. Unfortunately, due to my limited programming knowledge and regular errors I could not manage to make this work. Therefore, I tried to find a tool that could help me to make my own custom dataset.

[aesop_video src=”self” hosted=”https://cmdstudio.nl/mddd/wp-content/uploads/sites/58/2021/06/Schermopname-2021-06-13-om-12.23.56.mp4″ align=”center” disable_for_mobile=”on” loop=”on” controls=”on” mute=”off” autoplay=”off” viewstart=”off” viewend=”off” show_subtitles=”off” revealfx=”off” overlay_revealfx=”off”]

 

 

Roboflow and Yolov5

First iteration

After some research I found a tutorial that showed me how to make a custom object detection model for multiple objects. This tutorial showed me how to use the YOLOv5 (You Only Look Once) environment to train the model and Roboflow to annotate the training images. Roboflow is a program that makes it possible to annotate every image and export the annotations into the filetype that you need. In this case I exported my file to a YOLOv5 PyTorch file. The annotation process was a lot of work because I had to draw bounding boxes on every picture. At first, I used a dataset from Kaggle. This was a dataset of 90.000 images consisting of 130 different fruits or vegetables. I selected 600 images for 6 different fruits. The images were already cut out but had to be annotated still. The results after training these images were not very good. It could not recognize the images very well. The images below show that the model created wrong bounding boxes and predictions.

 

[aesop_image img=”https://cmdstudio.nl/mddd/wp-content/uploads/sites/58/2021/06/wrong-results.jpeg” panorama=”on” align=”center” lightbox=”on” captionsrc=”custom” captionposition=”left” revealfx=”off” overlay_revealfx=”off”]

Second iteration

Therefore, I created a new dataset that I collected myself. This time I used a mix between images of fruit with a blank background and images of fruits with a mixed background. This time the results of the training were much better. I used YOLOv5 because this model is easier to implement than YOLOv4 and the results are similar (Nelson, 2021). The image below shows the score of different elements. The most elements show a flattening curve at the top which shows there were enough training iterations.  The mAP@0,5 (mean average precision) scored above the 0,5 which is commonly seen as a success (Yohanandan, 2020). Although the training of my model was a success, I still could not manage to connect the model to a camera. This is a disappointment and a challenge in future projects.

 

 

[aesop_image img=”https://cmdstudio.nl/mddd/wp-content/uploads/sites/58/2021/06/Good-graph.png” panorama=”on” imgwidth=”content” align=”center” lightbox=”on” captionsrc=”custom” captionposition=”left” revealfx=”off” overlay_revealfx=”off”]

The images above show a wrong and a good prediction. Although the prediction of the lemon is wrong, it is interesting to see that it is recognized for a banana. This is understandable because there are training images of bananas that look similar.

Future Research

Image recognition is a technology that has an enormous potential for food recognition. There are a lot of datasets and models available that are perfectly capable of recognizing fruits. Which would be a very useful addition to dietary applications when linked to a database with the caloric values. The next step would be a model that could recognize sizes or portions of food. This step would improve the quality of dietary intake data.

Conclusion

 My search for technologies that could support dietary apps was successful. I enjoyed working with Arduino and came across many amazing projects. However, I learned more during my dive into the different image recognition techniques. I discovered how many different models there are and how they work. My lack of programming knowledge was sometimes frustrating because I could not finish some projects I started with. Nevertheless, the search for solutions or different methods gave me new insights that I otherwise would not have acquired. It would be nice if I could implement some of the techniques, I learned during my graduation project.

 

References

Adult Obesity. (2020, 17 september). Centers for Disease Control and Prevention. https://www.cdc.gov/obesity/adult/causes.html#:%7E:text=Obesity%20is%20 serious%20because%20it,and%20some%20types%20of%20cancer.

Jebb, S. (2004). Obesity: causes and consequences. Women’s Health Medicine, 1(1), 38–41. https://doi.org/10.1383/wohm.1.1.38.55418

Nelson, J. (2021, 4 maart). Responding to the Controversy about YOLOv5. Roboflow Blog. https://blog.roboflow.com/yolov4-versus-yolov5/

Thomas, D. M., Martin, C. K., Redman, L. M., Heymsfield, S. B., Lettieri, S., Levine, J. A., Bouchard, C., & Schoeller, D. A. (2014). Effect of dietary adherence on the body weight plateau: a mathematical model incorporating intermittent compliance with energy intake prescription. The American Journal of Clinical Nutrition, 100(3), 787–795. https://doi.org/10.3945/ajcn.113.079822

World Health Organization. (2020a, 1 april). Obesity and overweight. https://www.who.int/news-room/fact-sheets/detail/obesity-and-overweight

Yohanandan, S. (2020, 24 juni). mAP (mean Average Precision) might confuse you! – Towards Data Science. Medium. https://towardsdatascience.com/map-mean-average-precision-might-confuse-you-5956f1bfa9e2

 

 

 

 

Read more "A deep dive into food recognition technology"