Healthy datasets have lots of different examples!
If your data is healthy …
Healthy
dataset
Finds
correct
patterns
Healthy
prediction!
Correct actions or decisions!
If your data is not healthy …
Unhealthy
dataset
Finds
bad
patterns
Bad
prediction!
Incorrect actions or decisions!
But how do we make healthy datasets?
Healthy datasets
Lots of data
Different examples of data
The right kind of data
(example: not using pictures if sounds would be better)
How
healthy
was this dataset?
Finds
the right
patterns
Correct
prediction
for happy
or sad
Correctly shows a happy or
sad face
Is the Make Me Happy dataset healthy?
Does it have lots of data? Were there equal amounts in each class?
What different examples of data does it have? Could we have added more, different data?
Does it use the right kind of data? Why?
Let’s check if the dataset is healthy!
More than 10 pieces of data means we are off to a good start with lots of data!
We have different example sentences as our different types of data.
Our model uses sentences, so a text dataset (happy and sad sentences) is the right kind of data!
Likely to make the right decision and display the correct happy or sad face!
Can you make the dataset for Make Me Happy better?
Healthy datasets
Lots of data – are there more sentences we can add to the training data?
Different examples of data – are there sentences in another language we should add?
The right kind of data – instead of text should we use audio clips (sound)?
You can make
an even better happy/sad recognizer!
Below are some activities to get you ready to work with datasets! First, you will judge some existing datasets to see if they are healthy, and then you’ll practice planning a dataset for a given problem.
Cookie | Duration | Description |
---|---|---|
cookielawinfo-checkbox-analytics | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics". |
cookielawinfo-checkbox-functional | 11 months | The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional". |
cookielawinfo-checkbox-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
cookielawinfo-checkbox-others | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other. |
cookielawinfo-checkbox-performance | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance". |
viewed_cookie_policy | 11 months | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |
We’re using an AI-powered translation tool, and there may be some errors. If you find something that is incorrect, please fill out our form and let us know so we can fix it!
You can also find a link to the form in the footer.