Skip to content

This repository contains a Google Colab notebook that provides tools and techniques to help identify and locate bad labels in datasets. Bad labels refer to incorrect, inconsistent, or misleading annotations assigned to data points.

Notifications You must be signed in to change notification settings

subhan97ahmed/How-to-find-bad-labels

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

How To Find Bad Labels In Dataset

Description

This repository contains a Google Colab notebook that provides tools and techniques to help identify and locate bad labels in datasets. Bad labels refer to incorrect, inconsistent, or misleading annotations assigned to data points. Identifying and rectifying bad labels is crucial for improving the accuracy and reliability of machine learning models.

Some Ways

  • Model Uncertainty
  • Model Disagreement
  • Pruning

Usage

To use the notebook, follow these steps:

  1. Open Google Colab (https://colab.research.google.com/).

  2. Click on "File" and select "Open Notebook".

  3. In the "GitHub" tab, enter the URL of this repository: https://github.com/your-username/How-to-find-bad-labels.

  4. Click on the notebook file How-to-find-bad-labels.ipynb to open it.

  5. Run the cells in the notebook sequentially to execute the code and analyze your dataset.

  6. Modify the code and adapt it to your specific needs as required.

Requirements

The notebook requires a Google Colab environment with the following dependencies:

  • Python 3.x
  • Additional Python libraries specified in the notebook (if any)

Ensure that you have the necessary dependencies installed in your Colab environment before running the notebook.

Contributing

If you would like to contribute to this project, please follow these steps:

  1. Fork the repository on GitHub.
  2. Create a new branch with a descriptive name for your feature/bug fix.
  3. Make the necessary changes and additions in your branch.
  4. Test your changes thoroughly.
  5. Commit your changes and push them to your forked repository.
  6. Create a pull request on the original repository, describing your changes and their purpose.

About

This repository contains a Google Colab notebook that provides tools and techniques to help identify and locate bad labels in datasets. Bad labels refer to incorrect, inconsistent, or misleading annotations assigned to data points.

Topics

Resources

Stars

Watchers

Forks