Data-centric AI (DCAI): Difference between revisions

No edit summary
 
(One intermediate revision by the same user not shown)
Line 35: Line 35:


==Reasons for Data-centric AI==
==Reasons for Data-centric AI==
*Data quality issues are costing the U.S. alone an estimated $3 Trillion annually.
In the US, data quality problems cost $3 trillion per year. It is difficult to guarantee data quality in large datasets without using algorithms. ChatGPT, a ML system that relies on human feedback to correct shortcomings arising out of low-quality training data has used ChatGPT as an example. However, automated methods are required to ensure that ML models are trained using clean data. Recent research has highlighted the importance of data-centric AI. This is an approach that uses simple methods to change the dataset and creates more accurate models. This course will teach you how to improve any ML model using its data. It can be used to train and supervised ML models.
*Automated methods and systematic engineering principles are now needed to ensure ML models are trained with clean data.
*Recent research on image classification with noisily labeled data revealed simple methods which adaptively change the dataset can lead to more accurate models than sophisticated modeling strategies.