Data-centric AI (DCAI): Difference between revisions

Data-centric AI (DCAI) (view source)

Revision as of 08:58, 28 February 2023

1,593 bytes added , 28 February 2023

no edit summary

Elegant angel

57

edits

@@ Line 18: / Line 18: @@
 Despite tempting temptation, don't skip Step 2 through Step 4. You can repeat Steps 3-4 multiple times to deploy the most effective ML systems.
-==Examples of Data-centric AI==
+==Examples==
+This field covers the following methods:
+*Outlier detection and removal (handling unusual examples in the dataset)
+*Correction and error detection (handling incorrect labels/values in the dataset).
+*Establishing consensus (determining truth among many crowdsourced annotations).
+*Data augmentation (adding examples of data to encode prior information)
+*Feature engineering and Selection (manipulating the way data are represented).
+*Active learning (selecting most informative data to label next).
+*Curriculum Learning (Ordering the data in a dataset from easiest to most difficult)
+*Recent high-profile examples of ML applications clearly show how reliability of ML model deployed in real-world depend on training data.
+OpenAI stated openly that errors in data and labels were the main problem in training famous ML models such as Dall-E, GPT-3 and ChatGPT. These are stills from the demo of DallE 2.
+Tesla was able to produce autonomous driving systems that are far more advanced than comparable competitors by using model-assisted data improvement (Step 3). The key to this success is the Data Engine. These slides are from Andrej Karpathy (Tesla Director of AI 2021).
+==Reasons for Data-centric AI==
+*Data quality issues are costing the U.S. alone an estimated $3 Trillion annually.
+*Automated methods and systematic engineering principles are now needed to ensure ML models are trained with clean data.
+*Recent research on image classification with noisily labeled data revealed simple methods which adaptively change the dataset can lead to more accurate models than sophisticated modeling strategies.