ML Journals

PMLR Proceedings of Machine Learning Research
- JMLR Journal of Machine Learning Research
- MLOSS Machine Learning Open Source Software

Object Detection

在 PrimeHub 中使用 YOLOv4 定時偵測景點擁擠度 (1/4)

Data Quality

Data Collection and Quality Challenges for Deep Learning

video
slides Papers from KAIST AI Data Collection and Quality Challenges for Deep Learning (VLDB 2020 Tutorial) by Steven Euijong Whang (KAIST AI) and Jae-Gil Lee

Abstract: Software 2.0 refers to the fundamental shift in software engineering where using machine learning becomes the new norm in software with the availability of big data and computing infrastructure. As a result, many software engineering practices need to be rethought from scratch where data becomes a first-class citizen, on par with code. It is well known that 80-90% of the time for machine learning development is spent on data preparation. Also, even the best machine learning algorithms cannot perform well without good data or at least handling biased and dirty data during model training. In this tutorial, we focus on data collection and quality challenges that frequently occur in deep learning applications. Compared to traditional machine learning, there is less need for feature engineering, but more need for significant amounts of data. We thus go through state-of-the-art data collection techniques for machine learning. Then, we cover data validation and cleaning techniques for improving data quality. Even if the data is still problematic, hope is not lost, and we cover fair and robust training techniques for handling data bias and errors. We believe that the data management community is well poised to lead the research in these directions. The presenters have extensive experience in developing machine learning platforms and publishing papers in top-tier database, data mining, and machine learning venues.

Data for Good

數據中藏線索資料英雄用「AI」降低社會風險

AI Papers

ML Journals

Object Detection

Data Quality

Data Collection and Quality Challenges for Deep Learning

Data for Good

AI Big World

ML Journals

Object Detection

Data Quality

Data Collection and Quality Challenges for Deep Learning

Data for Good

Share this post