Loading [MathJax]/extensions/MathMenu.js
Label, Segment, Featurize: A Cross Domain Framework for Prediction Engineering | IEEE Conference Publication | IEEE Xplore

Label, Segment, Featurize: A Cross Domain Framework for Prediction Engineering


Abstract:

In this paper, we introduce "prediction engineering" as a formal step in the predictive modeling process. We define a generalizable 3 part framework - Label, Segment, Fea...Show More

Abstract:

In this paper, we introduce "prediction engineering" as a formal step in the predictive modeling process. We define a generalizable 3 part framework - Label, Segment, Featurize (L-S-F) - to address the growing demand for predictive models. The framework provides abstractions for data scientists to customize the process to unique prediction problems. We describe how to apply the L-S-F framework to characteristic problems in 2 domains and demonstrate an implementation over 5 unique prediction problems defined on a dataset of crowdfunding projects from DonorsChoose.org. The results demonstrate how the L-S-F framework complements existing tools to allow us to rapidly build and evaluate 26 distinct predictive models. L-S-F enables development of models that provide value to all parties involved (donors, teachers, and people running the platform).
Date of Conference: 17-19 October 2016
Date Added to IEEE Xplore: 26 December 2016
ISBN Information:
Conference Location: Montreal, QC, Canada
No metrics found for this document.

I. Introduction

In recent years, the data science community has experienced a sharp increase in demand for predictive models from time-driven relational data. This type of data is collected during the regular day-to-day use of physical machines (such as turbines or cars), services (such as ride sharing or an airline), and digital platforms (such as online learning or retail websites). This data has specific properties that differentiate it from images or text. It is event-driven and collected across different time scales, and contains a multitude of data types, including categorical, numeric, and textual.

Usage
Select a Year
2025

View as

Total usage sinceDec 2016:342
00.511.522.53JanFebMarAprMayJunJulAugSepOctNovDec120000000000
Year Total:3
Data is updated monthly. Usage includes PDF downloads and HTML views.
Contact IEEE to Subscribe

References

References is not available for this document.