A DATA TO TEXT FRAMEWORK FOR DESCRIBING REGRESSION MODELS: AN OPTIMIZATION APPROACH FOR CONTENT DETERMINATION
Majed Bokhari
James Curry
Ashraf El-Houbi
Hsing-Wei Chu
Alberto Marquez
Xinyu Liu
Lamar University
ABSTRACT
Linear regression models are a common approach to analyzing data. This study explores
constructing a generic text document to describe linear regression models using Natural Language
Generation. A series of functions were developed in this research to explain a regression model.
Each function describes a fact about the linear regression model including model parameters, model
fit, and outlier analysis. A simulated annealing heuristic is used to join the results of the functions
to a single document. The heuristic optimizes the information displayed to the users subject to a
page length restriction. The proof of concept software in this study was developed in R.