A DATA TO TEXT FRAMEWORK FOR DESCRIBING REGRESSION MODELS: AN OPTIMIZATION APPROACH FOR CONTENT DETERMINATION

Majed Bokhari

James Curry

Ashraf El-Houbi

Hsing-Wei Chu

Alberto Marquez

Xinyu Liu

Lamar University

ABSTRACT

Linear regression models are a common approach to analyzing data. This study explores

constructing a generic text document to describe linear regression models using Natural Language

Generation. A series of functions were developed in this research to explain a regression model.

Each function describes a fact about the linear regression model including model parameters, model

fit, and outlier analysis. A simulated annealing heuristic is used to join the results of the functions

to a single document. The heuristic optimizes the information displayed to the users subject to a

page length restriction. The proof of concept software in this study was developed in R.