In my opinion,the following definition is very good!

Data
• Let x = (x1, x2, . . . , xD) T denote a data point, and D = {x(1), x(2) . . . , x(N)}, a
data set. D is sometimes associated with desired outputs y1, y2, . . ..

Predictions
• We are generally interested in predicting something based on the observed
data set.
• Given D what can we say about x(N+1)?

Model
• To make predictions, we need to make some assumptions. We can often
express these assumptions in the form of a model, with some parameters, θ
• Given data D, we learn the model parameters , from which we can predict
new data points.
• The model can often be expressed as a probability distribution over data
points