A salutary note at the end of Rutherford Aris’ Mathematical Modelling Techniques:
When a model is being used as a simulation an obvious comparison can be made between its predictions and the results of the experiment. We are favourably impressed with the model if the agreement is good and if it has not been purchased at the price of too many empirical constant adjusted to fit the data. If the parameters are determined independently and fed into the final model as fixed constant not to be further adjusted, then we can have a fair degree of confidence in the data and in the model. Both model and data have their own integrity the former in the relevance and clarity of its hypotheses and the rigour and appropriateness of its development, the latter in the carefulness of the experimenter and the accuracy of the results. But these virtues do not only inhere in the possessors they also gain validity from the other…Thus the attitude of never believing an experiment until its confirmed by theory has as much to be said for it as that which never believes a theory before its confirmation by experiment. (emphasis mine)
In the comparison of theory with experiment an array of statistical tools is available and should be used. One danger that is easy to overlook is the existence of hidden constancies that will give spurious values…The classic correlation between the intelligence of the children and the drunkenness of the parents which so confounded temperance societies years ago–until it was discovered that all the data came from schools in the east end of London–is another illustration of a data base too narrow to test a model.
As someone who works in the earth sciences, the indiscriminate use of statistics and purely empirical relationships is maddening, and that has spread to many other disciplines as well. The computer power we have at our disposal these days makes it too tempting to simply reduce “big data” and let us “tell us” what’s going on, but this can be a serious mistake without some kind of hypothesis–right or wrong–about what we are looking at.