This post is a continuation of a debate that has been going on for some time on TSZ – moved here because TSZ is having technical problems.
These are my most recent comments repeated.
More to the point – I think I have another example which would give you reason to refine your dFSCI process if you want to preserve 100% specificity. Before I do the work let me check the function is acceptable:
“The string identifies for each month over a period of 120 months whether the London monthly mean high temperature is above or below long-term average.”
As I have given you the function before working out the string you can see that I am prespecifying it!
With the other two functions, instead, relying only on an explicit, non contingent property, the computation of dFSI would not change in the prespecified or postspecified case. The target space and the search space remain the same in both cases.
That’s false. For the other two examples if they were post-specified this would be something like taking the string, studying the papers it points to, and seeing what you can find that they had in common. As all papers have something in common (even if it is just a distinctive phrase somewhere in the text) then the probability of success is 100%. That’s why I suggest you simply amend the process to say no post-specified functions. Any function could potentially be post-specified.
But, to be complete, I would obviously ask what the period is (and in particular, if it is a future period or a past period whose values are already known), and what the long term average reference is.
I was thinking of the last 10 years – 2002 to 2012 – I could do a longer period but it would be tedious. I was going to use http://www.holiday-weather.com/london/averages/ for the averages. Although the values for 2002 to 2012 are known I was not going to use them to generate the string. That’s why I said “identify” rather than “predict”. I will not even look at the actual temperatures until after I have generated the string – although I won’t be able to resist checking it has worked when I have finished. The string will simply be a string of 120 bits with 1 for above average and 0 for below average. I realise you want 500 bits but that would be really tedious to look up all the data, so I hope 120 will be sufficient to prove the case.