Continuation of discussion of Gpuccio’s challenge from TSZ

This post is a continuation of a debate that has been going on for some time on TSZ – moved here because TSZ is having technical problems.

These are my most recent comments repeated.


Gpuccio

More to the point – I think I have another example which would give you reason to refine your dFSCI process if you want to preserve 100% specificity. Before I do the work let me check the function is acceptable:

“The string identifies for each month over a period of 120 months whether the London monthly mean high temperature is above or below long-term average.”  

As I have given you the function before working out the string you can see that I am prespecifying it!


Gpuccio

With the other two functions, instead, relying only on an explicit, non contingent property, the computation of dFSI would not change in the prespecified or postspecified case. The target space and the search space remain the same in both cases.

That’s false. For the other two examples if they were post-specified this would be something like taking the string, studying the papers it points to, and seeing what you can find that they had in common. As all papers have something in common (even if it is just a distinctive phrase somewhere in the text) then the probability of success is 100%. That’s why I suggest you simply amend the process to say no post-specified functions. Any function could potentially be post-specified.

But, to be complete, I would obviously ask what the period is (and in particular, if it is a future period or a past period whose values are already known), and what the long term average reference is.

I was thinking of the last 10 years – 2002 to 2012 – I could do a longer period but it would be tedious. I was going to use http://www.holiday-weather.com/london/averages/ for the averages. Although the values for 2002 to 2012 are known I was not going to use them to generate the string. That’s why I said “identify” rather than “predict”. I will not even look at the actual temperatures until after I have generated the string – although I won’t be able to resist checking it has worked when I have finished. The string will simply be a string of 120 bits with 1 for above average and 0 for below average. I realise you want 500 bits but that would be really tedious to look up all the data, so I hope 120 will be sufficient to prove the case.

About these ads

6 Responses to “Continuation of discussion of Gpuccio’s challenge from TSZ”


  1. 1 Petrushka November 2, 2012 at 1:59 pm

    I suspect one of my posts caused the problem at TSZ. Try deleting any of mine that are in ,moderation.

  2. 2 Mark Frank November 2, 2012 at 2:59 pm

    Gpuccio

    You are not thinking clearly enough about pre and post specification. All three cases define subsets of all possible papers – one through listing the five papers, the others throught the presence of certain keywords. Any of them can be prespecified by either giving the titles of the papers or the relevant keywords. Post specification is an inherently vague process in that you have to guess what rule was being used to find the function from the string. It is always possible to find a rule that makes it 100% certain to find a function and I guess that is the one I am assuming you would use. There are generally other rules available that reduce your chances of finding the function. It would be perverse to reduce the rule to “find this function” which indeed would make the probability of finding a suitable function identical to prespecification.

    So in the case of the five papers the 100% rule is “any five papers”. Of course you might have had a rule which said “any five papers with numbers less than 1000″ which would have been less likely to satisfy the string.

    In the case of a keyword it might be “find something they all have in common” which is the 100% rule. Or it might be “find a keyword they all have in common” which is somewhat less. You could reduce it to “find this specific keyword” which would indeed reduce the probability to the same as prespecification and make postspecification rather pointless. (I could do the same for the five papers). But you don’t have to. Postspecification is a flexible tool – in my Bayes calculation I assumed one would go for the 100% option as anything else seems rather daft.

  3. 3 Mark Frank November 3, 2012 at 7:11 am

    Gpuccio

    I am going to have to abandon my attempt to produce a binary string which idenitifies when London temperatures were above average. I can’t get the data I need consistently and accurately enough.

    It might be interesting to explain what I was trying to do.

    I was looking for two events A and B which satisfy these properties:

    • A happens if and only if B happens
    • A (and therefore B) happen on an unpredictable schedule
    • No living thing in involved with either A nor B
    • The schedule for A and B is publically available

    Under those conditions the string of when A happens (if long enough) would appear to have the function of identifying B, be complex, incompressible, digital and prespecified.

    I thought being above average temperature in London and being above average temperature in somewhere else very close woud satisfy these conditions but it is vital to have temperature records to high degree of accuracy and averages taken over the same periods. I can’t seem to find that data.

    Nevertheless I wonder if you agree that the conditions I set out would be a case of dFSCI which is not designed?

  4. 4 Mark Frank November 3, 2012 at 8:08 am

    Mung 253

    “Do you still think your strings exhibit dFSCI?

    Did you ever think they exhibited dFSCI, lol? (Just thought I’d ask.)”

    I am not sure. I created the examples to try and better define what dFSCI is. As a result I believe that if those examples are to be ruled out then dFSCI must have an additional condition:

    The function must be prespecified

    (which means that things like DNA and proteins do not have dFSCI as they are postspecified).

    “You also say that each string represents a specific set of papers.

    I take that to mean that all the papers identified by each string have something in common, other than the fact that they are available through PubMed.”

    No. I meant what I said. The function is represent exactly those papers. There was no implication they had anything in common – although I dare say it would be possible to find something they had in common. It is possible to do that for almost any set of things.

    “If they can be shown to specify any other function, then would you agree that your function is not objective?”

    No. Any string can be show to specify more than one string – even a string of DNA. That doesn’t make the function subjective

  5. 5 Mark Frank November 4, 2012 at 1:59 pm

    Gpuccio 258

    Thanks for you response. I must say I am surprised by what you wrote. You seem to be saying that the reason that the “above average temperature record” is not dFSCI is because you know its origin (natural vartion + necessity mechanism). This leaves us straight into the circularity argument again – because the whole point of dFSCI was to determine the origin. Imagine I was to present you the string without telling you the origin. That is the scenario we are talking about. You would then need to determine whether there is dFSCI and if it has conclude it was designed. If you cannot tell whether something has dFSCI without first knowing the origin its not much use for determining the origin!

    I don’t think the “above average temperature record” string has dFSCI for a completely different reason. It needs a prespecified function and I haven’t found one yet. As you say you can always find a postspecified function (that is why you need to rule them out). “B” – the second string of somewhere physically close was intended to provide that prespecified function – the one string could be used to predict the other. This is an empirical relationship based on our empirical knowledge that temperatures in locations that are physically close are very similar. However, as I say, I can’t get good enough temperature records.

  6. 6 Mark Frank November 5, 2012 at 6:41 am

    TSZ seems to be working again (thanks Lizzie) so I have reverted to replying there.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s





Follow

Get every new post delivered to your Inbox.

%d bloggers like this: