You are viewing a read-only archive of the Blogs.Harvard network. Learn more.

Rock Bottom, Already?

1

The last twenty-four hours have been a challenge. When I settled down to work after dinner last night, it was time to go from describing the general trends in Figure 1 to identifying the particular characters that contribute to the first three PCO axes. To my utter shock, I found that, again, the characters showing up were ridiculous—clearly marginally important characters that were hitting high correlations with the axes and high statistical significance because they are valid in only a few genera, or only have a few valid states.

This was a huge blow to my confidence, having just spent an entire week recovering from the initial setback from the whole culling choice fiasco. I almost lost it entirely—that deep aching feeling in my chest came flooding in and I could feel myself spiraling down into utter despair. All those cumulative emotions of disappointment and failure from the years of aborted projects came rushing back, and it was not pleasant. I managed to regain my composure enough to decide to let it go for the day, go through a mindfulness meditation, and fall asleep.

Then, I lost a good couple of hours in the day today when my mother called and we ended up in a rather explosive argument, though thematically disconnected probably not causally unrelated to yesterday’s revelations and today’s worries. This put even more of damper on the day.

I took a walk and some fresh air, and was eventually able to pull myself together and start to try to tackle the problem. One of the characters was in the category I had previously determined as “uninformative”, i.e. with less than two character states having more than 1 valid entry. So I went back once more to write (yet more!) code to automatically remove such characters. I think it now works—at least I hope. Now to see if this improves the list of characters that crops up as “most associated” with the first few axes.

So, I ran the new filtering and distance matrix calculation, hoping to have finally rid myself of the “uninformatives”, and now depressingly (or actually hilariously, at this juncture) found my marginal Cramér sums in Figure 1 to be the greatest on PCO axis 10 or so, not the first one…

This is utterly idiotic, because it makes it look like that axis (I guess it’s PCO 11) captures the most variance, not PCO 1. Well, perhaps there are other problems, too. The key question is what are the top few characters in axes 1-3. They were idiotic ones last night, which caused me so much grief—and it also brought up the problem of how to choose the most contribution to the axes—is it the characters with the most significant association or the characters with the strongest association, i.e. the largest Cramér values? It seemed to me that Foote looked at significance only, and indeed the results were even more idiotic when I used Cramér last night. But that raises the question—why even plot Cramér on this idiotic figure, if I’m not going to use it?! Oh, the despair.

The biggest Cramér values with p < 0.05 on PCO 1:

  • X123, the relative thickness of raphe sides (an utterly minor character that has just 31 valid states, of which 28 are the zero state)
  • X60, the shape of the structural pattern center (this is actually a really good one)
  • X34, the shape of the mantle in cross section (I guess this is a reasonable one)
  • X26, the angle between mantle and valve face (umm, OK, similar to the above, I suppose)
  • X102, the location of labiate processes (what?! this one is totally ridiculous… I think)
  • X12, the general topography of the valve face (reasonable—this is overall shape)
  • X90, marginal ribs (silly)

On PCO 2:

  • X24, the shape of the central elevation (this is just bizarre—why this character? It’s subsidiary to the presence/absence of a central elevation, I just don’t get it)
  • X21, the shape of the apical elevation summit (this also seems like an irrelevant detail in the overall morphology)
  • X34, X60, X102, X26 as above
  • X114, the extent of the raphe (hmm… but why not the raphe character itself?)
  • X51, presence of a distinct central area (OK)

And PCO 3:

  • X46, whether rays are asymmetric (this is an extremely arcane one, only valid in 4 genera, and thereby only barely passing the “uninformative” criterion; totally ridiculous)
  • X114, as above
  • X28, whether the margin acts as linkage (this seems reasonable)
  • X121, raphe keel (hmm, seems esoteric—very detailed, only applies to 32 taxa)
  • X72, pore openings at inside (OK, this seems fairly universal)
  • X68, pore size (fine, I can buy this one)
  • X67, uniformity of pore size (ditto)
  • X76, pseudoloculate pores (I should probably take this one out, since it’s not strictly a morphologically agnostic character)

Well. This is not terribly encouraging. There are a few characters in there that describe overall shape or are widely shared characters—but a lot of characters with high associations are just esoteric things that apply to a few things. That isn’t satisfying, because I’d expect the most important characters to be stuff like the outline shape, whether it has a raphe or not, whether it has apical elevations, etc. Major things. Presumably this is because it’s just statistically easier to have a strong associations with the PCO axes post hoc if you only have a few valid entries. In theory, the p-value should account for that, though, so maybe I’m just looking at the wrong thing, and Foote had it right just looking at the p-values?

These characters have the lowest p-values in PCO 1:

  • X35, warts or plaques on mantle (this is utterly crazy, why should that matter?! It has 109:11 in 0:1 states, so not easy to exclude)
  • X43, external costae or ribs (also doesn’t seem reasonable)
  • X111, a fascia (esoteric, only applies to raphids)
  • X39, brim (hmph)
  • X85, apical pseudoseptum (also not what I’d expect)
  • X1, outline shape (HALLELUJAH! this one I would expect)
  • X5, dorsoventrality (hmm)
  • X123, raphe orientation, again

PCO 2 p-value champs:

  • X114, raphe extent again
  • X76, pseudoloculate bullshit again
  • X21, apical elevation summit shape again
  • X32, mantle symmetry (bullshit)
  • X43, see above
  • X27, ornament at mantle edge (at last, ONE character I might expect to see)
  • X69, pore shape (meh)

And with PCO 3, no surprises:

  • X12, general topography again, OK, fine
  • X46, that idiotic ray symmetry character again
  • X121, raphe keel again
  • X22, presence/absence of central elevation (OK, seems reasonable!)
  • X28, rim ornament again, OK
  • X114, raphe extent again

So, essentially, many of the characters I would think would be important—and which LOOK like they are important from just inspecting the PCO 1-2 plot, are not actually on the top 5 or top 10 lists of either significance or strength of association on those axes, instead, there’s a bunch of esoteric weirdo characters.

So what this basically tells me is that this whole Figure 1 exercise was a bullshit waste of time—my VERY FIRST HUNCH about empirical morphospaces based on many categorical characters was ABSOLUTELY SPOT ON: they’re stupid things to do because you DON’T KNOW WHAT THE AXES MEAN, and this whole statistical rigmarole has just proven that.

So, perhaps what I can write up in my paper is exactly that. Don’t expect these axes to mean what they mean, because they’re going to correlate with weird things you didn’t expect. And probably, I’ll just have to show how the IMPORTANT characters (chosen by me, of course, because bullshit science doesn’t work the way think it works, no, it seems to always go ass backwards from investigator to data, at least the shitty ass science I have somehow found myself doing) and how they plot onto the PCO 1-2, 2-3 morphospace.

God. This is so, so demotivating and utterly sad.

I’m calling it for tonight, off for dinner to weep some salty, salty tears into my chicken pot pie.

Tomorrow: start tracking time with TimeSink.

previous:
Chew(-ing-my-arm-off)sday
next:
I Shall Overcome

1 Comment

  1. Beau

    February 2, 2012 @ 10:45 am

    1

    Hey Ben

    So sorry to read of all the troubles you’ve been discovering. Can’t pretend I fully understand the problem, but I understand enough to see why you’re despairing about the whole thing. Maybe, as you suggest, a critical framing is indeed the right one for this paper? Either way, my suggestion is to echo Mark Williams: take it easy on yourself.

    This is most assuredly NOT bullshit science. This is as good as science gets, if we take Wikipedia at its word: you’ve been systematic, you’ve tested theories and assumptions every step of the way, and now – while the result isn’t to your liking – you’re going to report on the whole exercise. I know it seems crazy and maybe even impossible given your level of investment in a certain kind of outcome, but if there’s any way you can step back from all this and recognize the tremendous diligence, honesty and integrity you’ve displayed at every step of this work – I feel that you’d be proud. I certainly would be. I routinely hack together pieces of work which are terrible – not in terms of what they show, but in the process by which they were constructed. I’m ashamed of these efforts. You, on the other hand, need not be.

    In other words, far from demotivating, this whole exercise just proves that you’re an excellent scientist unfortunately saddled with a tremendously difficult subject area. If there’s any chance you can ignore that gnawing, doubting voice and just get what you’ve done out on paper, I wager you’ll get out of the slump and be able to look back on this and see it for what I see it as – and, I imagine, what your committee will see it as: evidence of a tremendously smart and inquiring mind doing his damndest to solve a problem to the best of his ability. Whether the solution is desirable or not, the getting there is something you can bank (and something which will be bankable, once you get out in the world and let yourself attack other kinds of problems).

    Anyhoo, just my 2 cents. Let me know if you need to chat at all – at your service.

    B