Thus spoke Andy yesterday, when I presented him the fruits of two very short but really productive bursts of work, one on Sunday (finally having carved out a morning’s worth of work from family hangout time) and another on the same Monday morning, squeezing all I could out of the hours before our 3pm meeting. In addition to the plots from the last week, here’s what I had come up with:

This was my first success in writing a function that plots a point as a rectangle with the aspect ratio reflecting the character state of character #2 (valve face aspect ratio) of the genus that point represents. I thought this was pretty cool and a first step in the “more data-rich plot” direction I had been hoping to head for so long. Next, I overplotted this with symbols indicating the state of character #60, the highest-order taxonomic character that divides diatoms into those whose structural pattern center is a ring-shaped annulus (centrics), versus those whose structural pattern center is a linear sternum (pennates). This maps quite congruently on the division between forms with mostly equiproportional versus elongate aspect ratios in valve view.

Finally, I altered the plotting function I had written to also reflect another character state, namely shape (elliptical, triangular, rectangular, etc), leading to an even more data-rich plot:

I did notice an elongate triangle near the center of the plot, which doesn’t make sense, and I eventually tracked down to a typo in my original matrix. I fixed the entry in the matrix (both the Numbers file and the .txt file), but haven’t re-run the whole analysis—I’ll have to do that before I generate the final-final figures for publication. Also, the horizontally elongate forms obviously make no sense—they’re the same aspect ratio as the vertical orientation, but they represent asymmetrical (ovate) forms, shapes I haven’t figured out how to plot.
Andy’s response, as the title of this post suggests, was very positive and encouraging. I didn’t get around to telling him just how little of the variance is actually captured by the PCO axes, but whatever. Who gives a crap. I’ll write about that, and the different ways of calculating it, when I write it up, end of story. Andy really seemed to like the pattern of gradual expansion of morphospace through time shown by the new-and-improved plots, as expected. While I voiced my doubt that this expansion was necessarily a real phenomenon—after all, there are also only 1 and 2 sites in the two Cretaceous time bins, respectively, versus many more in subsequent time bins—I also realized that this issue could also be addressed by subsampling. Taking random subsamples from the Plio-Pleistocene time bin to the size of the Cretaceous data probably will still cover a larger area of morphospace, albeit less densely, than occupied in the Cretaceous—and that would be some actual support for the contention that this pattern is real.
In any case, the probably most terrifying part of the conversation (and the most inspiring—perhaps?) was that Andy also said he thought it wasn’t too early to start writing this up. He suggested (but was quick to add that it was up to me, as it was my paper) Paleobiology, where he would envisage a 25 page manuscript excluding about half a dozen figures, the references, and (probably a large) supplement with raw data and extra stuff. He suggested I start with the methods and introduction. While I’m more than a little nervous at the thought of starting to write, not having had any time to think about the big picture yet, nor what the analysis is supposed to show, he is probably right. Thinking by writing might be the best way to go at this stage, to keep up momentum. But there is so much analysis left to be done! Here a list of what I’ve scrawled down from my meeting with Andy, and briefly beforehand:
- Figure out the PCO loadings—which characters contribute to which directions on the PCO plot
- Compare pairwise distances between genera in morphospace to distances on a phylogeny
- How do some ‘important’ characters map onto the morphospace?
- Compare the PCO morphospace to a morphospace derived from NMDS
- Calculate morphospace area occupied using Sofya’s “shrink wrap” algorithm
- Figure out how to plot ovate shapes on the morphospace with my drawShape function
- Add raphe and sternum/annulus to drawShape function
- Figure out other metrics of morphospace occupancy/disparity (average pairwise distance?)
- Compare morphology to diversity
- Finish 3D stacked plot showing emergence of particular characters through time (linking,…?)
Andy talked about the intersection of morphospace, diversity and phylogeny as being a good selling point and focus for the paper. This is cool, of course, but part of why it makes me nervous is that I’ve done some of the morphospace part, I haven’t done any of the diversity or the phylogeny yet. Yikes!
In any case, this was both a terrifying and exciting conversation to have, and I’ve definitely been grinding my teeth and feeling generally quite on edge since then. Still, I didn’t get much done today: for starters, I took almost two hours to help Jc in a conversation with Justin, who is coming up to his quals and freaking out—and I felt it was my time to give back for all the people who talked me off the ledge and helped me prepare when I was going through that unpleasant period. But I also wanted to take some time to record and reflect on what I’d done, what came out of the conversation with Andy, and do a bit of priotizing/organizing of tasks left to do before I throw myself back in. In addition to the tasks listed above, it makes sense to start thinking big picture, looking over some relevant literature, and put the whole thing in context—by writing.
So I think I want to set aside some time each day to read/write/think, versus some time to press ahead on the analysis. That way, I hope I can achieve/maintain some momentum on both fronts, let the two inform each other, and not get derailed by spending excess time and effort on either. Since the analysis produces the most tangible and visible results, perhaps the best way to go ahead is to get as much analysis done as possible before lunch every day, and as much reading/writing/thinking as possible in the afternoon?