Identifying Characters
ø
Carved out some time from the turkey day festivities to make some headway in preparation for meeting Andy tomorrow. My first goal was to get the plot to reflect the state of individual characters in the original matrix, i.e. showing by plot symbol which genera have raphes versus those that do not. I figured it would be important to reflect the color scheme of the polygons at z=0, which are colored by time bin, somehow, since the plot symbol color currently reflects that. Spent an incredible two hours trying to accomplish the seemingly simple task of placing a square with each time bin’s color next to the time bin’s label. Seemingly. In the end, the best I could do was this, and it took an idiotic amount of time and effort. Zoiks.
Anywho, now that I’ve done the most elementary part, on to the actual challenge. First, to choose a good first character to mess around with. Decided that character 60, which describes the shape of the structural pattern center (annular versus linear), is probably a good one. Figured out that by subsetting the original matrix m I could easily obtain a list of genus names for each character state, as in:
rownames(m[m[,”X60″] == 0,])
Where X60 is character #60, and the list returned contains the genus names of genera with state 0 for that character (i.e., centric diatoms). Now it should be relatively easy to identify those sets on the plot.
In the process of doing this, however, I noticed that the size of the list returned by unique(N$Genus) is smaller than the number of rownames in m. This suggests that there are genera I laboriously coded in the matrix that didn’t make it into the amended Neptune occurrence database. This is a stinking shame and an omission I need to fix, if only to ensure my hours of work weren’t totally wasted. A quick check (setdiff(rownames(m),as.character(unique(N$Genus))) identified the culprits as Kreagra, Microorbis, and Praethalassiosiropsis. Presumably I skipped over them while going through those idiotic, illogically arranged and spelling-mistake-infused Cretaceous papers.
Indeed, they were not in the updated Neptune file. So, it’s back to the papers. Yech. Well, they all turned out to be from site 693. Goodness knows how I missed them. Best not to ponder how many such glaring mistakes happened along the way to this dataset. Jeepers. I went ahead and added them in, anyway, and that seemed to solve the problem. I managed to get the basic plot-by-character thing working:
Now, this is the sort of plot that Sofya used in her thesis—she just went through a whole ream of characters and presented plots like this to carve up the space and get a sense for what’s where. I’m thinking there might be a somewhat more elegant way of doing that, but I’m not sure how yet. Maybe by drawing convex hulls around these areas, and overlaying them, or something.
Partly, the challenge is to identify what the interesting characters are, and plot those—interesting from the point of view of answering overarching research questions. Which is why I’m close to the point where I need to step back and think about the bigger picture, and what the questions are I actually want to start to think about and address with this “fabulous” data set.
- previous:
- CMYK FCUK
- next:
- “There’s a Real Contribution Here”



