Siliceous and Delicious

The (Morphospace) Plot Thickens

kotrc - August 24, 2011 @ 1:05 pm · Research Journal, Timekeeping

Here, at long last, my very first visualization of the morphospace I’ve labored so long and hard to put together…

After having gotten well and truly stuck trying to understand what Boyce meant by transforming the dissimilarity matrix so the centroid was at zero, and where that statement came from in the Gower paper, I decided that it was all just slowing me down to much. So I re-ran the dissimilarity matrix algorithm so it would produce a full matrix (fast than writing up new code to reflect the existing half-matrix, and less liable to me making a mistake there!), which I had found out from reading the R function documentation was a required input to the cmdscale() function I’m using. Sent my dissimilarity matrix to the function, plotted up the resulting list of points, added the taxon names as labels, and hey presto! A product! Woot!

Now, figuring out what this means is going to be another task altogether. Right now this just looks like a random scattering of points with a bizarre outlier—Pseudorutilaria—which I’m not at all sure why it’s so far away from anything else. I’d like to label all these taxa with which ones are centrics and which ones are pennates, then further split into radial and multipolar centrics, araphids and raphids, to see if the major groups fall out separately on the space.

The weird outlier may have something to do with data quality along rows—i.e. it’s probably worth doing the analysis I showed in the last post, but for genera (i.e. rather than seeing how complete each character is in terms of valid coding, seeing how complete each genus is in terms of valid characters). That way perhaps I can rerun the morphospace analysis with a few “bad apples” out, if that’s what Pseudorutilaria is. It may of course be an indication that because of the garbagey quality of my character coding, the whole morphospace represents nothing. Garbage in, garbage out.

This is what the genus data quality looks like:

No genera have more than 80% valid genera. More than half have over 60% valid characters. So not fantastic. If I widen the net and also count “v” character states as invalid, it should look even worse.

And indeed, things are worse, but not substantially different. What this suggests, in any case, is that a) the quality is pretty bad, and b) for a few genera, it’s really bad—less than half the characters have valid states. These may be worth getting rid of. Or, at least re-run the analysis without them to see if it affects the results substantially.

The genera are (numbered from 1):

36 97 28 29 39 100 33 42 38 96 102 123 25 75 17

Cussia, Pseudoeunotia, Cladogramma, Clavicula, Cymatogonia, Pseudostictodiscus, Cosmiodiscus, Cymatotheca, Cymatodiscus, Pseudodimerogramma, Pyrgupyxis, Stephanogonia, Cestodiscus, Lisitzinia, Baxteriopsis.

Interestingly, Pseudorutilaria is not on the list. How complete is its character list? It’s row 99 in the matrix. But it has 72% valid characters, so it must actually be a legit outlier.

Of the other outlier-ish taxa on the plot above, Pseudoeunotia is on the list, but that’s about it. Hmmmm.

previous:: Let the Analysis Begin
next:: Negative Eigenvalues, Negative Eigenconfidence

Comments are closed.

The (Morphospace) Plot Thickens

Recent Comments

Hosted by:

Siliceous and Delicious

The (Morphospace) Plot Thickens

Tags

Recent Comments

Hosted by: