{"id":1800,"date":"2011-08-24T13:05:47","date_gmt":"2011-08-24T17:05:47","guid":{"rendered":"http:\/\/blogs.law.harvard.edu\/kotrc\/?p=1800"},"modified":"2011-08-24T16:21:26","modified_gmt":"2011-08-24T20:21:26","slug":"the-morphospace-plot-thickens","status":"publish","type":"post","link":"https:\/\/archive.blogs.harvard.edu\/kotrc\/2011\/08\/24\/the-morphospace-plot-thickens\/","title":{"rendered":"The (Morphospace) Plot Thickens"},"content":{"rendered":"<p>Here, at long last, my very first visualization of the morphospace I&#8217;ve labored so long and hard to put together&#8230;<\/p>\n<p><a href=\"http:\/\/blogs.law.harvard.edu\/kotrc\/files\/2011\/08\/FullSpacePlot1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1801\" src=\"http:\/\/blogs.law.harvard.edu\/kotrc\/files\/2011\/08\/FullSpacePlot1.png\" alt=\"\" width=\"480\" height=\"480\" srcset=\"https:\/\/archive.blogs.harvard.edu\/kotrc\/files\/2011\/08\/FullSpacePlot1.png 480w, https:\/\/archive.blogs.harvard.edu\/kotrc\/files\/2011\/08\/FullSpacePlot1-150x150.png 150w, https:\/\/archive.blogs.harvard.edu\/kotrc\/files\/2011\/08\/FullSpacePlot1-300x300.png 300w\" sizes=\"auto, (max-width: 480px) 100vw, 480px\" \/><\/a>After having gotten well and truly stuck trying to understand what Boyce meant by transforming the dissimilarity matrix so the centroid was at zero, and where that statement came from in the Gower paper, I decided that it was all just slowing me down to much. So I re-ran the dissimilarity matrix algorithm so it would produce a full matrix (fast than writing up new code to reflect the existing half-matrix, and less liable to me making a mistake there!), which I had found out from reading the R function documentation was a required input to the <em>cmdscale() <\/em>function I&#8217;m using. Sent my dissimilarity matrix to the function, plotted up the resulting list of points, added the taxon names as labels, and hey presto! A product! Woot!<\/p>\n<p>Now, figuring out what this means is going to be another task altogether. Right now this just looks like a random scattering of points with a bizarre outlier\u2014Pseudorutilaria\u2014which I&#8217;m not at all sure why it&#8217;s so far away from anything else. I&#8217;d like to label all these taxa with which ones are centrics and which ones are pennates, then further split into radial and multipolar centrics, araphids and raphids, to see if the major groups fall out separately on the space.<\/p>\n<p>The weird outlier may have something to do with data quality along rows\u2014i.e. it&#8217;s probably worth doing the analysis I showed in the last post, but for genera (i.e. rather than seeing how complete each character is in terms of valid coding, seeing how complete each genus is in terms of valid characters). That way perhaps I can rerun the morphospace analysis with a few &#8220;bad apples&#8221; out, if that&#8217;s what <em>Pseudorutilaria <\/em>is. It may of course be an indication that because of the garbagey quality of my character coding, the whole morphospace represents nothing. Garbage in, garbage out.<\/p>\n<p>This is what the genus data quality looks like:<\/p>\n<p><a href=\"http:\/\/blogs.law.harvard.edu\/kotrc\/files\/2011\/08\/GenusQuality1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1805\" src=\"http:\/\/blogs.law.harvard.edu\/kotrc\/files\/2011\/08\/GenusQuality1.png\" alt=\"\" width=\"480\" height=\"480\" srcset=\"https:\/\/archive.blogs.harvard.edu\/kotrc\/files\/2011\/08\/GenusQuality1.png 480w, https:\/\/archive.blogs.harvard.edu\/kotrc\/files\/2011\/08\/GenusQuality1-150x150.png 150w, https:\/\/archive.blogs.harvard.edu\/kotrc\/files\/2011\/08\/GenusQuality1-300x300.png 300w\" sizes=\"auto, (max-width: 480px) 100vw, 480px\" \/><\/a>No genera have more than 80% valid genera. More than half have over 60% valid characters. So not fantastic. If I widen the net and also count &#8220;v&#8221; character states as invalid, it should look even worse.\u00a0<a href=\"http:\/\/blogs.law.harvard.edu\/kotrc\/files\/2011\/08\/GenusQuality2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1806\" src=\"http:\/\/blogs.law.harvard.edu\/kotrc\/files\/2011\/08\/GenusQuality2.png\" alt=\"\" width=\"480\" height=\"480\" srcset=\"https:\/\/archive.blogs.harvard.edu\/kotrc\/files\/2011\/08\/GenusQuality2.png 480w, https:\/\/archive.blogs.harvard.edu\/kotrc\/files\/2011\/08\/GenusQuality2-150x150.png 150w, https:\/\/archive.blogs.harvard.edu\/kotrc\/files\/2011\/08\/GenusQuality2-300x300.png 300w\" sizes=\"auto, (max-width: 480px) 100vw, 480px\" \/><\/a><\/p>\n<p>And indeed, things are worse, but not substantially different. What this suggests, in any case, is that a) the quality is pretty bad, and b) for a few genera, it&#8217;s really bad\u2014less than half the characters have valid states. These may be worth getting rid of. Or, at least re-run the analysis without them to see if it affects the results substantially.<\/p>\n<p>The genera are (numbered from 1):<\/p>\n<p>36 \u00a097 \u00a028 \u00a029 \u00a039 100 \u00a033 \u00a042 \u00a038 \u00a096 102 123 \u00a025 \u00a075 \u00a017<\/p>\n<p><em>Cussia, Pseudoeunotia, Cladogramma, Clavicula, Cymatogonia, Pseudostictodiscus, Cosmiodiscus, Cymatotheca, Cymatodiscus, Pseudodimerogramma, Pyrgupyxis, Stephanogonia, Cestodiscus, Lisitzinia, Baxteriopsis.<\/em><\/p>\n<p>Interestingly, <em>Pseudorutilaria<\/em> is not on the list. How complete is its character list? It&#8217;s row 99 in the matrix. But it has 72% valid characters, so it must actually be a legit outlier.<\/p>\n<p>Of the other outlier-ish taxa on the plot above,\u00a0<em>Pseudoeunotia <\/em>is on the list, but that&#8217;s about it. Hmmmm.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Here, at long last, my very first visualization of the morphospace I&#8217;ve labored so long and hard to put together&#8230; After having gotten well and truly stuck trying to understand what Boyce meant by transforming the dissimilarity matrix so the centroid was at zero, and where that statement came from in the Gower paper, I [&hellip;]<\/p>\n","protected":false},"author":2222,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[14607,13584],"tags":[16233],"class_list":["post-1800","post","type-post","status-publish","format-standard","hentry","category-research-journal","category-timekeeping","tag-morphospace"],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/posts\/1800","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/users\/2222"}],"replies":[{"embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/comments?post=1800"}],"version-history":[{"count":4,"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/posts\/1800\/revisions"}],"predecessor-version":[{"id":1804,"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/posts\/1800\/revisions\/1804"}],"wp:attachment":[{"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/media?parent=1800"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/categories?post=1800"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/tags?post=1800"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}