{"id":2051,"date":"2012-01-11T16:29:42","date_gmt":"2012-01-11T21:29:42","guid":{"rendered":"http:\/\/blogs.law.harvard.edu\/kotrc\/?p=2051"},"modified":"2012-01-11T16:30:55","modified_gmt":"2012-01-11T21:30:55","slug":"pickin-up-mo-mentum","status":"publish","type":"post","link":"https:\/\/archive.blogs.harvard.edu\/kotrc\/2012\/01\/11\/pickin-up-mo-mentum\/","title":{"rendered":"Pickin&#8217; Up Mo&#8217; Mentum"},"content":{"rendered":"<p>Have missed a couple of days of note-taking\u2014mostly due to fierce productivity. Finally cracked the % variance explained problem using a two-pronged approach, first by the ratio of eigenvalues to the sum of eigenvalues (or the trace of the Gower-transformed matrix, which are supposed to be the same thing), second by the sum of the\u00a0<em>r<\/em>-squared values of the correlation between squared distances in the original data matrix against the squared (Euclidean) distances in the PCO-space of the first\u00a0<em>n <\/em>principal coordinates.<\/p>\n<p>This fantastic success under my belt, I spent a couple of fairly agonizing days trying to understand the description of the PCO analogue to PCA axis &#8220;loadings&#8221; described by Foote in both his &#8217;95 and &#8217;99 crinoid papers. Eventually tracked down the reference he cites for the statistics he uses to calculate coefficients of association between the categorical characters in the original matrix (on a &#8220;nominal scale&#8221;) and the PCO scores on each axis (a continuous character, or one on an &#8220;interval scale&#8221;, as Siegel &amp; Castellan call it). The PCO scores need to be discretized, which is <em>very\u00a0<\/em>easily done with R&#8217;s <em>cut() <\/em>function.<\/p>\n<p>Anyway&#8230; As of Wednesday morning I feel that I&#8217;m on the brink of figuring this shit out, so I popped into Andy&#8217;s office first thing and let him know he wasn&#8217;t going to be getting my draft outline yet, because I am on a roll and want to see if I can crack this beast.<\/p>\n<p>After an <em>incredibly <\/em>frustrating but focused day, I was able to both implement the Cram\u00e9r coefficient, write a code that would calculate it and the p-value for each combination of PCO axes and (most difficult of all) generate a plot that summarizes the results:<\/p>\n<p style=\"text-align: center\"><a href=\"http:\/\/blogs.law.harvard.edu\/kotrc\/files\/2012\/01\/Rplot.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-2053\" src=\"http:\/\/blogs.law.harvard.edu\/kotrc\/files\/2012\/01\/Rplot-300x244.png\" alt=\"\" width=\"300\" height=\"244\" srcset=\"https:\/\/archive.blogs.harvard.edu\/kotrc\/files\/2012\/01\/Rplot-300x244.png 300w, https:\/\/archive.blogs.harvard.edu\/kotrc\/files\/2012\/01\/Rplot.png 550w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>The color scale shows the significance of each combination\u2014black is a p-value of 0 (very significant), white is a p-value of 0.05 (not as significant), and everything above 0.05 has been thrown out. The size of the circles shows the degree of association, bigger circles implying a stronger association (larger Cram\u00e9r coefficient).<\/p>\n<p>I need to add a legend to this, and maybe fix the outline color of the circles, but I&#8217;m done for today. Tomorrow I will try to sort the association pairs by Cram\u00e9r coefficient and make a table of the, say, 20 largest associations and what PCO axes they&#8217;re on, to get a sense for what determines the axes most. But for today, this has been a pretty huge accomplishment, and today is the first anniversary of our official city hall wedding, so I&#8217;m off!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Have missed a couple of days of note-taking\u2014mostly due to fierce productivity. Finally cracked the % variance explained problem using a two-pronged approach, first by the ratio of eigenvalues to the sum of eigenvalues (or the trace of the Gower-transformed matrix, which are supposed to be the same thing), second by the sum of the\u00a0r-squared [&hellip;]<\/p>\n","protected":false},"author":2222,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[14607,13584],"tags":[16233,6277],"class_list":["post-2051","post","type-post","status-publish","format-standard","hentry","category-research-journal","category-timekeeping","tag-morphospace","tag-motivation"],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/posts\/2051","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/users\/2222"}],"replies":[{"embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/comments?post=2051"}],"version-history":[{"count":6,"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/posts\/2051\/revisions"}],"predecessor-version":[{"id":2057,"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/posts\/2051\/revisions\/2057"}],"wp:attachment":[{"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/media?parent=2051"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/categories?post=2051"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/kotrc\/wp-json\/wp\/v2\/tags?post=2051"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}