{"id":187,"date":"2022-05-23T14:14:43","date_gmt":"2022-05-23T05:14:43","guid":{"rendered":"http:\/\/blogs.harvard.edu\/adamnoto\/?p=187"},"modified":"2022-05-23T14:14:43","modified_gmt":"2022-05-23T05:14:43","slug":"how-to-merge-a-subfolder-git-into-its-parents-git-repo","status":"publish","type":"post","link":"https:\/\/archive.blogs.harvard.edu\/adamnoto\/2022\/05\/23\/how-to-merge-a-subfolder-git-into-its-parents-git-repo\/","title":{"rendered":"How to merge a subfolder git into its parent&#8217;s git repo"},"content":{"rendered":"<p>Since years ago I have this centralized repo where I kept notes, code snippets, diagrams, lab codes and basically everything I have learned that I think will be valuable. I do that because I know I would like to come again someday there and regain knowledge quickly. I find this organization to be extremely helpful for me.<\/p>\n<p>So, naturally, when I took the <a href=\"https:\/\/cscie95.dce.harvard.edu\/spring2022\/index.html\">CSCI-E95 last semester<\/a>, which was awesome, I wanted to have whatever I will learn to be inside that centralized repo as well. And of course, the code artifact is of utmost importance.<\/p>\n<p><!--more--><\/p>\n<p>However, the class has its own working repo where us students will submit our works to. So, even if I pull that class repo into my primary repo, the class repo will always be a distinct repo by its own right. That is, even if my folder looks like this:<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">\r\n\r\nroot\r\n\u251c\u2500\u2500 labs\r\n\u251c\u2500\u2500 languages\r\n\u251c\u2500\u2500 ...\r\n\u2514\u2500\u2500 classes\r\n    \u2514\u2500\u2500 hes\r\n        \u2514\u2500\u2500 e95-spring-2022-adamnoto\r\n<\/pre>\n<p>The <code>e95-spring-2022-adamnoto<\/code>\u00a0folder lived on its own repo, detached from whatever the <code>root<\/code>\u00a0folder&#8217;s repo is.<\/p>\n<p>How should I merge it? &#8230;that&#8217;s what I wondered.<\/p>\n<p>The easiest thing I can do, is to simply remove the <code>.git<\/code>\u00a0folder inside the E95 repo and merge that into my main repo.<\/p>\n<p>But then, I will lose my carefully crafted commits history:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-188\" src=\"http:\/\/blogs.harvard.edu\/adamnoto\/files\/2022\/05\/Screen-Shot-2022-05-23-at-11.25.33.png\" alt=\"Commits history of my E95 repo\" width=\"761\" height=\"154\" srcset=\"https:\/\/archive.blogs.harvard.edu\/adamnoto\/files\/2022\/05\/Screen-Shot-2022-05-23-at-11.25.33.png 761w, https:\/\/archive.blogs.harvard.edu\/adamnoto\/files\/2022\/05\/Screen-Shot-2022-05-23-at-11.25.33-300x61.png 300w\" sizes=\"auto, (max-width: 761px) 100vw, 761px\" \/><\/p>\n<p>I can as well make the E95 repo to be a submodule of the parent repo. But, that&#8217;s also not that desirable as the E95 repo&#8217;s is not in my &#8220;domain.&#8221; Although I believe I will forever have the access to the repo, or that I can ask to regain the access if I lost it for some reason; the fact stays that I completely will have no prime control over the source repo. What if someone mistakenly force-pushed something and deleted everything, and I have no backup? My hard work may forever be gone. (Yeah, I do have a backup; but still! ?)<\/p>\n<p>So naturally, I want to:<\/p>\n<ul>\n<li>Merge the E95 repo into a repo owned and controlled by me<\/li>\n<li>Retain the commits history<\/li>\n<\/ul>\n<p>I have searched the internet about how to achieve it, and came across this\u00a0<a href=\"https:\/\/markvanlent.dev\/2013\/11\/02\/merge-a-separate-git-repository-into-an-existing-one\/\">awesome posting<\/a>\u00a0that did the trick.<\/p>\n<p>Let&#8217;s do it step by step.<\/p>\n<h2>First, clone the source repo<\/h2>\n<p>Exactly like what&#8217;s written in the post: first, I cloned the source repo into a <code>\/tmp<\/code>\u00a0folder, so I ended up with\u00a0<code>\/tmp\/e95-spring-2022-adamnoto<\/code>. Then, I <code>cd<\/code>\u00a0into it, so: <code>cd \/tmp\/e95-spring-2022-adamnoto<\/code>.<\/p>\n<p>Surely you will have a different repo with a different name. But the point is to clone it into, let&#8217;s just say, a folder under <code>\/tmp<\/code>\u00a0and then <code>cd<\/code>\u00a0into that repo&#8217;s folder.<\/p>\n<p>Also, you may want to remove the <code>origin<\/code>\u00a0remote channel, so that we won&#8217;t commit anything into the source repo by mistake.<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">\r\n\r\ngit remote remove origin\r\n\r\n<\/pre>\n<h2>Rewrite commits history on the source repo<\/h2>\n<p>Next, we use <code>git filter-branch<\/code>\u00a0to rewrite the commits from that source repo. I want my source repo, when merged into the primary, centralized repo, to be located under this folder: <code>classes\/hes\/<\/code>. So, I should be able to find my\u00a0<code>e95-spring-2022-adamnoto<\/code> from that subfolder inside my primary repo.<\/p>\n<p>To do that, I issue <code>git filter-branch<\/code>\u00a0to rewrite the commits from within the E95 repo I cloned into the <code>tmp<\/code>\u00a0folder earlier.<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">\r\ngit filter-branch --index-filter \\\r\n'git ls-files -s | sed &quot;s-\\ -&amp;amp;classes\\\/hes\\\/e95\\-spring\\-2022\\-adamnoto\/-&quot; |\r\nGIT_INDEX_FILE=$GIT_INDEX_FILE.new \\\r\ngit update-index --index-info &amp;amp;&amp;amp;\r\nmv &quot;$GIT_INDEX_FILE.new&quot; &quot;$GIT_INDEX_FILE&quot;\r\n' HEAD\r\n<\/pre>\n<p>The command above is slightly different from the original post. For example, instead of <code>\\t<\/code>\u00a0in <code>sed<\/code>\u00a0I issued a real tab by doing <code>Ctrl+V<\/code>\u00a0then pressing the <code>tab<\/code>\u00a0key, to insert a real tab.<\/p>\n<h2>Clone the destination repo<\/h2>\n<p>Yes! You heard that right. Although you may have the destination repo somewhere already, please clone a new inside the same <code>tmp<\/code>\u00a0folder to make things simple. So by now, we should have both the source repo and the target, or the destination repo inside the <code>tmp<\/code>\u00a0folder.<\/p>\n<h2>Merging from the source to the target repo<\/h2>\n<p>It&#8217;s time to merge our source repo into the destination repo. First, we must\u00a0<code>cd<\/code>\u00a0into the folder we cloned our destination repo earlier. Then, add the source repo as a remote:<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">\r\n\r\ngit remote add -f source \/tmp\/e95-spring-2022-adamnoto\r\n\r\n<\/pre>\n<p>Of course, you will want to change the command above accordingly as the source folder will likely be different.<\/p>\n<p>Then, still within the same directory (that is, the destination repo) we perform <code>git merge<\/code>\u00a0:<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">\r\n\r\ngit merge --allow-unrelated-histories source\/master\r\n\r\n<\/pre>\n<p>The <code>--allow-unrelated-histories<\/code>\u00a0is needed on my version of <code>git<\/code>. This might not be necessary if you&#8217;re using a different platform or an older <code>git<\/code>, or even perhaps a newer <code>git<\/code> app.<\/p>\n<p>That&#8217;s it! Now we can remove the <code>source<\/code>\u00a0using <code>git remote remove source<\/code>\u00a0and then we can push our repo.<\/p>\n<p>In my own experience, I did not even need to push force. I also had no conflicts to resolve. Your mileage may vary but I think that should be the case for you too.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Since years ago I have this centralized repo where I kept notes, code snippets, diagrams, lab codes and basically everything I have learned that I think will be valuable. I do that because I know I would like to come again someday there and regain knowledge quickly. I find this organization to be extremely helpful [&hellip;]<\/p>\n","protected":false},"author":10207,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6317],"tags":[3772],"class_list":["post-187","post","type-post","status-publish","format-standard","hentry","category-engineering","tag-git"],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/archive.blogs.harvard.edu\/adamnoto\/wp-json\/wp\/v2\/posts\/187","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/archive.blogs.harvard.edu\/adamnoto\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/archive.blogs.harvard.edu\/adamnoto\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/adamnoto\/wp-json\/wp\/v2\/users\/10207"}],"replies":[{"embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/adamnoto\/wp-json\/wp\/v2\/comments?post=187"}],"version-history":[{"count":17,"href":"https:\/\/archive.blogs.harvard.edu\/adamnoto\/wp-json\/wp\/v2\/posts\/187\/revisions"}],"predecessor-version":[{"id":293,"href":"https:\/\/archive.blogs.harvard.edu\/adamnoto\/wp-json\/wp\/v2\/posts\/187\/revisions\/293"}],"wp:attachment":[{"href":"https:\/\/archive.blogs.harvard.edu\/adamnoto\/wp-json\/wp\/v2\/media?parent=187"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/adamnoto\/wp-json\/wp\/v2\/categories?post=187"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/adamnoto\/wp-json\/wp\/v2\/tags?post=187"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}