mulhern_at_yocto

So, I spent a while working on a set of Python utilities of about a thousand lines altogether, and not surprisingly, I had a lot of commits, around eighty or so, by the time I was done. Of course, during the development process, those commits were useful, but now that I'm done, nobody should be forced to go through them all. They should get just a few commits, representing the logical divisions of the work, which they can then apply to the master branch.

Achieving this automatically is astonishingly easy. I have two branches, a master branch and my development branch. I switch back to master, pull the current version, and make sure it's in a good state. Then I make a new branch from the master branch, my development-final branch and switch to that. It is an exact copy of the master branch. Then, I merge my changes from the development branch into the development-final branch, but I add the --squash flag.

git figures out an important fact, which is that my development branch branched from the master at a particular point in the past. Therefore, all subsequent updates to the master branch and consequently to the development-final branch are totally acceptable and are not counted as fundamental differences between the development and development-final branch. The changes I merge into the development-final branch are exactly my own development and nothing else. Also, the changes are staged for commit, i.e., they are not yet committed. This means that I can unstage, restage, and commit in whatever way I like. Handy!

When this patch set is finally applied to the master I will be free to delete both the development and development-final branches.

Of course, what may often happen is that as soon as I've made the development-final branch and begun to work with it to get my patch set ready I realize that there is something not quite right about my development. In that case, I switch back to the development branch, and unless I've already unstaged or commited something my development-final branch simply becomes clean again. I can do my fixes in the development branch, and when I'm done switch to the development-final branch and do a merge --squash again. If I have unstaged or commited I'll have to do a little cleanup before I switch to the development branch, either by adding or by doing a reset --hard. But that is all second nature by now.

It's curious that a branch becomes dirty if you unstage something. In that case, if you try to switch to a new branch git will complain saying that your unstaged changes will be lost. But if all your changes are staged and you switch, git will not complain, but it will lose your changes. I don't yet quite understand the rationale.

Two years ago I used git for a bit because GitHub uses git and GitHub was convenient for my needs at that time.

This allowed me to make one observation of some societal interest. People who use GitHub may not even know that "GitHub" is a portmanteau word, i.e., git + hub. In fact, they may not know about the existence of git at all. To some GitHub is just the same as DropBox, only, in some indefinable and mysterious way, cooler, and only for text. I don't know how that happened and how many people actually use it that way, but that some do is a fact.

Now, I knew that, in certain very important ways, git was not just like Subversion. It was, I was told, one of the new kind of distributed VCSs. Of course, in a perfect world I would have rapidly learned to exploit everything that was different, and perhaps better, about git. But I was very busy and git could be used just like Subversion and so...that's what I did.

Back when I was involved in a research project w/ multiple developers using Subversion I noticed a kind of pattern. A colleague would commit a change that affected a whole bunch of files and changed a lot of lines. Then, they would commit a bunch of small changes, one after the other, that really should have gone in with the initial commit. Often the note attached to the commit was just something very basic like "Should have gone in with previous." of "fixes a small bug in previous commit." or something like that. I often thought how nice it would be if all those little emendations could somehow be attached or combined to the original commit that got it all started. But in Subversion, you can't really do any sort of combining of commits without having superpowers and doing lots of fancy stuff. That kind of thing is just not part of the expected Subversion workflow.

Well, in git you can split, combine, and otherwise edit your commits freely using the git rebase command. This is because git divides your actions into two phases, whereas Subversion just has a single phase. In git, when you commit, you have only completed the first phase and you've only affected the copy of the code that you have. You haven't yet "pushed" your changes to some remote repository (which probably exists so that it can be shared with someone else). So you can rearrange all the commits in your private copy in many ways before you expose your work to the rest of the world by pushing it.

On the other hand, Subversion's commit combines git's commit and git's push into a single action.

How, really, will this difference change how I develop code and inflict it on the outside world? Will it make things better or just more nerve-wracking? We shall see.

S	M	T	W	T	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Entries tagged with git

Getting Better with Git

git rebase

Profile

September 2013

Syndicate

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags