Re: [Wikitech-l] SVN update on live sites to r43514, questions on branching

16 Nov 2008

Aryeh Gregor wrote:
...
  If you're going to correct someone's use of
technical terminology like
 that, maybe you should explain what you think the difference is?  To
 me, "branching" including the entire process of creating, maintaining,
 and re-merging branches, which seems to be precisely what Brion means. 
Branching is a more general term. An svn checkout, for example, is a
branch. Not the kind of named branch you create in the repository, but a
small, one-level branch, nevertheless.

Most branches created in Subversion are release branches and tags
(create-only, locked branches), that either don't require merge
tracking, or share only single commits between them.

Also, Subversion supports the idea of inner branches, that no other
version control (AFAIK) supports. Like:

   cd skins/
   svn cp Monobook.php MyOwnSkin.php
   svn ci -m "branched Monobook skin"
     /* make changes in MyOwnSkin.php */
   svn ci -m "changes to my skin"
     /* time passes */
   svn update
     /* changes to Monobook are received */
   svn merge $repo/skins/Monobook.php MyOwnSkin.php
   svn ci -m "merged changes from Monobook to MyOwnSkin"

...
  From reviewing that, it seems to still handle merging
as a diff/patch
 operation.  It doesn't copy over the commits, it only copies over the
 changes.  Therefore you inevitably get giant merge commits that
 clutter up commit histories and/or are impossible to actually review.
 Correct? 
Not really. To explain, I'd have to dig too much into the inner workings
of both Git and Svn. But in short, they both boil down to very similar
procedures. The difference is in how this information about the merge is
stored and interpreted. Also, since Svn was born as centralized, the
information registered and immediately available on svn log doesn't
match what a Git user would expect. It is possible (but still
cumbersome) to check the diffs in svn:mergeinfo properties, and extract
the same information very similar to what Git stores.

Also, every commit in Git has a unique SHA1 identifier, that is
perceived in different branches. In Svn, every commit is referenced by
where it occurred first (in the branch where it happens with the lowest
revision ID), then this information (branch path + revision id) is
copied into other branches svn:mergeinfo properties to indicate that
that commit was transplanted.

Note also that Svn merge tracking is more complex (in an algorithmic
point-of-view) than Git's and Mercurial's, since Svn supports detachable
directories and partial checkouts. This same extra complexity is what
enables the support for merge tracking of inner branchs I mentioned above.

...
  FWIW, here's a very detailed account of how to use
git svn and
 arguments on why it's better than SVN alone or SVK, from someone who
 switched from SVK to git: "I used to push strongly for SVK, but got
 brow-beaten by people who were getting far more out of their version
 control system than I knew possible until I saw what they were talking
 about." http://utsl.gen.nz/talks/git-svn/intro.html 
I'll take a look at it later. But I agree that using other DVCS with
less restrictions than SVK is better. But if you know how to use SVK
correctly, it plays nicer with the upstream Svn repository than the
git-svn bridge. That is from my own experience.

I'm suggesting SVK just because AFAIK MediaWiki is not yet planning to
move to distributed version control. When that day comes, then surely
there are much better tools than SVK.

...
  git not caring about directories is not a big deal (we
have rather few
 empty directories in trunk, and most of those could readily be dropped
 or have an empty index.html or something dumped in them). 
Directory version isn't just about empty directories. Directory
versioning also plays a big role for move/rename history tracking, merge
tracking and version control metadata (Svn properties).

Simple example: you rename a directory with a thousand versioned files
inside. You don't make any changes to the files, just the directory name.

In Svn, it is recorded as a single directory move operation, all files
register *no* change, since they actually weren't touched.

In Git, it is recorded as a thousand files removed, plus a thousand
files added, which is a *wrong* description of what happened. Also,
since Git doesn't record rename history upon commit (that happens only
once), this information has to be figured out later (that may happen
many times) using an exponential-complexity algorithm. It doesn't make
any sense when you think about it.

...
  Copy/move
 info might be a pain, although of course that would be solved by
 having more people use git instead of SVN.  :D 
But this problem of copy/move information being ignored/discarded have
already caused so much pain to Git that I still wonder why they insist
in calling that a feature. Demanding the user to pass -M, -C and
--find-copies-harder options to find and use the correct information,
causing an exponential-complexity algorithm to be performed many times
over the same information that could be simply stored in the commit
manifest. It sounds completely illogical.

...
  You evidently dislike git a lot. 
Sure. For many reasons (technical and ethical reasons). Kind off-topic
in this thread and this list.

...
  On the other hand, it has the clear disadvantage of
being less widely
 used and supported, I suspect, by this point (maybe I'm wrong there?). 
No, you are right. SVK is almost always forgotten. That means smaller
community, less documentation, less support, slower development, etc.

...
  And certainly I don't know of any current
developers who use it, 
I use it, with MediaWiki's repository. Using it, I can sync between MW's
repo, my local mirror, and my own public MW repo using it, something
that would be extremely difficult with git-svn.

...
  There are some things I like a lot about git that SVK
evidently
 doesn't support, like easily rewriting history for private commits. 
Rewriting history is something that SVK doesn't support by design. To
rewrite history means to lose information, and that is considered wrong
(even if you think that you won't need the overwritten history anymore).

That doesn't mean you can't *recreate* a new history as you like. You
rebranch from the point you want to recreate, do whatever changes you
want until the head of the former branch, then delete the former branch.
You have the same effect, but information is never lost forever. If you
find that you screwed up something trying to recreate the history, the
old history is still there to aid you.

...
  Other notable criticisms gleaned from the page I
mentioned above (SVK
 user who switched to git):

 "SVK claims on its home page to be distributed, but by everyone else's
 definition, it's not, because it's not decentralised - there's always
 an upstream. No, SVK merely offers disconnected operation. If I meet
 you in the middle of a cruise and we both have a mirror of a
 subversion repository, I just can't easily, natively share my local
 branch with you if we're both on SVK. " 
Not exactly true. It is true that SVK is optimized to work with upstream
repositories, but it is not true that you are unable to share local
branches without the *main* upstream.

You can have *multiple upstream repositories* in SVK. That may blow your
mind if you think that git-svn is the best bridge between your local
repository and the upstream Subversion repo.

SVK doesn't share branches directly with other SVK, but nothing blocks
you from setting up a temporary repository between you and your friend
that is on this cruise, replicate your local branch to this temporary
repository, and give him access to this temporary repository. It is just
a different way of developer-to-developer push that other DVCSes support.

It may be even simpler if you give him access directly to your internal
SVK depot, but you usually don't want him to have direct access to your
depot.

You have to know the tool you use. The problem is that very few people
know how to use SVK, very few people put the same effort they do into
learning Git, and take the easy path of talking bad about it.

...
  "git normally stores its repository information
under .git at the top
 level of your checkout. But everything's compressed and the filenames
 don't resemble the files in your checkout so grep -r and find etc
 don't hate you." 
SVK doesn't even create *any* bookkeeping directory, unless you order it
to do so. SVK checkouts are *completely* clean by default. So this is
even less of an issue for SVK.

...
  "I don't know about you but I was always
running into situations where
 my ~/.svk/config didn't match reality, and there were no breadcrumbs
 left in the checkout to do anything with it. I much prefer these
 floating repositories and there was some talk of adding them to SVK. " 
Because he didn't read the documentation. It was a design decision to
keep checkouts completely clean and have bookkeeping information stored
in a different private area. SVK provides commands to move checkouts,
and to relocate checkout paths if you move a working copy without using SVK.

And now (since 2.0 I think) SVK already supports floating working
copies, if you want.

Best regards,
Juliano.

-- 
Juliano F. Ravasi ·· http://juliano.info/
5105 46CC B2B7 F0CD 5F47 E740 72CA 54F4 DF37 9E96

"A candle loses nothing by lighting another candle." -- Erin Majors

* NOTE: Don't try to reach me through this address, use "contact@" instead.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] SVN update on live sites to r43514, questions on branching