Talk:Practical OSS Exploration - Getting the Code

From Teaching Open Source

(Redirected from Talk:Getting the Code)
Jump to: navigation, search

[edit] IRC explanation of VCS to novices

Explanation of version control to non-CS majors at Allegheny college who were completely new to the concept.

00:45 < morinoa> what is git?
00:45 < ke4qqq> it's a version control system - like CVS or subversion, except distributed and lightweight
00:46 -!- rbergeron [1002@terrapin.fischetti.com] has joined #allegheny
00:47 < ke4qqq> and fast
00:47  * rbergeron waves
00:47  * ke4qqq realizes that was a horrendous explanation
00:47  * ke4qqq waves back
00:47 < morinoa> a version control system? sorry, I'
00:48 < morinoa> I'm so unfamiliar
00:48 < ke4qqq> rbergeron: can you help here.... I suck at explaining dvcs
00:49 < morinoa> does it really matter? 
00:49 < morinoa> unless it is vital, please don't worry that I don't understand
00:49 < ke4qqq> morinoa: it will to you
00:49 < morinoa> ok, that's fair enough
00:50 < ke4qqq> and that's really what makes this soooooo ugly - not only do you have to grok html, but you have to grok git, and a few 
                other things
00:50 < rbergeron> ke4qqq: did you make sure everyone groks grok yet :)
00:50 < morinoa> grok? 
00:50 < ke4qqq> lol
00:51 < ke4qqq> it's a heinleinism
00:51 < ke4qqq> which is great reading incidentally
00:51 < ke4qqq> To grok is to share the same semiosphere or line of thinking with another physical or conceptual entity
00:51 < ke4qqq> but really you can just say that you understand something when you grok it
00:51 < rbergeron> or for those of us who speak english
00:51 < rbergeron> you understand
00:51 < rbergeron> :)
00:51 < morinoa> ok...
00:52 < morinoa> I grok
00:52 < ke4qqq> lol
00:52 < rbergeron> let me see if i can explain revision control
00:52 < morinoa> ok lol I understand haha
00:52 < rbergeron> before we talk about the "distributed" portion
00:52 < rbergeron> go to this page
00:52 < rbergeron> http://en.wikipedia.org/wiki/Revision_control
00:52 -!- mether [~Rahul@115.242.39.226] has joined #allegheny
00:53 < rbergeron> see how there are tabs at the top of the page - article, discussion, edit this page, history?
00:53 < rbergeron> click on the history tab
00:53 < rbergeron> this is probably the most basic form of version control
00:53 < rbergeron> every time someone makes an edit to a wikipedia page
00:53 < rbergeron> it's logged here in the history page
00:53  * ke4qqq never thought of using wiki as an example
00:54 < rbergeron> there are multiple versions - a new version, essentially, every time someone edits the page.
00:54 < rbergeron> they do this in wikis for multiple reasons - you might haphazardly enter / delete something incorrectly and save it
00:54 < rbergeron> or - more commonly - someone will come in and delete a page for malicious purposes, or to add viagra spam, etc.
00:55 < rbergeron> you can click on an older version and revert back to it.
00:55 < rbergeron> or - alternately - compare multiple versions side by side to see what is different about them.
00:55 < rbergeron> do you see what i'm talking about?
00:56 < rbergeron> this is no different from how source code works - except source code doesn't necessarily have a pretty interface.
00:56 < ke4qqq> better yet does it make sense
00:57 < ke4qqq> Robyn: the command line IS pretty :) 
00:57 < rbergeron> with source code - rather than looking at it, you check out pieces from a centralized place
00:58 < morinoa> this is good! i totally understand
00:58 < rbergeron> and THEN look at it - and you can add to it, etc.
00:58 < rbergeron> and then check it back in - marking / notating it with what you did, etc.
00:58 < rbergeron> every checkin creates a new version number
00:58 < morinoa> aha...handy
00:58  * ke4qqq knew he invited rbergeron for a reason
00:58 < rbergeron> which is great for if multiple people are working on things at the same time 
00:59 < rbergeron> because they may see - ohh, i checked out version 775 - but now i want to check in, and it's at version 778 - i should 
                   make sure nobody else has been doing stuff that i might have hacked on as well
00:59 < morinoa> definitely
01:00 < rbergeron> which is an improvement over wiki, i might add - you can definitely have collisions sometimes with multiple people in a 
                   wiki
01:00 < rbergeron> :)
01:01 < rbergeron> so for code, it's great - also, in a DVCS or VCS - you can set things up so that only certain people have permissions to 
                   check code in or out
01:01 < rbergeron> actually, i should just say in"
01:01 < ke4qqq> yeah - no blocking if it's 'out'
01:01 < ke4qqq> though I think some old vcss did that
01:01 < rbergeron> you want people to be able to check out code to look at it - you don't want everyone to necessarily have permissions to 
                   "fix it" if they want to
01:02 < rbergeron> usually in those situations - people who want to add more code might post a patch to a mailing list - showing what 
                   they've added / changed
01:02 < rbergeron> until they've gained a level of trust with whoever the maintainer/owner of the code is
01:02 < rbergeron> "we know your'e not going to screw things up - you keep doing good work"
01:02 < rbergeron> that kind of thing
01:02 < rbergeron> the other thingis - with code - you'll hear people refer to things like trees and branches
01:03 < morinoa> ok...
01:03 < morinoa> obviously these are like "grok." a different language 
01:03  * rbergeron wonders where her bf is since he could really explain this better than i can
01:03 < rbergeron> yeah
01:03 < rbergeron> so - let's say you're a developer
01:03 < rbergeron> a lot of times you'll be working on multiple versions of the same code
01:03 < rbergeron> ie:
01:04 < rbergeron> you have a version that is already out - sometimes you have urgent fixes that need to happen
01:04 < rbergeron> security issues
01:04 < rbergeron> that kind of thing
01:04 < rbergeron> you might also be working on "the next version" - more hardcore development
01:04 < rbergeron> things aren't as stable
01:04 < rbergeron> but in the end - they are all going to be more or less the same product, at some point
01:04 < morinoa> ok, so I understand this revision idea, now what do I do with it?
01:04 < rbergeron> so in a DVCS - you might see people refer to things as a development branch
01:05 < rbergeron> or a stable branch
01:05 < rbergeron> ke4qqq: yeah, what's up with me explaining this :)
01:05 < ke4qqq> you splain it well

[edit] Analysis of VCS explanation

(Written by jadudm April 14, 2010)

This conversation is incredibly problematic on a number of levels. This is a critical analysis, but it is not intended to be personal. I suspect this conversation has been played out in many channels many times, but I claim it was (in this case) exactly the wrong conversation.

The context is that 40 students from Allegheny College -- first-year students enrolled in a "Freshman Seminar," primarily about writing, speaking, and communication -- were thrown into the Fedora project leading up to Release 13. Specifically, they engaged the Marketing and Design teams. Mel Chua was present for two weeks with the students, but we had not fully brought them up to speed in advance regarding IRC, mailing lists, and the like. All said, we had to *try* something, and I'm glad we did.

That said, reflection is where we learn. This conversation is a powerful reflection tool, and I think it nicely illustrates a problem with open source communities: the "firehose." FLOSS contributors are so deeply expert in their tools and technologies -- or, perhaps, expert at functioning with partial knowledge -- that they forget that many people do not operate this way on a daily basis. This conversation, many times over, illustrates how two community members completely failed to address the actual needs of the contributor, and instead patted themselves on the back for spilling large amounts of information regarding esoteric technologies that had nothing to do with what the contributor was trying to achieve.

When interfacing with students, it is important to remember that students live in highly constrained spaces: they have classes, and homework, and piano lessons, and social groups and student clubs, and always the pressure that their education costs money and they want to do well. When a member of the faculty brings students into the community, they are taking a big risk: the community is now part of the students' educational experience. If you treat them like you treat every other contributor ("well, if they stay, great, and if not, we can't do anything about it"), then *faculty will not work with you*. You must step back from the now, and ask more diagnostic questions: what is it you're trying to achieve? Why are you asking that question now? Is that really what you want to be working on and why?

If you can't ask these questions, then you should be using your faculty support. Push students back to the faculty when they seem to be in deeper than is productive. "Productively lost" is fine, but we only have 14 weeks in a semester. We must demonstrate learning outcomes. We are evaluated on what we do. When you have 6-month release cycles, and live in a world where things that don't make Release 13 can be in Release 14 (or 15, or ...), you can afford to be confused for three weeks. In my world, that represents a colossal waste of time and resources -- roughly 20% of the semester, to be precise. We must compress "productively lost" into something that lasts hours and days, not weeks, for collaborations involving students from outside the computer-geek culture to contribute meaningfully.

What follows is a brief analysis of the IRC conversation above. My comments are in forestgreen. You may feel differently, and if so, feel free to respond in a different color (or, further below).

[edit] Ad-hoc conversation analysis

00:45 < morinoa> what is git?

This should be a big red flag. A student in their first year in college, taking part in a marketing team project, is asking about git. git is barely understood by third-year CS majors. This question should be an indicator that the student is engaged in something that is not productive. Certainly, diagnosing this question should be the first step.

"Why do you want to know about git?"

00:45 < ke4qqq> it's a version control system - like CVS or subversion, except distributed and lightweight

This answer pre-supposed knowledge of several other technologies that are not commonly taught in the first two years of a CS degree. Again, a non-major in their second semester will not be helped by this reply. Context matters.

00:46 -!- rbergeron [1002@terrapin.fischetti.com] has joined #allegheny
00:47 < ke4qqq> and fast
00:47  * rbergeron waves
00:47  * ke4qqq realizes that was a horrendous explanation
00:47  * ke4qqq waves back
00:47 < morinoa> a version control system? sorry, I'
00:48 < morinoa> I'm so unfamiliar

The community member (ke4qqq) realizes that their explanation will mean nothing to the student, which is good. But now the student is engaged, and follows up further. Not knowing what knowledge is critical path and what knowledge is not, they must pursue the only line of questioning they have. A novice does not have, and cannot be expected to have, the ability to filter what is useful from what is not. That is an expert ability.

00:48 < ke4qqq> rbergeron: can you help here.... I suck at explaining dvcs
00:49 < morinoa> does it really matter? 
00:49 < morinoa> unless it is vital, please don't worry that I don't understand
00:49 < ke4qqq> morinoa: it will to you
00:49 < morinoa> ok, that's fair enough

That said, the student still pushes back. "does it really matter?" The community member says "it will to you," which is false. The contributor is trying to engage in writing profile information for a spin. They need help writing copy, not committing to a repos. It might matter to them if they become permanent fixtures of the team, but at this point, they're a willing contributor who is capable of doing interviews and writing text. It is absolutely true that this student could learn these tools -- I am not challenging that notion -- but that they would need git in the next two weeks leading up to release is, quite simply, silly.

It is up to the community members to stop this conversation now, but they don't. Instead, we proceed...

00:50 < ke4qqq> and that's really what makes this soooooo ugly - not only do you have to grok html, but you have to grok git, and a few 
                other things
00:50 < rbergeron> ke4qqq: did you make sure everyone groks grok yet :)
00:50 < morinoa> grok? 
00:50 < ke4qqq> lol
00:51 < ke4qqq> it's a heinleinism
00:51 < ke4qqq> which is great reading incidentally
00:51 < ke4qqq> To grok is to share the same semiosphere or line of thinking with another physical or conceptual entity
00:51 < ke4qqq> but really you can just say that you understand something when you grok it
00:51 < rbergeron> or for those of us who speak english
00:51 < rbergeron> you understand
00:51 < rbergeron> :)
00:51 < morinoa> ok...
00:52 < morinoa> I grok
00:52 < ke4qqq> lol

The student's comment around this point was "This is all alphabet soup!" This is the side-effect of being in the same room.

"Geek culture" is deeply ingrained in open communities, and is a form of discrimination. If you don't know the l33t sp33k, you aren't hip, jive turkey. *cough* First, the conversation was too technical; then it was too cultural. This isn't "cool," it isn't "hip," it's a way to drive people away from a community.

00:52 < rbergeron> let me see if i can explain revision control
00:52 < morinoa> ok lol I understand haha
00:52 < rbergeron> before we talk about the "distributed" portion
00:52 < rbergeron> go to this page
00:52 < rbergeron> http://en.wikipedia.org/wiki/Revision_control
00:52 -!- mether [~Rahul@115.242.39.226] has joined #allegheny
00:53 < rbergeron> see how there are tabs at the top of the page - article, discussion, edit this page, history?
00:53 < rbergeron> click on the history tab
00:53 < rbergeron> this is probably the most basic form of version control
00:53 < rbergeron> every time someone makes an edit to a wikipedia page
00:53 < rbergeron> it's logged here in the history page
00:53  * ke4qqq never thought of using wiki as an example
00:54 < rbergeron> there are multiple versions - a new version, essentially, every time someone edits the page.
00:54 < rbergeron> they do this in wikis for multiple reasons - you might haphazardly enter / delete something incorrectly and save it
00:54 < rbergeron> or - more commonly - someone will come in and delete a page for malicious purposes, or to add viagra spam, etc.
00:55 < rbergeron> you can click on an older version and revert back to it.
00:55 < rbergeron> or - alternately - compare multiple versions side by side to see what is different about them.
00:55 < rbergeron> do you see what i'm talking about?
00:56 < rbergeron> this is no different from how source code works - except source code doesn't necessarily have a pretty interface.
00:56 < ke4qqq> better yet does it make sense
00:57 < ke4qqq> Robyn: the command line IS pretty :) 
00:57 < rbergeron> with source code - rather than looking at it, you check out pieces from a centralized place
00:58 < morinoa> this is good! i totally understand

This is a decent interaction. A concrete example was presented, and the newcomer was able to understand it. However, we're now going to make a jump to transfer to a context that the nascent contributor has no context for: code. "This is no different from how source code works..." is, while possibly understandable in the abstract, is still not "concrete" in terms of the contributor's actual goal.

00:58 < rbergeron> and THEN look at it - and you can add to it, etc.
00:58 < rbergeron> and then check it back in - marking / notating it with what you did, etc.
00:58 < rbergeron> every checkin creates a new version number
00:58 < morinoa> aha...handy
00:58  * ke4qqq knew he invited rbergeron for a reason
00:58 < rbergeron> which is great for if multiple people are working on things at the same time 
00:59 < rbergeron> because they may see - ohh, i checked out version 775 - but now i want to check in, and it's at version 778 - i should 
                   make sure nobody else has been doing stuff that i might have hacked on as well
00:59 < morinoa> definitely
01:00 < rbergeron> which is an improvement over wiki, i might add - you can definitely have collisions sometimes with multiple people in a 
                   wiki
01:00 < rbergeron> :)
01:01 < rbergeron> so for code, it's great - also, in a DVCS or VCS - you can set things up so that only certain people have permissions to 
                   check code in or out
01:01 < rbergeron> actually, i should just say in"
01:01 < ke4qqq> yeah - no blocking if it's 'out'
01:01 < ke4qqq> though I think some old vcss did that
01:01 < rbergeron> you want people to be able to check out code to look at it - you don't want everyone to necessarily have permissions to 
                   "fix it" if they want to
01:02 < rbergeron> usually in those situations - people who want to add more code might post a patch to a mailing list - showing what 
                   they've added / changed
01:02 < rbergeron> until they've gained a level of trust with whoever the maintainer/owner of the code is
01:02 < rbergeron> "we know your'e not going to screw things up - you keep doing good work"
01:02 < rbergeron> that kind of thing
01:02 < rbergeron> the other thingis - with code - you'll hear people refer to things like trees and branches
01:03 < morinoa> ok...
01:03 < morinoa> obviously these are like "grok." a different language 

Here, the newcomer is once again signaling that they don't actually get the idea *that* deeply. The idea that you might have a "magic undo" is one thing -- but the transition to code (for which they have zero context) is more challenging.

01:03  * rbergeron wonders where her bf is since he could really explain this better than i can
01:03 < rbergeron> yeah
01:03 < rbergeron> so - let's say you're a developer

My thought? "No, lets not."

01:03 < rbergeron> a lot of times you'll be working on multiple versions of the same code
01:03 < rbergeron> ie:
01:04 < rbergeron> you have a version that is already out - sometimes you have urgent fixes that need to happen
01:04 < rbergeron> security issues
01:04 < rbergeron> that kind of thing
01:04 < rbergeron> you might also be working on "the next version" - more hardcore development
01:04 < rbergeron> things aren't as stable
01:04 < rbergeron> but in the end - they are all going to be more or less the same product, at some point
01:04 < morinoa> ok, so I understand this revision idea, now what do I do with it?

And again, the newcomer comes back and says "Sure, revision, I got it... but so what?"

This brings us all the way back to the original point of this critique, which is that the community needs to work with its newcomers at their level, not at the community's level, and be prepared to bring them in where they are -- without dumping them in the deep end of the pool.

01:04 < rbergeron> so in a DVCS - you might see people refer to things as a development branch
01:05 < rbergeron> or a stable branch
01:05 < rbergeron> ke4qqq: yeah, what's up with me explaining this :)
01:05 < ke4qqq> you splain it well

In conclusion, "Productively lost" is a wonderful thing... but I suspect that more than 95% of potential contributors are lost simply because the community has no way of scaffolding their entry. While we might like Lave and Wenger's notion of legitimate peripheral participation, we need to work together to seriously improve the mechanisms by which newcomers (especially people without a background in a technical field involving computing) are brought into open projects.

This is a dialog that I am prepared to have, and can contribute 20 students to next Fall. However, the conversation needs to take place during the summer, and I want one, two, or possibly (at the most!) three small communities within a larger project who want to improve their entry paths in meaningful ways. This is something we can task students on, and begin developing a model for how to do this repeatably, reliably, and without burdening the community (or the faculty involved) unnecessarily.