Archive for the 'code' category

Open Source for Business

Posted on June 8, 2009

I didn’t get this posted yesterday because the Internet crapped out in our area. Nothing but excuses, I know.

I’ve been working beyond the bleeding edge, using a version of the Nutch code that’s not even made it into the Apache SVN for the project yet. To celebrate the fact that my contributions will make it in I figure its a good time to get into open source and business.

To put it briefly, and as you can probably guess, I’m pro open source. I use it extensively and I push back as much as I can. When it comes to the most of the code I write there really isn’t much commercial benefit in keeping it hidden so it just makes sense to give back.

There are two types of business on the web, one where you provide a software service and that is the product and others where you provide access to data. It’s pretty easy to tell which camp you’re in.

37 Signals for example, they’re in the first and their software probably isn’t something they just want to let people download – unless they’re incredibly brave. Doing that would mean that they’d be competing on margins for the cheapest hosting, users would flock to the cheaper services, have a poor experience and blame the software.

Sproozi on the other hand is the second type, our data is what users mostly care about and we’re not planning to be precious about our code. I’ve already been pushing some of the changes I’ve made to Nutch back into the project and we’re planning open source projects of our own in the coming months.

One of our plans we have is to build iPhone, Andriod and other phone based applications for our service and release them as open source projects. We’re planning to write them (or have them written for us) and release ‘official’ versions. Then release that code as open source project to provide a framework for developers so that they can build great things from it and on our API.

If there are any experienced iPhone, Android, Blackberry, Symbian or Pre developers out there that want to get involved, drop me a line were a ways off yet but would love to chat about it and get some very early feedback.

Reblog this post [with Zemanta]

Javascript mapping abstraction

Posted on June 5, 2009

One thing that occurred to me while I’ve been writing Sproozi is how tied to a mapping implementation I am and it’s something I’ve been thinking of starting a project to abstract the details away so that I could swap implementations whenever I liked.

A couple of things I wanted out of the API, the most important was to be able to do were switch from one implementation to another without loosing my markers or having to reload them. I really like that foot paths show up on the OSM maps.

Then I came across mapstraction – which looks very good. I’m checking it out to see if it meets my needs. I’ll let you know.

Whoops, for some reason this didn’t get posted yesterday. Must try harder.

Welcome to the cloud

Posted on June 3, 2009

The great thing about the cloud is the extremely low barrier to entry. It’s very cheap to get up and running, it’s very cheap to scale and it’s very cheap to store data. I’m still not fully convinced that from a long term perspective with thousands of nodes it’s going to be cheaper than provisioning your own servers and hosting but I’m more than happy to be convinced – saving money one way or another can’t be a bad thing.

Long term though one thing businesses really need to be wary of is being tied to tightly to a platform. There is of course the issue of being tied to an API and not being able to choose a new provider, but this isn’t really what concerns me – we use debian AMIs on EC2 with a full opensource stack top to bottom. I’m taking about the the issue of economic lock in. The massive scale and masses of data the cloud allows users to store could quickly lead a company to a very expensive decision when it chooses to or is forced into moving providers.

Moving data in or out costs about $0.10/GB (a nice easy number to work with), let’s pretend it’s about the same for all other provider, so to shift your data from one to another is going to cost you $0.20/GB. That could quickly add up to a massive cost just to choose a new provider. 1TB will cost you over $200, 1 petabyte, which isn’t going to be an unheard of amount of data in the next few years, is going to cost over a whopping $200,000! Just to move to a new provider. That’s some kind of lock-in, probably a lot more than the cost of any new or changed APIs. Not to mention how long it’s going to take to transfer that amount of data and the fact that you’ve already paid $100,000 to get it there!

Before anyone starts, yes I’m aware you can send Amazon a big storage device and they’ll put all your data on that and send it back to you. Then you could probably send that data to your new provider and they’d put it in the cloud for you- I won’t get into how good a solution to the problem this is, because I haven’t really thought it through nor do I have any idea what that sort of storage would cost and what sort of redundancy you’d want it to have to safely truck the thing around.

What I’m trying to get across is that an open stack you control is probably something you really want to own. And it’s probably something you want to deploy across more than one provider for redundancy and your own piece of mind. Just imagine if your cloud provider launches a competitive service, shuts down or for whatever reason decides not to service your account anymore.

There are mitigation strategies you can apply to this. With sproozi for example we’re holding a lot of data on the nodes so that we can work with it. A lot of this can be rebuilt and reacquired, so we don’t need to truck all our data about. Saving the list of places submitted by users and that we’ve discovered is more than enough to re-crawl everything and rebuild all the indexes. This is just one example though and we’re not saving things like images and other data critical for users, so we’re likely the exception here not the rule.

We run only on EC2 at the moment, but when we actually start getting more data we’re going to spread it out across a few providers – just in case.

I see the point of git.

Posted on January 1, 2009

I’ve made some progress over the last few days getting things working towards getting the crawler online and came across a scenario where git would have been useful and had one of those epiphany movements when distributed development clicked and it made sense to me. 

The first run of our crawler is based on a pretty vanilla nutch install with some plugins; it works already and saves us an awful lot of leg work. The first thing I wanted to do was quickly mavenise it so we could better integrate it into our development environment. Problem is if I pull the trunk, I can’t commit and update from our own internal repository and the Apache one. 

I intend at some point, once it’s working and I’m happy with it, to submit a patch back to the ASF but in the short term I just want to share my fork with other developers on my team. This is where git would come in. Next I was hoping to start to move the configuration over to the springframework IOC controller, to see if/how it works or doesn’t, again I’d submit the patches if the community was interested. 

When I get a bit of time in the next few weeks I think I’ll install a central git repository for our dev environment and trial a transition there. Has anyone tried something similar, is git what I’m looking for? 

 

Reblog this post [with Zemanta]

Testing for search result quality

Posted on November 3, 2008

I’ve been faced with a bit of a problem lately, how to test the quality of results being produced from a changing dataset? It’s easy to write unit tests for components of an application, and to make sure they’re producing the expected results, but how do you test end to end?

Andraz from Zemanta asking the same question, How do you test a complex system that is trying to mimic being smart, last year.

when you have new content in the system, you get completely new related stories and you have to go back and have a human judge them. There is expansion of the evaluation data – as you add new tests you generally can’t send them through previous versions of your algorithms, since that would be prohibitely expansive. And there is statistics that hardly gives you overview over what exactly your changes caused, just few final numbers. And then there is the problem of pipelining the processing. Even if you improve the first stage, end results might be worse, since you’ve already adapted the second stage to previous first one. So you need to actually evaluate each part of the system in isolation and then together.

At the end you actually find out that you spend disproportional amount of time evaluating even the smallest changes. So you are in danger to just skip that evaluation which naturally you shouldn’t.

The fundamental problem you run up against is that the index is constantly changing, and it’s meant to change. So it’s hard to automatically test the output without a clear idea of what is going in. It’s also difficult to get an accurate picture of how small changes in code affect the general results if you’re just using a testing index with a small dataset.

One way to go about it is to gauge result quality based on measuring user interaction. Basically there are things users do when they get results they’re expecting and things they do when they haven’t found what they were looking for. So if you can get measure how they’re reacting, you can get an idea of quality.

At the moment I’m a lone developer putting all the data in the index, and I have a good idea of what I should be seeing out if it’s actually working. In the next while though we’re going to be rolling out to a few more internal beta users as we get a prototype system developed and we’re not going to be in control of the inputs or output anymore. So soon enough we’re going to actually be faced with trying to measure the quality of the results we’re giving users in a dynamic system – expect to hear much more about this as we go on.

Reblog this post [with Zemanta]

I have no idea what I’m doing

Posted on October 22, 2008

It’s very, very hard for us to admit that we’re wrong, let alone that we have no clue what we’re doing. The first step of course is being able to admit that you’ve got a problem and before you can admit you have a problem you need to be able to recognise that you’ve got one.

I came to the shocking realisation the other day, that I was guilty of not really know what I was doing. I came across a subtle bug in some of my code; my unit tests were passing but in production things weren’t working as I expected. I dug deeper and got a vague idea of what was happening, then I engaged in shotgun programing.

I knew I was doing it too. I had no idea what I was doing, and I knew it. I just started hacking away at my code, solving one problem and introducing another. Basically I knew where the problem was and as if it were some small animal rustling in the bushes nearby I just blasted away its general direction and just hoped I hit it, whatever it was.

I eventually solved the problem, and eventually understood what was going on; but it was touch and go there for a while. The problem wasn’t so much my trial and error approach, that did eventually solve the problem. It was more that I had no idea why I was trying the solutions I was trying, more specifically why I thought they were plausible I was just pointing in the general direction of the problem and letting loose.

Reblog this post [with Zemanta]

Interfaces first

Posted on September 30, 2008

Don’t write spec documents, throw away any you have near you, they’re next to useless anyway. It might seem like a hearsay to some but it’s true and deep down even the most most anal planner in all of us knows it. Two of the biggest reasons for this are:

  • size and scope: In order to address every aspect of your project a full specification document has to, well address every aspect of your project and it’s going to be huge as a result.
  • lack of vision: For most people relating what’s written on paper to a vision of a design in their heads is difficult if not impossible.

Loose documents don’t work well either, because leaving the obvious unsaid just leads to problems, chiefly not everything is obvious to everyone. The devil is in the details and avoiding them until the end of a project is just going to cause trouble.

“Why haven’t we implemented x, that should have been obvious!”

create the interfaces

Prototype the interfaces and use them as your spec. Working this way has so many benefits, try it once and I promise you’ll be converted. Before you write a single line of backend code, sit down and work out your interfaces. Build each screen, put all the buttons, fields and elements on the page

I normally start with index cards and write a few simple requirements for each screen or page – just a few simple sentences listing what the page is expected to achieve.

My next step is to turn those requirements into interfaces.

in HTML, not photoshop

Photoshop is great, but let’s face it it’s no better for trying to work out a functional interface prototype than some paper and a pencil. It’s very easy to focus on making things look pretty over making them functional and you risk missing something important out.

The best way to avoid both of these traps is to write your interfaces in HTML, mark them up semantically as they will be marked up in the release and use them as your working spec.

I know good design but I’m no designer

Another benefit of coding up your interfaces in HTML before you start writing code is you can hand them off to designers so they can make them look pretty as you’re working on making them work. I have a pretty basic idea of design principals and the tools – but I find it hard work. So with interface first development I don’t have to think about it.

let the testing begin

Another thing which is made infinitely easier with interface first development is functional testing, because your testers don’t have to wait for a release before they can start developing a test suite. You are using an automated test suite aren’t you? By generating interfaces as developers add functionality testers can already have been through the interfaces and written a whole series of failing tests.

updating later

The great thing about using HTML as your design and communication tool is that when it comes to making a change to a page you have the most up to date working copy there ready to modify in the form of your site. Just save the page, add the change to the HTML and you’ve got your interfaces ready for discussion and later implementation. What could be easier, faster or more clear for everyone?

it’s not a panacea

There have been times when I’ve been working with notoriously difficult individuals where nothing was going to satisfy. An exmaple that springs to mind is a multi month project where we produced an interface design early in the project. The design was a set of PDF images not HTML (see above), but the delivered product was a pixel perfect implementation of it. Only once the project was delivered was there any input on the design or functioning of the project – I’m not just talking things that a PDF can’t show you like what clicking a link looks like. I’m taking basic things like colours, layout and the size of elements!

Would a HTML mock up have made any difference, I actually doubt it. I take blame for miscommunication and resulting mistakes when it’s my fault but in this case even after the site was fully developed and deployed to a staging area for testing none of these issues were highlighted. It was only after the site was put into production that these “critical issues” were discovered.

Is it better?

Posted on September 29, 2008

After reading this post what Jamie said hit me as well:

This is when I realized how trained I was in the processes at my former workplaces. This email would have been delayed until it was perfect[...] After fixing this there would be another thing and then another thing. A 2-day project would drag on for a week of redesign, approval, and development[...] It’s one thing to read Getting Real [...] It’s another thing to actually practice the principles. [...] That part is trickier than you think.

I don’t think it’s just Jamie and I don’t think it’s just his former workplaces. We’re all trained to make excuses not to launch, it’s endemic in the culture of most organisations. We endlessly pay lip service to the principals of release early and release often, agreeing in principal
with the principals – but put very little of them into practice.

A manager’s role is to facilitate an organisation’s march towards better – all too often it’s a weak manager that needs constant input on projects and at the root it’s fear of inadequacy on their part that builds a culture of ass covering. It’s obvious that at 37 Signals they don’t suffer from being crippled organisationally when executive decisions need to be made but executives are absent. Their staff are trusted to make decisions and they’re empowered to release better features.

When your last change was ready to deploy, when it was better than what was there, did you release it? If not, how long did it take you to get from that point to actually releasing it and how many people had to give final approval?

Why not empower your staff with a simple test – Is it better than what we have? No flow charts, no organisational hierarchy; just a simple question.

Integration testing in maven – With Maven, Cargo, httpunit and Selenium

Posted on September 22, 2008

I’ve been trying to figure this out for months, and thought it should have been simple. All I wanted to do was write a set of unit tests to test my java code, have them run whenever I hit test. Next I wanted to have a set of HTTP Unit, Selenium or similar tests run to test that the actual application is working when it’s built and deployed to a container using the cargo plugin.

I didn’t really want to have a separate project to do this because it seemed like a massive pain in the ass to maintain the whole thing. I didn’t see any reason why it shouldn’t be easy with Maven either – it does have a phase for integration-tests already.

My first try looked good to me:

 

...
<build>
    ...
    <plugins>
        ...
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-surefire-plugin</artifactId>
            <configuration>
                <excludes>
                    <exclude>**/integration/**</exclude>
                </excludes>
            </configuration>
            <executions>
                <execution>
                    <id>integrationtest</id>
                    <phase>integration-test</phase>
                    <goals>
                        <goal>test</goal>
                    </goals>
                    <configuration>
                        <excludes>
                            <exclude>none</exclude>
                        </excludes>
                        <includes>
                            <include>**/integration/**/*Test*.java</include>
                            <include>**/integration/**/*Test.java</include>
                            <include>**/integration/**/*TestCase.java</include>
                        </includes>
                    </configuration>
                </execution>
            </executions>
        </plugin>
        ...
    </plugins>
    ...
</build>
...

 

I tried to run the tests, but unfortunately when it gets to the second round of tests – which should run my integration tests – it runs the same sets of as it ran the first time. I tried adding the combine.children="append" attribute to the includes, and excludes but that didn’t work either. Finally I came across a source file XppDom, part of plexus which maven uses. XppDom allowed the combine.children attribute as well as another I haven’t seen mentioned anywhere else childMergeOverride. I added that instead and it worked!

 

...
<build>
    ...
    <plugins>
        ...
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-surefire-plugin</artifactId>
            <configuration>
                <excludes>
                    <exclude>**/integration/**</exclude>
                </excludes>
            </configuration>
            <executions>
                <execution>
                    <id>integrationtest</id>
                    <phase>integration-test</phase>
                    <goals>
                        <goal>test</goal>
                    </goals>
                    <configuration>
                        <excludes childMergeOverride="true">
                            <exclude>none</exclude>
                        </excludes>
                        <includes childMergeOverride="true">
                            <include>**/integration/**/*Test*.java</include>
                            <include>**/integration/**/*Test.java</include>
                            <include>**/integration/**/*TestCase.java</include>
                        </includes>
                    </configuration>
                </execution>
            </executions>
        </plugin>
        ...
    </plugins>
    ...
</build>
...

 

 

Setting it up for Cargo

I’ve done this on a few projects, the first was a project that built a series of taglibs which are used across a number of applications. The release for the project is a jar so I have a separate test web-app structure in ${baseDir}/src/test/web-app. The code for generating this war looks like this:

 

...
<build>
    ...
    <plugins>
        ...
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-war-plugin</artifactId>
            <executions>
                <execution>
                    <id>generate-test-war</id>
                    <phase>pre-integration-test</phase>
                    <goals>
                        <goal>war</goal>
                    </goals>
                </execution>
            </executions>
            <configuration>
                <warSourceDirectory>${basedir}/src/test/webapp</warSourceDirectory>
                <warName>${project.artifactId}-test</warName>
                <webappDirectory>${basedir}/target/${project.artifactId}-test</webappDirectory>
                <primaryArtifact>false</primaryArtifact>
            </configuration>
        </plugin>
        ...
    </plugins>
    ...
</build>
...

 

During the pre-integration-test phase the above simple generates a test war. If you’re project is a web-app and it already generates a war you can skip the above as one will be generated for you already.

The next step for was then to get Cargo to deploy the application:

 

...
<build>
    ...
    <plugins>
        ...
        <plugin>
            <groupId>org.codehaus.cargo</groupId>
            <artifactId>cargo-maven2-plugin</artifactId>
            <configuration>
                <wait>false</wait>
                <container>
                    <containerId>tomcat5x</containerId>
                    <zipUrlInstaller>
                        <url>${integrationtests.tomcatURL}</url>
                        <installDir>${installDir}</installDir>
                    </zipUrlInstaller>
                    <output>
                        ${project.build.directory}/tomcat5x.log
                    </output>
                    <log>${project.build.directory}/cargo.log</log>
                </container>
                <configuration>
                    <home>
                        ${project.build.directory}/tomcat5x/container
                    </home>
                    <properties>
                        <cargo.logging>high</cargo.logging>
                        <cargo.servlet.port>8080</cargo.servlet.port>
                    </properties>
                </configuration>
            </configuration>
            <executions>
                <execution>
                    <id>start-container</id>
                    <phase>pre-integration-test</phase>
                    <goals>
                        <goal>start</goal>
                        <goal>deploy</goal>
                    </goals>
                    <configuration>
                        <wait>false</wait>
                        <deployer>
                            <deployables>
                                <deployable>
                                    <location>${basedir}/target/${project.artifactId}-test.war</location>
                                    <type>war</type>
                                    <pingURL>http://localhost:8080/${project.artifactId}-test/index.html</pingURL>
                                    <pingTimeout>300000</pingTimeout>
                                    <properties>
                                        <context>${project.artifactId}-test</context>
                                    </properties>
                                </deployable>
                            </deployables>
                        </deployer>
                    </configuration>
                </execution>
                <execution>
                    <id>stop-container</id>
                    <phase>post-integration-test</phase>
                    <goals>
                        <goal>stop</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
        ...
    </plugins>
    ...
</build>
...


Most of this is pretty self explanatory, I try to use parameters to configure as much as I can and try to maintain it in higher level POMs wherever appropriate. This is especially true for the tomcat URL which changes on a regular basis. The seem to post new releases and remove the older ones, and it can be a chore to keep up in multiple projects. You may want to look at the cargo plugin documentation, you can ignore some of what I’m doing above if you’re project’s default artifact (the one package creates) is a web-app. 

So there you have it, eventually I got there after a lot of digging. I’ve been pretty brief in my explanations and have assumed a fairly good understanding of maven. if you have any questions by all means leave a comment and I’ll do my best to make it clearer.

Reblog this post [with Zemanta]

Setting subversion up to work with contractors

Posted on July 28, 2008

In my last post I promised I’d write another about subversion and how to set things up to work with contractors. I was supposed to post this the next day, but I got caught up with other things.

Anyway, let’s just jump right in…

We run all our projects out of one big subversion repository, some people create a repository per project or group projects into a number repositories. In all honesty it really doesn’t matter how you group things, I think that one repository works better for us, I can move things around and I can allow people access to all or parts of the repository. You may feel differently, but for the time being I’m going to assume you’re working with a single repository like us.

getting started: installing apache, subversion and mod_dav_svn

I recently upgraded to Subversion 1.5, there wasn’t a package available for the version of Fedora we’re running on our server so I had to build it. It was a relatively painless process of simply making sure dependencies existed and following the instructions in the INSTALL file.

http://subversion.tigris.org/getting.html

Go do that now, get subversion and mod_dav_svn installed. There are more than enough resources out there on how to install subversion that I’m not going to go into too much detail about building from source or installing on any specific platform.

In fact since the steps and concepts I’m going over in this post don’t require a specific version I’ll just throw up some common defaults.

If you’re on a Red Hat based linux system try:

# yum install subversion
# yum install mod_dav_svn

If you’re on a Debian based system try:

# apt-get install subversion
# apt-get install libapache2-svn

In your apache configuration file, near the module declarations make sure you have the following lines.

LoadModule dav_module /usr/lib/apache2/modules/mod_dav.so
LoadModule dav_svn_module /usr/lib/apache2/modules/mod_dav_svn.so

setting up a repository

Again moving quite quickly here, there are tons of other better resources for getting started with Subversion on the web, better than I could hope to write. The basic command you need to run simply creates the directory structure that subversion needs.

$ svnadmin create /path/to/repo

It’s usually best to create this somewhere near, but not in the document tree for the webserver.

Open you’re apache config file, or the config file that stores the details for the virtual host you’re using and add the following lines:


DAV svn
SVNPath /path/to/your/repository
AuthType Basic
AuthName "Subversion repository"
AuthUserFile /path/to/your/passwdfile

Now create the htpasswd file and add a user or two using the following commands

# htpasswd -cm # htpasswd -cm /etc/svn-passwd andrew
New password:
Re-type new password:
Adding password for user andrew

# htpasswd /etc/svn-passwd -m simon
New password:
Re-type new password:
Adding password for user simon

Restart apache and that’s it all done. You should now have a running subversion repository with two users, andrew and simon, they should be able to view and commit anywhere. We’ll assume these are staff members who can view and commit on any project.

Setting up for an external contractor

Now you want to bring a contractor in on a project, so let’s create them a user:


# htpasswd /etc/svn-passwd -m john
New password:
Re-type new password:
Adding password for user john

The next thing you need to do is to setup svnauthz to control access to the repository. Back in your apache config file, add the following line into your svn config:

AuthzSVNAccessFile /home/goroam/dev.goroam.net/user/repos/svn-authz

So that it looks something like this:


DAV svn
SVNPath /path/to/your/repository
AuthType Basic
AuthName "Subversion repository"
AuthUserFile /path/to/your/passwdfile
AuthzSVNAccessFile /path/to/your/repository/svn-authz

I tend to keep my svn-authz file within my repository path, you may wish to place it elsewhere – whatever works best for you.

The next step is to create the svn-authz file. That’s as simple as this:

[groups]
staff = andrew, simon
contractors = john

[/]
@staff = rw
* = r

[/external-project]
@contractors = rw

The file is pretty simple and self explanatory but the first block starting with [groups] defines the groups. In this case we’ve got two one for staff with andrew and simon and one for contractors with john as the single member. The usernames to use here are the same as you set in your htpasswd file. Authentication is controlled by the standard basic authentication, subversion is only controlling access.

The next two blocks are paths within the repository. The first:


[/]
@staff = rw
* = r

Tells subversion to give all members of the staff group read and write access to everything under the / path – basically the whole repository. The next line * = r tells subversion to give everyone else, read access to everything.

Again this may not work for you, but we tend to allow read access to everything to everyone with a password. If we trust them enough to give them a password, we trust them. Also it allows contractors to build dependant libraries from the head, which is required at times and saves us the trouble of working out dependencies in our svn-authz file.


[/external-project]
@contractors = rw

In the next block, above, we’re giving everyone in the contractors group read and write access to everything in the /external-project path of our repository. This of course is the project they’re currently working, so your path will be different.

That’s it. Contractors hired, up and running in your subversion in minutes.

Slightly more complicated, but easy now that you know how

Let’s just complicate things a little bit by adding two external contractors from the first company and a third from second working on two different projects.

Let’s say that CompanyA Limited are working on project-x. They’ve asked us to create accounts for Dick and Jane. Then we hire Dave from CompanyB to add a feature called zing-bang to project-y. You could, of course just create two accounts, one for each company – but we tend to avoid that. We like to know WHO committed what code. Not just where they were from, we work closely with our contractors and it helps us when we’re communicating with them.

We know how to create the users, but just to recap here we go again:


# htpasswd /etc/svn-passwd -m dick
New password:
Re-type new password:
Adding password for user dick

# htpasswd /etc/svn-passwd -m jane
New password:
Re-type new password:
Adding password for user jane

# htpasswd /etc/svn-passwd -m dave
New password:
Re-type new password:
Adding password for user dave

Assuming, as above we have andrew and simon as staff users we should make our svn-authz file look something like this:


[groups]
staff = andrew, simon
companya = dick, jane
companyb = dave

[/]
@staff = rw
* = r

[/project-x]
@companya = rw

[/project-y/branches/zing-bang-feature]
@companyb = rw

We’re only building slighty on the previous file with the groups entries, and that should be clear. The entry here for project-x should also look familiar – Both dick and jane, members of the companya group have read and write access to the path.

For the next entry we’ve created a branch and we have company b working on the branch. I’m really just throwing that out there, mostly because I can, but also to show that you can go to any depth in the repository tree granting access as you see fit.

In this case it’s been determined that the work CompanyB in engaged to complete only needs to take place in this branch, so while they can read the whole repository they cannot write anywhere except here. This would allow you to continue internal development, or indeed external with some more entries on project-y making point releases while zing-bang feature is developed and CompanyB could merge the trunk in at will, since they have read access to it.

a lot of words

It was a lot of words, but the concepts are pretty simple. It doesn’t take long to set subversion up and even with basic auth and htpasswd files it’s not complicated to administer. So go on, get your contractors under control. I promise the dividends paid through increased accountability, visibility and communication will more than offset the time spent administrating the system.

Reblog this post [with Zemanta]