Systems and Data principles

For a year now the Systems and Data Working Group of Mozilla has been meeting, brainstorming about community building systems, designing and implementing them and pioneering new ways to measure contribution activity across Mozilla.

In the process of evaluating existing systems (like mozillians.org) and creating new ones (like Baloo) it was obvious that we needed a common set of principles to apply on all systems that are in service of community building within Mozilla. That would enable Mozillians to easily access tools and contribute in a way that maximizes impact. We as the Systems and Data Working Group recommend these principles be adapted for all tools used by Mozilla.

The principles written in buckets are:

  • Unified Identity
    • Tools should have single source of truth for people data
      • Integration with HRIS
      • mozillians.org already has staff and volunteer information, so it is a good candidate at the single source of tr
    • Tools should de-duplicate people information by integrating with a single source of truth
    • e.g. Reps: Not integrated with Mozillians.org, lots of duplicate information on two profiles
  • Unified Authentication and Authorization
    • Tools should use a single identity platform that provides permissions-based access to tools (like Mozillians.org)
    • e.g Reps: add people to the Reps group on mozillians.org to give them permission to use rep.mozilla.org as a Rep
  • Accessible Metrics
    • Tools should track each contribution a Mozillian makes and provide it in an accessible way to create a holistic view of contributions
  • Localization
    • Tools should be localized so they are accessible to our global community
  • Education
    • Tools should teach the user how to use the tool, answer common usage questions, and have general documentation
  • Recognition
    • Tools should recognize the contributions that they enable
  • Participation
    • Tools should enable anyone to improve the tool by filing bugs, suggesting ideas and bringing those ideas to life
  • Content de-duplication
    • Tools should de-duplicate the content that is created in those tools, making it accessible to other systems
  • Fun
    • Tools should be personal and written in the Mozilla voice

This has been a collaborative effort involving various stakeholders of tools within Mozilla that have been reviewing those and providing feedback during our meetings. We are seeking more feedback moving forward especially with regards to how those impact the Roadmap of various tools and translate to actual features. Feel free to comment here or join our discussions in the community-building mailing list.

 

Connecting Baloo with areweamillionyet.org

It was only a matter of time to connect the dots. As we saw in a previous post, we have been working with Adam Lofting on publishing a public dashboard for contribution activity metrics. The data we had were based on one-off exports from Github for demo purposes. The intention was to feed the dashboard with data from Baloo, our single source of truth about contribution activity in Mozilla.

Thanks to the hard work of Sheeri Carbal, Anurag Phadke and community builders on various contribution areas across Mozilla, this connection is now live.areweamillion_balooNavigating to areweamillionyet.org you can see the total counts of Active Contributors in Mozilla with drill-downs to specific teams and systems.

The data flow can be briefly described as this: Databases for integrated systems (Github, Bugzilla, SuMo for now) are scrapped for activity info based on our Schema, resulting in a formatted database full of raw contribution data. Then we apply aggregations per system and per area as defined by Community Builders in our Conversion Points tables to create active contributor counts while de-duplicating them across projects. Aggregations are then exported and captured by a nodejs app feeding info to our public dashboard.

More systems are in the pipeline to be integrated (Reps, MDN, Location Services and others) really soon. You can track the progress (and request integrations) through the Baloo wiki page.

Damned Lies and Contribution Metrics

The power of numbers is unquestionable. I never fully understood though, what it is. Possibly the urge of everyone to explain the world rationally. Or the need for reference to make any decision an “informed one”. Whatever it is, it drives people. Mozilla wouldn’t be an exception.

The ask was simple enough:

How many active contributors do we have in Mozilla?

No one knew last year, that a year in today we would only have scratched the surface of this question. But in the process of doing so we laid a solid foundation to move us forward.

Yesterday Adam Lofting announced the unified Mozilla Foundation and Mozilla Corporation contributors dashboard which you can check out visiting areweamillionyet.org This has been a collaborative effort between both teams and an incredible journey so far exploring and articulating notions of contribution metrics across the Mozilla Project.Screen Shot 2014-06-12 at 1.06.33 PMProject Baloo (intro post here) is underway to supply all the data that will fuel the unified dashboard, starting with Bugzilla, and Github data, thanks to Sheeri and Anurag from BI team. Next in line are Reps, SuMo and MDN. All data will be gathered in a central database, in a common schema, updated almost instantly by the systems that activities are happening. You can track the progress here.

Adam’s post has all the technical details about the current implementation (so I will not go into details here) but I would like to expand a bit around the importance of deduplication and cross-examination of metrics between different teams of Mozilla.

Being a Community Builder inside Mozilla you want metrics for your contribution area. You can see people come and go, but you have no idea whether those people are moving to other teams or leaving Mozilla completely. With cross examination of contribution metrics we will be able to see trends and movements of people across different projects and teams for the first time.

Using deduplication of identities (based on emails) we will get a much more accurate count of people, that will improve even more once we integrate with Mozillians.org and Workday so we can deduplicate people using multiple emails. Anecdotally (and based on the initial real data we have) we know for sure that the actual count of active contributors will be considerably lower that the sum of active contributors on all teams.

Deduplication(1)Expect more updates to come as we roll new integrations in and new data-sets become available.

Contribution Activity Metrics – Project Baloo

tl;dr version: You want contribution metrics? Project Baloo is here.

As we have seen in our previous posts, we decided to move ahead using learnings from the past. Here is how we tackled the issues identified:

  • we need to be a top line goal for people and teams
  • we need to examine really well what is out there (internally or externally to Mozilla) and investigate the possibility of re-using it.
    • We discovered that infrastructure that scales easily to handle huge amount of data (in our case contribution activities) already exists.
  • we need a clear and common language to make communications as effective as possible
    • A new team (which I am proudly part of) was formed to develop and establish this common language among other things.
  • we need to be inclusive in all our procedures as a working group, with volunteers as well as all paid staff.
  • and in true Mozilla fashion: we need to start small, test and iterate with a focus on modularity.

Enter Project Baloo

What is project Baloo?

Project Baloo, is a collaborative effort between the Business Intelligence and Data-Warehouse team and the Community Building team to create a contribution tracking system for Mozilla.

What does it look like?

Project Baloo is re-using already existing infrastructure of the BIDW team and adding some new entry-points and end-points for data import and export.

What can it do for me?

So say, you are part of a contribution area. You eagerly want to know more about your area contributions, specifically metrics around it. Having your system integrated with Baloo, will give you access to an easy way to visualize those contribution metrics (using Tableau) or even have more advanced access to data, like cross-comparing and de-duplicating contribution metrics with other areas in Mozilla using an API!

Have it crossed your mind that people might be contributing to more than one areas? Yes they do! (We expect l10n-leave-my-sumo-contributors-alone type of reactions to our data)

What can I do for it?

Start integrating your systems with Baloo! More info can be found here and we are always here to help you along the process.

People love graphs so here is one to follow:

What is next?

We are polishing the data schema and publishing the first results from the test run on SuMo. You can follow the progress in our roadmap and participate in our System and Data Meetings if you want to help (or just follow updates!)

Contribution Activity Metrics – Early attempts and fails

As we examined with the intro post, the need for contribution activity metrics for different contribution areas in Mozilla has been high. It was only logical that many attempts were made to address this issue, mainly on the area-level (and not in Mozilla-wide level). Almost all of them had zero interaction between each other, and there was a general lack of vision for an holistic approach to the problem.

After one of our initial gatherings as the (then meta-) Community Building Team, a couple of people brainstormed together a possible solution to our problem. Together with Josh Matthews, Giorgos Logiotatidis, Ricky Rosario and Liz Henry a new approach was born. Enter project Blackhole!

Project Blackhole was a collaborative effort to develop and maintain an infrastructure of gathering and serving raw contribution data within Mozilla. We created a data architecture and flow together with a data Schema and specification to describe contribution activities for the first time in Mozilla. The project went far enough (thanks to Josh) to create a working prototype for back-end and front-end.

What went right:

Having a single project to drive multiple metrics efforts forward got people engaged. Everyone saw the value of de-duplicating efforts and tapping into that as a resource. Also during the process of designing and testing it we were able to self-identify as a group of people that share interest and commitment towards a common goal. Most of those people went on to become active members of the Systems and Data Working Group. Finally, we ended up with a common language and descriptions around contribution activities, a really valuable asset to have for the future of cross-project tracking.

What went wrong:

Building *anything* from scratch can be hard. Really hard. First, everyone (rightfully) questions the need to build something instead of re-using what is out there. Once you get everyone on board, development and deployment resources are hard to find especially on a short notice. On top of that Blackhole’s architecture *seemed* logical enough in theory, but was never tested in scale so everyone involved was not 100% sure that our architecture would survive stress tests and the scale of Mozilla’s contribution ecosystem.

PRO TIP: Changing the project name does not help. We went from “Blackhole” to “Wormhole” (and back to “Blackhole”?), to better reflect the proposed data flow (data would not disappear forever!) and people got confused. Really confused. Which is obviously something that is not helpful during conversations. Pick a name, and stick to it!

Lack of a team dedicated to it and inability to get the project listed as a personal goal of people (or teams), halted any progress leading us to a fearsome dead-end.

What we learned:

As with most failures, this one was also really valuable. We learned that:

  • we need to be a top line goal for people and teams
  • we need to examine really well what is out there (internally or externally to Mozilla) and investigate the possibility of re-using it.
  • we need a clear and common language to make communications as effective as possible
  • we need to be inclusive in all our procedures as a working group, with volunteers as well as all paid staff.
  • and in true Mozilla fashion: we need to start small, test and iterate with a focus on modularity.

A way forward?

Having those lessons learned from the process, we sat down last December as a group and re-aligned. We addressed all 5 issues and now we are ready to move forward. And the name of it? Baloo. Stay tuned for more info on our next detailed post :)