My Outreachy internship with Mozilla

About me

I recently completed an Outreachy internship with the Socorro team at Mozilla. I wanted to gain experience working at a high-profile tech company and I chose this internship for a couple of reasons. I liked the sound of the Socorro project because it involved mainly front-end work, and I am sympathetic to the goal that Outreachy is trying to achieve: supporting the employment of women and other underrepresented groups in the tech industry.

I am about to finish a Masters in bioinformatics and will be starting a PhD in the field next term. I am most comfortable working with JavaScript and my most recent university work has included writing statistics modules in Node.js and interactive graph tools using D3.

About Socorro

Socorro collects, processes and stores crash data about Firefox and other Mozilla products, so that problems can be identified and solved quickly. Crash-stats is a webapp built in the Django framework that provides users with a view into the crash data, and this is achieved partly through graphs. My job was to work on the Socorro team, guided by my mentor Adrian Gaudebert, to help maintain and improve crash-stats, focussing particularly on the graphs.

When I started the project, the graphs on crash-stats had been added at different points throughout the site’s history, often in response to a particular request from a user. Several different JavaScript graph libraries were being used:

graphs_old

Many of the graphs were in frequent use, but a few were disused or even broken. The plan was for me to redraw the useful graphs using a single graph library, and then to create an interactive tool for drawing customised graphs, as an early step towards a long-term goal of giving users more control over what data is visualised.

Redrawing the graphs

Once we had identified the useful graphs, my first task was to redraw them using Metrics Graphics, a Mozilla-made graph library specialising in time-series graphs. Metrics Graphics is built on top of D3, a powerful graph library that manipulates the DOM based on data. Drawing a graph directly in D3 entails specifying each element of an svg, resulting in code that is difficult to maintain, but Metrics Graphics provides a convenient API while making many of the basic design decisions and adding interactivity.

We encountered a few bugs, which is to be expected from such a new library, but the great thing about using an open source library was that I was able to discuss issues with the team working on it and make upstream fixes myself where necessary. The team responded really quickly, advising me, merging my fixes and adding the issues I didn’t get round to fixing to their milestones. Here are some examples of the updated graphs:

graphs_new

The new graphs are more consistent in terms of colour, font, style and interactive behaviour. I prefer the Metrics Graphics philosophy of emphasising the data, and giving the axes and labels a less overbearing presence so that they can serve their purpose of being for reference rather than a major part of the visuals. This does result in Metrics Graphics having some slightly controversial policies, such as making the auto-generated axes shorter than the plots in some situations; however they are starting to offer options to overrule such decisions. There is still some work to be done: some of the old graphs are still present and there are a few bugs concerning the axis labels and resizing.

Custom graph tool

Another aim of the project was to create a tool for drawing customised graphs on the new signature page. This page, a replacement for the old signature reports page, contains multiple tabs that allow the user to view the data in different ways. One of these tabs already offered customised data aggregations in table format, and we wanted to add a new tab for making graphs of this aggregated data, broken down per day.

As preparation for this new graphs tab, I reorganised the code for the tabs. The idea was to have a more object-orientated structure, centred around the tab and the panel within the tab, each panel containing a separate, customised visualisation of the data (e.g. a table or graph), according to the user’s parameters. Once the new middleware functionality to support this was implemented by the team, it was then relatively straightforward for me to add a tab for drawing graphs on multiple deletable panels, within this framework. Here is an example of the same data presented in table and graph format:

aggregations_graphs

We faced a few dilemmas, for example the question of the maximum number of lines to display on a line graph. Aggregating crash data on certain fields can result in tables with many rows, and a graph depicting all of these rows could end up looking quite complicated:

multimultiline

We chose to display a maximum of four lines on each graph – depicting the four datasets with the highest number of crashes – and give a summary of the other rows as text below the graph, like this:

build_id

While higher numbers of crashes may be of greater interest, only showing the top four is potentially restrictive, so we have probably not quite found the ideal solution yet. One improvement could be to allow the user to click on and view any dataset.

Profile page

The third and final strand of my internship was to create a profile page for signed in users that would display user-specific information. This information, which includes a list of permissions and summaries of crash reports and API tokens, is currently being displayed on three separate pages. The aim was to create a new page that would display all this information in one place, with a view to adding more personalised information in the future. The new page is visible here to a logged in user, but is still being tested so it has not replaced the old pages yet.

There is lots to be done to the profile page in the future, and the team have mentioned plans to track what a user is most interested in and automatically show the most relevant graphs in a dashboard-like display.

Going forward

More generally there is plenty of work to be done on making crash-stats more personalisable, and this is the direction the team intend to take in the future, as summarised in this blog post by Adrian. I’ve heard talk of a move to reinforce the JavaScript, which I think would be a very good idea. The site is full of visual components such as panels and interactive components such as forms, that are mostly implemented separately, meaning that they come with different appearances, different behaviour and different bugs. To really sharpen up the user interface and code base in preparation for creating a more interactive site, it would be great if there could be a more library-like approach to the JavaScript, with one universal table, one universal datepicker, etc.

I think even the new Metrics Graphics graphs could be made more universal: at the moment, every time a line graph is drawn certain Metrics Graphics options are chosen to format the graphs according to preferences that work best for Socorro. It would make more sense to have a universal Socorro line graph that already had those options defined, and only pass in what is unique to each graph, such as the data.

What I have learnt

Before this internship I would not have had the confidence to make a pull request to a team of people who didn’t know me, but during the internship I found myself contributing to two different Mozilla projects and I am looking forward to making further contributions in the future. Before I started working on Socorro I had never worked with Django, and had only worked with Python to a moderate level. It was a steep learning curve to understand how all the different components were interacting and to learn about concepts such as decorators and mocking, but by the end of the project I had managed to make a new Django app.

I’d like to thank the Socorro team for helping, teaching and encouraging me, and particularly Adrian, who was a very attentive and approachable mentor: he met with me every week, travelled to London and Paris to work alongside me and refused to let me make him a cup of tea on my first day, because “that’s not what interns are for”.

Final words about Outreachy

I hear that Mozilla are making some improvements to their Outreachy programme, which I welcome. I would advocate for Outreachy interns to be given all the same privileges as Mozilla interns, including a Mozilla account, which would have allowed me to participate in team meetings and integrate much better. While I appreciate that Outreachy has a lower barrier to entry than the Mozilla internship, I think it’s dangerous to have an outreach programme with fewer privileges and lower expectations, because it automatically categorises the people who are struggling to be employed in the industry as less employable. An outreach program should get people from underrepresented groups through the door, and then give them the same opportunities as everyone else.

Mozilla could do more to raise awareness of the Outreachy programme among its employees. The Socorro team were wonderful and had had Outreachy interns before, but a quite a few people I encountered had never heard of Outreachy. Having to explain to them that I was not a Mozilla intern reinforced the idea that I was a bit of an outsider: the very problem that Outreachy is trying to address.

These issues aside, I had a really great experience and learnt a huge amount, and from what I hear there are some very positive changes coming: I hope these will give future Outreachy interns the same experience as Mozilla interns.

Advertisements

2 thoughts on “My Outreachy internship with Mozilla

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s