My Impossible Story

Keeping up my bi-yearly blogging cadence, I thought it might be fun to write about what I’ve been doing since I left Mozilla. It’s also a convenient time, as it coincides with our work being open-sourced and made public (and of course, developed in public, because otherwise what’s the point, right?) Somewhat ironically, I’ve been working on another machine-learning project, though I’m loathe to call it that, as it uses no neural networks so far, and most people I’ve encountered consider those to be synonymous. I did also go on a month’s holiday to the home of bluegrass music, but that’s a story for another post. I’m getting ahead of myself here.

Some time in March I met up with some old colleagues/friends and of course we all got to chatting about what we’re working on at the moment. As it happened, Rob had just started working at a company run by a friend of our shared former boss, Matthew Allum. What he was working on sounded like it would be a lot of fun, and I had to admit that I was a little jealous of the opportunity… But it so happened that they were looking to hire, and I was starting to get itchy feet, so I got to talk to Kwame Ferreira and one thing lead to another.

I started working for Impossible Labs in July, on an R&D project called ‘glimpse’. The remit for this work hasn’t always been entirely clear, but the pitch was that we’d be working on augmented reality technology to aid social interaction. There was also this video:

How could I resist?

What this has meant in real terms is that we’ve been researching and implementing a skeletal tracking system (think motion capture without any special markers/suits/equipment). We’ve studied Microsoft’s freely-available research on the skeletal tracking system for the Kinect, and filling in some of the gaps, implemented something that is probably very similar. We’ve not had much time yet, but it does work and you can download it and try it out now if you’re an adventurous Linux user. You’ll have to wait a bit longer if you’re less adventurous or you want to see it running on a phone.

I’ve worked mainly on implementing the tools and code to train and use the model we use to interpret body images and infer joint positions. My prior experience on the DeepSpeech team at Mozilla was invaluable to this. It gave me the prerequisite knowledge and vocabulary to be able to understand the various papers around the topic, and to realistically implement them. Funnily, I initially tried using TensorFlow for training, with the mind that it’d help us to easily train on GPUs. It turns out re-implementing it in native C was literally 1000x faster and allowed us to realistically complete training on a single (powerful) machine, in just a couple of days.

My take-away for this is that TensorFlow isn’t necessarily the tool for all machine-learning tasks, and also to make sure you analyse the graphs that it produces thoroughly and make sure you don’t have any obvious bottlenecks. A lot of TensorFlow nodes do not have GPU implementations, for example, and it’s very easy to absolutely kill performance by requiring frequent data transfers to happen between CPU and GPU. It’s also worth noting that a large graph has a huge amount of overhead that will be unrelated to the actual operations you’re trying to run. I’m no TensorFlow expert, but it’s definitely a particular tool for a particular job and it’s worth being careful. Experts can feel free to look at our repository history and tell me all the stupid mistakes I was making before we rewrote it 🙂

So what’s it like working at Impossible on a day-to-day basis? I think a picture says a thousand words, so here’s a picture of our studio:

Though I’ve taken this from the Impossible website, this is seriously what it looks like. There is actually a piano there, and it’s in tune and everything. There are guitars. We have a cat. There’s a tree. A kitchen. The roof is glass. As amazing as Mozilla (and many of the larger tech companies) offices are, this is really something else. I can’t overstate how refreshing an environment this is to be in, and how that impacts both your state of mind and your work. Corporations take note, I’ll take sunlight and life over snacks and a ball-pit any day of the week.

I miss my 3-day work-week sometimes. I do have less time for music than I had, and it’s a little harder to fit everything in. But what I’ve gained in exchange is a passion for my work again. This is code I’m pretty proud of, and that I think is interesting. I’m excited to see where it goes, and to get it into people’s hands. I’m hoping that other people will see what I see in it, if not now, sometime in the near future. Wish us luck!

Goodbye Mozilla

Today is effectively my last day at Mozilla, before I start at Impossible on Monday. I’ve been here for 6 years and a bit and it’s been quite an experience. I think it’s worth reflecting on, so here we go; Fair warning, if you have no interest in me or Mozilla, this is going to make pretty boring reading.

I started on June 6th 2011, several months before the (then new, since moved) London office opened. Although my skills lay (lie?) in user interface implementation, I was hired mainly for my graphics and systems knowledge. Mozilla was in the region of 500 or so employees then I think, and it was an interesting time. I’d been working on the code-base for several years prior at Intel, on a headless backend that we used to build a Clutter-based browser for Moblin netbooks. I wasn’t completely unfamiliar with the code-base, but it still took a long time to get to grips with. We’re talking several million lines of code with several years of legacy, in a language I still consider myself to be pretty novice at (C++).

I started on the mobile platform team, and I would consider this to be my most enjoyable time at the company. The mobile platform team was a multi-discipline team that did general low-level platform work for the mobile (Android and Meego) browser. When we started, the browser was based on XUL and was multi-process. Mobile was often the breeding ground for new technologies that would later go on to desktop. It wasn’t long before we started developing a new browser based on a native Android UI, removing XUL and relegating Gecko to page rendering. At the time this felt like a disappointing move. The reason the XUL-based browser wasn’t quite satisfactory was mainly due to performance issues, and as a platform guy, I wanted to see those issues fixed, rather than worked around. In retrospect, this was absolutely the right decision and lead to what I’d still consider to be one of Android’s best browsers.

Despite performance issues being one of the major driving forces for making this move, we did a lot of platform work at the time too. As well as being multi-process, the XUL browser had a compositor system for rendering the page, but this wasn’t easily portable. We ended up rewriting this, first almost entirely in Java (which was interesting), then with the rendering part of the compositor in native code. The input handling remained in Java for several years (pretty much until FirefoxOS, where we rewrote that part in native code, then later, switched Android over).

Most of my work during this period was based around improving performance (both perceived and real) and fluidity of the browser. Benoit Girard had written an excellent tiled rendering framework that I polished and got working with mobile. On top of that, I worked on progressive rendering and low precision rendering, which combined are probably the largest body of original work I’ve contributed to the Mozilla code-base. Neither of them are really active in the code-base at the moment, which shows how good a job I didn’t do maintaining them, I suppose.

Although most of my work was graphics-focused on the platform team, I also got to to do some layout work. I worked on some over-invalidation issues before Matt Woodrow’s DLBI work landed (which nullified that, but I think that work existed in at least one release). I also worked a lot on fixed position elements staying fixed to the correct positions during scrolling and zooming, another piece of work I was quite proud of (and probably my second-biggest contribution). There was also the opportunity for some UI work, when it intersected with platform. I implemented Firefox for Android’s dynamic toolbar, and made sure it interacted well with fixed position elements (some of this work has unfortunately been undone with the move from the partially Java-based input manager to the native one). During this period, I was also regularly attending and presenting at FOSDEM.

I would consider my time on the mobile platform team a pretty happy and productive time. Unfortunately for me, those of us with graphics specialities on the mobile platform team were taken off that team and put on the graphics team. I think this was the start in a steady decline in my engagement with the company. At the time this move was made, Mozilla was apparently trying to consolidate teams around products, and this was the exact opposite happening. The move was never really explained to me and I know I wasn’t the only one that wasn’t happy about it. The graphics team was very different to the mobile platform team and I don’t feel I fit in as well. It felt more boisterous and less democratic than the mobile platform team, and as someone that generally shies away from arguments and just wants to get work done, it was hard not to feel sidelined slightly. I was also quite disappointed that people didn’t seem particular familiar with the graphics work I had already been doing and that I was tasked, at least initially, with working on some very different (and very boring) desktop Linux work, rather than my speciality of mobile.

I think my time on the graphics team was pretty unproductive, with the exception of the work I did on b2g, improving tiled rendering and getting graphics memory-mapped tiles working. This was particularly hard as the interface was basically undocumented, and its implementation details could vary wildly depending on the graphics driver. Though I made a huge contribution to this work, you won’t see me credited in the tree unfortunately. I’m still a little bit sore about that. It wasn’t long after this that I requested to move to the FirefoxOS systems front-end team. I’d been doing some work there already and I’d long wanted to go back to doing UI. It felt like I either needed a dramatic change or I needed to leave. I’m glad I didn’t leave at this point.

Working on FirefoxOS was a blast. We had lots of new, very talented people, a clear and worthwhile mission, and a new code-base to work with. I worked mainly on the home-screen, first with performance improvements, then with added features (app-grouping being the major one), then with a hugely controversial and probably mismanaged (on my part, not my manager – who was excellent) rewrite. The rewrite was good and fixed many of the performance problems of what it was replacing, but unfortunately also removed features, at least initially. Turns out people really liked the app-grouping feature.

I really enjoyed my time working on FirefoxOS, and getting a nice clean break from platform work, but it was always bitter-sweet. Everyone working on the project was very enthusiastic to see it through and do a good job, but it never felt like upper management’s focus was in the correct place. We spent far too much time kowtowing to the desires of phone carriers and trying to copy Android and not nearly enough time on basic features and polish. Up until around v2.0 and maybe even 2.2, the experience of using FirefoxOS was very rough. Unfortunately, as soon as it started to show some promise and as soon as we had freedom from carriers to actually do what we set out to do in the first place, the project was cancelled, in favour of the whole Connected Devices IoT debacle.

If there was anything that killed morale for me more than my unfortunate time on the graphics team, and more than having FirefoxOS prematurely cancelled, it would have to be the Connected Devices experience. I appreciate it as an opportunity to work on random semi-interesting things for a year or so, and to get some entrepreneurship training, but the mismanagement of that whole situation was pretty epic. To take a group of hundreds of UI-focused engineers and tell them that, with very little help, they should organised themselves into small teams and create IoT products still strikes me as an idea so crazy that it definitely won’t work. Certainly not the way we did it anyway. The idea, I think, was that we’d be running several internal start-ups and we’d hopefully get some marketable products out of it. What business a not-for-profit company, based primarily on doing open-source, web-based engineering has making physical, commercial products is questionable, but it failed long before that could be considered.

The process involved coming up with an idea, presenting it and getting approval to run with it. You would then repeat this approval process at various stages during development. It was, however, very hard to get approval for enough resources (both time and people) to finesse an idea long enough to make it obviously a good or bad idea. That aside, I found it very demoralising to not have the opportunity to write code that people could use. I did manage it a few times, in spite of what was happening, but none of this work I would consider myself particularly proud of. Lots of very talented people left during this period, and then at the end of it, everyone else was laid off. Not a good time.

Luckily for me and the team I was on, we were moved under the umbrella of Emerging Technologies before the lay-offs happened, and this also allowed us to refocus away from trying to make an under-featured and pointless shopping-list assistant and back onto the underlying speech-recognition technology. This brings us almost to present day now.

The DeepSpeech speech recognition project is an extremely worthwhile project, with a clear mission, great promise and interesting underlying technology. So why would I leave? Well, I’ve practically ended up on this team by a series of accidents and random happenstance. It’s been very interesting so far, I’ve learnt a lot and I think I’ve made a reasonable contribution to the code-base. I also rewrote python_speech_features in C for a pretty large performance boost, which I’m pretty pleased with. But at the end of the day, it doesn’t feel like this team will miss me. I too often spend my time finding work to do, and to be honest, I’m just not interested enough in the subject matter to make that work long-term. Most of my time on this project has been spent pushing to open it up and make it more transparent to people outside of the company. I’ve added model exporting, better default behaviour, a client library, a native client, Python bindings (+ example client) and most recently, Node.js bindings (+ example client). We’re starting to get noticed and starting to get external contributions, but I worry that we still aren’t transparent enough and still aren’t truly treating this as the open-source project it is and should be. I hope the team can push further towards this direction without me. I think it’ll be one to watch.

Next week, I start working at a new job doing a new thing. It’s odd to say goodbye to Mozilla after 6 years. It’s not easy, but many of my peers and colleagues have already made the jump, so it feels like the right time. One of the big reasons I’m moving, and moving to Impossible specifically, is that I want to get back to doing impressive work again. This is the largest regret I have about my time at Mozilla. I used to blog regularly when I worked at OpenedHand and Intel, because I was excited about the work we were doing and I thought it was impressive. This wasn’t just youthful exuberance (he says, realising how ridiculous that sounds at 32), I still consider much of the work we did to be impressive, even now. I want to be doing things like that again, and it feels like Impossible is a great opportunity to make that happen. Wish me luck!