On Brian Kardell’s post Where Browsers Come From there’s an estimate given on the cost of maintaining the three major browser engines:

What does the browser commons cost, per-user, per-year?

Based on what we know of team sizes, scales and budgets - several people seem to have arrived on a similar ballpark figure: Somewhere around 2 billion dollars per year.

This figure feels like an overestimate to me, so let’s dig a little further into this. I want to get my own very rough estimate to see if I reach a similar ballpark figure.

A reasonable place to start is the commit logs - while it won’t capture everyone who contributes to the browser commons, it’ll give us a lower bound from which we can estimate the number of additional roles (managers, developer relations, tech writers, etc).

We’ll remove some duplicates, and free gmail addresses (as a likely signal they, like me, are not being paid). Otherwise we’ll be very conservative in saying that all committers to the three major engines are employed full-time to advance the browser commons. That’s definitely not true, but I’m trying to find a more reasonable upper bound.

I’ll be looking at commits from the 2022 calendar year. While it was tempting to look at the trailing 12 months, having a repeatable time frame is useful if people disagree with my analysis. We’ll collate the email addresses from the three engines into one file for de-duplication. This will prevent double counting some people from Igalia who may contribute to multiple engines in the same calendar year.

Then, with the number of committers we can determine a rough estimate of their total compensation. After that we’ll add 50% for a conservative upper-bound of the cost of management, developer relations, tech writers, bug reporters, build infrastructure cost, sales (to negotiate the default search deal occasionally). This won’t include cost of senior management, expenditure for hardware platforms, or expenditure to try and diversify income streams - just trying to get a rough estimate on maintaining and improving the browser engines themselves.

WebKit:

We’ll start with WebKit, because I’m more familiar with the project.

$ git log --since "Jan 1 2022" --before "Jan 1 2023" --format="%ae" | grep -v gmail.com | cut -d'@' -f1,2 | sort | uniq | wc -l
269

The cut -d'@' -f1,2 removes duplicates due to trailing UUIDs in git authors:

[email protected]@
[email protected]@00000000-0000-0000-0000-000000000000

269 is a bit of an overestimate: there’s 207 from Apple, Igalia and Sony. Let’s roll with it for now, we’re just looking to get an upper bound.

Firefox (Gecko):

I’m using the gecko-dev mirror.

$ git log --since "Jan 1 2022" --before "Jan 1 2023" --format="%ae" | grep -v gmail.com | cut -d'@' -f1,2 | sort | uniq | wc -l
839

That’s a surprise to me! I had expected Gecko to be in the same ballpark as WebKit. Let’s look at the top 10 domains:

$ git log --since "Jan 1 2022" --before "Jan 1 2023" --format="%ae" | grep -v gmail.com | cut -d'@' -f1,2 | sort | uniq | cut -d'@' -f2 | sort | uniq -c | sort -n --reverse | head -n 10
    244 mozilla.com
    169 chromium.org
     55 google.com
     42 users.noreply.github.com
     27 igalia.com
     22 apple.com
     14 microsoft.com
      9 outlook.com
      7 intel.com
      6 protonmail.com

Ah, okay. The Chromium, Google, Apple and (potentially) Igalia ones may end up being double-counted. There’s some free e-mails in here and some github noreply e-mail addresses which likely aren’t being paid. I’ll ignore those for now, we’re just looking to get an upper bound. Might revisit this later, there’s a long tail of single-commit authors who are unlikely to be getting paid.

Chromium (Chrome):

Chrome uses git submodules so makes it a little more difficult of a task. Let’s start in src:

$ git log --since "Jan 1 2022" --before "Jan 1 2023" --format="%ae" | grep -v gmail.com | cut -d'@' -f1,2 | sort | uniq | wc -l
2616

Woah! Another surprising result, that’s much higher than I expected. That’s a lotta contributors! Let’s look at the top 10 domains:

$ git log --since "Jan 1 2022" --before "Jan 1 2023" --format="%ae" | grep -v gmail.com | cut -d'@' -f1,2 | sort | uniq | cut -d'@' -f2 | sort | uniq -c | sort -n --reverse | head -n 10
   1132 google.com
   1105 chromium.org
    101 microsoft.com
     41 intel.com
     33 igalia.com
     26 yandex-team.ru
     20 samsung.com
     11 opera.com
     11 navercorp.com
      9 chops-service-accounts.iam.gserviceaccount.com

Ok, maybe there’s some duplication here between google.com and chromium.org. Committers can gain a chromium.org e-mail address, and most use the same username as their google.com address. I’ll do some spot inspection first:

git log --since "Jan 1 2022" --before "Jan 1 2023" --format="%ae" | grep -e chromium.org -e google.com | cut -d'@' -f1,2 | sort | uniq | less

Then will figure out a way to count the duplication:

$ git log --since "Jan 1 2022" --before "Jan 1 2023" --format="%ae" | grep -e chromium.org -e google.com | cut -d'@' -f1,2 | sort | uniq -c | wc -l
2241
$ git log --since "Jan 1 2022" --before "Jan 1 2023" --format="%ae" | grep -e chromium.org -e google.com | cut -d'@' -f1 | sort | uniq -c | wc -l
2038

Okay, only 203 duplicates. The Chrome team is clearly much larger than the WebKit and Firefox teams.

Let’s also grab src/v8 contributors. I don’t want to delve too deep into the codebase for a rough figure, but that feels like an important component to include.

$ git log --since "Jan 1 2022" --before "Jan 1 2023" --format="%ae" | grep -v gmail.com | cut -d'@' -f1,2 | sort | uniq | wc -l
191

Summary

Combining those e-mail lists together, we get:

$ # total contributors
$ cat contributors-chromium.txt contributors-gecko.txt contributors-v8.txt contributors-webkit.txt | sort | uniq | wc -l
3478
$ # all chromium.org + google.com emails
$ cat contributors-chromium.txt contributors-gecko.txt contributors-v8.txt contributors-webkit.txt | grep -e chromium.org -e google.com | cut -d'@' -f1,2 | sort | uniq -c | wc -l
2275
$ # removing duplicate usernames from chromium.org + google.com emails
$ cat contributors-chromium.txt contributors-gecko.txt contributors-v8.txt contributors-webkit.txt | grep -e chromium.org -e google.com | cut -d'@' -f1 | sort | uniq -c | wc -l
2063
$ echo "$(( 3478 - (2275 - 2063) ))"
3266

So eliminating 212 likely google / chromium duplicates, we’re at 3266 contributors with 59% of them being Chromium / Google. Assuming the average compensation is somewhere between a Google L4 and L5 at $320K, that’s $1,045,120,000. Adding 50% overhead we’re at $1,567,680,000 - roughly $1.6 billion.

The 50% overhead is likely an overestimate, the average compensation might be an overestimate, and there is a long tail of small contributions which probably don’t represent full-time employment on advancing the browser commons. There’s probably better ways to de-duplicate contributors who use multiple e-mail addresses. I could’ve spent more time in the Chromium project to get a more accurate picture of the number of contributors through enumerating all the submodules.

However the 2 billion per year figure isn’t as outlandish as I thought! I’d been more familiar with the WebKit project and hadn’t accounted for just how wildly different the level of investment is between WebKit and Chromium. I can see how you might get to $2 billion with different methodology and likely more information than I have.