Google Sheets Formula to Create a Right Outer Join

In my post The Ultimate Google Sheets Formula to Join Two Tables I offered a formula equivalent to a left join. Since publishing that post, a few people have commented asking for a Google Sheets formula equivalent to a right outer join. Here’s what I’ve come up with.

For this post, we’re going to use two tables.

user_idfirst_name
1Alice
2Bob
3Carol
4Dave
users
sale_iduser_idamount
1110.00
2320.00
3330.00
4140.00
sales

We want the end result to include every user from the “users” table and every sale linked to each user. For those users who have no sales linked to them, we want empty cells in the “sale_id” and “amount” columns. And if a user has more than one sale attributed to them, we want one row for each sale.

user_idfirst_namesale_idamount
1Alice110.00
1Alice440.00
2Bob
3Carol220.00
3Carol330.00
4Dave
join

This is the result we’d expect if we were to run the following SQL query.

SELECT users.user_id, first_name, sale_id, amount
FROM sales
RIGHT OUTER JOIN users
ON sales.user_id = users.user_id

Here’s what we’re going to do:

  1. On the “users” sheet, retrieve the first matching sales record from the “sales” sheet.
  2. On the “sales” sheet, retrieve the matching user from the “users” sheet.
  3. In a third sheet that we’ll name “join”, append the data from the “sales” sheet to the data from the “users” sheet.
  4. Return only the unique set of records from this concatenated range of data.
  5. Query our concatenated range of data, returning only those rows where there’s a user. (If for some reason there happened to be a row of sales data without a corresponding user, we wouldn’t show that row.)

And here’s the spreadsheet where you can see all the stuff we’re doing.

Retrieve the first sales record for each user

We’ll use a VLOOKUP to find the first sales record linked to each user.

=IFERROR(FILTER(VLOOKUP(A:A,{sales!B:B,sales!A:A,sales!C:C},{2,3},FALSE),A:A<>""))
user_idfirst_namesale_idamount
1Alice110.00
2Bob
3Carol220.00
4Dave
users

Retrieve the user matching each sales record

Again, we’ll use a VLOOKUP to find the user corresponding to each sale.

=IFERROR(FILTER(VLOOKUP(B:B,users!A:B,2,FALSE),A:A<>""))
sale_iduser_idamountfirst_name
1110.00Alice
2320.00Carol
3330.00Carol
4140.00Alice
sales

Concatenate the data from the “users” and “sales” tables

Create a new sheet named “join”.

In this new sheet, in cell A1, we’ll start with a formula to get all rows from the “users” table that have data.

=FILTER(users!A:D,users!A:A<>"")

Next, we want to get all rows from the “sales” table that have data, excluding the first row which contains the headers. (We’ve already gotten the headers from our “users” table.) The trick with the sales data, though, is that the columns in our “sales” table are not in the same order as the columns in our “users” table. So, we need to re-order the columns from the “sales” table to match the order in the “users” table. We use the FILTER function to retrieve only those rows that have data.

user_idfirst_namesale_idamount
1Alice110.00
2Bob
3Carol220.00
4Dave
users
sale_iduser_idamountfirst_name
1110.00Alice
2320.00Carol
3330.00Carol
4140.00Alice
sales

To re-order the columns in the “sales” table, we’ll create an array.

=FILTER({sales!B2:B,sales!D2:D,sales!A2:A,sales!C2:C},sales!A2:A<>"")

Again, we use the FILTER function to retrieve only those rows that have data. This formula will return a data set that looks like this:

user_idfirst_namesale_idamount
1Alice110.00
3Carol220.00
3Carol330.00
1Alice440.00
sales

Now, we’ll append the sales data to the user data by creating an array that combines these two formulas.

={FILTER(users!A:D,users!A:A<>"");FILTER({sales!B2:B,sales!D2:D,sales!A2:A,sales!C2:C},sales!A2:A<>"")}

This will produce a range that looks like this:

user_idfirst_namesale_idamount
1Alice110.00
2Bob
3Carol220.00
4Dave
1Alice110.00
3Carol220.00
3Carol330.00
1Alice440.00

Return only unique records

Notice that we’ve got some duplicate rows. For example, the first row (after the header) is the same as fifth row. We’ll use the UNIQUE function to return only unique records.

=UNIQUE({FILTER(users!A:D,users!A:A<>"");FILTER({sales!B2:B,sales!D2:D,sales!A2:A,sales!C2:C},sales!A2:A<>"")})

This gives us the following data set:

user_idfirst_namesale_idamount
1Alice110.00
2Bob
3Carol220.00
4Dave
3Carol330.00
1Alice440.00

Return only rows with a “user” record

Finally, we’ll use QUERY to return only those rows where there’s a user. We’ll also use QUERY to sort the data by “first_name” in ascending order.

=QUERY(UNIQUE({FILTER(users!A:D,users!A:A<>"");FILTER({sales!B2:B,sales!D2:D,sales!A2:A,sales!C2:C},sales!A2:A<>"")}),"SELECT * WHERE Col2 != '' ORDER BY Col2")

And there you have it. Effectively, the equivalent of a right outer join in Google Sheets.

user_idfirst_namesale_idamount
1Alice110.00
1Alice440.00
2Bob
3Carol220.00
3Carol330.00
4Dave

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.