Ranked Choice voting with Google Forms and Python

Ricardo Rosas
5 min readNov 25, 2020
Sankey Diagram with vote by ranking

Imagine that you are voting within your team for what to do for a team building event. The three options that you have available are: Cooking together, doing arts & crafts, or playing football. You would really love to do arts & crafts but you would HATE to play a sport with your work colleagues. If you feel that more people would like cooking over arts & crafts, you may decide to vote for it as a ‘safe choice’, even if it wasn’t your preference at all.

When the stakes are higher, like in politics, this is a particular problem. Although you may like a 3rd party candidate, very often people would tell you to vote another candidate because “Voting for a 3rd party candidate is a vote for [insert party you like least]”.

This is were Ranked Choice voting comes into play. With ranked choice, rather than picking a single option (e.g. cooking), you list your preferences (i.e. #1 arts & crafts, #2 cooking, #3 football). In case there are too few people who voted for your #1, your vote would still go to your #2.

Here’s a video that explains this

In this article, I will walk you on how you can do your own ranked-choice voting using Google Forms and Python (note: in this case, I do not stop until a choice has 51% but until only two options are left).

First, let’s start with a Google Form. It’s quite simple: You just need to create a multiple choice grid with one vote per row and colum, like so:

Google Forms, editing
Final output

Afterwards you’ll have all the results in a Google Spreadsheet

Google Spreadsheet with results

Now we’re ready for some Python!

Basically, the steps below are as follows:

  • Import the data from google sheets to google colab
  • Clean the data
  • Calculate who has the most votes
  • Remove the choice with the least votes
  • Give their votes to their next preference
  • Repeat until you only have 2 left
  • Count the votes!
  • Graph the results using a SankeyDiagram

Code for the geeks

You can download the notebook I used from my Github :)

Importing and cleaning the data

First we need to import the spreadsheet to the Google colab notebook (super similar to a Jupyter notebook, but easier to pull data from a Google spreadsheet). To import the data I used the following code (original source her).

This will get you to a nice dataframe that you still need to clean with love and affection but let’s see how this looks like:

Fair enough; but there are five small problems with this data that you need to clean: Getting rid of the timestamp column, changing the votes from strings to integers, resetting the index, having the first row as the header, and having the name of the candidate as the column name. I got you covered :

After cleaning the data, we finally can start to dig into those votes!

Beautiful votes
Beautifully displayed votes

Determining who won

Now that our data is clean and proper, we need to have a place to start. With a couple simple operations, you can already see who the first vote is in the initial round

First round of votes, one person, one vote

This new DataFrame, which I’m calling vote_rounds, will be at the center of the voting process. On this table, I will save the results of each election round.

Ok. So now for the fun part, this is the actual process of assigning votes to people. The process goes as follows:

  • Find out the person (or multiple) that have the least amount of votes (i.e. who people voted for as their first choice.
  • If there’s more than one, find out who has the ‘worst’ ranking. In this case, I just some up all the numbers (i.e. one person voted as #1, three voted as #3, two as #5, and so on). Whoever has the highest number loses :(
  • Determine who their votes would go to
  • Assign the votes to their next preferred candidate
  • Repeat the process until you have only two people left

If you want to avoid all this code and the nice graph, you could also just sum up at the very beginning all the votes and select the winner based on who has the lowest number (but where’s the fun in that?).

Final Output: How the voting went

Now to the SankeyDiagram

Full credit on how to use the SankeyDiagram goes to Ken333135. To do the Sankey diagram, we first need to do some data cleaning. One of the drawbacks of this SankeyDiagram is that the names have to be different each time. So I simply added a number at the end of this name and added all the same voting paterns together

and… voila!

Sankey Diagram with flow of votes

P.S: If you ever use this, let me know! Would love to hear in which context you used it :)

--

--