JavaScript Use Binary Search over Linear Search

Problem

When working with large arrays, checking to see if it contains a string can be costly on performance.

Story (TL;DR)

Whilst learning Java I learned about binary search forcollection types and that it's algorithm is a lot more performant than your regular linear search. Curiously I wanted to see if JavaScript also has binary search natively built within the language and to my surprise it isn't. The algorithm itself isn't complex and I recommend myself and others who read this to add it to your project. Ryan Day has built an NPM module that implements binary search with a bunch of useful functions (github repo). This will solve the problem with working on large arrays and should be favoured, however there is an exception if you know your array is going to always be small - you can ignore doing this.

Linear Search is faster for small arrays but slow for large ones.

Linear Search

Def: Linear; progressing from one stage to another in a single series of steps; sequential.

Linear Search is probably something you've done quite a lot in JS. To recap by example:


  const animals = ["Dog", "Cat", "Bird", "Rabbit", "Tiger", "Whale", "Frog"];

  // Linear example 1
  for (index in animals) {
    if (let animals[index] === "Tiger") {
      console.log("The Tiger says growl");
    }
  }

  // Linear example 2
  animals.forEach((animal) => {
    if (animal === "Whale") {
      console.log("The Whale makes a large splash");
    }
  });

  // Linear example 3
  if (animals.indexOf("Bird") {
    console.log("The Bird flys high in the sky");
  }

  // Linear example 4
  for (let i = 0; i < animals.length; i++) {
    if (animals[i] === "Rabbit") {
      console.log("Rabbit did a jump of the great wall");
    }
  }

For all the examples above, each will iterate through the array checking each item and if there is a match it will print out a log. For large arrays where there are thousands of records it would have to iterate through each one. What's interesting is that I've included indexOf. This array function does in fact use linear search. Reading the polyfill on MDN Website proves this claim.

Here is a snippet of it in action:


  do {
    if (that[index] === member) {
      return index;
    }
  } while (++index < length);

Binary Search

Referencing Wikipedia, here is an explanation on the binary search algorithm:

"In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the middle element of the array; if they are unequal, the half in which the target cannot lie is eliminated and the search continues on the remaining half until it is successful. If the search ends with the remaining half being empty, the target is not in the array."

Let's take a look how it's been implemented in Java:

java.util.Arrays version 8u40-b25


  2429    private static int binarySearch0(Object[] a, int fromIndex, int toIndex,
  2430                                     Object key) {
  2431        int low = fromIndex;
  2432        int high = toIndex - 1;
  2433
  2434        while (low <= high) {
  2435            int mid = (low + high) >>> 1;
  2436            @SuppressWarnings("rawtypes")
  2437            Comparable midVal = (Comparable)a[mid];
  2438            @SuppressWarnings("unchecked")
  2439            int cmp = midVal.compareTo(key);
  2440
  2441            if (cmp < 0)
  2442                low = mid + 1;
  2443            else if (cmp > 0)
  2444                high = mid - 1;
  2445            else
  2446                return mid; // key found
  2447        }
  2448        return -(low + 1);  // key not found.
  2449    }

Thankfully, doing a binary search isn't a lot of code, but there is an operator that I am unfamiliar with - bitwise logical operators. On line 2435 you can see one of those operators getting used, that one in particular is a right bit-shift operator. I remember learning these in C and got told that you'll most likely never use it unless you're working with robotics and other low level programming stuff. So, I threw it out of my brain in space for something else, but after seeing this in action I think I will have to do some research and write about these operators in anther post. For now line 2435 is doing something like this:


  int mid = (low + high) / 2;

By looking at the implementation in Java and referencing it against Wikipedia's explanation makes it easier to read. We can see that it's getting the mid point of the array and then comparing to see if the mid value matches the key and determining whether or not to go up or down the chain. It will loop through again and again getting new mid points, making it's steps quicker than iterating through one by one. And if the cmp is 0 the key is found. Without knowledge of the compareTo() method it is a little hard to understand. The compareTo() method returns the positioning difference between the comparisons from an ordered list. It will be easier to explain by example.


  String word1 = "hello";
  String word2 = "beatle";
  String word3 = "soup";

  System.out.println(word1.compareTo(word2)); // 6 because "b" is 6 times greater than "h" in the alphabet
  System.out.println(word1.compareTo(word3); // -11 because "s" is 11 times lower than "h"

Now that we better understand the binary search method, we can notice an issue. Since we are comparing via an organised data structure, where everything is already sorted in order, running this method on an array which is not sorted will cause issues and not work. Thus, an array must be sorted before use with binary search.

Ensure your array is sorted in order before using binary search

JavaScript implementation

Understanding Java's implementation we can translate it across to JavaScript.


  function binarySearch(list, key) {
    let low = 0;
    let high = list.length - 1;

    while (low <= high) {
      const mid = (low + high) >>> 1;
      const midVal = list[mid];
      const cmp = midVal.localeCompare(key);

      if (cmp < 0) {
        low = mid + 1;
      } else if (cmp > 0) {
        high = mid - 1;
      } else {
        return mid; // key found
      }
    }
    return -(low + 1); // key not found.
  }

A successful return will return the index of the key and if it's unsuccessful it will return -1. Now we can write more efficient code in JS.

NOTE: My JS version is not a performant as the Java method as localeCompare() only returns -1,0,1 which means it's travel is much shorter than Java's. Since JS doesn't have a compareTo Java-like-method we would have to create our own to match the same performance.