I have a task I am asked to do fairly regularly that involves searching through a large spreadsheet (around 4,000 rows with about a dozen columns) looking for very specific text. There are, of course, various ways I could accomplish this. But I finally found a solution that speeds this up and want to document it so I don’t have to figure it out again the next time I have to perform this task.
Here’s the scenario: I have a list of several thousand students at my university. I’m trying to find students who meet specific criteria: major or minor in an area, rank has to be a junior or senior, and they have to have a specific GPA. Sorting by GPA and removing those below the cutoff is easy enough, so I’m going to ignore that. It’s also fairly easy to remove sophomores and freshmen since there are limited options and this is just a single column (again, sort and remove those I don’t need).
Why this task is a little difficult is that there are three columns I have to search for the specific majors: Major 1, Major 2, and Minor. I can, of course, sort by major and minor and look for the specific one, but there are hundreds of majors and minors and the majors can be in either of the two major columns. I made up fake data to illustrate. Look at where “Psychology” appears in all of these columns. This makes it rather challenging to find all of the students majoring or minoring in Psychology.
To search just a single column is easy enough. I created the following function to do this:
=ISNUMBER(SEARCH("Psychology",B2))
Let me break this down for you. The first part of the formula, “ISNUMBER,” returns “TRUE” if a number is observed. It returns “FALSE” if not. The second component, “SEARCH,” looks for a passage of text inside a cell. The passage of text is in quotes (“Psychology”) and the cell follows that (“, B2”). If it finds the target text in the target cell, it returns a “1,” otherwise it returns “#VALUE!”. Combined, these two functions will generate a TRUE or FALSE because, if the SEARCH function finds the text, it generates a “1” and that triggers the ISNUMBER function to return TRUE. If the text is not found, it generates “#VALUE!”, which is not a number, and that returns FALSE.
But I need to search across multiple columns. The SEARCH function doesn’t natively search across multiple columns or cells. If you try to tell it to do so, it won’t work (e.g., B2:D2). But there is a cheat or workaround here: CONCATENATE the contents of the cells. Here’s the function I used to search across the three cells at the same time:
=ISNUMBER(SEARCH("Psychology",CONCATENATE(B2,C2,D2))
This function is very similar to the one above but combines all of the text in the three cells and then searches inside the combined text. If the target is in the combined text it will generate a “1”, telling the “ISNUMBER” function that it is TRUE. Otherwise, it will generate the “#VALUE!”, which tells the “ISNUMBER” function that it is FALSE. I can then sort the column where the function is and quickly find all of the individuals who meet the criteria.
I wish I could take credit for this workaround but I can’t. I found it on Stack Exchange.
Leave a Reply