Database functions are designed to perform simple operations on data that is in a “database-like structure”. Database-like structure means that the data must be in a table structure that has organized records with labels and appropriate separation. In this model, each row represents an individual record, and each column is a different type of information. For data that is in this structure, you can use Excel database functions to manipulate and manage your tables much more efficiently.
Excel Database Functions Overview
All Excel database functions, except for DGET, can be used just like the xIFS functions (SUMIFS, AVERAGEIFS, etc.). xIFS formulas perform the specific aggregation or operation on a certain column, with specified filters. On the other hand, the DGET function can grab a cell value as is, without any aggregation, if there is only a single result after filters are applied. Below is a list of Excel database functions.
Function | Description |
DAVERAGE | Returns the average of selected database entries |
DCOUNT | Counts the cells that contain numbers in a database |
DCOUNTA | Counts nonblank cells in a database |
DGET | Extracts from a database a single record that matches the specified criteria |
DMAX | Returns the maximum value from selected database entries |
DMIN | Returns the minimum value from selected database entries |
DPRODUCT | Multiplies the values in a particular field of records that match the criteria in a database |
DSTDEV | Estimates the standard deviation based on a sample of selected database entries |
DSTDEVP | Calculates the standard deviation based on the entire population of selected database entries |
DSUM | Adds the numbers in the field column of records in the database that match the criteria |
DVAR | Estimates variance based on a sample from selected database entries |
DVARP | Calculates variance based on the entire population of selected database entries |
The formulas look and work very much like their IFS counterparts, the only exception being the DGET function.
Syntax
All Excel database functions use the same syntax which has 3 arguments for data, field, and filters. All arguments are required.
Database | The range of cells containing the data itself. The top row of the range contains labels for each column. |
Field | The column that is to be used in calculations. A label (name) or position of a column can be entered. You can enter a name inside quotation marks, such as "Base Salary", or a position index, such as 7 for the 7th column. |
Criteria | The range of cells that contain the conditions that will determine which records are to be included in the calculations. Criteria has label of column(s) to be filtered and condition(s) under the label. |
Building a criteria
Building the criteria logic might seem intimidating at first. You can create conditions and join them using AND and OR logical operators. The idea is that every condition inside a row is connected with an AND logic, whereas every row is connected with an OR. Let's see these logical statements in examples. Feel free to download our sample workbook below.
Basics
Enter the field names (columns) in a single row, where you would like to add to the criteria. For example, if we want to filter Atlanta values in a column named Location, our criteria should be like below.
If we need to add a second value for Location, the second value should be placed under the first one. Below criteria refers to the records that has Atlanta or Valdosta values in the corresponding Location. This is an OR connection.
Alternatively, if we want to get records that have Atlanta for Location AND Year smaller than 2017, we need to use the same row.
Note that you can use <, >, <=, >= and <> operators as well. Although, Excel suggests using = with equal conditions, this is not mandatory. The database formulas support * and ? wildcards for unknown characters. To learn more about wildcard characters in Excel, please see How to use a wildcard in Excel formula.
Complex Examples
Multiple AND logic
Boolean logic: (Location = Atlanta AND Year > 2011 AND Year < 2018)
Records in Atlanta between 2011 and 2018.
Multiple OR statements
Boolean logic: (Location = Atlanta OR Location = Dothan OR Location = Valdosta)
Records in Atlanta or Dothan or Valdosta.
OR logic between different fields
Boolean logic: (Location = Atlanta OR Department <> R&D)
Records in Atlanta or not in R&D department.
You can leave fields empty when you do not need them in the criteria.
Combination of AND or OR logics
Boolean logic: ( (Location = Atlanta AND Year > 2011 AND Year < 2018) OR (Location = Dothan AND Department <> R&D) )
Records in Atlanta between 2011 and 2018, or in Dothan but not in R&D department.
Wildcards
Boolean logic: (Full Name = Dana* AND Location = Atlanta AND Year = 2018)
Records with names starting with Dana in Atlanta and in 2018. The department is irrelevant in this case.
Advantages & Disadvantages
Once you understand the structural requirements, managing data in a database-like structure becomes much easier. Finally, let’s take a quick look at some of the advantages and disadvantages of using Excel database formulas, instead of traditional aggregation methods.
Advantages
- Easy to create and modify filters and target column without updating formulas.
- Easy to manage complex filters.
- A single named range is enough in most cases.
- Supports AND and OR logic checks (xIFS functions only supports AND).
Disadvantages
- Data must have headers
- Slight learning curve