Excel's extracting data capability is not a new thing. However, it has become more flexible with Power Query. In this guide, we're going to show you how to extract data from a website in Excel.
Extracting data from a website in Excel by using Power Query
- Click From Web command under the Data tab, Get & Transform Data Section name might be different on your Excel because Microsoft has changed the name a few times.
- In the From Web dialog, you can stay on the Basic option to extract data from a public website. The advanced option is for websites that are using parameters to generate data. If the URL of your website is like this website.com?country=US?state=NY, you can update "country" and "state" parameters dynamically by using Advanced mode.
In our example, we stayed in Basic and use Wikipedia's FIFA World Cup 2022 address: https://en.wikipedia.org/wiki/2022_FIFA_World_Cup
- Clicking OK opens the Navigator window where you can see the preview of existing data in tables and select which ones to extract.
Our selection Group G.
- You have two options:
- Load: Loads the data you see in the Navigator
- Transform Data: Opens the Power Query window where you can alter the data before populating in the worksheet.
Because the first row of the data includes the titles, we are continuing with Transform Data button to remove that row.
- Power Query has a built-in command that can convert the first row to the header. Clicking Use First Row as Headers button promotes the rows to headers.
- You can see that there are no more duplicate headers. Also, Power Query automatically changed data types of columns like Pld, W and D from text to number because text values (header duplicates) are removed.
Use the Close & Load button to populate the table in the worksheet.
- Once imported you can use the values in your calculations as regularly. Do not make changes directly on the table, because Excel overwrites the table with up-to-date values every time the table (or data connections) is refreshed.
- You have a couple of options to refresh the data:
- Ctrl + F5: refresh worksheet
- Ctrl + Alt + F5: refresh workbook
- Right-click menu
- Data tab
- Query tab (Visible only if a cell in the table is selected)
- Of course, VBA
- ActiveWorkbook.Connections("test").Refresh 'Refresh specific connection named "test"
- ActiveWorkbook.RefreshAll 'Refresh all connections
- See that the how the table is updated after Cameroon - Serbia (3-3) match.