Using urllib.request.urlretrieve()

Vishal Shrestha

Downloading Files From URLs With Python Vishal Shrestha 03:22

Download link mentioned in this lesson: https://api.worldbank.org/v2/en/indicator/NY.GDP.MKTP.CD?downloadformat=csv

00:00 Now let’s get practical and see how you can actually download a file using the urlretrieve() function. The link to download the file that we’ve used in this example can be found in the description below the video.

00:13 Now let’s go ahead and download this file.

00:18 Notice that I’ve changed my directory to the Real-Python folder inside the D drive. This is important because where you start your REPL from is where your file will be downloaded by default. Later on, I’ll show you how you can save your downloaded file to any directory or any path. The first step to download a file using the urlretrieve() function is to import the urllib.request module.

00:44 Now, you just define the URL from which you want to download the file.

00:51 Again, you can find this link in the description below the video.

00:56 Now, set a filename that your downloaded file will be saved as. I’ll just save it as gdp_data and since it’s a zip file, I’ll add the extension .zip.

01:11 And finally, you can just download the file using the urllib.request.urlretrieve() function.

01:19 And now we pass the parameters. The first parameter is the URL or the link from which you want to download the file and the second parameter is the name.

01:29 Now when we run this, the file will be downloaded.

01:33 Now you can exit from the REPL and check the file list in your current directory and you’ll see that gdp_data.zip is downloaded.

01:45 Let’s break down what you just did. First, you import urllib.request, then you define the URL string pointing to the file you want to download and the filename string, which is what you want to call the file on your local computer.

02:00 In this string, you can also have the full path of the downloaded file, which is what you did in the coding session. The download happens in the third step.

02:09 You call the urlretrieve() function, pass in the source and the destination, Python goes out to the web, fetches the file and saves it to your current directory or the path you had provided.

02:20 In this code snippet, you also print a confirmation just so you know the script has finished successfully. You now know how to save the downloaded file in REPL’s current directory.

02:33 Here, you use pathlib to save the file in more practical locations. First, you use Path.home() to get the user’s home directory and save the file in the downloads folder which is a common and expected location for downloaded files.

02:47 In the second example, you refer to the folder where your script is located and create a project-specific downloads folder. Using pathlib like this makes your downloads clean, predictable, and cross-platform.

03:00 Do go through the pathlib tutorial linked in the additional resources section to learn more about pathlib. Sometimes, after downloading the file, you might need to know how large the file is or what type of file it is before you process it.

03:14 This is where inspecting response headers comes in. Let’s see how you can inspect the file that you just downloaded.

Become a Member to join the conversation.