Thursday, August 11, 2022
HomeArtificial IntelligenceUtilizing Kaggle in Machine Studying Tasks

Utilizing Kaggle in Machine Studying Tasks


You’ve in all probability heard of Kaggle information science competitions, however do you know that Kaggle has many different options that may enable you to along with your subsequent machine studying undertaking? For folks in search of datasets for his or her subsequent machine studying undertaking, Kaggle lets you entry public datasets by others and share your personal datasets. For these seeking to construct and prepare their very own machine studying fashions, Kaggle additionally provides an in-browser pocket book surroundings and a few free GPU hours. You may also take a look at different folks’s public notebooks as effectively!

Aside from the web site, Kaggle additionally has a command-line interface (CLI) which you need to use inside the command line to entry and obtain datasets.

Let’s dive proper in and discover what Kaggle has to supply!

After finishing this tutorial, you’ll be taught:

  • What’s Kaggle?
  • How you need to use Kaggle as a part of your machine studying pipeline
  • Utilizing Kaggle API’s Command Line Interface (CLI)

Let’s get began!

Utilizing Kaggle in Machine Studying Tasks
Photograph by Stefan Widua. Some rights reserved.


This tutorial is cut up into 5 elements; they’re:

  • What’s Kaggle?
  • Establishing Kaggle Notebooks
  • Utilizing Kaggle Notebooks with GPUs/TPUs
  • Utilizing Kaggle Datasets with Kaggle Notebooks
  • Utilizing Kaggle Datasets with Kaggle CLI device

What Is Kaggle?

Kaggle might be most well-known for the information science competitions that it hosts, with a few of them providing 5-figure prize swimming pools and seeing a whole lot of groups collaborating. Apart from these competitions, Kaggle additionally permits customers to publish and seek for datasets, which they’ll use for his or her machine studying tasks. To make use of these datasets, you need to use Kaggle notebooks inside your browser or Kaggle’s public API to obtain their datasets which you’ll be able to then use on your machine studying tasks.

Kaggle Competitions

Along with that, Kaggle additionally provides some programs and a discussions web page so that you can be taught extra about machine studying and speak with different machine studying practitioners!

For the remainder of this text, we’ll concentrate on how we are able to use Kaggle’s datasets and notebooks to assist us when engaged on our personal machine studying tasks or discovering new tasks to work on.

Establishing Kaggle Notebooks

To get began with Kaggle Notebooks, you’ll must create a Kaggle account both utilizing an current Google account or creating one utilizing your e mail.

Then, go to the “Code” web page.

Left Sidebar of Kaggle Residence Web page, Code Tab

You’ll then be capable to see your personal notebooks in addition to public notebooks by others. To create your personal pocket book, click on on New Pocket book.

Kaggle Code Web page

This can create your new pocket book, which seems to be like a Jupyter pocket book, with many comparable instructions and shortcuts.

Kaggle Pocket book

You may also toggle between a pocket book editor and script editor by going to File -> Editor Kind.

Altering Editor Kind in Kaggle Pocket book

Altering the editor kind to script reveals this as a substitute:

Kaggle Pocket book Script Editor Kind

Utilizing Kaggle with GPUs/TPUs

Who doesn’t love free GPU time for machine studying tasks? GPUs may also help to massively pace up the coaching and inference of machine studying fashions, particularly with deep studying fashions.

Kaggle comes with some free allocation of GPUs and TPUs, which you need to use on your tasks. On the time of this writing, the provision is 30 hours per week for GPUs and 20 hours per week for TPUs after verifying your account with a cellphone quantity.

To connect an accelerator to your pocket book, go to Settings ▷ Surroundings ▷ Preferences.

Altering Kaggle Pocket book Surroundings preferences

You’ll be requested to confirm your account with a cellphone quantity.

Confirm cellphone quantity

After which introduced with this web page which lists the quantity of availability you’ve gotten left and mentions that turning on GPUs will cut back the variety of CPUs accessible, so it’s in all probability solely a good suggestion when doing coaching/inference with neural networks.

Including GPU Accelerator to Kaggle Pocket book

Utilizing Kaggle Datasets with Kaggle Notebooks

Machine studying tasks are data-hungry monsters, and discovering datasets for our present tasks or in search of datasets to start out new tasks is all the time a chore. Fortunately, Kaggle has a wealthy assortment of datasets contributed by customers and from competitions. These datasets generally is a treasure trove for folks in search of information for his or her present machine studying undertaking or folks in search of new concepts for tasks.

Let’s discover how we are able to add these datasets to our Kaggle pocket book.

First, click on on Add information on the best sidebar.

Including Datasets to Kaggle Pocket book Surroundings

A window ought to seem that reveals you among the publicly accessible datasets and offers you the choice to add your personal dataset to be used along with your Kaggle pocket book.

Looking out Via Kaggle datasets

I’ll be utilizing the basic titanic dataset as my instance for this tutorial, which yow will discover by keying your search phrases into the search bar on the highest proper of the window.

Kaggle Datasets Filtered with “Titanic” Key phrase

After that, the dataset is accessible for use by the pocket book. To entry the information, check out the trail for the file and prepend ../enter/{path}. For instance, the file path for the titanic dataset is:

Within the pocket book, we are able to learn the information utilizing:

This will get us the information from the file:

Utilizing Titanic Dataset in Kaggle Pocket book

Utilizing Kaggle Datasets with Kaggle CLI Software

Kaggle additionally has a public API with a CLI device which we are able to use to obtain datasets, work together with competitions, and rather more. We’ll be methods to arrange and obtain Kaggle datasets utilizing the CLI device.

To get began, set up the CLI device utilizing:

For Mac/Linux customers, you would possibly want:

Then, you’ll must create an API token for authentication. Go to Kaggle’s webpage, click on in your profile icon within the prime proper nook and go to Account.

Going to Kaggle Account Settings

From there, scroll all the way down to Create New API Token:

Producing New API Token for Kaggle Public API

This can obtain a kaggle.json file that you simply’ll use to authenticate your self with the Kaggle CLI device. You’ll have to place it within the right location for it to work. For Linux/Mac/Unix-based working programs, this ought to be positioned at ~/.kaggle/kaggle.json, and for Home windows customers, it ought to be positioned at C:Customers<Home windows-username>.kagglekaggle.json. Inserting it within the incorrect location and calling kaggle within the command line will give an error:

Now, let’s get began on downloading these datasets!

To seek for datasets utilizing a search time period, e.g., titanic, we are able to use:

Looking for titanic, we get:

To obtain the primary dataset in that record, we are able to use:

Utilizing a Jupyter pocket book to learn the file, much like the Kaggle pocket book instance, offers us:

Utilizing Titanic Dataset in Jupyter Pocket book

In fact, some datasets are so massive in dimension that you could be not need to maintain them by yourself disk. Nonetheless, this is without doubt one of the free assets supplied by Kaggle on your machine studying tasks!

Additional Studying

This part offers extra assets in case you’re fascinated about going deeper into the subject.


On this tutorial, you realized what Kaggle is , how we are able to use Kaggle to get datasets, and even for some free GPU/TPU situations inside Kaggle Notebooks. You’ve additionally seen how we are able to use Kaggle API’s CLI device to obtain datasets for us to make use of in our native environments.

Particularly, you learnt:

  • What’s Kaggle
  • The way to use Kaggle notebooks together with their GPU/TPU accelerator
  • The way to use Kaggle datasets in Kaggle notebooks or obtain them utilizing Kaggle’s CLI device



Most Popular

Recent Comments