Datasets
Datasets allow users to upload and work with large files. This article will describe how to create, interact with, or manage datasets.
Datasets allow users to upload and store large files (over 100MB, up to a max of 45GB) without directly loading them into the kernel. This allows users to selectively load large files into the kernel as needed to manage memory usage.
To create a dataset:
- 1.Go to a notebook.
- 2.Open the Data Sources panel using theicon in the left sidebar.
- 3.Click on theicon at the top.
- 4.SelectDataset.
- 5.Provide a Name.
- 6.Click Create.

To upload files to a dataset:
- 1.Hover over the dataset to access themenu.
- 2.ClickAdd files...
- 3.Drag & Drop the file to the upload window OR click the Upload from computer link to search for and select the file.
- 1.When you create a dataset in the UI, Noteable automatically creates a directory of that dataset at ../datasets/<dataset_name>/
- 2.You can then programmatically write files >100MB to that directory
- 3.To sync the files in that directory back to dataset so that they are persisted, see Writing changes to dataset
A dataset and its contents are accessible to everybody in the space.
To access files that are in a dataset from your notebook code, you need to read them into the kernel environment.
UI
Python
- 1.Hover over the dataset to access themenu.
- 2.SelectRead dataset into environment (a cell will be created with code).
- 3.Run the code cell.
- 1.Create a new Python cell.
- 2.Run the following code replacing <name> with your dataset's name:
# will pull in all files within a given dataset
%ntbl pull datasets <name>
UI
Python
- 1.Expand the dataset contents by using the chevon icon ().
- 2.Hover over the file to access themenu.
- 3.SelectRead file into environment (a cell will be created with code).
- 4.Run the code cell.
- 1.Create a new Python cell.
- 2.Run the following code replacing <name> with your dataset's name and <file_path> with the file name or path to the file:
# will pull a specific file from a dataset
%ntbl pull datasets <name>/<file_path>
Once the files have been loaded into the kernel, they're accessible from the
../datasets
directory. The easiest way to get the full path to the file is to copy it from the dataset file menu:
- 1.Hover over the dataset to access themenu.
- 2.SelectCopy Path.
- 3.Paste the path as needed in code.
# Reads a file called file_name.csv into a pandas dataframe
df = pd.read_csv('../datasets/dataset_name/file_name.csv)

Once loaded, a dataset is local to your notebook's kernel. If you'd like to save that file back to the dataset to persist the changes, you'll need to push the changes back to the dataset.
UI
Python
- Hover over the dataset to access themenu.
- SelectWrite changes to dataset (a cell will be created with code).
- Run the code cell.
- 1.Create a new Python cell.
- 2.Run the following code replacing <name> with your dataset's name and <file_path> with the file name or path to the file:
# will push all changes back to a dataset
%ntbl push datasets <name>
If your changes are in a pandas dataframe, you can write these changes out to a file by running:
df.to_csv("../datasets/<file_path>")
To delete a dataset or dataset file:
- 1.Go to a notebook in the space.
- 2.Open the project panel using theicon in the left sidebar.
- 3.Hover over the dataset or dataset file to access themenu.
- 4.SelectDelete and confirm your choice.
Last modified 2mo ago