Skip to main content
Datasets are the foundation of how DataLinks organizes and understands your information. Each dataset represents a structured table of data that can later be cleaned, connected, and queried by AI. If you’re new to DataLinks, you may want to start with the explanatory article Datasets and Namespaces to learn how datasets fit within namespaces and the broader DataLinks structure.

Create a dataset using the API

This sections shows how to create a new dataset in DataLinks by calling the REST API directly from Python. You’ll create a dataset named employee_records inside a namespace called hr_demo, using the DataLinks Create new dataset API endpoint.

Prerequisites

Before you begin, make sure you have:
  • A DataLinks account
  • A valid bearer token for the DataLinks API
  • A Python environment set up for running scripts
  • The ability to install Python packages
You can use any project structure or environment setup that fits your workflow.

Step-by-step instructions

1

Install dependencies

This guide uses direct HTTP calls with Python. Install the requests library if you do not already have it available:
pip install requests
or
pip3 install requests
2

Configure environment variables

Set the following environment variables so sensitive values are not hardcoded in your script:
For the current session
$env:DATALINKS_TOKEN="YOUR_BEARER_TOKEN"
$env:DATALINKS_NAMESPACE="hr_demo"
$env:DATALINKS_DATASET="employee_records"
Persist across future sessions
setx DATALINKS_TOKEN "YOUR_BEARER_TOKEN"
setx DATALINKS_NAMESPACE "hr_demo"
setx DATALINKS_DATASET "employee_records"
After using setx, open a new PowerShell window before running your script.
3

Create the dataset (Python)

Create a Python file, for example create_dataset.py, and add the following code:
import os
import json
import requests

BASE_URL = "https://api.datalinks.com/api/v1"

token = os.environ["DATALINKS_TOKEN"]
namespace = os.environ["DATALINKS_NAMESPACE"]
dataset_name = os.environ["DATALINKS_DATASET"]

url = f"{BASE_URL}/ingest/new/{namespace}/{dataset_name}"

headers = {
    "Authorization": f"Bearer {token}",
    "Content-Type": "application/json",
}

payload = {
    # Required. Options: "Public" or "Private"
    "visibility": "Private",

    # Optional but recommended
    "inferDefinition": {
        "dataDescription": "Employee dataset containing demographics and compensation.",
        "fieldDefinition": (
            "id=unique employee identifier\n"
            "name=full employee name\n"
            "age=employee age in years\n"
            "department=work department\n"
            "salary=annual salary in USD\n"
        ),
    },
}

response = requests.post(
    url,
    headers=headers,
    data=json.dumps(payload),
    timeout=30,
)

print("Status:", response.status_code)

if response.headers.get("content-type", "").startswith("application/json"):
    print(json.dumps(response.json(), indent=2))
else:
    print(response.text)

response.raise_for_status()
print("\nDataset created successfully.")
4

Run the script

Run the script using your normal Python workflow:
python create_dataset.py

How this works

This request:
  • Calls POST /ingest/new/{namespace}/{datasetName} to create a dataset
  • Authenticates using a bearer token in the Authorization header
  • Sets dataset visibility to Private
  • Supplies optional metadata to help DataLinks understand the dataset structure
The inferDefinition section is optional, though it is strongly recommended when creating datasets programmatically.

Common issues

Authentication errors If the request fails with a 401 error, confirm that:
  • The token environment variable is set correctly
  • The Authorization header uses the Bearer format
Dataset naming errors If you see a 400 or 404 error, check that:
  • The namespace and dataset name appear in the request URL
  • The dataset name follows standard naming conventions, such as lowercase letters and underscores

Create a dataset in the web platform

If you prefer a simple, guided setup, you can create a dataset directly in the DataLinks web platform. To create a dataset in the web platform, follow these steps:
  1. Log in to your DataLinks account.
  2. In the left navigation menu, click Create a New Dataset.
  3. Enter a name for your dataset. Choose something descriptive, such as customer_transactions_2025
    Use underscores, not spaces, when creating names.
  4. Select or enter a namespace to assign the dataset to. The namespace helps organize related datasets and controls access and context. For example, accounting_2025
  5. Use the toggle to set visibility to Private or Public.
  6. Click Create dataset and upload data to finish.
Your dataset will appear in your list of available datasets and will be ready for ingestion and cleaning.

Next steps

Once your dataset exists, you can ingest data into it using the data ingestion endpoints. From there, you can explore schema inference, enrichment, and querying workflows in DataLinks.