Database Seeding: The Complete Developer's Guide

What is Database Seeding?

Database seeding is the process of populating a database with initial data. This data can serve multiple purposes: development testing, automated testing, demo environments, or providing default application data. Proper seeding strategies ensure consistent, reproducible database states across different environments.

Unlike production data, seed data is typically generated or curated specifically for testing and development purposes, making it safe to share, version control, and regenerate as needed.

Why Database Seeding Matters

1. Consistent Development Environments

Every developer on your team can work with the same baseline data, reducing “works on my machine” issues and ensuring consistent behavior across development environments.

2. Automated Testing

Integration and end-to-end tests need predictable database states. Seeding ensures tests start with known data, making them reliable and reproducible.

3. Demo and Staging Environments

Populate demo environments with realistic data that showcases your application's features without exposing real user information.

4. Onboarding New Developers

New team members can quickly get a working local environment with data that demonstrates all application features.

Seeding MongoDB

MongoDB makes seeding straightforward with its flexible document structure and built-in import tools.

Method 1: Using mongoimport

# Generate JSON data with our tool, save as users.json

# Import into MongoDB
mongoimport --db myapp --collection users --file users.json --jsonArray

# Or for a single document per line format
mongoimport --db myapp --collection users --file users.json

Method 2: Using Node.js Script

// seed.js
const { MongoClient } = require('mongodb');
const mockData = require('./mockData/users.json');

async function seedDatabase() {
  const client = new MongoClient('mongodb://localhost:27017');

  try {
    await client.connect();
    const db = client.db('myapp');

    // Clear existing data
    await db.collection('users').deleteMany({});

    // Insert seed data
    const result = await db.collection('users').insertMany(mockData);
    console.log(`${result.insertedCount} users inserted`);

  } finally {
    await client.close();
  }
}

seedDatabase().catch(console.error);

Method 3: Using Mongoose

// seed.js
const mongoose = require('mongoose');
const User = require('./models/User');
const mockData = require('./mockData/users.json');

async function seedDatabase() {
  await mongoose.connect('mongodb://localhost:27017/myapp');

  // Clear existing data
  await User.deleteMany({});

  // Insert seed data
  await User.insertMany(mockData);

  console.log('Database seeded successfully');
  await mongoose.disconnect();
}

seedDatabase().catch(console.error);

Seeding PostgreSQL & MySQL

Method 1: SQL INSERT Statements

-- seed.sql
TRUNCATE TABLE users CASCADE;

INSERT INTO users (id, name, email, created_at) VALUES
  ('550e8400-e29b-41d4-a716-446655440000', 'John Doe', 'john@example.com', NOW()),
  ('6ba7b810-9dad-11d1-80b4-00c04fd430c8', 'Jane Smith', 'jane@example.com', NOW()),
  ('6ba7b811-9dad-11d1-80b4-00c04fd430c8', 'Bob Johnson', 'bob@example.com', NOW());

-- Run with:
-- psql -U username -d database -f seed.sql
-- mysql -u username -p database < seed.sql

Method 2: Using Node.js with Knex

// seeds/001_users.js
const mockUsers = require('../mockData/users.json');

exports.seed = async function(knex) {
  // Clear existing data
  await knex('users').del();

  // Insert seed data
  await knex('users').insert(mockUsers);
};

// Run with: npx knex seed:run

Method 3: Using Prisma

// prisma/seed.ts
import { PrismaClient } from '@prisma/client';
import mockUsers from './mockData/users.json';

const prisma = new PrismaClient();

async function main() {
  // Clear existing data
  await prisma.user.deleteMany({});

  // Insert seed data
  for (const user of mockUsers) {
    await prisma.user.create({
      data: user
    });
  }

  console.log('Database seeded');
}

main()
  .catch(console.error)
  .finally(() => prisma.$disconnect());

// package.json
// "prisma": {
//   "seed": "ts-node prisma/seed.ts"
// }

Generating Seed Data

Pro Tip: Use Our Mock Data Generator

Instead of manually creating seed data, use our free tool to generate realistic datasets in seconds:

Define your schema with the data types you need
Generate hundreds or thousands of records
Download as JSON
Use directly with your database seeding scripts

Example schema for a complete user dataset:

{
  "id": "uuid",
  "firstName": "firstName",
  "lastName": "lastName",
  "email": "email",
  "phone": "phone",
  "avatar": "avatar",
  "address": "address",
  "city": "city",
  "country": "country",
  "zipCode": "zipCode",
  "company": "company",
  "jobTitle": "jobTitle",
  "status": "status",
  "createdAt": "past",
  "lastLogin": "recent"
}

Best Practices

1. Make Seeding Idempotent

Running your seed script multiple times should produce the same result. Clear existing data before seeding, or use upsert operations to avoid duplicates.

2. Handle Foreign Key Relationships

Seed related tables in the correct order. Parent records must exist before child records that reference them. Use consistent IDs across related seed files.

3. Use Environment Variables

Never hardcode database credentials. Use environment variables or configuration files that are not committed to version control.

4. Separate Seeds by Purpose

Create different seed files for different scenarios: minimal data for testing, comprehensive data for development, demo data for presentations.

5. Never Seed Production

Add safeguards to prevent accidentally running seed scripts against production databases. Check environment variables and require explicit confirmation.

Automating Seeding in CI/CD

Integrate database seeding into your CI/CD pipeline for automated testing:

# .github/workflows/test.yml
name: Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest

    services:
      postgres:
        image: postgres:15
        env:
          POSTGRES_PASSWORD: postgres
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5

    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-node@v3

      - name: Install dependencies
        run: npm install

      - name: Run migrations
        run: npm run migrate

      - name: Seed database
        run: npm run seed

      - name: Run tests
        run: npm test

Generate Your Seed Data Now

Stop writing seed data manually! Use our free tool to generate production-quality test data in seconds.

Create Seed Data →

Conclusion

Proper database seeding is essential for modern software development. It ensures consistent environments, enables automated testing, and accelerates team productivity.

Whether you're using MongoDB, PostgreSQL, MySQL, or any other database, the principles remain the same: generate realistic data, maintain relationships, make it reproducible, and integrate it into your development workflow.