Saltar al contenido
EdgeServers
Blog

laravel

The Laravel migrations that break production — and the safe patterns we use instead

Rename column, drop column, change type, add NOT NULL — every one of these has a 'works on staging, breaks at midnight on production' failure mode. Here are the safe two-step patterns and the `--pretend` discipline that catches the rest.

27 de mayo de 2026 · 11 min · por Sudhanshu K.

The Laravel migrations that break production — and the safe patterns we use instead

php artisan migrate is one of the most dangerous commands in the Laravel toolbox, and the danger is almost entirely invisible until the day it isn't. Most migrations are harmless. A handful — rename column, drop column, change type, add a NOT NULL constraint to an existing table — have failure modes that don't appear in development, don't appear in staging, and only manifest in production when the table is large, the application is live, and the deploy is in progress.

This is the migration playbook we use for managed Laravel customers running against MySQL and Postgres. It's the result of a decade of inheriting other teams' production databases and learning, the hard way, what breaks them.

The taxonomy of dangerous migrations

A safe migration has these properties:

  • It runs in seconds, regardless of table size
  • It takes no exclusive locks for more than a few milliseconds
  • It is forward-compatible with code that hasn't been deployed yet
  • It is backward-compatible with code that's already running on other instances
  • It can be rolled back without data loss

Migrations that violate these properties:

MigrationLockingTime on 100M-row tableCompatibility risk
ADD COLUMN nullablebrief metadata locksecondssafe
ADD COLUMN NOT NULL DEFAULT xdepends on engineminutes-to-hourssafe if engine supports instant DDL
ADD INDEX (online)lowminutessafe
ADD INDEX (offline)full table lockminutesunsafe
RENAME COLUMNbrief, instant on modern MySQL/PGsecondshighly unsafe — old code
DROP COLUMNbriefsecondshighly unsafe — old code
CHANGE TYPE (compatible)variesvariesvaries
CHANGE TYPE (incompatible)full table rewritehourshighly unsafe
ADD UNIQUE on existing columnfull table scanminutes-to-hoursunsafe if duplicates exist
ADD FOREIGN KEYfull table scan + lockminutesunsafe at scale

The "compatibility risk" column is the one most teams ignore. A migration that takes 200ms can still take your application down if the application code at the moment of running the migration can't tolerate the schema change.

The two-deploy pattern

The pattern that solves almost every dangerous migration: split it into two deploys.

Deploy 1: Schema change that's compatible with both old code and new code. Old code keeps working. New code is written to tolerate both states.

Deploy 2: Application code change that depends on the new schema. By this point, the schema is settled and all running instances agree on it.

This sounds bureaucratic. It's the difference between a deploy that works and a deploy that pages you at 11pm. Let's walk through each dangerous case.

Case 1: Renaming a column

The naive migration:

public function up(): void
{
    Schema::table('orders', function (Blueprint $table) {
        $table->renameColumn('customer_id', 'buyer_id');
    });
}

Problem: the moment this runs, every running instance of your application that still has customer_id in its compiled query expects a column that no longer exists. SQL errors. Application down until the new code lands on every instance.

The safe pattern:

Deploy 1 (additive):

// 2026_05_20_100000_add_buyer_id_to_orders.php
public function up(): void
{
    Schema::table('orders', function (Blueprint $table) {
        $table->unsignedBigInteger('buyer_id')->nullable()->after('customer_id');
    });
 
    // Backfill in chunks (don't do this in the migration for huge tables —
    // run as a separate batch job)
    DB::table('orders')->whereNull('buyer_id')
        ->orderBy('id')
        ->chunkById(5000, function ($rows) {
            foreach ($rows as $row) {
                DB::table('orders')->where('id', $row->id)
                    ->update(['buyer_id' => $row->customer_id]);
            }
        });
}

Add the new column, backfill from the old one. Application code in this deploy starts writing to both columns:

$order->customer_id = $userId;
$order->buyer_id = $userId;
$order->save();

Deploy 2 (cutover):

Application code reads from buyer_id instead of customer_id. Old column still exists; nothing has dropped yet.

Deploy 3 (cleanup, optional):

public function up(): void
{
    Schema::table('orders', function (Blueprint $table) {
        $table->dropColumn('customer_id');
    });
}

Only after you're confident nothing reads customer_id anywhere. We typically wait 1-2 weeks before this step to allow for rollback.

Yes, this is three deploys for what looked like a one-line change. It's the difference between a routine rename and a 3am call.

Case 2: Dropping a column

Same pattern in reverse:

Deploy 1: Application code stops reading and writing the column. The column still exists.

Wait: At least one full release cycle — typically a week. This is when you find the obscure cron job nobody mentioned that still reads the column.

Deploy 2: Drop the column.

The migration itself is fast and safe. The danger is the application code change that should precede it.

A trick we use during the wait period: rename the column to _deprecated_customer_id and see if anything explodes. If a week passes without errors, the column is genuinely unused and we can drop it.

Case 3: Changing a column type

The hardest case. Some type changes are cheap on modern MySQL/Postgres (e.g., INT to BIGINT on Postgres 12+ for compatible widths; varchar widening on MySQL 8 with ALGORITHM=INSTANT). Most are not.

VARCHAR(100) to TEXT, INT to BIGINT (on older MySQL), TIMESTAMP to DATETIME — all involve rewriting the entire table.

For managed MySQL and Postgres customers, we use the same dual-column pattern:

Deploy 1: Add the new column with the new type. Application code writes to both. Backfill in batches.

Deploy 2: Application code reads from the new column.

Deploy 3: Drop the old column.

For very large tables (>100GB), we use a more elaborate variant with pt-online-schema-change (MySQL) or pg_repack (Postgres) to do the change without a long lock, then swap. Out of scope for this article — but it's the right tool when the dual-column approach is too storage-expensive.

Case 4: Adding a NOT NULL column with no default

$table->string('email')->nullable(false);

This fails on every existing row that doesn't have a value. The error is loud and the migration aborts, so it's not silent — but it's also not subtle, and it's not the kind of thing you want to discover during a production deploy.

The safe pattern:

Deploy 1: Add the column as nullable with a sensible default.

Schema::table('users', function (Blueprint $table) {
    $table->string('email')->nullable()->default('unknown@example.com');
});
 
// Backfill any rows that need a real value

Application changes: Writing code now sets the column on every insert. Reading code tolerates the default.

Deploy 2: Make it NOT NULL once you've verified no NULL or default values remain.

DB::statement('ALTER TABLE users MODIFY email VARCHAR(255) NOT NULL');
// or for Postgres:
// DB::statement('ALTER TABLE users ALTER COLUMN email SET NOT NULL');

Postgres 12+ does this without a full table scan if you've added the constraint as NOT VALID first and then validated separately. MySQL 8 requires a careful approach depending on the table size and engine.

Case 5: Adding a UNIQUE constraint

The migration looks innocent:

$table->unique('email');

It fails — loudly — if duplicates already exist in the column. Worse, on a large table, it can take minutes-to-hours during which the table is locked.

Safe pattern:

Step 1: Audit for duplicates before writing the migration.

SELECT email, COUNT(*) FROM users GROUP BY email HAVING COUNT(*) > 1;

Step 2: Resolve duplicates with a data migration (or a manual cleanup with the customer involved).

Step 3: Add a non-unique index first if the table is large. Then add the unique constraint, which can use the existing index and avoid a full table scan.

Step 4: Application code enforces uniqueness at the application layer in the meantime, to prevent new duplicates appearing between Step 2 and Step 4.

--pretend and the migration review

The single best tool for catching dangerous migrations is php artisan migrate --pretend. It prints the SQL that would be executed without executing it. We run it on every deploy candidate, in a pipeline step:

php artisan migrate --pretend --no-interaction > /tmp/migration.sql

Then we review /tmp/migration.sql against a checklist:

  • Is there a DROP COLUMN? Has the corresponding code been live for a release?
  • Is there a RENAME COLUMN? Is this dual-column migration deploy 1 or deploy 2?
  • Is there a MODIFY COLUMN or ALTER COLUMN TYPE? Is the change instant or full-rewrite?
  • Is there an ADD UNIQUE? Have we verified no duplicates?
  • Is there an ADD FOREIGN KEY? Is the referenced column indexed on both sides?
  • Are there any statements that mix DDL and DML (which won't be transactional)?

For managed Laravel customers, we automate this review — a script flags anything matching the patterns above and requires a human signoff. It catches roughly one risky migration per month per customer.

Migration locking and the deploy itself

Laravel migrations don't take database-level locks by themselves, but they acquire MySQL's metadata lock (or Postgres's AccessExclusiveLock) on each table being altered. While this lock is held:

  • New queries against the table block
  • The blocked queries hold their own locks, building a queue
  • Connections pile up
  • Connection pool exhausts
  • Application starts returning 5xx

For a 200ms migration, this is invisible. For a 30-minute table rewrite, it's an outage even though the deploy is "working."

What we do:

  • Set a statement timeout before running migrations. For MySQL: SET SESSION max_execution_time=30000; (30s). For Postgres: SET statement_timeout = '30s';. If a migration tries to take a long lock, it fails fast rather than building a stampede.
  • Run heavy migrations as out-of-band operations, not during deploy. Backfills, type changes, large index builds — these happen separately, often in maintenance windows we agree with the customer, often via pt-online-schema-change or pg_repack.
  • Keep the deploy migration step under 5 seconds. If it can't be, it shouldn't be in the deploy.

Rollback semantics

php artisan migrate:rollback is a tool for the dev environment. In production, it should be treated with extreme caution:

  • Most data migrations are not reversible (you can't "un-backfill" data without saving the previous state)
  • The down() method is rarely tested
  • Rolling back a NOT NULL drop loses information

Our production rollback strategy is forward-only. If a migration causes a problem, we:

  1. Roll back the application code to the previous release (cheap — atomic symlink swap as covered in our deployment article)
  2. Leave the schema change in place
  3. Write a new forward migration to undo the schema change, if needed

This is part of the broader change-safety discipline we apply — production database state is never quietly reverted by automation. Every change moves forward.

The pre-flight checklist

For any migration touching a table with >1M rows, we run through:

  1. --pretend reviewed by a second engineer
  2. Estimated lock duration on production-sized data (from staging, with realistic row counts)
  3. Statement timeout configured
  4. Heavy data work scheduled out-of-band, not in deploy
  5. Roll-forward plan, not roll-back plan
  6. Customer notification if the operation has any chance of impacting users
  7. Monitoring dashboard open, lock-wait queries query ready

It looks heavy. It's the discipline that means we've not had a customer outage from a routine migration in years.

What we provide by default

For every managed Laravel customer, we configure:

  • A pre-deploy CI step that runs migrate --pretend and flags dangerous patterns
  • Statement timeouts on the database session used for migrations
  • A standard runbook covering the dual-column patterns above, customised per app
  • A scheduled monthly review of any pending dangerous migrations (rename/drop columns that haven't been cleaned up)
  • A separate workflow for large data operations that doesn't piggyback on application deploys

Migration safety isn't a tool, it's a habit. The tools — --pretend, statement timeouts, pt-online-schema-change — are easy. The habit is the part that's worth getting right.

Reach out if you'd like us to audit your last 100 migrations. We typically find 2-5 that were quietly unsafe — they happened to work, but only just.

Sudhanshu K. is a Senior Database engineer at EdgeServers (RemotIQ Pty Ltd, ABN 91 682 628 128). She has personally executed schema migrations against multi-terabyte Laravel production databases and has a strong opinion about the word "just" in the phrase "just rename the column."