Skip to content

Conversation

@jsai20
Copy link
Contributor

@jsai20 jsai20 commented Dec 29, 2025

Fix zfs_open() to skip zil_async_to_sync() for the snapshot, as it won't have any transactions. zfsvfs->z_log is NULL for the snapshot.

Motivation and Context

zfs_open()->zil_async_to_sync() gets kernel panic due to NULL pointer reference for the snapshot while referring zilog->zl_spa;

[1393579.112941] BUG: kernel NULL pointer dereference, address: 0000000000000038
[1393579.114200] #PF: supervisor read access in kernel mode
[1393579.115075] #PF: error_code(0x0000) - not-present page
[1393579.115968] PGD 1bf632067 P4D 1bf632067 PUD 1fba60067 PMD 0
[1393579.116932] Oops: 0000 [#1] SMP NOPTI
….
...
[1393579.120976] RIP: 0010:zil_async_to_sync+0x1d/0x310 [zfs]
[1393579.121887] Code: fc ff ff e8 55 90 1a d9 0f 1f 44 00 00 0f 1f 44 00 00 41 57 41 56 49 c7 c6 fc ff ff ff 41 55 41 54 55 48 89 fd 53 48 83 ec 78 <48> 8b 7f 38 48 89 74 24 10 65 48 8b 04 25 28 00 00 00 48 89 44 24
[1393579.124995] RSP: 0018:ffffaede0c5dbbe8 EFLAGS: 00010296
[1393579.125887] RAX: 0000000000000000 RBX: ffff90f71a4a92d0 RCX: 0000000000000000
[1393579.127081] RDX: ffff90f514c41e80 RSI: 0000000000000002 RDI: 0000000000000000
[1393579.128312] RBP: 0000000000000000 R08: ffff90f514c41e80 R09: ffff90f514c41e80
[1393579.129509] R10: ffffaede0c5dbd00 R11: 0000000000000008 R12: 0000000000101000
[1393579.130703] R13: 0000000000000000 R14: fffffffffffffffc R15: ffff90f514c41e80
[1393579.131906] FS:  00007ff8851f46c0(0000) GS:ffff90f730280000(0000) knlGS:0000000000000000
[1393579.133256] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1393579.134224] CR2: 0000000000000038 CR3: 00000002933fa003 CR4: 00000000003706e0
[1393579.135421] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[1393579.136628] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[1393579.137827] Call Trace:
[1393579.147380]  zfs_open+0xbb/0x170 [zfs]
[1393579.148144]  zpl_open+0x6d/0xf0 [zfs]
[1393579.149616]  do_dentry_open+0x1c4/0x350
[1393579.150280]  do_open+0x180/0x3d0
[1393579.150851]  path_openat+0x12c/0x2c0
[1393579.151472]  do_filp_open+0xa4/0x150
[1393579.153663]  do_sys_openat2+0x95/0x140
[1393579.154312]  __x64_sys_openat+0x6a/0xa0
[1393579.154977]  do_syscall_64+0x30/0x40
[1393579.155602]  entry_SYSCALL_64_after_hwframe+0x62/0xc7
[1393579.156486] RIP: 0033:0x7ff884d1d332

Description

How Has This Been Tested?

Opening a file on snapshot "<dataset mountpoint>/.zfs/snapshot/<file>" wiht O_SYNC flag hits a panic.
Validated a fix with same steps.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@jsai20
Copy link
Contributor Author

jsai20 commented Dec 30, 2025

@behlendorf Actually, we need to change zfs_close() as well to prevent decrement z_sync_cnt for snapshot.
I would update a PR.

Fix zfs_open() to skip zil_async_to_sync() for the snapshot, as it won't
have any transactions. zfsvfs->z_log is NULL for the snapshot.

Signed-off-by: Jitendra Patidar <[email protected]>
@amotin
Copy link
Member

amotin commented Jan 5, 2026

While I agree that this code should not make much sense for snapshots, I don't see here an explanation of why zfsvfs->z_log or something else appear NULL there. Without full explanation it makes me wonder if we may have any other scenarios, like read-only mounts, having the same problem?

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Jan 5, 2026
@jsai20
Copy link
Contributor Author

jsai20 commented Jan 6, 2026

While I agree that this code should not make much sense for snapshots, I don't see here an explanation of why zfsvfs->z_log or something else appear NULL there. Without full explanation it makes me wonder if we may have any other scenarios, like read-only mounts, having the same problem?

Following code in zfs_domount() doesn’t initialise zfsvfs->z_log for the snapshot.
It does initialise z_log always for the live dataset via zfs_domount()->zfsvfs_setup(), even if its readonly mount. So, we shouldn't hit the null pointer dereference for readonly mount.

int
zfs_domount(struct super_block *sb, zfs_mnt_t *zm, int silent)
{
…

        if (dmu_objset_is_snapshot(zfsvfs->z_os)) {
                uint64_t pval;

                atime_changed_cb(zfsvfs, B_FALSE);
                readonly_changed_cb(zfsvfs, B_TRUE);
                if ((error = dsl_prop_get_integer(osname,
                    "xattr", &pval, NULL)))
                        goto out;
                xattr_changed_cb(zfsvfs, pval);
                if ((error = dsl_prop_get_integer(osname,
                    "acltype", &pval, NULL)))
                        goto out;
                acltype_changed_cb(zfsvfs, pval);
                zfsvfs->z_issnap = B_TRUE;
                zfsvfs->z_os->os_sync = ZFS_SYNC_DISABLED;
                zfsvfs->z_snap_defer_time = jiffies;

                mutex_enter(&zfsvfs->z_os->os_user_ptr_lock);
                dmu_objset_set_user(zfsvfs->z_os, zfsvfs);
                mutex_exit(&zfsvfs->z_os->os_user_ptr_lock);
        } else {
                if ((error = zfsvfs_setup(zfsvfs, B_TRUE)))
                        goto out;
        }
….
}

static int
zfsvfs_setup(zfsvfs_t *zfsvfs, boolean_t mounting)
{
        int error;
        boolean_t readonly = zfs_is_readonly(zfsvfs);

        error = zfs_register_callbacks(zfsvfs->z_vfs);
        if (error)
                return (error);

        /*
         * If we are not mounting (ie: online recv), then we don't
         * have to worry about replaying the log as we blocked all
         * operations out since we closed the ZIL.
         */
        if (mounting) {
                ASSERT0P(zfsvfs->z_kstat.dk_kstats);
                error = dataset_kstats_create(&zfsvfs->z_kstat, zfsvfs->z_os);
                if (error)
                        return (error);
                zfsvfs->z_log = zil_open(zfsvfs->z_os, zfs_get_data,
                    &zfsvfs->z_kstat.dk_zil_sums);
….
}

Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. OK. Though I wonder whether snapshots really need to be that special...

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jan 6, 2026
@behlendorf behlendorf merged commit 2301755 into openzfs:master Jan 6, 2026
24 of 26 checks passed
@jsai20 jsai20 deleted the us-zfs-open-fix branch January 7, 2026 13:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Accepted Ready to integrate (reviewed, tested)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants