Add backing_device_info per-filesystem
authorBrian Behlendorf <behlendorf1@llnl.gov>
Tue, 2 Aug 2011 01:24:40 +0000 (18:24 -0700)
committerBrian Behlendorf <behlendorf1@llnl.gov>
Thu, 4 Aug 2011 20:37:38 +0000 (13:37 -0700)
commit76659dc110ef2ada13bcb8e4e2ec60d8216c6836
tree19de8ecbb20ae764ae16b0f05399fff16424c440
parent3c0e5c0f455576d045fa443cbab74834d70ded55
Add backing_device_info per-filesystem

For a long time now the kernel has been moving away from using the
pdflush daemon to write 'old' dirty pages to disk.  The primary reason
for this is because the pdflush daemon is single threaded and can be
a limiting factor for performance.  Since pdflush sequentially walks
the dirty inode list for each super block any delay in processing can
slow down dirty page writeback for all filesystems.

The replacement for pdflush is called bdi (backing device info).  The
bdi system involves creating a per-filesystem control structure each
with its own private sets of queues to manage writeback.  The advantage
is greater parallelism which improves performance and prevents a single
filesystem from slowing writeback to the others.

For a long time both systems co-existed in the kernel so it wasn't
strictly required to implement the bdi scheme.  However, as of
Linux 2.6.36 kernels the pdflush functionality has been retired.

Since ZFS already bypasses the page cache for most I/O this is only
an issue for mmap(2) writes which must go through the page cache.
Even then adding this missing support for newer kernels was overlooked
because there are other mechanisms which can trigger writeback.

However, there is one critical case where not implementing the bdi
functionality can cause problems.  If an application handles a page
fault it can enter the balance_dirty_pages() callpath.  This will
result in the application hanging until the number of dirty pages in
the system drops below the dirty ratio.

Without a registered backing_device_info for the filesystem the
dirty pages will not get written out.  Thus the application will hang.
As mentioned above this was less of an issue with older kernels because
pdflush would eventually write out the dirty pages.

This change adds a backing_device_info structure to the zfs_sb_t
which is already allocated per-super block.  It is then registered
when the filesystem mounted and unregistered on unmount.  It will
not be registered for mounted snapshots which are read-only.  This
change will result in flush-<pool> thread being dynamically created
and destroyed per-mounted filesystem for writeback.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #174
61 files changed:
Makefile.in
cmd/Makefile.in
cmd/mount_zfs/Makefile.in
cmd/sas_switch_id/Makefile.in
cmd/zdb/Makefile.in
cmd/zfs/Makefile.in
cmd/zinject/Makefile.in
cmd/zpios/Makefile.in
cmd/zpool/Makefile.in
cmd/zpool_id/Makefile.in
cmd/zpool_layout/Makefile.in
cmd/ztest/Makefile.in
cmd/zvol_id/Makefile.in
config/kernel-bdi.m4 [new file with mode: 0644]
config/kernel.m4
configure
dracut/90zfs/Makefile.in
dracut/Makefile.in
etc/Makefile.in
etc/init.d/Makefile.in
etc/udev/Makefile.in
etc/udev/rules.d/Makefile.in
etc/zfs/Makefile.in
include/Makefile.in
include/linux/Makefile.in
include/linux/vfs_compat.h
include/sys/Makefile.in
include/sys/fm/Makefile.in
include/sys/fm/fs/Makefile.in
include/sys/fs/Makefile.in
include/sys/zfs_vfsops.h
lib/Makefile.in
lib/libavl/Makefile.in
lib/libefi/Makefile.in
lib/libnvpair/Makefile.in
lib/libshare/Makefile.in
lib/libspl/Makefile.in
lib/libspl/asm-generic/Makefile.in
lib/libspl/asm-i386/Makefile.in
lib/libspl/asm-x86_64/Makefile.in
lib/libspl/include/Makefile.in
lib/libspl/include/ia32/Makefile.in
lib/libspl/include/ia32/sys/Makefile.in
lib/libspl/include/rpc/Makefile.in
lib/libspl/include/sys/Makefile.in
lib/libspl/include/sys/dktp/Makefile.in
lib/libspl/include/sys/sysevent/Makefile.in
lib/libspl/include/util/Makefile.in
lib/libunicode/Makefile.in
lib/libuutil/Makefile.in
lib/libzfs/Makefile.in
lib/libzpool/Makefile.in
man/Makefile.in
man/man8/Makefile.in
module/zfs/zfs_vfsops.c
scripts/Makefile.in
scripts/zpios-profile/Makefile.in
scripts/zpios-test/Makefile.in
scripts/zpool-config/Makefile.in
scripts/zpool-layout/Makefile.in
zfs_config.h.in