C H A P T E R  7

Advanced Topics

This chapter discusses advanced topics that are beyond the scope of basic system administration and usage. This chapter contains the following sections:


Using Daemons, Processes, and Tracing

It is useful to have an understanding of system daemons and processes when you are debugging. This section describes the Sun StorageTek QFS daemons and processes. It also provides information about daemon tracing.

Daemons and Processes

All Sun StorageTek QFS daemons are named in the form sam-daemon_named. Processes are named in a similar manner; the difference is that they do not end in the lowercase letter d.

TABLE 7-1 shows some of the daemons and processes that can run on your system. Others, such as sam-genericd and sam-catserverd, might also be running, depending on system activities.


TABLE 7-1 Daemons and Processes

Process

Description

sam-fsd

Master daemon.

sam-sharefsd

Invokes the Sun StorageTek QFS shared file system daemon.

sam-rpcd

Controls the remote procedure call (RPC) application programming interface (API) server process.


When you run Sun StorageTek QFS software, init starts the sam-fsd daemon as part of /etc/inittab processing. The daemon is started at init levels 0, 2, 3, 4, 5, and 6. It should restart automatically in case of failure.

In a Sun StorageTek QFS shared file system, a sam-fsd daemon is always active. In addition, one sam-sharefsd daemon is active for each mounted shared file system.

When a Sun StorageTek QFS shared file system is mounted, the software starts a shared file system daemon (sam-sharefsd). TCP sockets are used to communicate between the server and client hosts. All clients that connect to the metadata server are validated against the hosts file.



Note - See the hosts.fs(4) man page for more information about the hosts file.



The sam-sharedfsd daemon on the metadata server opens a listener socket on the port named sam-qfs. During the Sun StorageTek QFS installation process, the sam-qfs entry is automatically added to /etc/services file. Do not remove this entry. In addition, the shared file system port is defined in the /etc/inet/services file as port number 7105. Verify that this port does not conflict with another service.



Note - Before the Sun StorageTek QFS 4U2 release, one port per file system was required. You can remove these entries from your file.



All metadata operations, block allocation and deallocation, and record locking are performed on the metadata server. The sam-sharefsd daemon does not keep any information. Hence, it can be stopped and restarted without causing any consistency problems for the file system.

Trace Files

Several Sun StorageTek QFS processes can write messages to trace files. These messages contain information about the state and progress of the work performed by the daemons. The messages are primarily used by Sun Microsystems staff to improve performance and diagnose problems. The message content and format are subject to change from release to release.

Trace files can be used in debugging. By default, trace files are not enabled. You can enable trace files by editing the defaults.conf file. You can enable tracing for all processes, or you can enable tracing for individual processes. For information about the processes that you can trace, see the defaults.conf(4) man page.

By default, trace files are written to the /var/opt/SUNWsamfs/trace directory. In that directory, the trace files are named for the processes (archiver, catserver, fsd, ftpd, recycler, sharefsd, and stager). You can change the names of the trace files by specifying directives in the defaults.conf configuration file. You can also set a limit on the size of a trace file and rotate your tracing logs. For information about controlling tracing, see the defaults.conf(4) man page.

Trace File Content

Trace file messages contain the time and source of the message. The messages are produced by events in the processes. You can select the events by using directives in the defaults.conf file.

The default events are as follows:

You can also trace the following events:

The default message elements (program name, process ID (PID), and time) are always included and cannot be excluded. Optionally, the messages can also contain the following elements:

Trace File Rotation

To prevent trace files from growing indefinitely, the sam-fsd daemon monitors the size of the trace files and periodically executes the following command:


/opt/SUNWsamfs/sbin/trace_rotate

This script moves the trace files to sequentially numbered copies. You can modify this script to suit your operation. Alternatively, you can provide this function using cron(1) or some other facility.

Determining Which Processes Are Being Traced

To determine which processes are being traced currently, enter the sam-fsd(1M) command at the command line. CODE EXAMPLE 7-1 shows the output from this command.


CODE EXAMPLE 7-1 sam-fsd (1M) Command Output
# sam-fsd
Trace file controls:
sam-amld      /var/opt/SUNWsamfs/trace/sam-amld
              cust err fatal misc proc date
              size    0    age 0
sam-archiverd /var/opt/SUNWsamfs/trace/sam-archiverd
              cust err fatal misc proc date
              size    0    age 0
sam-catserverd /var/opt/SUNWsamfs/trace/sam-catserverd
              cust err fatal misc proc date
              size    0    age 0
sam-fsd       /var/opt/SUNWsamfs/trace/sam-fsd
              cust err fatal misc proc date
              size    0    age 0
sam-rftd      /var/opt/SUNWsamfs/trace/sam-rftd
              cust err fatal misc proc date
              size    0    age 0
sam-recycler  /var/opt/SUNWsamfs/trace/sam-recycler
              cust err fatal misc proc date
              size    0    age 0
sam-sharefsd  /var/opt/SUNWsamfs/trace/sam-sharefsd
              cust err fatal misc proc date
              size    0    age 0
sam-stagerd   /var/opt/SUNWsamfs/trace/sam-stagerd
              cust err fatal misc proc date
              size    0    age 0
sam-serverd   /var/opt/SUNWsamfs/trace/sam-serverd
              cust err fatal misc proc date
              size    0    age 0
sam-clientd   /var/opt/SUNWsamfs/trace/sam-clientd
              cust err fatal misc proc date
              size    0    age 0
sam-mgmt      /var/opt/SUNWsamfs/trace/sam-mgmt
              cust err fatal misc proc date
              size    0    age 0

For more information about enabling trace files, see the defaults.conf(4) man page and the sam-fsd(1M) man page.


Using the setfa(1) Command to Set File Attributes

Sun StorageTek QFS file systems enable end users to set performance attributes for files and directories. Applications can enable these performance features on a per-file or per-directory basis. The following sections describe how the application programmer can use these features to select file attributes for files and directories, to preallocate file space, to specify the allocation method for the file, and to specify the disk stripe width.

For more information about implementing the features described in the following subsections, see the setfa(1) man page.

Selecting File Attributes for Files and Directories

The setfa(1) command sets attributes on a new or existing file. The file is created if it does not already exist.

You can set attributes on a directory as well as a file. When using setfa(1) with a directory, files and directories created within that directory inherit the attributes set in the original directory. To reset attributes on a file or directory to the default, use the -d (default) option. When the -d option is used, attributes are first reset to the default and then other attributes are processed.

Preallocating File Space

An end user can preallocate space for a file. This space is associated with a file so that no other files in the file system can use the disk addresses allocated to this file. Preallocation ensures that space is available for a given file, which avoids a file-system-full condition. Preallocation is assigned at the time of the request rather than when the data is actually written to disk.

Note that space can be wasted by preallocation of files. If the file size is less than the allocation amount, the kernel allocates space to the file from the current file size up to the allocation amount. When the file is closed, space below the allocation amount is not freed.

You can preallocate space for a file by using the setfa(1) command with either the -L or the -l (lowercase letter L) option. Both options accept a file length as their argument. Use the -L option for an existing file, which can be empty or contain data. Use the -l option for a file that has no data yet. If you use the -l option, the file cannot grow beyond its preallocated limit.

For example, to preallocate a 1-gigabyte file named /qfs/file_alloc, type the following:


# setfa -l 1g /qfs/file_alloc

After space for a file has been preallocated, truncating a file to 0 length or removing the file returns all space allocated for a file. There is no way to return only part of a file's preallocated space to the file system. In addition, if a file is preallocated with the -l option, there is no way to extend the file beyond its preallocated size in future operations.

Selecting a File Allocation Method and Stripe Width

By default, a file uses the allocation method and stripe width specified at mount time (see the mount_samfs(1M) man page). However, an end user might want to use a different allocation scheme for a file or directory. The user could do this by using the setfa(1) command with the -s (stripe) option.

The allocation method can be either round-robin or striped. The -s option specifies the allocation method and the stripe width, as shown in TABLE 7-2.


TABLE 7-2 File Allocations and Stripe Widths

-s Option

Allocation Method

Stripe Width

Explanation

0

Round-robin

Not applicable

The file is allocated on one device until that device has no space.

1-255

Striped

1-255 DAUs

The file is striped across all disk devices with this number of DAUs per disk.


The following example shows how to create a file explicitly by specifying a round-robin allocation method:


# setfa -s 0 /qfs/100MB.rrobin

The following example shows how to create a file explicitly by specifying a striped allocation method with a stripe width of 64 DAUs (preallocation is not used):


# setfa -s 64 /qfs/file.stripe

Selecting a Striped Group Device

Striped group devices are supported for Sun StorageTek QFS file systems only.

A user can specify that a file begin allocation on a particular striped group. If the file allocation method is round-robin, the file is allocated on the designated stripe group.

CODE EXAMPLE 7-2 shows setfa(1) commands specifying that file1 and file2 be independently spread across two different striped groups.


CODE EXAMPLE 7-2 setfa (1) Commands to Spread Files Across Striped Groups
# setfa -g0 -s0 file1
# setfa -g1 -s0 file2

This capability is particularly important for applications that must achieve levels of performance that approach raw device speeds. For more information, see the setfa(1) man page.


Configuring WORM-FS File Systems

Write once read many (WORM) technology is used in many applications for data integrity reasons and because of the accepted legal admissibility of stored files that use the technology. Beginning with release 4U3 of the Sun StorageTek QFS software, a WORM-FS feature became available as an add-on package called SUNWsamfswm. In the 4U4 software release the WORM-FS interface was modified to be compatible with the new Sun StorageTek 5310 network attached storage (NAS) appliance. The previous WORM-FS interface using ssum is no longer supported.



Note - The WORM-FS package (SUNWsamfswm) is included with the Sun StorageTek QFS software packages, but must be installed separately by using the pkgadd command.



The WORM-FS feature offers default and customizable file-retention periods, data and path immutability, and subdirectory inheritance of the WORM setting.

WORM-FS can operate in one of two modes: Sun standard compliance mode (referred to herein simply as standard mode), which is the default, and Sun emulation compliance mode (referred to herein as emulation mode), which is designed to provide compatibility with the emulation mode of the Sun StorageTek 5320 network attached storage (NAS) appliance and is similar to an interface defined by Network Appliance.

One difference between standard and emulation mode is a restriction on the nature of files that can be retained. Specifically, in standard mode, files with any UNIX executable permissions cannot be retained. There is no such restriction in emulation mode. The restriction in standard mode exists because of the nature of the retention trigger defined for NFS and FTP. For these protocols, retention is requested by specifying that the setuid mode be set on the file. Once a file is retained, a client will see the setuid mode bit set, but the restriction on executable files will prevent the possible security hole of allowing an executable file owned by the root user to be made WORM and therefore impossible to remove. A benefit of this approach is that the user or application can more easily determine which files on the system are indeed WORM-protected files.

Enabling the WORM-FS Feature

There are four mount options that can be used to enable the WORM-FS feature:

These four mount options are somewhat exclusive. You can upgrade from "lite" to standard WORM mode, but you cannot change from standard WORM mode to emulation mode, or from emulation to standard mode. These options can be provided on the command line when the file system is mounted, listed in /etc/vfstab, or provided in /opt/SUNWsamfs/samfs.cmd. The normal rules of precedence for mount options apply.

The WORM attribute is stored in the mount table and enables WORM files to be created in directories anywhere in the file system.



Note - You must have system administration privileges to set a WORM mount option in /etc/vfstab.



CODE EXAMPLE 7-3 shows an example of WORM-FS mount options. The file system samfs1 mounted at /samfs1 is WORM-capable and has the default retention period for files set to 60 minutes.


CODE EXAMPLE 7-3 Using WORM-FS Mount Options
# cat /etc/vfstab#device 	device 	mount 	FS 	fsck	mount	mount#to mount 	to fsck 	point	type	pass	at boot	options#fd 	-	/dev/fd	fd	-	no	-/proc	-	/proc	proc	-	no	-/dev/dsk/c0t0d0s1	- 	-	swap	-	no	-samfs1 	-	/samfs1	samfs	-	yes 	worm_capable,def_retention=60swap	-	/tmp	tmpfs	-	yes	-

After the WORM-FS feature has been enabled and at least one WORM file is resident in the file system, the file system's superblock is updated to reflect the WORM capability. Any subsequent attempt to rebuild the file system through sammkfs will fail, unless you are using the worm_lite or emul_lite mount option.

WORM "Lite" Options

The worm_lite and emul_lite mount options create a modified WORM environment that eases the restrictions on actions that can be taken on WORM-enabled volumes and retained files. The WORM lite options can be a solution for companies with document management and retention policies requiring data retention guarantees but not the strict constraints that WORM places on systems. Mechanisms exist to alter and even reverse some data retention decisions.

The WORM lite options can also be used for testing and configuring WORM systems and applications before upgrading to the more strict standard WORM policies.

The WORM lite environment behaves similarly to the standard WORM mode. File data and path remain immutable, but the system administrator is allowed to carry out the following special actions:

Creating WORM Files

A WORM mount option enables a file system to contain WORM files, but it does not automatically create WORM files. To create a WORM file, you must first make the directory WORM-capable. To do this, create an ordinary directory and then use a WORM trigger command to set the WORM bit on the directory. Depending on the mount option being used, the following WORM trigger commands are available:

After setting the WORM bit on a directory, you can create files in that directory and then use the appropriate WORM trigger to set the WORM bit on files that you want retained. The WORM trigger is the same for both files and directories.

The following are examples of using the WORM trigger for each of the four mount options using the system-wide default retention value:

Example 1. WORM trigger is chmod 4000

Simple application of the WORM trigger using standard WORM functionality:


[root@ns-east-44]# grep -i worm /etc/vfstab
samfs1 -       /samfs1  samfs   -       no     bg,worm_capable
 
[root@ns-east-44]# cd /samfs1
[root@ns-east-44]# mkdir WORM
[root@ns-east-44]# chmod 4000 WORM
[root@ns-east-44]# sls -D
 
WORM:
  mode: drwxr-xr-x  links:   2  owner: root      group: root
  length:      4096  admin id:      0  inode:     1025.1
  access:      Jan 30 15:50  modification: Jan 30 15:50
  changed:     Jan 30 15:50  attributes:   Jan  1  1970
  creation:    Jan 30 15:50  residence:    Jan 30 15:50
  worm-capable        retention-period: 0y, 30d, 0h, 0m
 
[root@ns-east-44]# cd WORM
[root@ns-east-44]# touch test
[root@ns-east-44]# chmod 4000 test
[root@ns-east-44]# sls -D
 
test:
  mode: -r-Sr--r--  links:   1  owner: root      group: root
  length:         0  admin id:      0  inode:     1026.3
  access:      Jan 30 15:51  modification: Jan 30 15:51
  changed:     Jan 30 15:51  retention-end: Mar  1 15:51 2007
  creation:    Jan 30 15:51  residence:    Jan 30 15:51
  retention:   active        retention-period: 0y, 30d, 0h, 0m
 
[root@ns-east-44]# rm test
rm: test: override protection 444 (yes/no)? yes
rm: test not removed: Read-only file system
[root@ns-east-44]# ls
test

Example 2. WORM trigger is chmod 4000

Simple application of the WORM trigger using standard WORM lite functionality:


[root@ns-east-44]# grep -i worm /etc/vfstab
samfs1 -       /samfs1  samfs   -       no     bg,worm_lite
 
[root@ns-east-44]# mount samfs1
[root@ns-east-44]# cd /samfs1
[root@ns-east-44]# mkdir WORM
[root@ns-east-44]# chmod 4000 WORM
[root@ns-east-44]# sls -D
 
WORM:
  mode: drwxr-xr-x  links:   2  owner: root      group: root
  length:      4096  admin id:      0  inode:     1025.1
  access:      Jan 30 16:12  modification: Jan 30 16:12
  changed:     Jan 30 16:12  attributes:   Jan  1  1970
  creation:    Jan 30 16:12  residence:    Jan 30 16:12
  worm-capable        retention-period: 0y, 30d, 0h, 0m
 
[root@ns-east-44]# cd WORM
[root@ns-east-44]# touch test
[root@ns-east-44]# chmod 4000 test
[root@ns-east-44]# sls -D
 
test:
  mode: -r-Sr--r--  links:   1  owner: root      group: root
  length:         0  admin id:      0  inode:     1026.1
  access:      Jan 30 16:13  modification: Jan 30 16:13
  changed:     Jan 30 16:13  retention-end: Mar  1 16:13 2007
  creation:    Jan 30 16:13  residence:    Jan 30 16:13
  retention:   active        retention-period: 0y, 30d, 0h, 0m
 
[root@ns-east-44]# rm test
rm: test: override protection 444 (yes/no)? yes
[root@ns-east-44]# ls
[root@ns-east-44]#

Example 3. WORM trigger is chmod -w

Simple application of the WORM trigger using WORM emulation mode:


[root@ns-east-44]# grep -i worm /etc/vfstab
samfs1 -       /samfs1  samfs   -       no     bg,worm_emul
 
[root@ns-east-44]# mount samfs1
[root@ns-east-44]# cd /samfs1
[root@ns-east-44]# mkdir WORM
[root@ns-east-44]# chmod -w WORM
[root@ns-east-44]# sls -D
 
WORM:
  mode: drwxr-xr-x  links:   2  owner: root      group: root
  length:      4096  admin id:      0  inode:     1025.1
  access:      Jan 30 16:26  modification: Jan 30 16:26
  changed:     Jan 30 16:26  attributes:   Jan  1  1970
  creation:    Jan 30 16:26  residence:    Jan 30 16:26
  worm-capable        retention-period: 0y, 30d, 0h, 0m
 
[root@ns-east-44]# cd WORM
[root@ns-east-44]# touch test
[root@ns-east-44]# chmod -w test
[root@ns-east-44]# sls -D
 
test:
  mode: -r--r--r--  links:   1  owner: root      group: root
  length:         0  admin id:      0  inode:     1026.1
  access:      Jan 30 16:26  modification: Jan 30 16:26
  changed:     Jan 30 16:26  retention-end: Mar  1 16:26 2007
  creation:    Jan 30 16:26  residence:    Jan 30 16:26
  retention:   active        retention-period: 0y, 30d, 0h, 0m
 
[root@ns-east-44]# rm test
rm: test: override protection 444 (yes/no)? yes
rm: test not removed: Read-only file system
[root@ns-east-44]# ls
test

Example 4. WORM trigger is chmod -w

Simple application of the WORM trigger using WORM emulation lite mode:


[root@ns-east-44]# grep -i worm /etc/vfstab
samfs1 -       /samfs1  samfs   -       no     bg,emul_lite
 
[root@ns-east-44]# mount samfs1
[root@ns-east-44]# cd /samfs1
[root@ns-east-44]# mkdir WORM
[root@ns-east-44]# chmod -w WORM
[root@ns-east-44]# sls -D
 
WORM:
  mode: drwxr-xr-x  links:   2  owner: root      group: root
  length:      4096  admin id:      0  inode:     1025.1
  access:      Jan 30 16:36  modification: Jan 30 16:36
  changed:     Jan 30 16:36  attributes:   Jan  1  1970
  creation:    Jan 30 16:36  residence:    Jan 30 16:36
  worm-capable        retention-period: 0y, 30d, 0h, 0m
 
[root@ns-east-44]# cd WORM
[root@ns-east-44]# touch test
[root@ns-east-44]# chmod -w test
[root@ns-east-44]# sls -D
 
test:
  mode: -r--r--r--  links:   1  owner: root      group: root
  length:         0  admin id:      0  inode:     1026.1
  access:      Jan 30 16:36  modification: Jan 30 16:36
  changed:     Jan 30 16:36  retention-end: Mar  1 16:36 2007
  creation:    Jan 30 16:36  residence:    Jan 30 16:36
  retention:   active        retention-period: 0y, 30d, 0h, 0m
 
[root@ns-east-44]# rm test
rm: test: override protection 444 (yes/no)? yes
[root@ns-east-44]# ls
[root@ns-east-44]#



Note - Use care when applying the WORM trigger. The file data and path cannot be changed after the file has the WORM feature applied. Once this feature is applied to a file, it is irrevocable. Further, once the WORM trigger is applied to a file, its volume also become a WORM volume and remains that way. The volume can only be destroyed using a volume management or RAID interface. If one of the WORM "lite" options was used to create it, the volume can also be rebuilt by using sammkfs.



Retention Periods

The WORM-FS feature also includes file-retention periods that can be customized. Assigning a retention period to a file maintains the WORM features in that file for the specified period of time.



Note - Retention periods cannot extend beyond 01/18/2038 when initially assigning or extending the period using Solaris/UNIX utilities. This is due to the fact these utilities use signed 32 bit numbers to represent time in seconds. Time is measured from the epoch which is January 1, 1970. 2**31 seconds from the epoch extends to 01/18/2038 around 10:14 PM. You can, however, exceed this date using a default retention period. See Setting the Default Retention Period.



Do one of the following to set a retention period for a file:

CODE EXAMPLE 7-4 shows the creation of a file in a WORM-capable directory, using the WORM trigger on the file (with the chmod 4000 command), and using the sls command to display the file's WORM features. This example uses the default retention period of the file system (60 minutes, as set in CODE EXAMPLE 7-3).


CODE EXAMPLE 7-4 Creation of a WORM-Capable Directory and WORM File
# cd WORM# echo "This is a test file" >> test# sls -Dtest:	mode: -rw-r--r--  links: 1  owner: root group: other	length: 20  admin id: 0  inode: 1027.1	access: Oct 30 02:50  modification: Oct 30 02:50	changed: Oct 30 02:50  attributes: Oct 30 02:50	creation: Oct 30 02:50  residence: Oct 30 02:50	checksum: gen  no_use  not_val  algo: 0
# chmod 4000 test# sls -Dtest:	mode: -r--r--r--  links: 1  owner: root group: other	length: 20  admin id: 0  inode: 1027.1	access: Oct 30 02:50  modification: Oct 30 02:50	changed: Oct 30 02:50  retention-end: Oct 30 2005 03:50	creation: Oct 30 02:50  residence: Oct 30 02:50	retention: active retention-period: 0y, 0d, 1h, 0m	checksum: gen  no_use  not_val  algo: 0
 

With the addition of the WORM-FS feature, three states are possible for a file in a Sun StorageTek QFS file system:

The normal state represents the state of an ordinary file in a Sun StorageTek QFS file system. A transition to the retained, or active, state occurs when the WORM bit is set on a file. The expired, or over, state occurs when the file's retention period is exceeded.

When a retention period is assigned to a file and the WORM trigger is applied to it, the file's path and data are immutable. When the retention period expires, the state is changed to "expired" but the path and data remain immutable.

When a file is in an expired state, only two operations are available:

If the retention period is extended, the file's state returns to "active" and the new end date and duration are set accordingly.

Both hard and soft links to files can be used with the WORM-FS feature. Hard links can be established only with files that reside in a WORM-capable directory. After a hard link is created, it has the same WORM characteristics as the original file. Soft links can also be established, but a soft link cannot use the WORM features. Soft links to WORM files can be created in any directory in a Sun StorageTek QFS file system.

Another attribute of the WORM-FS feature is directory inheritance. New directories created under a directory that includes a WORM attribute inherit this attribute from their parent. If a directory has a default retention period set, this retention period is also inherited by any new subdirectories. The WORM bit can be set on any file whose parent directory is WORM-capable. Ordinary users can set the WORM feature on directories and files that they own or have access to by using normal UNIX permissions.



Note - A WORM-capable directory can only be deleted if it contains no WORM files.



Setting the Default Retention Period

The default retention period for a file system can be set as a mount option in the /etc/vfstab file. For example:

samfs1 - /samfs1 samfs - no
bg,worm_emul,def_retention=1y60d

The format for setting the default retention period is MyNdOhPm, in which M, N, O, and P are non-negative integers and y, d, h, and m stand for years, days, hours, and minutes, respectively. Any combination of these units can be used. For example, 1y5d4h3m indicates 1 year, 5 days, 4 hours, and 3 minutes; 30d8h indicates 30 days and 8 hours; and 300m indicates 300 minutes. This format is backward compatible with software versions prior to 4U5, in which the retention period was specified in minutes. It is important to note, although the granularity of the period is in minutes, the accuracy of the period is based on one day. Also, the function handling days, hours, and minutes does not account for leap years when determining retention periods. You must consider this when using one (or all) of these to set the default retention period.

You can also use the default retention period to set a file or directory's retention period beyond the year 2038. To do this, set the default retention period to a value which exceeds 2038 and mount the file system. Then use the appropriate WORM trigger to apply the default retention period. Here is an example of using the default retention period to set a retention period on a directory and file which exceeds the year 2038.


CODE EXAMPLE 7-5 Extending the Retention Period Beyond 2038
[root@ns-east-44]# grep samfs1 /etc/vfstab
samfs1 -       /samfs1  samfs   -       no    
bg,worm_capable,def_retention=34y
[root@ns-east-44]# mount samfs1
[root@ns-east-44]# cd /samfs1
[root@ns-east-44]# mkdir WORM
[root@ns-east-44]# chmod 4000 WORM
[root@ns-east-44]# sls -D
WORM:
 mode: drwxr-xr-x  links:   2  owner: root      group: root
 length:      4096  admin id:      0  inode:     1026.1
 access:      Feb 20 14:24  modification: Feb 20 14:24
 changed:     Feb 20 14:24  attributes:   Jul 26  1970
 creation:    Feb 20 14:24  residence:    Feb 20 14:24
 worm-capable        retention-period: 34y, 0d, 0h, 0m
 
[root@ns-east-44]# cd WORM
[root@ns-east-44]# touch test
[root@ns-east-44]# chmod 4000 test
[root@ns-east-44]# sls -D
test:
 mode: -r-Sr--r--  links:   1  owner: root      group: root
 length:         0  admin id:      0  inode:     1027.1
 access:      Feb 20 14:24  modification: Feb 20 14:25
 changed:     Feb 20 14:25  retention-end: Feb 20 14:25 2041
 creation:    Feb 20 14:24  residence:    Feb 20 14:24
 retention:   active        retention-period: 34y, 0d, 0h, 0m

You can also set a default retention period for a directory using the touch utility, as described in the following section, Setting the Retention Period Using touch. This retention period overrides the default retention period for the file system and is inherited by any subdirectories.

Setting the Retention Period Using touch

You can use the touch utility to set or extend a file's or directory's retention period. You can also use touch to shorten the default retention period for a directory (but not for a file).

To set the retention period, you must first advance the file's or directory's access time using touch, and then apply the WORM trigger by using the chmod command or removing write permissions (depending on the WORM mode in place at the time).

CODE EXAMPLE 7-6 shows the use of the touch utility to set a file's retention period, followed by the application of the WORM trigger.


CODE EXAMPLE 7-6 Using touch and chmod to Set the Retention Period
# touch -a -t200508181125 test
# sls -D
test:
  mode: -rw-r--r--  links:   1  owner: root      group: root    
  length:         0  admin id:      0  inode:     1027.1
  access:      Aug 18  2005  modification: Aug 18 11:19
  changed:     Aug 18 11:19  attributes:   Aug 18 11:19
  creation:    Aug 18 11:19  residence:    Aug 18 11:19
 
# chmod 4000 test
# sls -D
test:
  mode: -r-Sr--r--  links:   1  owner: root      group: root    
  length:         0  admin id:      0  inode:     1027.1
  access:      Aug 18  2005  modification: Aug 18 11:19
  changed:     Aug 18 11:19  retention-end: Aug 18 2005 11:25
  creation:    Aug 18 11:19  residence:    Aug 18 11:19
  retention:   active        retention-period: 0y, 0d, 0h, 6m
 

The -a option for touch is used to change the access time of the file or directory. The -t option specifies what time is to be used for the access time field. The format for the time argument is [[CC]YY]MMDDhhmm[.SS], as follows:

The CC, YY, and SS fields are optional. If CC and YY are not given, the default is the current year. See the touch manpage for more information on these options.

To set the retention period to permanent retention, set the access time to its largest possible value: 203801182214.07.

Extending a File's Retention Period

CODE EXAMPLE 7-7 shows an example of using touch to extend a file's retention period.


CODE EXAMPLE 7-7 Using touch to Extend a File's Retention Period
# sls -D test
test:
  mode: -r-Sr--r--  links:   1  owner: root      group: root    
  length:         0  admin id:      0  inode:     1029.1
  access:      Aug 18 11:35  modification: Aug 18 11:33
  changed:     Aug 18 11:33  retention-end: Aug 18 2005 11:35
  creation:    Aug 18 11:33  residence:    Aug 18 11:33
  retention:   over          retention-period: 0y, 0d, 0h, 2m
# touch -a -t200508181159 test
# sls -D
test:
  mode: -r-Sr--r--  links:   1  owner: root      group: root    
  length:         0  admin id:      0  inode:     1029.1
  access:      Aug 18 11:35  modification: Aug 18 11:33
  changed:     Aug 18 11:33  retention-end: Aug 18 2005 11:59
  creation:    Aug 18 11:33  residence:    Aug 18 11:33
  retention:   active        retention-period: 0y, 0d, 0h, 26m

In this example the retention period was extended to Aug 18, 2005 at 11:59AM, which is 26 minutes from the time the WORM trigger was initially applied.



Note - Using touch to extend the retention period is independent of the active WORM mode.



Using sls to View WORM-FS Files

Use the sls command to view WORM file attributes. The -D option shows whether a directory is WORM-capable. Use this option on a file to display when the retention period began, when it will end, the current retention state, and the duration as specified on the command line.

The start of the retention period is stored in the file's modified time field. The end of the retention period is stored in the file's attribute time field. This time is displayed as a calendar date. An additional line in the sls output shows the retention period state and duration.

CODE EXAMPLE 7-8 shows an example of how sls -D displays a file's retention status.


CODE EXAMPLE 7-8 Using sls to Find a File's Retention Status
sls -D test
test:	mode: -r-Sr--r--  links:   1  owner: root group: root	length: 5  admin id: 0  inode: 1027.1	access: Aug 18 2005 modification: Aug 18 11:19	changed: Aug 18 11:19 retention-end: Aug 18 2005 11:25	creation: Aug 18 11:19 residence: Aug 18 11:19	retention: active retention-period: 0y, 0d, 0h, 6m

In this example, the retention state is active, as shown by the retention: active designation, meaning that the file has the WORM bit set. The retention period started on August 18, 2005, at 11:19 and will end on August 18, 2005, at 11:25. The retention period was specified to be 0 years, 0 days, 0 hours, and 6 minutes.

Using sfind to Find WORM-FS Files

Use the sfind utility to search for files that have certain retention periods. See the sfind(1) man page for more information on the options. The following options are available:

For example, CODE EXAMPLE 7-9 shows the command to find files whose retention period expires after 12/24/2004 at 15:00.


CODE EXAMPLE 7-9 Using sfind to Find All WORM Files That Expire After a Certain Date
# sfind -rafter 200412241500

For example, CODE EXAMPLE 7-10 shows the command to find files for which more than 1 year, 10 days, 5 hours, and 10 minutes remain before expiration.


CODE EXAMPLE 7-10 Using sfind to Find All WORM Files With More Than a Specified Time Remaining
# sfind -rremain 1y10d5h10m

For example, CODE EXAMPLE 7-11 shows the command to find files that have retention periods longer than 10 days.


CODE EXAMPLE 7-11 Using sfind to Find All WORM Files With Longer Than a Specified Retention Period
# sfind -rlonger 10d


Accommodating Large Files

When manipulating very large files, pay careful attention to the size of disk cache that is available on the system. If you try to write a file that is larger than your disk cache, behavior differs depending on the type of file system that you are using:

If you are operating within a SAM-QFS environment and your application must write a file that is larger than the disk cache, you can segment the file with the segment(1) command. For more information about the segment(1) command, see the segment(1) man page or see the Sun StorageTek Storage Archive Manager Archive Configuration and Administration Guide.


Configuring a Multireader File System

The multireader file system consists of a single writer host and multiple reader hosts. The writer and reader mount options that enable the multireader file system are compatible with Sun StorageTek QFS file systems only. The mount options are described in this section and on the mount_samfs(1M) man page.

You can mount the multireader file system on the single writer host by specifying the -o writer option with the mount(1M) command. The host system with the writer mount option is the only host system that is allowed to write to the file system. The writer host system updates the file system. You must ensure that only one host in a multireader file system has the file system mounted with the writer mount option enabled. If -o writer is specified, directories are written through to disk at each change and files are written through to disk at close.



caution icon

Caution - The multireader file system can become corrupted if more than one writer host has the file system mounted at one time. It is the site administrator's responsibility to ensure that this situation does not occur.



You can mount a multireader file system on one or more reader hosts by specifying the -o reader option with the mount(1M) command. There is no limit to the number of host systems that can have the multireader file system mounted as a reader.

A major difference between the multireader file system and Sun StorageTek QFS shared file system is that the multireader host reads metadata from the disk, and the client hosts of a Sun StorageTek QFS shared file system read metadata over the network. The Sun StorageTek QFS shared file system supports multireader hosts. In this configuration, multiple shared hosts can be adding content while multiple reader hosts are distributing content.



Note - You cannot specify the writer option on any host if you are mounting the file system as a Sun StorageTek QFS shared file system. You can, however, specify the reader option.

If you want a Sun StorageTek QFS shared file system client host to be a read-only host, mount the Sun StorageTek QFS shared file system on that host with the reader mount option. In addition, set the sync_meta mount option to 1 if you use the reader option in a Sun StorageTek QFS shared file system. For more information about the Sun StorageTek QFS shared file system, see Configuring a Sun StorageTek QFS Shared File System. For more information about mount options, see the mount_samfs(1M) man page.



You must ensure that all readers in a multireader file system have access to the device definitions that describe the ma device. Copy the lines from the mcf file that resides on the primary metadata server host to the mcf files on the alternate metadata servers. After copying the lines, you might need to update the information about the disk controllers because, depending on your configuration, disk partitions might not show up the same way across all hosts.

In a multireader file system environment, the Sun StorageTek QFS software ensures that all servers accessing the same file system can always access the current environment. When the writer closes a file, the Sun StorageTek QFS file system immediately writes all information for that file to disk. A reader host can access a file after the file is closed by the writer. You can specify the refresh_at_eof mount option to help ensure that no host system in a multireader file system gets out of sync with the file system.

By default, the metadata information for a file on a reader host is invalidated and refreshed every time a file is accessed. If the data changed, it is invalidated. This includes any type of access, whether through cat(1), ls(1), touch(1), open(2), or other methods. This immediate refresh rate ensures that the data is correct at the time the refresh is done, but it can affect performance. Depending on your site preferences, you can use the mount(1M) command's -o invalid=n option to specify a refresh rate between 0 seconds and 60 seconds. If the refresh rate is set to a small value, the Sun StorageTek QFS file system reads the directory and other metadata information n seconds after the last refresh. More frequent refreshes result in more overhead for the system, but stale information can exist if n is nonzero.



caution icon

Caution - If a file is open for a read on a readerhost, there is no protection against that file's being removed or truncated by the writer. You must use another mechanism, such as application locking, to protect the reader from inadvertent writer actions.




Using the SAN-QFS File System in a Heterogeneous Computing Environment

The SAN-QFS file system enables multiple hosts to access the data stored in a Sun StorageTek QFS system at full disk speeds. This capability can be especially useful for database, data streaming, web page services, or any application that demands high-performance, shared-disk access in a heterogeneous environment.

You can use the SAN-QFS file system in conjunction with fibre-attached devices in a storage area network (SAN). The SAN-QFS file system enables high-speed access to data through Sun StorageTek QFS software and software such as Tivoli SANergy file-sharing software. To use the SAN-QFS file system, you must have both the SANergy (2.2.4 or later) and the Sun StorageTek QFS software. For information about the levels of Sun StorageTek QFS and SANergy software that are supported, contact your Sun sales representative.



Note - In environments that include the Solaris OS and supported Linux operating systems, use the Sun StorageTek QFS shared file system, not the SAN-QFS file system, on the Solaris hosts.

For information about the Sun StorageTek QFS shared file system, see the Configuring a Sun StorageTek QFS Shared File System. For a comparison of the Sun StorageTek QFS shared file system and the SAN-QFS file system, see SAN-QFS Shared File System and Sun StorageTek QFS Shared File System Comparison.



FIGURE 7-1 depicts a SAN-QFS file system that uses both the Sun StorageTek QFS software and the SANergy software and shows that the clients and the metadata controller (MDC) system manage metadata across the local area network (LAN). The clients perform I/O directly to and from the storage devices.

Note that all clients running only the Solaris OS are hosting the Sun StorageTek QFS software, and that all heterogeneous clients running an OS other than Solaris are hosting the SANergy software and the NFS software. The SAN-QFS file system's metadata server hosts both the Sun StorageTek QFS and the SANergy software. This server acts not only as the metadata server for the file system but also as the SANergy MDC.



Note - The SANergy software is not supported on x64 hardware platforms.




FIGURE 7-1 SAN-QFS File System Using Sun StorageTek QFS Software and SANergy Software

The rest of this section describes other aspects of the SAN-QFS file system:

Before You Begin

Before you enable the SAN-QFS file system, keep the following configuration considerations in mind and plan accordingly:



Note - This documentation assumes that your non-Solaris clients are hosting SANergy software and NFS software for file system sharing. The text and examples in this document reflect this configuration. If your non-Solaris clients host the Samba software instead of the NFS software, see your Samba documentation.



Enabling the SAN-QFS File System

The following procedures describe how to enable the SAN-QFS file system. Perform these procedures in the order in which they are presented:


procedure icon  To Enable the SAN-QFS File System on the Metadata Controller

When you use the SAN-QFS file system, one host system in your environment acts as the SANergy metadata controller (MDC). This is the host system upon which the Sun StorageTek QFS file system resides.

1. Log in to the host upon which the Sun StorageTek QFS file system resides and become superuser.

2. Verify that the Sun StorageTek QFS file system is tested and fully operational.

3. Install and configure the SANergy software.

For instructions, see your SANergy documentation.

4. Use the pkginfo(1) command to verify the SANergy software release level:


# pkginfo -l SANergy

5. Ensure that the file system is mounted.

Use the mount(1M) command either to verify the mount or to mount the file system.

6. Use the share(1M) command in the following format to enable NFS access to client hosts:


MDC# share -F nfs -d qfs-file-system-name /mount-point

For qfs-file-system-name, specify the name of your Sun StorageTek QFS file system, such as qfs1. For more information about the share(1M) command, see the share(1M) or share_nfs(1M) man page.

For mount-point, specify the mount point of qfs-file-system-name.

7. If you are connecting to Microsoft Windows clients, configure Samba, rather than NFS, to provide security and namespace features.

To do this, add the SANERGY_SMBPATH environment variable in the /etc/init.d/sanergy file and point it to the location of the Samba configuration file. For example, if your Samba configuration file is named /etc/swf/smb.conf, you must add the following lines to the beginning of your /etc/init.d/sanergy file:

SANERGY_SMBPATH=/etc/sfw/smb.conf
export SANERGY_SMBPATH

8. (Optional) Edit the file system table (/etc/dfs/dfstab) on the MDC to enable access at boot time.

Perform this step if you want to automatically enable this access at boot time.


procedure icon  To Enable the SAN-QFS File System on the Clients

After you have enabled the file system on the MDC, you are ready to enable it on the client hosts. The SAN-QFS file system supports several client hosts including IRIX, Microsoft Windows, AIX, and Linux hosts. For information about the specific clients supported, see your Sun sales representative.

Every client has different operational characteristics. This procedure uses general terms to describe the actions you must take to enable the SAN-QFS file system on the clients. For information specific to your clients, see the documentation provided with your client hosts.

1. Log in to each of the client hosts.

2. Edit the file system defaults table on each client and add the file system.

For example, on a Solaris OS, edit the /etc/vfstab file on each client and add the name of your Sun StorageTek QFS file system, as follows:


server:/qfs1  -  /qfs1  nfs  -  yes  noac,hard,intr,timeo=1000

On other operating system platforms, the file system defaults table might reside in a file other than /etc/vfstab. For example, on Linux systems, this file is /etc/fstab.

For more information about editing the /etc/vfstab file, see Sun StorageTek QFS Installation and Upgrade Guide. For information about required or suggested NFS client mount options, see your SANergy documentation.


procedure icon  To Install the SANergy Software on the Clients

After enabling the file system on the client hosts, you are ready to install the SANergy software on the clients. The following procedure describes the SANergy installation process in general terms.

1. Install and configure the SANergy software.

For instructions, see your SANergy documentation.

2. Use the mount command to NFS mount the file system.

For example:


# mount host:/mount-point/ local-mount-point

For host, specify the MDC.

For mount-point, specify the mount point of the Sun StorageTek QFS file system on the MDC.

For local-mount-point, specify the mount point on the SANergy client.

3. Use the SANergy fuse command to fuse the software:


# fuse|mount-point

For mount-point, specify the mount point on the SANergy client.

Unmounting the SAN-QFS File System

The following procedures describe how to unmount a SAN-QFS file system that is using the SANergy software. Perform these procedures in the order in which they are presented:


procedure icon  To Unmount the SAN-QFS File System on the SANergy Clients

Follow these steps for each client host on which you want to unmount the SAN-QFS file system.

1. Log in to the client host and become superuser.

2. Use the SANergy unfuse command to unfuse the file system from the software:


# unfuse|mount-point

For mount-point, specify the mount point on the SANergy client.

3. Use the umount(1M) command to unmount the file system from NFS:


# umount host:/mount-point/ local-mount-point

For host, specify the MDC.

For mount-point, specify the mount point of the Sun StorageTek QFS file system on the MDC.

For local-mount-point, specify the mount point on the SANergy client.


procedure icon  To Unmount the SAN-QFS File System on the Metadata Controller

1. Log in to the MDC system and become superuser.

2. Use the unshare(1M) command to disable NFS access to client hosts:


MDC# unshare qfs-file-system-name /mount-point

For qfs-file-system-name, specify the name of your Sun StorageTek QFS file system, such as qfs1. For more information about the unshare(1M) command, see the unshare(1M) man page.

For mount-point, specify the mount point of qfs-file-system-name.


procedure icon  To Unmount the SAN-QFS File System on the Sun StorageTek QFS Clients

Follow these steps on each participating client host.

1. Log in to a Sun StorageTek QFS client host and become superuser.

2. Use the umount(1M) command to unmount the file system.

For example:


# umount /qfs1


procedure icon  To Unmount the SAN-QFS File System on the Sun StorageTek QFS Server

1. Log in to the host system upon which the Sun StorageTek QFS file system resides and become superuser.

2. Use the umount(1M) command to unmount the file system.

Troubleshooting: Unmounting a SAN-QFS File System With SANergy File Holds

SANergy software issues holds on Sun StorageTek QFS files to reserve them temporarily for accelerated access. If SANergy crashes when holds are in effect, you will not be able to unmount the file system. If you are unable to unmount a SAN-QFS file system, examine the /var/adm/messages file and look for console messages that describe outstanding SANergy holds.

Whenever possible, allow the SANergy file-sharing function to clean up its holds, but in an emergency, or in case of a SANergy file-sharing system failure, use the following procedure to avoid a reboot.


procedure icon  To Unmount a File System in the Presence of SANergy File Holds

1. Use the unshare(1M) command to disable NFS access.

2. Use the samunhold(1M) command to release the SANergy file system holds.

For more information about this command, see the samunhold(1M) man page.

3. Use the umount(1M) command to unmount the file system.

Block Quotas in a SAN-QFS File System

The SANergy software does not enforce block quotas. Therefore, it is possible for you to exceed a block quota when writing a file with the SANergy software. For more information on quotas, see Enabling Quotas.

File Data and File Attributes in a SAN-QFS File System

The SANergy software uses the NFS software for metadata operations, which means that the NFS close-to-open consistency model is used for file data and attributes. File data and attributes among SANergy clients do not support the POSIX coherency model for open files.

Using samgrowfs(1M) to Expand SAN-QFS File Systems

You can use the samgrowfs(1M) command to increase the size of a SAN-QFS file system. To perform this task, follow the procedures described in Adding Disk Cache to a File System.



caution icon

Caution - When using this procedure, be aware that the line-by-line device order in the mcffile must match the order of the devices listed in the file system's superblock.



When the samgrowfs(1M) command is issued, the devices that were already in the mcf file keep their positions in the superblock. New devices are written to subsequent entries in the order in which the are encountered.

If this new order does not match the order in the superblock, the SAN-QFS file system cannot be fused.

SAN-QFS Shared File System and Sun StorageTek QFS Shared File System Comparison

The SAN-QFS shared file system and the Sun StorageTek QFS shared file system have the following similarities:

TABLE 7-3 describes differences between the file systems.


TABLE 7-3 SAN-QFS Shared File System Versus Sun StorageTek QFS Shared File System

SAN-QFS File System

Sun StorageTek QFS Shared File System

Uses NFS protocol for metadata.

Uses natural metadata.

Preferred in heterogeneous computing environments (that is, when not all hosts are Sun systems).

Preferred in homogeneous Solaris OS environments.

Useful in environments where multiple, heterogeneous hosts must be able to write data.

Preferred when multiple hosts must write to the same file at the same time.



Understanding I/O Types

The Sun StorageTek QFS file systems support paged I/O, direct I/O, and switching between the I/O types. The following sections describe these I/O types.

Paged I/O

When paged I/O is used, user data is cached in virtual memory pages, and the kernel writes the data to disk. The standard Solaris OS interfaces manage paged I/O. Paged I/O (also called buffered or cached I/O) is selected by default.

Direct I/O

Direct I/O is a process by which data is transferred directly between the user's buffer and the disk. This means that much less time is spent in the system. For performance purposes, specify direct I/O only for large, block-aligned, sequential I/O.

The setfa(1) command and the sam_setfa(3) library routine both have a -D option that sets the direct I/O attribute for a file or directory. If applied to a directory, files and directories created in that directory inherit the direct I/O attribute. After the -D option is set, the file uses direct I/O.

You can also select direct I/O for a file by using the Solaris OS directio(3C) function call. If you use the function call to enable direct I/O, the setting lasts only while the file is active.

To enable direct I/O on a file-system basis, do one of the following:

For more information, see the setfa(1), sam_setfa(3), directio(3C), samfs.cmd(4), and mount_samfs(1M) man pages.

I/O Switching

By default, paged I/O is performed, and I/O switching is disabled. However, the Sun StorageTek QFS file systems support automatic I/O switching, a process by which a site-defined amount of paged I/O occurs before the system switches automatically to direct I/O.

I/O switching should reduce page cache usage on large I/O operations. To enable I/O switching, use samu(1M), or use the dio_wr_consec and dio_rd_consec parameters as directives in the samfs.cmd file or as options with the mount(1M) command.

For more information about these options, see the mount_samfs(1M) or samfs.cmd(4) man pages.


Increasing File Transfer Performance for Large Files

Sun StorageTek QFS file systems are tuned to work with a mix of file sizes. You can increase the performance of disk file transfers for large files by enabling file system settings.



Note - Sun recommends that you experiment with performance tuning outside of a production environment. Tuning these variables incorrectly can have unexpected effects on the overall system.

If your site has a Suntrademark Enterprise Services (SES) support contract, please inform SES if you change performance tuning parameters.




procedure icon  To Increase File Transfer Performance

1. Set the maximum device read/write directive.

The maxphys parameter in the Solaris /etc/system file controls the maximum number of bytes that a device driver reads or writes at any one time. The default value for the maxphys parameter can differ, depending on the level of your Sun Solaris OS, but it is typically around 128 kilobytes.

Add the following line to /etc/system to set maxphys to 8 megabytes:


set maxphys = 0x800000



Note - The maxphys value must be set to a power of two.



2. Set the SCSI disk maximum transfer parameter.

The sd driver enables large transfers for a specific file by looking for the sd_max_xfer_size definition in the /kernel/drv/sd.conf file. If this definition does not exist, the driver uses the value defined in the sd device driver definition, sd_max_xfer_size, which is 1024 x 1024 bytes.

To enable and encourage large transfers, add the following line at the end of the /kernel/drv/sd.conf file:


sd_max_xfer_size=0x800000;

3. Set the fibre disk maximum transfer parameter.

The ssd driver enables large transfers for a specific file by looking for the ssd_max_xfer_size definition in the /kernel/drv/ssd.conf file. If this definition does not exist, the driver uses the value defined in the ssd device driver definition, ssd_max_xfer_size, which is 1024 x 1024 bytes.

Add the following line at the end of the /kernel/drv/ssd.conf file:


ssd_max_xfer_size=0x800000;



Note - On Solaris 10 x86 platforms, this change is made in the /kernel/drv/sd.conf file. For a maximum transfer size of 8 MBytes, the following line is added.
sd_max_xfer_size=0x800000



4. Reboot the system.

5. Set the writebehind parameter.

This step affects paged I/O only.

The writebehind parameter specifies the number of bytes that are written behind by the file system when paged I/O is being performed on a Sun StorageTek QFS file system. Matching the writebehind value to a multiple of the RAID's read-modify-write value can increase performance.

This parameter is specified in units of kilobytes and is truncated to an 8-kilobyte multiple. If set, this parameter is ignored when direct I/O is performed. The default writebehind value is 512 kilobytes. This value favors large-block, sequential I/O.

Set the writebehind size to a multiple of the RAID 5 stripe size for both hardware and software RAID-5. The RAID-5 stripe size is the number of data disks multiplied by the configured stripe width.

For example, assume that you configure a RAID-5 device with three data disks plus one parity disk (3+1) with a stripe width of 16 kilobytes. The writebehind value should be 48 kilobytes, 96 kilobytes, or some other multiple, to avoid the overhead of the read-modify-write RAID-5 parity generation.

For Sun StorageTek QFS file systems, the DAU (sammkfs(1M) -a command) should also be a multiple of the RAID-5 stripe size. This allocation ensures that the blocks are contiguous.

You should test the system performance after resetting the writebehind size. The following example shows testing timings of disk writes:


# timex dd if=/dev/zero of=/sam/myfile bs=256k count=2048

You can set the writebehind parameter from a mount option, from within the samfs.cmd file, from within the /etc/vfstab file, or from a command within the samu(1M) utility. For information about enabling this from a mount option, see the -o writebehind=n option on the mount_samfs(1M) man page. For information about enabling this from the samfs.cmd file, see the samfs.cmd(4) man page. For information about enabling this from within samu(1M), see the samu(1M) man page.

6. Set the readahead parameter.

This step affects paged I/O only.

The readahead parameter specifies the number of bytes that are read ahead by the file system when paged I/O is being performed on a Sun StorageTek QFS file system. This parameter is specified in units of kilobytes and is truncated to an 8-kilobyte multiple. If set, this parameter is ignored when direct I/O is performed.

Increasing the size of the readahead parameter increases the performance of large file transfers, but only to a point. You should test the performance of the system after resetting the readahead size until you see no more improvement in transfer rates. The following is an example method of testing timings on disk reads:


# timex dd if=/sam/myfile of=/dev/null bs=256k

You should test various readahead sizes for your environment. The readahead parameter should be set to a size that increases the I/O performance for paged I/O, but is not so large as to hurt performance. It is also important to consider the amount of memory and number of concurrent streams when you set the readahead value. Setting the readahead value multiplied by the number of streams to a value that is greater than memory can cause page thrashing.

The default readahead value is 1024 kilobytes. This value favors large-block, sequential I/O. For short-block, random I/O applications, set readahead to the typical request size. Database applications do their own read-ahead, so for these applications, set readahead to 0.

The readahead setting can be enabled from a mount option, from within the samfs.cmd file, from within the /etc/vfstab file, or from a command within the samu(1M) utility. For information about enabling this setting from a mount option, see the -o readahead=n option on the mount_samfs(1M) man page. For information about enabling this setting from the samfs.cmd file, see the samfs.cmd(4) man page. For information about enabling this setting from within samu(1M), see the samu(1M) man page.

7. Set the stripe width.

The -o stripe=n option with the mount(1M) command specifies the stripe width for the file system. The stripe width is based on the disk allocation unit (DAU) size. The n argument specifies that n x DAU bytes are written to one device before writing switches to the next device. The DAU size is set when the file system is initialized by the sammkfs(1M) -a command.

If -o stripe=0 is set, files are allocated to file system devices using the round-robin allocation method. With this method, each file is completely allocated on one device until that device is full. Round-robin is the preferred setting for a multistream environment. If -o stripe=n is set to an integer greater than 0, files are allocated to file system devices using the stripe method. To determine the appropriate -o stripe=n setting, try varying the setting and taking performance readings. Striping is the preferred setting for turnkey applications with a required bandwidth.

You can also set the stripe width from the /etc/vfstab file or from the samfs.cmd file.

For more information about the mount(1M) command, see the mount_samfs(1M) man page. For more information about the samfs.cmd file, see the samfs.cmd(4) man page.


Enabling Qwrite Capability

By default, the Sun StorageTek QFS file systems disable simultaneous reads and writes to the same file. This is the mode defined by the UNIX vnode interface standard, which gives exclusive access to only one write while other writers and readers must wait. Qwrite enables simultaneous reads and writes to the same file from different threads.

The Qwrite feature can be used in database applications to enable multiple simultaneous transactions to the same file. Database applications typically manage large files and issue simultaneous reads and writes to the same file. Unfortunately, each system call to a file acquires and releases a read/write lock inside the kernel. This lock prevents overlapped (or simultaneous) operations to the same file. If the application itself implements file-locking mechanisms, the kernel-locking mechanism impedes performance by unnecessarily serializing I/O.

Qwrite can be enabled in the /etc/vfstab file, in the samfs.cmd file, and as a mount option. The -o qwrite option with the mount(1M) command bypasses the file system locking mechanisms (except for applications accessing the file system through the network file system [NFS]) and lets the application control data access. If qwrite is specified, the file system enables simultaneous reads and writes to the same file from different threads. This option improves I/O performance by queuing multiple requests at the drive level.

The following example uses the mount(1M) command to enable Qwrite on a database file system:


# mount -F samfs -o qwrite /db

For more information about this feature, see the qwrite directive on the samfs.cmd(4) man page or the -o qwrite option on the mount_samfs(1M) man page.


Setting the Write Throttle

The -o wr_throttle=n option limits the number of outstanding write kilobytes for one file to n. By default, Sun StorageTek QFS file systems set the wr_throttle to 16 megabytes.

If a file has n write kilobytes outstanding, the system suspends an application that attempts to write to that file until enough bytes have completed the I/O to allow the application to be resumed.

If your site has thousands of streams, such as thousands of NFS-shared workstations accessing the file system, you can tune the -o wr_throttle=n option in order to avoid flushing excessive amounts of memory to disk at once. Generally, the number of streams multiplied by 1024 x the n argument to the -o wr_throttle=n option should be less than the total size of the host system's memory minus the memory needs of the Solaris OS, as shown in this formula:


number-of-streams x n x 1024 < total-memory - Solaris-OS-memory-needs

For turnkey applications, you might want to use a size larger than the default 16,384 kilobytes, because this keeps more pages in memory.


Setting the Flush-Behind Rate

Two mount parameters control the flush-behind rate for pages written sequentially and for stage pages. The flush_behind and stage_flush_behind mount parameters are read from the samfs.cmd file, the /etc/vfstab file, or the mount(1M) command.

The flush_behind=n mount parameter sets the maximum flush-behind value. Modified pages that are being written sequentially are written to disk asynchronously to help the Solaristrademark Volume Manager (SVM) layer keep pages clean. To enable this feature, set n to be an integer from 16 through 8192. By default, n is set to 0, which disables this feature. The n argument is specified in kilobyte units.

The stage_flush_behind=n mount parameter sets the maximum stage flush-behind value. Stage pages that are being staged are written to disk asynchronously to help the SVM layer keep pages clean. To enable this feature, set n to be an integer from 16 through 8192. By default, n is set to 0, which disables this feature. The n argument is specified in kilobyte units.

For more information about these mount parameters, see the mount_samfs(1M) man page or the samfs.cmd(4) man page.


Tuning the Number of Inodes and the Inode Hash Table

The Sun StorageTek QFS file system enables you to set the following two tunable parameters in the /etc/system file:

To enable nondefault settings for these parameters, edit the /etc/system file, and then reboot your system.

The following subsections describe these parameters in more detail.

The ninodes Parameter

The ninodes parameter specifies the maximum number of default inodes. The value for ninodes determines the number of in-core inodes that Sun StorageTek QFS software keeps allocated to itself, even when applications are not using many inodes.

The format for this parameter in the /etc/system file is as follows:


set samfs:ninodes = value

The range for value is from 16 through 2000000. The default value for ninodes is one of the following:

The nhino Parameter

The nhino parameter specifies the size of the in-core inode hash table.

The format for this parameter in the /etc/system file is as follows:


set samfs:nhino = value

The range for value is 1 through 1048756. value must be a nonzero power of 2. The default value for nhino is one of the following:

For this example, if nhino is not set, the system assumes 1024, which is 8000 divided by 8 and then rounded up to the nearest power of 2 (210)

When to Set the ninodes and nhino Parameters

When searching for an inode by number (after obtaining an inode number from a directory or after extracting an inode number from an NFS file handle), a Sun StorageTek QFS file system searches its cache of in-core inodes. To speed this process, the file system maintains a hash table to decrease the number of inodes it must check.

A larger hash table reduces the number of comparisons and searches, at a modest cost in memory usage. If the nhino value is too large, the system is slower when undertaking operations that sweep through the entire inode list (inode syncs and unmounts). For sites that manipulate large numbers of files and sites that do extensive amounts of NFS I/O, it can be advantageous to set these parameter values to larger than the defaults.

If your site has file systems that contain only a small number of files, it might be advantageous to make these numbers smaller than the defaults. This could be the case, for example, if you have a file system into which you write large single-file tar(1) files to back up other file systems.