smartd は、多くの ATA-3 以降の ATA、IDE、および SCSI-3 ハード ドライブに組み込まれている Self-Monitoring, Analysis and Reporting Technology (SMART) システムを監視するデーモンです。 SMART の目的は、ハード ドライブの信頼性を監視し、ドライブの障害を予測し、さまざまなタイプのドライブ セルフテストを実行することです。このバージョンの smartd は、ATA/ATAPI-7 およびそれ以前の標準と互換性があります。
smartd は、ATA デバイスで SMART 監視を有効にしようとし (smartctl -s on と同等)、30 分ごとにこれらのデバイスと SCSI デバイスをポーリングし (構成可能)、SMART エラーと SMART 属性の変更を SYSLOG インターフェイス経由でログに記録します。これらの SYSLOG 通知および警告のデフォルトの場所は /var/log/messages です。デフォルトの場所を変更するには、以下で説明する「-l」コマンドライン オプションを参照してください。
ファイルにログを記録するだけでなく、問題が検出された場合に電子メールで警告を送信するように smartd を構成することもできます。問題の種類に応じて、ディスクでセルフテストを実行したり、ディスクをバックアップしたり、ディスクを交換したり、製造元のユーティリティを使用して、不良または読み取り不能なディスク セクターを強制的に再割り当てしたりすることができます。ディスクの問題が検出された場合は、smartctl のマニュアル ページと smartmontools の Web ページ/FAQ で詳細なガイダンスを参照してください。
サービス管理
Init.d スクリプトの場所:
/etc/init.d/smartd
「chkconfig –list smartd」の例
# chkconfig --list smartd smartd 0:off 1:off 2:on 3:on 4:on 5:on 6:off
利用可能なサービス利用オプション
# service smartd Usage: /etc/init.d/smartd {start|stop|reload|report|restart|status}
# service smartd start Starting smartd: [ OK ]
# service smartd stop Shutting down smartd: [ OK ]
# service smartd status smartd (pid 4061 2857) is running...
# service smartd restart Shutting down smartd: [ OK ] Starting smartd: [ OK ]
# service smartd reload Reloading smartd daemon configuration: [ OK ]
# service smartd report Checking SMART devices now: [ OK ]
実行するデーモン:
/usr/sbin/smartd
構成
RPM パッケージ:
smartmontools-[version]-[release]
構成ファイル
/etc/smartd.conf ### For CentOS/RHEL 5,6 /etc/smartmontools/smartd.conf . ### For CentOS/RHEL 7
構成ファイルの例 /etc/smartmontools/smartd.conf
# cat /etc/smartmontools/smartd.conf # Sample configuration file for smartd. See man smartd.conf. # Home page is: http://smartmontools.sourceforge.net # $Id: smartd.conf 3651 2012-10-18 15:11:36Z samm2 $ # smartd will re-read the configuration file if it receives a HUP # signal # The file gives a list of devices to monitor using smartd, with one # device per line. Text after a hash (#) is ignored, and you may use # spaces and tabs for white space. You may use '\' to continue lines. # You can usually identify which hard disks are on your system by # looking in /proc/ide and in /proc/scsi. # The word DEVICESCAN will cause any remaining lines in this # configuration file to be ignored: it tells smartd to scan for all # ATA and SCSI devices. DEVICESCAN may be followed by any of the # Directives listed below, which will be applied to all devices that # are found. Most users should comment out DEVICESCAN and explicitly # list the devices that they wish to monitor. DEVICESCAN -H -m root -M exec /usr/libexec/smartmontools/smartdnotify -n standby,10,q # Alternative setting to ignore temperature and power-on hours reports # in syslog. #DEVICESCAN -I 194 -I 231 -I 9 # Alternative setting to report more useful raw temperature in syslog. #DEVICESCAN -R 194 -R 231 -I 9 # Alternative setting to report raw temperature changes >= 5 Celsius # and min/max temperatures. #DEVICESCAN -I 194 -I 231 -I 9 -W 5 # First (primary) ATA/IDE hard disk. Monitor all attributes, enable # automatic online data collection, automatic Attribute autosave, and # start a short self-test every day between 2-3am, and a long self test # Saturdays between 3-4am. #/dev/hda -a -o on -S on -s (S/../.././02|L/../../6/03) # Monitor SMART status, ATA Error Log, Self-test log, and track # changes in all attributes except for attribute 194 #/dev/hdb -H -l error -l selftest -t -I 194 # Monitor all attributes except normalized Temperature (usually 194), # but track Temperature changes >= 4 Celsius, report Temperatures # >= 45 Celsius and changes in Raw value of Reallocated_Sector_Ct (5). # Send mail on SMART failures or when Temperature is >= 55 Celsius. #/dev/hdc -a -I 194 -W 4,45,55 -R 5 -m [email protected] # An ATA disk may appear as a SCSI device to the OS. If a SCSI to # ATA Translation (SAT) layer is between the OS and the device then # this can be flagged with the '-d sat' option. This situation may # become common with SATA disks in SAS and FC environments. # /dev/sda -a -d sat # A very silent check. Only report SMART health status if it fails # But send an email in this case #/dev/hdc -H -C 0 -U 0 -m [email protected] # First two SCSI disks. This will monitor everything that smartd can # monitor. Start extended self-tests Wednesdays between 6-7pm and # Sundays between 1-2 am #/dev/sda -d scsi -s L/../../3/18 #/dev/sdb -d scsi -s L/../../7/01 # Monitor 4 ATA disks connected to a 3ware 6/7/8000 controller which uses # the 3w-xxxx driver. Start long self-tests Sundays between 1-2, 2-3, 3-4, # and 4-5 am. # NOTE: starting with the Linux 2.6 kernel series, the /dev/sdX interface # is DEPRECATED. Use the /dev/tweN character device interface instead. # For example /dev/twe0, /dev/twe1, and so on. #/dev/sdc -d 3ware,0 -a -s L/../../7/01 #/dev/sdc -d 3ware,1 -a -s L/../../7/02 #/dev/sdc -d 3ware,2 -a -s L/../../7/03 #/dev/sdc -d 3ware,3 -a -s L/../../7/04 # Monitor 2 ATA disks connected to a 3ware 9000 controller which # uses the 3w-9xxx driver (Linux, FreeBSD). Start long self-tests Tuesdays # between 1-2 and 3-4 am. #/dev/twa0 -d 3ware,0 -a -s L/../../2/01 #/dev/twa0 -d 3ware,1 -a -s L/../../2/03 # Monitor 2 SATA (not SAS) disks connected to a 3ware 9000 controller which # uses the 3w-sas driver (Linux). Start long self-tests Tuesdays # between 1-2 and 3-4 am. # On FreeBSD /dev/tws0 should be used instead #/dev/twl0 -d 3ware,0 -a -s L/../../2/01 #/dev/twl0 -d 3ware,1 -a -s L/../../2/03 # Same as above for Windows. Option '-d 3ware,N' is not necessary, # disk (port) number is specified in device name. # NOTE: On Windows, DEVICESCAN works also for 3ware controllers. #/dev/hdc,0 -a -s L/../../2/01 #/dev/hdc,1 -a -s L/../../2/03 # Monitor 3 ATA disks directly connected to a HighPoint RocketRAID. Start long # self-tests Sundays between 1-2, 2-3, and 3-4 am. #/dev/sdd -d hpt,1/1 -a -s L/../../7/01 #/dev/sdd -d hpt,1/2 -a -s L/../../7/02 #/dev/sdd -d hpt,1/3 -a -s L/../../7/03 # Monitor 2 ATA disks connected to the same PMPort which connected to the # HighPoint RocketRAID. Start long self-tests Tuesdays between 1-2 and 3-4 am #/dev/sdd -d hpt,1/4/1 -a -s L/../../2/01 #/dev/sdd -d hpt,1/4/2 -a -s L/../../2/03 # HERE IS A LIST OF DIRECTIVES FOR THIS CONFIGURATION FILE. # PLEASE SEE THE smartd.conf MAN PAGE FOR DETAILS # # -d TYPE Set the device type: ata, scsi, marvell, removable, 3ware,N, hpt,L/M/N # -T TYPE set the tolerance to one of: normal, permissive # -o VAL Enable/disable automatic offline tests (on/off) # -S VAL Enable/disable attribute autosave (on/off) # -n MODE No check. MODE is one of: never, sleep, standby, idle # -H Monitor SMART Health Status, report if failed # -l TYPE Monitor SMART log. Type is one of: error, selftest # -f Monitor for failure of any 'Usage' Attributes # -m ADD Send warning email to ADD for -H, -l error, -l selftest, and -f # -M TYPE Modify email warning behavior (see man page) # -s REGE Start self-test when type/date matches regular expression (see man page) # -p Report changes in 'Prefailure' Normalized Attributes # -u Report changes in 'Usage' Normalized Attributes # -t Equivalent to -p and -u Directives # -r ID Also report Raw values of Attribute ID with -p, -u or -t # -R ID Track changes in Attribute ID Raw value with -p, -u or -t # -i ID Ignore Attribute ID for -f Directive # -I ID Ignore Attribute ID for -p, -u or -t Directive # -C ID Report if Current Pending Sector count non-zero # -U ID Report if Offline Uncorrectable count non-zero # -W D,I,C Monitor Temperature D)ifference, I)nformal limit, C)ritical limit # -v N,ST Modifies labeling of Attribute N (see man page) # -a Default: equivalent to -H -f -t -l error -l selftest -C 197 -U 198 # -F TYPE Use firmware bug workaround. Type is one of: none, samsung # -P TYPE Drive-specific presets: use, ignore, show, showall # # Comment: text after a hash sign is ignored # \ Line continuation character # Attribute ID is a decimal integer 1 <= ID <= 255 # except for -C and -U, where ID = 0 turns them off. # All but -d, -m and -M Directives are only implemented for ATA devices # # If the test string DEVICESCAN is the first uncommented text # then smartd will scan for devices /dev/hd[a-l] and /dev/sd[a-z] # DEVICESCAN may be followed by any desired Directives.smartd (S.M.A.R.T.) を使用してディスクの状態を監視する方法
CentOS / RHEL でディスクの不良ブロックまたはディスク エラーをチェックする方法