OVH Community, your new community space.

Platte/Kabel/Controller defekt?


kenshin
10.04.12, 20:47
Moin, folgendes Problem laut syslog:
Code:
Apr 10 18:12:53 <$HOSTNAME> kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Apr 10 18:12:53 <$HOSTNAME> kernel: ata4.00: failed command: FLUSH CACHE EXT
Apr 10 18:12:53 <$HOSTNAME> kernel: ata4.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
Apr 10 18:12:53 <$HOSTNAME> kernel:         res 40/00:01:09:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr 10 18:12:53 <$HOSTNAME> kernel: ata4.00: status: { DRDY }
Apr 10 18:12:53 <$HOSTNAME> kernel: ata4: hard resetting link
Apr 10 18:12:58 <$HOSTNAME> kernel: ata4: link is slow to respond, please be patient (ready=0)
Apr 10 18:13:01 <$HOSTNAME> /USR/SBIN/CRON[19569]: (root) CMD (/usr/local/rtm/bin/rtm 48 > /dev/null 2> /dev/null)
Apr 10 18:13:03 <$HOSTNAME> kernel: ata4: COMRESET failed (errno=-16)
Apr 10 18:13:03 <$HOSTNAME> kernel: ata4: hard resetting link
Apr 10 18:13:08 <$HOSTNAME> kernel: ata4: link is slow to respond, please be patient (ready=0)
Apr 10 18:13:13 <$HOSTNAME> kernel: ata4: COMRESET failed (errno=-16)
Apr 10 18:13:13 <$HOSTNAME> kernel: ata4: hard resetting link
Apr 10 18:13:18 <$HOSTNAME> kernel: ata4: link is slow to respond, please be patient (ready=0)
Apr 10 18:13:48 <$HOSTNAME> kernel: ata4: COMRESET failed (errno=-16)
Apr 10 18:13:48 <$HOSTNAME> kernel: ata4: limiting SATA link speed to 1.5 Gbps
Apr 10 18:13:48 <$HOSTNAME> kernel: ata4: hard resetting link
Apr 10 18:13:53 <$HOSTNAME> kernel: ata4: COMRESET failed (errno=-16)
Apr 10 18:13:53 <$HOSTNAME> kernel: ata4: reset failed, giving up
Apr 10 18:13:53 <$HOSTNAME> kernel: ata4.00: disabled
Apr 10 18:13:53 <$HOSTNAME> kernel: ata4: EH complete
Apr 10 18:13:53 <$HOSTNAME> kernel: sd 3:0:0:0: [sdd] Unhandled error code
Apr 10 18:13:53 <$HOSTNAME> kernel: sd 3:0:0:0: [sdd]  Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Apr 10 18:13:53 <$HOSTNAME> kernel: sd 3:0:0:0: [sdd] CDB: Write(10): 2a 00 e8 d0 67 81 00 00 08 00
Apr 10 18:13:53 <$HOSTNAME> kernel: end_request: I/O error, dev sdd, sector 3905972097
Apr 10 18:13:53 <$HOSTNAME> kernel: end_request: I/O error, dev sdd, sector 3905972097
Apr 10 18:13:53 <$HOSTNAME> kernel: md: super_written gets error=-5, uptodate=0
Apr 10 18:13:53 <$HOSTNAME> kernel: md/raid:md2: Disk failure on sdd2, disabling device.
Apr 10 18:13:53 <$HOSTNAME> kernel: md/raid:md2: Operation continuing on 2 devices.
Apr 10 18:13:53 <$HOSTNAME> kernel: sd 3:0:0:0: [sdd] Unhandled error code
Apr 10 18:13:53 <$HOSTNAME> kernel: sd 3:0:0:0: [sdd]  Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Apr 10 18:13:53 <$HOSTNAME> kernel: sd 3:0:0:0: [sdd] CDB: Write(10): 2a 00 01 40 0f 80 00 00 08 00
Apr 10 18:13:53 <$HOSTNAME> kernel: end_request: I/O error, dev sdd, sector 20975488
Apr 10 18:13:53 <$HOSTNAME> kernel: end_request: I/O error, dev sdd, sector 20975488
Apr 10 18:13:53 <$HOSTNAME> kernel: md: super_written gets error=-5, uptodate=0
Apr 10 18:13:53 <$HOSTNAME> kernel: md/raid1:md1: Disk failure on sdd1, disabling device.
Apr 10 18:13:53 <$HOSTNAME> kernel: md/raid1:md1: Operation continuing on 2 devices.
Apr 10 18:13:53 <$HOSTNAME> kernel: RAID conf printout:
Apr 10 18:13:53 <$HOSTNAME> kernel: --- level:5 rd:4 wd:2
Apr 10 18:13:53 <$HOSTNAME> kernel: disk 0, o:1, dev:sda2
Apr 10 18:13:53 <$HOSTNAME> kernel: disk 2, o:1, dev:sdc2
Apr 10 18:13:53 <$HOSTNAME> kernel: disk 3, o:0, dev:sdd2
Apr 10 18:13:53 <$HOSTNAME> kernel: RAID conf printout:
Apr 10 18:13:53 <$HOSTNAME> kernel: --- level:5 rd:4 wd:2
Apr 10 18:13:53 <$HOSTNAME> kernel: disk 0, o:1, dev:sda2
Apr 10 18:13:53 <$HOSTNAME> kernel: disk 2, o:1, dev:sdc2
Apr 10 18:13:53 <$HOSTNAME> kernel: Aborting journal on device dm-0-8.
Apr 10 18:13:53 <$HOSTNAME> kernel: EXT4-fs error (device dm-0) in ext4_reserve_inode_write:5620: Journal has aborted
Apr 10 18:13:53 <$HOSTNAME> kernel: EXT4-fs error (device dm-0) in ext4_reserve_inode_write:5620: Journal has aborted
Apr 10 18:13:53 <$HOSTNAME> kernel: Buffer I/O error on device dm-0, logical block 728268800
Apr 10 18:13:53 <$HOSTNAME> kernel: lost page write due to I/O error on dm-0
Apr 10 18:13:53 <$HOSTNAME> kernel: Buffer I/O error on device dm-0, logical block 0
Apr 10 18:13:53 <$HOSTNAME> kernel: lost page write due to I/O error on dm-0
Apr 10 18:13:53 <$HOSTNAME> kernel: JBD2: I/O error detected when updating journal superblock for dm-0-8.
Apr 10 18:13:53 <$HOSTNAME> kernel: EXT4-fs error (device dm-0) in ext4_dirty_inode:5747: Journal has aborted
Apr 10 18:13:53 <$HOSTNAME> kernel: EXT4-fs (dm-0): previous I/O error to superblock detected
Apr 10 18:13:53 <$HOSTNAME> kernel: Buffer I/O error on device dm-0, logical block 0
Apr 10 18:13:53 <$HOSTNAME> kernel: lost page write due to I/O error on dm-0
Apr 10 18:13:53 <$HOSTNAME> kernel: EXT4-fs error (device dm-0) in ext4_da_write_end:3329: Journal has aborted
Apr 10 18:13:53 <$HOSTNAME> kernel: EXT4-fs (dm-0): previous I/O error to superblock detected
Apr 10 18:13:53 <$HOSTNAME> kernel: Buffer I/O error on device dm-0, logical block 0
Apr 10 18:13:53 <$HOSTNAME> kernel: lost page write due to I/O error on dm-0
Apr 10 18:13:53 <$HOSTNAME> kernel: EXT4-fs (dm-0): previous I/O error to superblock detected
Apr 10 18:13:53 <$HOSTNAME> kernel: Buffer I/O error on device dm-0, logical block 0
Apr 10 18:13:53 <$HOSTNAME> kernel: lost page write due to I/O error on dm-0
Apr 10 18:13:53 <$HOSTNAME> kernel: Buffer I/O error on device dm-0, logical block 0
Apr 10 18:13:53 <$HOSTNAME> kernel: lost page write due to I/O error on dm-0
Apr 10 18:13:53 <$HOSTNAME> kernel: EXT4-fs error (device dm-0): ext4_journal_start_sb:260:
Apr 10 18:13:53 <$HOSTNAME> kernel: EXT4-fs error (device dm-0) in ext4_dirty_inode:5747: Journal has aborted
Apr 10 18:13:53 <$HOSTNAME> kernel: EXT4-fs (dm-0): previous I/O error to superblock detected
Apr 10 18:13:53 <$HOSTNAME> kernel: Buffer I/O error on device dm-0, logical block 0
Apr 10 18:13:53 <$HOSTNAME> kernel: lost page write due to I/O error on dm-0
Apr 10 18:13:53 <$HOSTNAME> kernel: Detected aborted journal
Apr 10 18:13:53 <$HOSTNAME> kernel: EXT4-fs (dm-0): Remounting filesystem read-only
Apr 10 18:13:53 <$HOSTNAME> kernel: EXT4-fs (dm-0): previous I/O error to superblock detected
Apr 10 18:13:53 <$HOSTNAME> kernel: Buffer I/O error on device dm-0, logical block 0
Apr 10 18:13:53 <$HOSTNAME> kernel: lost page write due to I/O error on dm-0
Apr 10 18:13:53 <$HOSTNAME> kernel: EXT4-fs (dm-0): ext4_da_writepages: jbd2_start: 12208 pages, ino 336201972; err -30
Apr 10 18:13:53 <$HOSTNAME> kernel: RAID1 conf printout:
Apr 10 18:13:53 <$HOSTNAME> kernel: --- wd:2 rd:4
Apr 10 18:13:53 <$HOSTNAME> kernel: disk 0, wo:0, o:1, dev:sda1
Apr 10 18:13:53 <$HOSTNAME> kernel: disk 2, wo:0, o:1, dev:sdc1
Apr 10 18:13:53 <$HOSTNAME> kernel: disk 3, wo:1, o:0, dev:sdd1
Apr 10 18:13:53 <$HOSTNAME> mdadm[2489]: Fail event detected on md device /dev/md2, component device /dev/sdd2
Apr 10 18:13:53 <$HOSTNAME> mdadm[2489]: Fail event detected on md device /dev/md1, component device /dev/sdd1
Apr 10 18:13:53 <$HOSTNAME> kernel: RAID1 conf printout:
Apr 10 18:13:53 <$HOSTNAME> kernel: --- wd:2 rd:4
Apr 10 18:13:53 <$HOSTNAME> kernel: disk 0, wo:0, o:1, dev:sda1
Apr 10 18:13:53 <$HOSTNAME> kernel: disk 2, wo:0, o:1, dev:sdc1
Apr 10 18:13:58 <$HOSTNAME> kernel: Buffer I/O error on device dm-0, logical block 1343750226
Apr 10 18:13:58 <$HOSTNAME> kernel: lost page write due to I/O error on dm-0
Apr 10 18:13:58 <$HOSTNAME> kernel: Buffer I/O error on device dm-0, logical block 1343750403
Apr 10 18:13:58 <$HOSTNAME> kernel: lost page write due to I/O error on dm-0
Apr 10 18:13:58 <$HOSTNAME> kernel: Buffer I/O error on device dm-0, logical block 1343750442
Apr 10 18:13:58 <$HOSTNAME> kernel: lost page write due to I/O error on dm-0
Apr 10 18:13:58 <$HOSTNAME> kernel: Buffer I/O error on device dm-0, logical block 1343758379
Apr 10 18:13:58 <$HOSTNAME> kernel: lost page write due to I/O error on dm-0
Apr 10 18:13:58 <$HOSTNAME> kernel: Buffer I/O error on device dm-0, logical block 1344798721
Apr 10 18:13:58 <$HOSTNAME> kernel: lost page write due to I/O error on dm-0
Apr 10 18:13:58 <$HOSTNAME> kernel: Buffer I/O error on device dm-0, logical block 1344798736
Apr 10 18:13:58 <$HOSTNAME> kernel: EXT4-fs (dm-0): previous I/O error to superblock detected
Apr 10 18:13:58 <$HOSTNAME> kernel: lost page write due to I/O error on dm-0
Apr 10 18:13:58 <$HOSTNAME> kernel: Buffer I/O error on device dm-0, logical block 1344798890
Apr 10 18:13:58 <$HOSTNAME> kernel: lost page write due to I/O error on dm-0
Apr 10 18:13:58 <$HOSTNAME> kernel: Buffer I/O error on device dm-0, logical block 1344798891
Apr 10 18:13:58 <$HOSTNAME> kernel: lost page write due to I/O error on dm-0
Apr 10 18:13:58 <$HOSTNAME> kernel: Buffer I/O error on device dm-0, logical block 1344798892
Apr 10 18:13:58 <$HOSTNAME> kernel: lost page write due to I/O error on dm-0
Apr 10 18:13:58 <$HOSTNAME> kernel: Buffer I/O error on device dm-0, logical block 1344798894
Apr 10 18:13:58 <$HOSTNAME> kernel: lost page write due to I/O error on dm-0
google nach dem Problem (genauer: nach der Fehlermeldung der ersten Zeile) sagt wahlweise "Kabel neu einstecken bzw SATA-Kabel wechseln" "Fehlerhafter Treiber im Kernel für den Controller" oder "HDD kaputt".

Code:
smartctl --attributes --log=selftest --quietmode=errorsonly 
spuckt allerdings nur Angaben zu Airflow_Temperature_Cel aus (d.h. beim letzten Test irgendwann vor 18Uhr war noch alles i.O?)

Störungsticket hab ich jedenfalls mal eröffnet weils doch stark nach einem Problem außerhalb meiner Zuständigkeit/Möglichkeiten aussieht. Für Hinweise/Tips wär ich allerdings dennoch dankbar