sata controller for non-x86

PostPosted: Fri May 03, 2019 4:31 pm
by ivelegacy
Architecture Machine CPU PCI kernel
PA-RISC2 C3600 1xPA8600@550Mhz seveal PCI32 and PCI64 slots 4.16 .. 5.1
PowerPC Apple PowerMacG4 MDD 2xPPC7450@1.2Ghz four PCI64 slots 4.16, 4.20
MIPS4/BE SGI Octane2/IP30 2xR14000@600Mhz three PCI64 slots via XIO-PCI _?_


Note:
PCIX is not the same as PCI64. PCI-X differs from PCI64 in interrupt handling


Our team is using these machines. We haven't yet found a stable and decent SATA controller.
brand/model chip/driver specs tested result
Highpoint RocketRAID 1640 HTP374 4ch, PCI32 C3600/v4.16 DMA problems, >500MB files result corrupted
HighPoint RocketRAID 2224 Marvell MV88SX6081 4ch+e*, PCI-X C3600/v4.16,5.1 under testing
HighPoint RocketRAID 2224 Marvell MV88SX6081 4ch+e*, PCI-X PowerMacG4/v4.16 working!!!
Highpoint RocketRAID 3220 Marvel xxx 4ch, PCI-X C3600/v4.16, x86/v4.20 3.3V-only, x86-only
Adaptec 1210SA Silicon Image Sil3112 2ch, PCI32 C3600/v4.16 working!!!
Adaptec 2410SA i960 4ch, PCI64 C3600/v4.16, x86/v4.20 x86-only
(wanted) Adaptec AAR-1420SA __ 4ch, PCI-X __ __
(wanted) SYBA-SY-PCI40010 Silicon Image Sil?? 4ch, PCI32 __ __
SYBA-SY-PCX40009 Silicon Image Sil3124 4ch, PCI-X C3600/4.20 __
SYBA-SY-PCX40009 Silicon Image Sil3124 4ch, PCI-X PowerMacG4/4.16 working!!!
VIA generic card VIA6421 2ch, PCI32 __ __
(wanted) LSI Fuel/Tezro LSI SAS3041X 4ch SAS, PCI-X __ __
(wanted) LINDY SATA-II Multilane _ e*, PCI-X __ __
(wanted) Sonnet Tempo™ SATA X4i sil?!? 4ch, PCI-X __ __


Note:
SY-PCX40009 PCI-X card supports two models { 32-bit at 66 MHz, 64-bit at 133 MHz }.
Backward compatible to PCI 2.3. Known to be working with C8000

e*: Infiniband Multilane (SFF-8470)

Houston, we have a serious problem

PostPosted: Fri May 03, 2019 4:33 pm
by ivelegacy
Adaptec AAR-2410SA was tested on x86, C3600, and Apple PowerMac G4 MDD. It worked only on x86.

Code: Select all
0001:10:15.0 RAID bus controller: Adaptec AAC-RAID (rev 01)
        Subsystem: Adaptec AAR-2410SA PCI SATA 4ch (Jaguar II)
        Flags: 66MHz, slow devsel, IRQ 58
        Memory at 84000000 (32-bit, prefetchable) [size=64M]
        Expansion ROM at 80088000 [disabled] [size=32K]
        Capabilities: [80] Power Management version 2


Alan Cox explained some reasons for this. His email is long, and not public, so I summarize what we have understood.

as soon as the computer bootstraps, the firmware in its BIOS scans every PCI-peripherical for any BIOS-extension, it finds then there is a BIOS-extension ROM on the SATA-card, and it loads and executes it: the flash-chip on the card contains x86 opcode! The ROM initializes some features on the SATA-card and loads and bootstraps a firmware there (the firmware is contained in the flash, but it somehow requires to be launched by the PC, dunno how/what), the PC goes ahead and bootstrap the OS-loader (Grub? Lilo? ... this stuff), the Linux kernel is loaded and bootstrapped too, the kernel is now running, and it probes for the SATA-controller device, and it finds it, so the kernel-driver finds the SATA-controller already configured and - it's running its own firmware - so, when the kernel issues commands, it responds properly!

So, if you put the Adaptec AAR-2410SA SATA-card into a non-x86 computer ... the BIOS extension is not expected, and the Linux kernel does not find the SATA correctly configured-card, in fact, the kernel complains the card is not even found running its own firmware running, and this can't be fixed, unless you do a full reverse-engineering of flash-code, in order to create a new kernel-driver able to directly initialize the card instead of waiting for the job done by the PC-BIOS.

hardware RAID cards are usually problematic for the same reason.

Serious disk corruption with HPT374 on C3600

PostPosted: Fri May 03, 2019 8:15 pm
by LordCrimson
What's happening is similar to the Bug 2271 appeared in 2004

Ivelegacy, Madame and I tried HighPoint HPT374 on a C3600 workstation running Kernel v4.16 in 64bit mode. It didn't panic but, during a file-copy operation, the DMA caused corruption to the file. The filesystem was not corrupted.

Code: Select all
# lssize data1.bin
400 Mbyte
# cp data1 data2
md5sum data1.bin data2.bin
6004eb9dd9189770655f8b49a1d688a8  data1.bin
6004eb9dd9189770655f8b49a1d688a8  data2.bin


Code: Select all
# lssize bigone1
5 Gbyte
# cp bigone1 bigone2
# md5sum bigone1 bigone2
f60a9f7ff4bcec465ea47e0f009354fd  bigone1
5e1fdedc560cfe82a5d59b740a980091  bigone2 <---- corrupted


Digging deeper it only happens with big files.

copy test

PostPosted: Sat May 04, 2019 12:22 pm
by ivelegacy
/usr/bin/safecp
Code: Select all
if [ "*$1" == "*" ]
   then
       exit
   fi
if [ "*$2" == "*" ]
   then
       exit
   fi

cp $1 $2
ans="$?"

if [ "$ans" != "0" ]
   then
       echo "copy error, l1"
   else
       check1=`md5sum $1`
       check2=`md5sum $2`
       check1=`myparam1 $check1`
       check2=`myparam1 $check2`
       if [ "*$check1" != "*$check2" ]
          then
              echo "copy error, l2"
          fi
   fi


Code: Select all
dd if=/dev/urandom of=data_01GB.bin count=1024 bs=1M
dd if=/dev/urandom of=data_02GB.bin count=1024 bs=2M
dd if=/dev/urandom of=data_04GB.bin count=1024 bs=4M
dd if=/dev/urandom of=data_08GB.bin count=1024 bs=8M
dd if=/dev/urandom of=data_16GB.bin count=1024 bs=16M
dd if=/dev/urandom of=data_32GB.bin count=1024 bs=32M


Code: Select all
safecp data_512MB.bin copy.bin

Code: Select all
safecp data_1GB.bin copy.bin

Code: Select all
safecp data_2GB.bin copy.bin

Code: Select all
safecp data_4GB.bin copy.bin

Code: Select all
safecp data_8GB.bin copy.bin

Code: Select all
safecp data_16GB.bin copy.bin


The last two do stress-out both the PCI and the DMA, so this is a good test for the worst case: all of our machines fail the test.

Highpoint RR3220 is 3.3V-only and x86-only

PostPosted: Sat May 04, 2019 12:32 pm
by ivelegacy
The Highpoint RocketRAID RR3220 comes with two mini-SAS connectors and it's not keyed as "3.3V-only". Putting the card into a 3V-only PCI-X slot doesn't suffer from any problem, and it works as expected. Tried on a SuperMicro Xeon motherboard that offers both 3V-only and 5V-tolerant PCI-X slots: on the 5V-tolerant slot, it triggers an over-current alarm.

The card must be keyed as "3.3V-only" by removing the 5V notch on the connector. Anyway, its firmware is x86-only, so it cannot be used on non-x86 machines.

Drivers with BLOB are a red flag

PostPosted: Thu May 09, 2019 3:16 pm
by ivelegacy
Basically, *anything* that uses binary "BLOB" firmware loaded by the driver is usually a big red flag for a whole bunch of reasons. This applies to not only sata-cards but also ethernet NICs that support features like layer 2/3 offloading, intelligent serial and so on.

HPPA, C3600 panics with Marvell MV88SX6081

PostPosted: Fri May 10, 2019 9:16 am
by ivelegacy
Tested on HPPA C3600, with git/deller/parisc-linux.git, parisc-5.2. I've been using kgcc-v7.3 and the kernel size issue has been compiled with CONFIG_MLONGCALLS. This issue is long-standing. The Binutils linker lacks support for long branch stubs when linking 64-bit code. So I cannot compile the kernel without MLONGCALLS and the final size is about 21Mbyte

Code: Select all
# mycp data_8GB.bin copy.bin


Performance is about 15Mbyte/sec, which is not good on a device that should go at 50Mbyte/sec at least, and it's not stable.
When copying files bigger than 4Gbyte, I experiment issues with the DMA. Usually the machine halts and reboots.

Hard Fail vs. Soft Fail on PCI Master Abort
Master Abort means the MMIO transaction timed out - usually due to the device not responding to an MMIO read. We would like HF to be enabled to find driver problems, though it means the system will crash with a HPMC. In SoftFail mode "~0L" is returned as a result of a timeout on the pci bus. This is like how PCI busses on x86 and most other architectures behave. In order to increase compatibility with existing (x86) PCI hardware and existing Linux drivers we enable Soft Faul mode on PA-RISC now too.




Code: Select all
c3600 ~ # uname -r
5.1.0-deller-5.2-c3600-64bit
c3600 ~ # cat /proc/cpuinfo | grep failmode
PCI failmode    : soft


softfail mycp data_08GB.bin copy.bin success
softfail mycp data_16GB.bin copy.bin failure Kernel panic - not syncing High Priority Machine Check (HPMC)
hardfail mycp data_08GB.bin copy.bin success 13m44.431s, 3m24.382s, 7m39.433s ~10Mbyte/sec
hardfail mycp data_16GB.bin copy.bin failure Kernel panic - not syncing High Priority Machine Check (HPMC)


Code: Select all
# gcc-config -l
[1] hppa2.0-unknown-linux-gnu-5.4.0
[2] hppa2.0-unknown-linux-gnu-6.4.0
[3] hppa2.0-unknown-linux-gnu-7.3.0
[4] hppa2.0-unknown-linux-gnu-8.2.0 *
[5] hppa64-unknown-linux-gnu-7.3.0 * <-------- using this to compile the kernel
[6] hppa64-unknown-linux-gnu-8.2.0

Marvell 88SX60xx

PostPosted: Fri May 10, 2019 7:35 pm
by TheHalloween
The driver is drivers/ata/sata_mv.c, which is per-device queues, full SATA control including hotplug.
  • The 88SX50xx "GEN_I" series supports TCQ, but not NCQ or PM.
  • The 88SX6xxx "GEN_II" series (6040, 6041, 6080, and 6081) supports TCQ, NCQ, and PM.
  • The 88SX7xxx "GEN_IIE" series (6042, 7042, and various system-on-chip hosts) supports TCQ, NCQ, FBS, and PM.

88SExxxx series of chips present an ahci-interface. Some of the recent HighPoint cards are based on the Marvell 88SX50xx and 88SX60xx chips. These will be supported by the Marvell libata driver. Anyway, although Marvell-controllers driven by sata_mv are well supported, various Marvell-AHCI controllers are suffering from incomplete and/or buggy support and Marvell doesn't seem to allocate any resource on upstream Linux support and communication between Marvell and libata developers is weak.

:uc-waves:

Marvell 88SX60xx on HPPA, HPMC log

PostPosted: Sat May 11, 2019 11:39 am
by ivelegacy
Code: Select all
Main Menu: Enter command > ser pim

PROCESSOR PIM INFORMATION

-----------------  Processor 0 HPMC Information ------------------

Timestamp =
  Sat May  11 11:37:53 GMT 2019    (20:19:05:11:11:37:53)

HPMC Chassis Codes = 2cbf0  2500b  27825  2cbfb

General Registers 0 - 31
00-03   0000000000000000  0000000040cc4360  00000000406dce7c  0000000049159530
04-07   0000000040c0bb60  000000012ec37020  00000000000a4000  0000000000000001
08-11   0000000000000001  000000012ec37020  0000000049159258  000000012ec33920
12-15   000000012ec38920  0000000040c36360  0000000000002218  0000000000000002
16-19   0000000000000001  000000012f1f17b0  0000000040c36360  00000000491677a0
20-23   0000000008ff0000  0000000049165d70  0000000000000007  0000000000000001
24-27   0000000000000000  0000000000000000  0000000000000010  0000000040c0bb60
28-31   000008d1a8d10800  0000000049159b30  00000000491595c0  00000008d1080008

<Press any key to continue (q to quit)>

Control Registers 0 - 31
00-03   0000000000000000  0000000000000000  0000000000000000  0000000000000000
04-07   0000000000000000  0000000000000000  0000000000000000  0000000000000000
08-11   00000000000017c6  0000000000000000  00000000000000c0  000000000000003f
12-15   0000000000000000  0000000000000000  0000000000183000  fe00000000000000
16-19   000002f2cd0df74f  0000000000000000  00000000406dcea8  0000000048dc0048
20-23   00000000a627ffdc  00000000090a4024  000000000804000e  8800000000000000
24-27   00000000010f9000  00000000dafe6000  00000000ffffffff  00000000f8f00480
28-31   00000000ffffffff  00000000ffffffff  0000000040fc3000  00000000ffffffff
Space Registers 0 - 7

00-03   005f1800          00000000          00000000          005f1800
04-07   00000000          00000000          00000000          00000000

<Press any key to continue (q to quit)>

IIA Space                    = 0x0000000000000000
IIA Offset                   = 0x00000000406dceac
Check Type                   = 0x20000000
CPU State                    = 0x9e000004
Cache Check                  = 0x00000000
TLB Check                    = 0x00000000
Bus Check                    = 0x0030103b
Assists Check                = 0x00000000
Assist State                 = 0x00000000
Path Info                    = 0x00000000
System Responder Address     = 0x000000fff7024024
System Requestor Address     = 0xfffffffffffa0000

Floating-Point Registers 0 - 31
00-03   0000001f00000000  0000000000000000  0000000000000000  0000000000000000
04-07   41d735a7c10ed08d  0000000000000000  0000000000000000  0000000000000000
08-11   0000000000000000  0000000000000000  0000000000000000  0000000000000000
12-15   0000000000000000  0000000000000000  0000000000000000  0000000000000000
16-19   0000000000000000  0000000000000000  0000000000000000  0000000000000000
20-23   0000000000000000  0000000000000000  0000000000000000  0000000000000000
24-27   0000000000000000  0000000000000000  0000000000000000  0000000000000000
28-31   0000000000000000  0000000000000000  0000000000000000  0000000000000000

<Press any key to continue (q to quit)>


'9000/785 B,C,J Workstation Unarchitected (per-CPU)', rev 1, 140 bytes:

Check Summary                = 0xcb81045028000000
Available Memory             = 0x0000000200000000
CPU Diagnose Register 2      = 0x0301000000802004
CPU Status Register 0        = 0x2420c20000000000
CPU Status Register 1        = 0x8002000000000000
SADD LOG                     = 0x00c0000400000000
Read Short LOG               = 0xc1a0f0fff7024024
ERROR_STATUS                 = 0x0000000000500050
MEM_ADDR                     = 0x000001ff3fffffff
MEM_SYND                     = 0x0000000000000000
MEM_ADDR_CORR                = 0x0000018a00db1dad
MEM_SYND_CORR                = 0x0000000000000094
RUN_DATA_HIGH                = 0xc1bff0fffed08040
RUN_DATA_LOW                 = 0xc1bff0fffed08040
RUN_CTRL                     = 0x0000021c00001418
RUN_ADDR                     = 0xc1bff0fffed08040
System Responder Path        = 0x00ffffff0a060200


HPMC PIM Analysis Information:

Timestamp =
  Sat May  11 11:37:53 GMT 2019    (20:19:05:11:11:37:53)


'9000/785 B,C,J Workstation HPMC PIM Analysis (per-CPU)', rev 0, 1304 bytes:

A Data I/O Fetch Timeout occurred while CPU 0 was
requesting information from a device at the path 10/6/2/0 (PCI slot 2).


Memory/IO Controller Error Analysis Information:

There were multiple correctable memory errors.  See 'Memory Error Log Info'.

<Press any key to continue (q to quit)>

-----------------  Processor 0 LPMC Information ------------------

Check Type                   = 0x00000000
I/D Cache Parity Info        = 0x00000000
Cache Check                  = 0x00000000
TLB Check                    = 0x00000000
Bus Check                    = 0x00000000
Assists Check                = 0x00000000
Assist State                 = 0x00000000
Path Info                    = 0x00000000
System Responder Address     = 0x0000000000000000
System Requestor Address     = 0x0000000000000000


-----------------  Processor 0 TOC Information -------------------

General Registers 0 - 31
00-03   0000000000000000  0000000000000000  0000000000000000  0000000000000000
04-07   0000000000000000  0000000000000000  0000000000000000  0000000000000000
08-11   0000000000000000  0000000000000000  0000000000000000  0000000000000000
12-15   0000000000000000  0000000000000000  0000000000000000  0000000000000000
16-19   0000000000000000  0000000000000000  0000000000000000  0000000000000000
20-23   0000000000000000  0000000000000000  0000000000000000  0000000000000000
24-27   0000000000000000  0000000000000000  0000000000000000  0000000000000000
28-31   0000000000000000  0000000000000000  0000000000000000  0000000000000000

<Press any key to continue (q to quit)>

Control Registers 0 - 31
00-03   0000000000000000  0000000000000000  0000000000000000  0000000000000000
04-07   0000000000000000  0000000000000000  0000000000000000  0000000000000000
08-11   0000000000000000  0000000000000000  0000000000000000  0000000000000000
12-15   0000000000000000  0000000000000000  0000000000000000  0000000000000000
16-19   0000000000000000  0000000000000000  0000000000000000  0000000000000000
20-23   0000000000000000  0000000000000000  0000000000000000  0000000000000000
24-27   0000000000000000  0000000000000000  0000000000000000  0000000000000000
28-31   0000000000000000  0000000000000000  0000000000000000  0000000000000000
Space Registers 0 - 7

00-03   00000000          00000000          00000000          00000000
04-07   00000000          00000000          00000000          00000000

IIA Space                    = 0x0000000000000000
IIA Offset                   = 0x0000000000000000
CPU State                    = 0x00000000


<Press any key to continue (q to quit)>

Memory Error Log Information:

Timestamp =
  Sat May  11 11:37:53 GMT 2019    (20:19:05:11:11:37:53)


'9000/785 B,C,J Workstation Memory Error Log', rev 0, 64 bytes:

This log displays the contents of memory specific registers when the
HPMC occurred.  If there are multiple memory errors, the order they are
listed is not indicative of the order they occurred.

                                   Trans  Addr
   Memory Error Type(s)  OV  MID    ID    par  CP   DIMM       Runway Address
   --------------------  --  ---  -----  ----  --  -------  -------------------
1) Correctable Mem       1   0x6  0x a   na    na  05       0x       0036c76b40

                                                Syndrome
                                           ------------------
                                        1) 0x94
<Press any key to continue (q to quit)>

I/O Module Error Log Information:

Timestamp =
  Sat May  11 11:37:53 GMT 2019    (20:19:05:11:11:37:53)


'9000/785 B,C,J Workstation IO Error Log', rev 0, 228 bytes:

Rope     Word1        Word2            Word3
------ ------------ ------------
   0    0x00000000   0x0e0cc009   0x00000000fed30048
   1    0x00000000   0x1e0cc009   0x00000000fed32048
   2    ----------   0x2e0cc009   ------------------
   3    ----------   0x3e0cc009   ------------------
   4    0x00000000   0x4e0cc009   0x00000000fed38048
   5    ----------   0x5e0cc009   ------------------
   6    0x00000000   0x6e0cc2a9   0x00000000fed3c048
   7    ----------   0x7e0cc009   ------------------


A Data I/O Fetch Timeout occurred while CPU 0 was requesting information from a device at the path 10/6/2/0 (PCI slot 2).


Code: Select all
c3600 / # lspci -nn
00:0c.0 Ethernet controller [0200]: Digital Equipment Corporation DECchip 21142/43 [1011:0019] (rev 41)
00:0d.0 Multimedia audio controller [0401]: Analog Devices Device [11d4:1889]
00:0e.0 IDE interface [0101]: National Semiconductor Corporation 87415/87560 IDE [100b:0002] (rev 03)
00:0e.1 Bridge [0680]: National Semiconductor Corporation 87560 Legacy I/O [100b:000e] (rev 01)
00:0e.2 USB controller [0c03]: National Semiconductor Corporation USB Controller [100b:0012] (rev 02)
00:0f.0 SCSI storage controller [0100]: LSI Logic / Symbios Logic 53C896/897 [1000:000b] (rev 07)
00:0f.1 SCSI storage controller [0100]: LSI Logic / Symbios Logic 53C896/897 [1000:000b] (rev 07)
01:04.0 RAID bus controller [0104]: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller [1095:3124] (rev 01)
03:02.0 SCSI storage controller [0100]: Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X Controller [11ab:60)


Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller <--- not in use
Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X Controller <--- in use

C3600 PCI

PostPosted: Sun May 12, 2019 6:37 pm
by madame
  • One PCI 64-bit/66 MHz, 3.3 V slot (1)
  • Three PCI 64-bit/33 MHz, 5 V slots
  • Two PCI 32-bit/33 MHz, 5 V slots

Code: Select all
S1: PCI-64/33, pci0, 5 V <--------- testing 64bit cards here
S2: PCI-64/66, pci1, 3.3 V
S3: PCI-64/33, pci0, 5 V
S4: PCI-64/33, pci2, 5 V
S5: PCI-32/33, pci3, 5 V
S6: PCI-32/33, pci3, 5 V <--------- testing 32bit cards here



(1) clocked at 100 MHz on C3750

C3600, Adaptec 1210SA

PostPosted: Mon May 13, 2019 7:03 pm
by ivelegacy
burnin test, 10 hours

script for the test
Code: Select all
while [ 1 ]
do
for item in `ls *.bin`
    do
        rm -f $copy.out
        echo -n "$item ... "
        mycp $item $copy.out
        echo "done"
    done
done


test1
Code: Select all
S1: PCI-64/33, pci0, 5 V
S2: PCI-64/66, pci1, 3.3 V
S3: PCI-64/33, pci0, 5 V
S4: PCI-64/33, pci2, 5 V
S5: PCI-32/33, pci3, 5 V
S6: PCI-32/33, pci3, 5 V <--------- tested here

kernel 4.16-softfail: working

test2
Code: Select all
S1: PCI-64/33, pci0, 5 V
S2: PCI-64/66, pci1, 3.3 V
S3: PCI-64/33, pci0, 5 V
S4: PCI-64/33, pci2, 5 V <--------- tested here
S5: PCI-32/33, pci3, 5 V
S6: PCI-32/33, pci3, 5 V

kernel 4.16-softfail: working

C3600, SYBA-SY-PCX40009

PostPosted: Wed May 15, 2019 11:26 am
by ivelegacy
burnin test, 10 hours

test1

script for the test
Code: Select all
while [ 1 ]
do
for item in `ls *.bin`
    do
        rm -f $copy.out
        echo -n "$item ... "
        mycp $item $copy.out
        echo "done"
    done
done


Code: Select all
S1: PCI-64/33, pci0, 5 V
S2: PCI-64/66, pci1, 3.3 V
S3: PCI-64/33, pci0, 5 V
S4: PCI-64/33, pci2, 5 V
S5: PCI-32/33, pci3, 5 V
S6: PCI-32/33, pci3, 5 V <--------- tested here PCI-X card forced to 32bit

kernel 4.16-softfail: panics, HPMC PCI timeout when the controller is stressed out
The PCI-X variant (3124) on 32 bit PCI @ 33MHz can reach the PCI bus limit.

log
Code: Select all
./do_test_adv
data_01GB.bin ...
real    0m28.304s
user    0m0.055s
sys     0m20.541s
done
data_02GB.bin ...
real    0m54.184s
user    0m0.048s
sys     0m40.351s
done
data_04GB.bin ...
real    1m34.203s
user    0m0.100s
sys     1m16.753s
done
data_08GB.bin ...
real    4m19.789s
user    0m0.368s
sys     2m43.647s
done
data_16GB.bin ...
real    8m43.602s
user    0m0.666s
sys     5m28.698s
done
data_32GB.bin ...
real    17m47.451s
user    0m1.027s
sys     11m3.431s
done
data_01GB.bin ...
real    0m26.390s
user    0m0.064s
sys     0m20.635s
done
data_02GB.bin ...
real    0m54.257s
user    0m0.063s
sys     0m40.589s
done
data_04GB.bin ...
real    1m33.727s
user    0m0.110s
sys     1m17.770s
done
data_08GB.bin ...
real    3m57.131s
user    0m0.266s
sys     2m29.096s

data_16GB.bin ...
panics, HPMC PCI timeout



test2

script for the test
Code: Select all
while [ 1 ]
do
for item in `ls *.bin`
    do
        rm -f $copy.out
        echo -n "$item ... "
        mycp $item $copy.out
        echo "done"
    done
done


Code: Select all
S1: PCI-64/33, pci0, 5 V
S2: PCI-64/66, pci1, 3.3 V  <--------- tested here
S3: PCI-64/33, pci0, 5 V
S4: PCI-64/33, pci2, 5 V
S5: PCI-32/33, pci3, 5 V
S6: PCI-32/33, pci3, 5 V

kernel 4.16-softfail: panics, HPMC PCI timeout when the controller is stressed out

test3

script for the test (without md5sucm, and with a delay)
Code: Select all
while [ 1 ]
do
for item in `ls *.bin`
    do
        rm -f $copy.out
        echo -n "$item ... "
        cp $item $copy.out
        echo "done"
        sync
        sleep 10
    done
done



Code: Select all
S1: PCI-64/33, pci0, 5 V
S2: PCI-64/66, pci1, 3.3 V  <--------- tested here
S3: PCI-64/33, pci0, 5 V
S4: PCI-64/33, pci2, 5 V
S5: PCI-32/33, pci3, 5 V
S6: PCI-32/33, pci3, 5 V

kernel 4.16-softfail: panics, HPMC PCI timeout when the controller is stressed out

log
data_01GB.bin ...
real 0m28.224s
user 0m0.048s
sys 0m19.895s
done
data_02GB.bin ...
real 0m58.382s
user 0m0.064s
sys 0m40.622s
done
data_04GB.bin ...
real 2m2.109s
user 0m0.167s
sys 1m20.512s
done
data_08GB.bin ...
real 4m13.158s
user 0m0.227s
sys 2m41.224s
done
data_16GB.bin ...
real 8m43.251s
user 0m0.494s
sys 5m22.696s
done
data_32GB.bin ...
real 17m28.304s
user 0m1.031s
sys 10m50.234s
done
data_01GB.bin ...
real 0m29.201s
user 0m0.032s
sys 0m20.273s
done
data_02GB.bin ...
real 0m55.108s
user 0m0.032s
sys 0m39.741s
done
data_04GB.bin ...
real 1m46.139s
user 0m0.115s
sys 1m18.823s
done
data_08GB.bin ...
real 4m9.451s
user 0m0.270s
sys 2m41.259s
done
data_16GB.bin ...
real 8m36.120s
user 0m0.549s
sys 5m24.918s
done
data_32GB.bin ...
real 17m25.319s
user 0m1.146s
sys 10m54.813s
done
data_01GB.bin ...
real 0m25.629s
user 0m0.016s
sys 0m20.244s
done
data_02GB.bin ...
real 0m55.077s
user 0m0.059s
sys 0m39.600s
done
data_04GB.bin ...
panics, HPMC PCI timeout


PowerMacG4, SYBA-SY-PCX40009

PostPosted: Fri May 17, 2019 8:38 pm
by madame
burnin test, 10 hours

test1

script for the test
Code: Select all
while [ 1 ]
do
for item in `ls *.bin`
    do
        rm -f $copy.out
        echo -n "$item ... "
        mycp $item $copy.out
        echo "done"
    done
done


Code: Select all
S4: PCI-64/33, pci3, 5 V <--------- tested here
S3: PCI-64/33, pci2, 5 V
S2: PCI-64/33, pci1, 5 V
S1: PCI-64/33, pci0, 5 V
S0: AGP 4x

kernel 4.16: no issue, no problem

log, 1 cycle
...
data_01GB.bin ...
real 0m34.809s
user 0m9.983s
sys 0m14.099s
done
data_02GB.bin ...
real 1m13.730s
user 0m20.158s
sys 0m29.842s
done
data_04GB.bin ...
real 2m31.833s
user 0m39.816s
sys 1m0.220s
done
data_08GB.bin ...
real 4m57.093s
user 1m20.328s
sys 1m59.137s
done
data_16GB.bin ...
real 9m57.283s
user 2m41.205s
sys 3m59.818s
done
data_32GB.bin ...
real 20m2.895s
user 5m21.326s
sys 8m4.992s
done
...


SYBA-SY-PCX40009

PostPosted: Sat May 18, 2019 10:53 am
by LordCrimson
PowerMacG4 MDD PCI64@33Mhz 5V burn-in test success
HPPA C3600 PCI64@33Mhz 5V burn-in test failed!!, HPMC PCI timeout
HPPA C3600 PCI64@66Mhz 3.3V burn-in test failed!!, HPMC PCI timeout
HPPA C3600 PCI32@33Mhz 5V burn-in test failed!!, HPMC PCI timeout


According to Apple's specs, in the default configuration, the PowerMacG4 MDD has four open 33 MHz 64-bit PCI slots, and a 4X AGP slot occupied by the graphics card.

HP RocketRAID 2224, PowerMacG4

PostPosted: Sun May 19, 2019 9:14 am
by madame
burnin test, 10 hours

test1

script for the test
Code: Select all
while [ 1 ]
do
for item in `ls *.bin`
    do
        rm -f $copy.out
        echo -n "$item ... "
        mycp $item $copy.out
        echo "done"
    done
done


Code: Select all
S4: PCI-64/33, pci3, 5 V <--------- tested here
S3: PCI-64/33, pci2, 5 V
S2: PCI-64/33, pci1, 5 V
S1: PCI-64/33, pci0, 5 V
S0: AGP 4x

kernel 4.16: no issue, no problem

log, 1 cycle
...
data_01GB.bin ...
real 0m32.805s
user 0m9.627s
sys 0m14.282s
done
data_02GB.bin ...
real 1m9.836s
user 0m19.408s
sys 0m30.725s
done
data_04GB.bin ...
real 2m21.577s
user 0m38.571s
sys 1m0.949s
done
data_08GB.bin ...
real 4m45.360s
user 1m18.418s
sys 2m1.852s
done
data_16GB.bin ...
real 9m26.800s
user 2m34.980s
sys 4m7.087s
done
data_32GB.bin ...

...



Note:
It's not clear if the card required to be forced into canonical PCI mode rather than PCI-X mode.

sata_sil24, sata_mv, a few notes

PostPosted: Sun May 19, 2019 9:24 am
by madame
sata_sil24
It seems likely to me that the sil24 driver depends on the card operating in PCI-X mode. PCI-X differs from PCI in interrupt handling and the driver has this flag SIL24_FLAG_PCIX_IRQ_WOC.

a comment in sata_mv.c
if PCIX_IRQ_WOC, there's an inherent race window between clearing IRQ pending status and reading PORT_SLOT_STAT which may cause spurious interrupts afterwards. This is unavoidable and much better than losing interrupts which happens if IRQ pending is cleared after reading PORT_SLOT_STAT


sata_mv
There is a register on the chip, which software could use to override the normal auto-detected PCI mode and bus speed for the chip. This could be used to, say, select 100Mhz or 66Mhz, or even 33Mhz operation, but the register is autodetected from the bus at power-on, and so if software wants to override that by rewriting the reg, it will also need to reset the PCI bus afterward. Which requires knowing how to reset a PCI bridge.
Code: Select all
sata_mv 0000:03:06.0: PCI ERROR; PCI IRQ cause=0x30000040

This error message might be reported for a similar reason, but it's not clear to me.