1. Introduction

1.1 SCSI's flxibility

One of SCSI's great features is it's flexibility. One can connect up to 7 or even 15 devices on a single adapter, each identified by its unique SCSI-id. If one uses several LUN's (maximum is 8) per SCSI-id even more devices can be connected on a single adapter. Furthermore one can easyly connected and disconnected external devices, to a maximum of 15 x 8 = 120 devices.

1.2 The Linux kernel and SCSI

SCSI's flexibility poses a problem to the Linux kernel. The kernel has a fixed major device number for all SCSI devices, so it has "only" 256 possible minor numbers to assign to all possible SCSI devices. It needs these minor Id's hower to uniquely identify the partitions on these devices as well. Assuming a maximum of 16 partitions per device, the kernel would need 120 x 16 = 1920 minor id's to identify all the possible devices/partitions on a single adapter, which exceeds the maximum of 256.

To solve this problem, the kernel assigns minor SCSI device numbers dynamically. This is done by assigning them only to actually connected devices in order of their SCSI-id's (mostly ascending).

2. An impractical Linux SCSI problem

2.1 General

A result of the dynamic allocation of minor device numbers is that Linux can not uniquely identfy a SCSI device with a specified SCSI-id by it's minor device number, for it's minor depends on the number of connected SCSI devices with a lower SCSI-id. This means that connecting an external SCSI device would change the minors of all internal devices with a higher SCSI-id forcing one to change /etc/fstab when connecting the external device and to change it back when disconnecting the device.

2.2 Wide SCSI

Normally "Narrow" SCSI allows op to 8 SCSI-id's (from 0 to 7), Wide SCSI however allows up to 16 SCSI-id's (from 0 to 15). Hence one can connect up to 15 devices on a wide SCSI adapter. Wide SCSI is "backward" compatible so one can connect both Wide and Narrow devices on a Wide SCSI adapter, which is convenient given the fact that most CD-ROMS, TAPE devices and "older" disks are still Narrow.

When both Narrow and Wide devices are connected on a Wide adapter, one might prefer to assign the high SCSI-id's (from 8 to 15) to Wide devices, leaving the lower SCSI-id's available for narrow devices. Unfortunately this would definitely result in the problems described in 2.2: an external Narrow Disk would inevidably have a lower SCSI-id than the internal Wide devices hence changing their minor device numbers.

3. A solution

3.1 The cause of the problem

As described the poblem in 2.1 is caused by the fact that the kernel assigns minor SCSI device numbers in order of the device's SCSI-id's. This implies that one could solve this problem if one could make the kernel scan the devices in a different order.

3.2 The scanorder fix

The fix is a small patch which allows one to specify a boot parameter "scsi_scanorder" or a scsi_mod module parameter "scanorder" which will make the kernel scan the SCSI devices in the specified order.

The current implementation however has some "raw edges" which may need to be removed. One them is the fact that the scanorder can be specified to specific host adapters based on their io addres or their base address. I don't consider this to be ideal, cause inserting another PCI board may change these addresses. A PCI slot id would be more useful in those cases, but it is not a field in "struct Scsi_Host". This may not be a real problem, in which case the actual problem is that I'm lazy :-)

Another "raw edge" is the fact that I'm not sure if it runs on non-intel hardware. This was my main reason to introduce the "base=.." qualifier. I'm not sure though if the scanorder fix is portable now...

For example "scsi_scanorder=io=0x330:1,6;base=0xfdffa000:2,3,10;1,8" means to scan the devices of the adapter at io=0x330 (e.g. an AHA1542) in the following order: 1,6,0,2,3,4,5,7. The adapter with base=0xfdffa000 (e.g. an AHA2940UW) will be scanned in the order: 2,3,10,0,1,4,5,6,7,8,9,11,12,13,14,15. Any other wide adapters will be scanned in the order 1,8,0,2,3,4,5,6,7,9,10,11,12,13,14,15. Other narrow adapters will be scanned in the order 1,0,2,3,4,5,6,7.

4. Alternatives

4.1 choosing low SCSI-id's for internal devices

One alternative to this fix might be to choose low SCSI-id's for the internal devices. As described in 2.2 this limits flexibility when the internal devices are Wide and the external devices are Narrow.

4.2 Devfs

There is a patch which implements devfs. This results in a kernel generated pseudo /dev directory in which the SCSI devices are given names reflecting their actual SCSI-id's. Currently devfs still isn't included in the kernel yet so the described fix might be of help until it is.

4.3 SCSIdev

This is a program that's run during boot. It creates /dev entries which reflect the actual SCSI-id's etc. like Devfs does.

4.4 Why the scanorder fix?

Alternative 4.1 restricts one's flexibility. Alternative 4.2 is still under development. Alternative 4.3 fixes a low-levelproblem at a high level.

The scanorder fix fixes the low level problem at a low level. It's fixed where it originates. It's simple, it's optional, it doesn't hurt.

5. Where to get the scanorder fix

The "latest and the greatest":

http://flits102-126.flits.rug.nl/~rolf/scanorder/scanorder-2.2.14.patch.gz

Previous ones:

http://flits102-126.flits.rug.nl/~rolf/scanorder/scanorder-2.2.10-990730.patch.gz

http://flits102-126.flits.rug.nl/~rolf/scanorder/scanorder-2.2.10-990705.patch.gz

http://flits102-126.flits.rug.nl/~rolf/scanorder/scanorder-2.2.11-990816.patch.gz

http://flits102-126.flits.rug.nl/~rolf/scanorder/scanorder-2.3.13-990818.patch.gz

http://flits102-126.flits.rug.nl/~rolf/scanorder/scanorder-2.3.16-990904.patch.gz

6. Note

I posted a similar fix on linux-kernel about a year ago. This resulted in a lot of feedback from which I learned about the alternatives in 4. The feedback also gave me reasons to think that connecting both Wide devices with high SCSI-id's and Narrow devices was a bad idea. It was pointed out that the 16 bits of the Wide SCSI bus are used for arbitration when devices want to "have the bus". Because Narrow SCSI only sees the lower 8 bits arbitration would fail when Wide devices would use the high SCSI-id's (and hence the upper 8 bits). It was also pointed out however that there is a certain ranking when arbitrating between several devices want to have the bus, the ranking is based on SCSI-id's: 7,6,5,4,3,2,1,015,14,13,12,11,10,9,8. This means that the highest SCSI-id gets the bus, however Narrow SCSI'ids get it before Wide SCSI-id's. This seems to solve the problem to me, when both a Wide (high SCSI-id) and a Narrow device want the bus, the Narrow device gets it, so it doesn't need to know about the wide devices. I'm not sure however....

...but there's practice as well. I run a 2.2.14 kernel on a dual PII with a AHA2940U/UW. I have two Wide internal SCSI disks (SCSI-id 0 and 8) and a Narrow CD-ROM (SCSI-id 6). I sometimes connect an external Narrow disk (SCSI-id 5), and I _NEVER_ have any problem at all.

7. Feedback

Any feedback can be mailed to me: rolf@flits102-126.flits.rug.nl. Especially feedback that helps me would be usefull. This means that for me the minor SCSI device allocation _IS_ a problem to me, and I think I'm not the only one. So anything that helps me solve this problem (it needn't be my scanorder fix) is welcome. Don't just tell me that it's wrong, tell me how to do it right. Thanks.

Nedstat statistics