What is pre fetcheable memory range of a PCI bridge? What does pre-fetcheable mean? I am talking about PCI-to-PCI bridge spec....
Posted on 2010-07-17 04:59:08 by logicman112
logicman112,

Some PCI devices indicate that their memory address range is prefetchable (i.e. it has no side-effects on reads and host bridge can merge writes; effectively it behaves like normal memory but doesn't participate in PCI caching protocol). These ranges are mapped in prefetchable memory address range of corresponding PCI bridge.

Essentially memory from prefetchable range can be read/written by host in bigger chunks than program requests without breaking device functionality. Videocontroller's frame buffer is good example of it.

PCI Local Bus Specification contains more detailed information on prefetch feature.
Posted on 2010-07-17 08:12:43 by baldr
Yeah, a framebuffer is a good example of prefetchable memory. It doesn't matter if you read it byte by byte, or in big chunks or merge writes to it. It's just data. An example of non-prefetchable memory might be an area where device's registers are mapped to. Every single read or write operation might affect device's functionality so you have to carefully perform read and write operations to those regions in very specific ways at very specific moments.
Posted on 2010-07-17 10:29:54 by ti_mo_n
Thanks for the replies.

Can you please clarify that by an assembly snippet. Pre-fetcheable means I write one byte into memory (this memory is some device's registers in fact) but CPU writes 10 bytes for example?

Posted on 2010-07-18 00:09:59 by logicman112
It doesn't have anything to do with assembly. It's a hardware thing.

It means that another device (any device, not just CPU) can PREFETCH chunks of memory and it won't have any negative impact on device's operation.

Let's see an example from CPU's point of view: It executes a read command of 1 byte from a location. Assuming that the data in not in the L1 cache, L1 cache orders L2 cache to fetch the data. L2 cache, in turn, performs a burst transaction of 64 bytes (1 cache line). The requested 1 byte is somewhere within this 64-byte region. Finally, the CPU's register is filled with this 1 byte and the execution conitues, but from memory's point of view the CPU had read 64 bytes, not just one. In most cases it's not a problem (and actually that's what caching is for), but in some cases it may be. Let's say that an external device's registers are mapped into some address range. This device has 2 1-byte registers (2 bytes in total) which are this device's timers. Reading from the first byte returns a timer's actual value, while reading from the secod byte gives another timer's value _AND_ resets the first timer (this is a very common behavior in many devices). Now imagine what happens if you try to read the first byte with full caching enabled. CPU will burst-in 64-bytes, reseting the first timer in the process. And that's not what you wanted when you requested only the first byte.

The above applies to write operations as well.

To prevent such unwanted behaviors, some devices explicitly state that a portion of their memory is NOT prefetchable. By stating that, they mean that you should read exacly those byte that you want to read and not more (in other words: you should not prefetch the data), because -for example- 2 registers may be mapped one next to the other and reading them both at once in 1 burst may give you different results that reading the first one and the the second one.

Another scenario is that you may need to read device's registers in a precisely specified order. Performing a burst prevents you from controlling the order in which you actually read data from an external device.

The above also applies to write operations.

To sum things up:
- Prefetching does not preserve the order in which you request data
- Prefetching does not preserve the amount of data requested

I hope it's clear now ^^
Posted on 2010-07-18 01:16:31 by ti_mo_n
Moreover, memory read in prefetchable range returns all bytes regardless of PCI BE[3:0]# signals (byte enables); PCI commands Memory Read Line and Memory Read Multiple are used more often than Memory Read to decrease latency of sequential reads.
Posted on 2010-07-18 01:47:52 by baldr
Thanks a lot ti_mo_n.

Baldr,

How BE works? How can i make BE# signal active for some bytes and inactive for the others? Besides if the second and forth bytes are enabled, for example, how these two bytes are transferred into a 32 bit register(or a 16 bit register)?
Posted on 2010-07-18 06:12:05 by logicman112

How BE works? How can i make BE# signal active for some bytes and inactive for the others? Besides if the second and forth bytes are enabled, for example, how these two bytes are transferred into a 32 bit register(or a 16 bit register)?


You can't, mostly. Two sequential writes to bytes with addresses 4*N+1 and 4*N+3 in prefetchable range probably will cause single Memory Write bus command with BE[1]# and BE[3]# asserted during data phase due to byte merging.

Haven't you read PCI specs yet?
Posted on 2010-07-18 09:55:24 by baldr
How a bridge recognizes to compare an incoming address with its list of pre-fetcheable addresses?  Please explain it by PCI signaling. 
Posted on 2010-07-24 01:18:36 by logicman112