Intel 253668-032US User Manual

Page of 806
Vol. 3   8-7
MULTIPLE-PROCESSOR MANAGEMENT
The act of one processor writing data into the currently executing code segment of a 
second processor with the intent of having the second processor execute that data as 
code is called cross-modifying code. As with self-modifying code, IA-32 processors 
exhibit model-specific behavior when executing cross-modifying code, depending 
upon how far ahead of the executing processors current execution pointer the code 
has been modified. 
To write cross-modifying code and insure that it is compliant with current and future 
versions of the IA-32 architecture, the following processor synchronization algorithm 
must be implemented:
(* Action of Modifying Processor *)
Memory_Flag ← 0; (* Set Memory_Flag to value other than 1 *)
Store modified code (as data) into code segment;
Memory_Flag ← 1;
(* Action of Executing Processor *)
WHILE (Memory_Flag ≠ 1)
Wait for code to update;
ELIHW; 
Execute serializing instruction; (* For example, CPUID instruction *)
Begin executing modified code;
(The use of this option is not required for programs intended to run on the Intel486 
processor, but is recommended to insure compatibility with the Pentium 4, Intel 
Xeon, P6 family, and Pentium processors.)
Like self-modifying code, cross-modifying code will execute at a lower level of perfor-
mance than non-cross-modifying (normal) code, depending upon the frequency of 
modification and specific characteristics of the code.
The restrictions on self-modifying code and cross-modifying code also apply to the 
Intel 64 architecture.
8.1.4 
Effects of a LOCK Operation on Internal Processor Caches
For the Intel486 and Pentium processors, the LOCK# signal is always asserted on the 
bus during a LOCK operation, even if the area of memory being locked is cached in 
the processor.
For the P6 and more recent processor families, if the area of memory being locked 
during a LOCK operation is cached in the processor that is performing the LOCK oper-
ation as write-back memory and is completely contained in a cache line, the 
processor may not assert the LOCK# signal on the bus. Instead, it will modify the 
memory location internally and allow it’s cache coherency mechanism to insure that 
the operation is carried out atomically. This operation is called “cache locking.” The 
cache coherency mechanism automatically prevents two or more processors that