Make HAXM run on system with more than 64 host CPUs by coxuintel · Pull Request #255 · intel/haxm (original) (raw)
Although this patch changed lots of files but only do one thing: make HAXM run on system with more than 64 logical CPUs.
Previously, HAX_MAX_CPUS is defined as 64, and HAXM stores CPU online bitmap in 64-bit variable. When running on a system with more than 64 logical CPUs, IPI call actually executed on all CPUs, although internally HAXM only maintains a 64-bit bitmap. So some per-CPU routine actually runs on different CPUs, but the 64 loop will check the same pair of VMXON/VMXOFF for VMX operations, then leads to the error. Simply increasing the 64-bit bitmap to larger size may resolve the issue but not efficient and clean. Previous implementation also has another issue that it invokes KeQueryActiveProcessors() to get the total logical CPU number, and KeGetCurrentProcessorNumber() to get the current logical CPU ID. However, both APIs are NOT designed to get information on Windows with more than 1 CPU group. Otherwise, both APIs only return value from group 0, which can't reveal the actual logical CPU information. Instead, user should use KeQueryActiveProcessorCountEx() and KeGetCurrentProcessorNumberEx().
This patch defines the CPU bitmap in 2-dimention way, in unit of group, each group can hold up to 64 CPUs for bitmap, same as old implementation. And introduce another array to store the per-CPU group/bitmap information so that indexing could be fast. This patch also unify cpu init routines, more common implementation into same header/source instead of OS specific, like cpu_info_init(), smp_cfunction(), smp_call_parameter{}, etc. Since they are very fundamental functions, several files are modified.
Change summary:
- Define 2-dimention structure hax_cpumap_t to store CPU bitmap info. Including total group number, total logical CPU number, bitmap within each group, group/bit position for each CPU id.
- For Windows/Linux/Darwin/BSD, use simliar routine cpu_info_init() to initialize host CPUs, and implement in OS specific way.
- On Windows, use KeQueryActiveProcessorCountEx(), KeGetCurrentProcessorNumberEx() and KeQueryActiveGroupCount() to get correct logical CPUs number and group information, and fill hax_cpumap_t.
- For Linux/Darwin/BSD, use OS specific routine to get the total logical CPU number, and fill into groups consecutive. This is different against Windows since logical process group is Windows definition, and CPU bitmap is not guaranteed to consecutively fit into a 64-bit bitmap: 64 logical CPUs could be in two groups.
- Implement the new cpu_is_online() function with the new hax_cpumap_t.
- Implement the new cpu2cpumap() function with the new hax_cpumap_t.
- Unify OS specific smp_cfunction() implementations in to one, since the function is executed by IPI on each CPU, check the CPU online with the new cpu_is_online().
- For all functions refer to the CPU bitmap, use the new implementation.
- For all per-CPU IPI function, add current CPU id in log.
After this patch, HAXM design won't block running on system with a large number CPUs, and easy to expand in case limitted by the date type range: now the upper bound is 65536*64 regardless of other resource limitation.
Signed-off-by: Colin Xu colin.xu@intel.com