Intel Xeon Phi for "cheap" - Matthew Francis-Landau

(This work and post were originally from early 2015, some aspects may still be useful, eg the kernel patch for the lower end motherboards)

Recently Intel has been selling their a version of their Xeon Phi coprocessor under a promotional deal at 90% off. This means that one can get a device with 8GB of ram (on the coprocessor) and 228 hardware threads (57 physical cores, and each with 4 hyper-threads) at a reasonable price of ~$200.

When I first purchased the Phi, I was planning to put it into somewhat of an old desktop system that I had lying around, however the motherboard did not support the major requirement of “Above 4G decoding” on the PCI bus. 4G decoding deals with how the system allocates the memory resources on items on the PCI bus. With the Intel Phi, unlike consumer level GPUs it will present all 8G as a memory mapped region to the host computer. (more about 4G decoding) Based off some research on this obscured feature, it appeared that most “modern” motherboard have some support for this feature. I decided to get an Asus h97m-plus which is fairly cheap, and fit the computer tower that I already had on hand. While this motherboard does list the above 4G decoding in its bios and manual, I am not actually sure if this feature has been properly tested, as unlike Asus higher end motherboards, there was no mention of this mother board specifically working with the above 4G decoding. Based off examining the early booting sequence it appeared that the Linux kernel was attempt to find alignment positions for PCI devices which were equal in size to the requested memory region (8GB in this case) or depends on the BIOS to perform the PCI allocation before booting. For the higher end motherboards which the Intel Phi was known to work with, it appears that the “more powerful BIOSes” were allocating memory for the Phi, but in the case of this lower end motherboard, the BIOS was unable to deal with a request to allocate 8GB of memory and thus falling back onto the kernel to perform allocations. Following this observation, I made a small kernel patch (here) which changes requests for alignment larger than the maximal size to be simply aligned at the maximal supported size. With the components in this computer it appears that even with this change the Intel Phi gets aligned to a 4GB boundary and is able to still function correctly.

The next challenge once the Phi was communicating with the computer was to prevent the chip from overheating. The discounted versions of the Phi did not include any fans as it was designed for use in server environments. Additionally being a 300+W accelerator, the system is capable of generating a lot of heat. As such, many “typical” fan solutions that I tried failed to keep the chip cool for longer than a few minutes. I eventually landed on the high-powered tornado fan which can move over 80 cubic inches of air a minute. I ended up having to zip tie this over one end of the chip to ensure that there was enough directed airflow to keep it functional. (warning to future users: This fan actually does sound like a tornado, constantly).

Having the entire system functional for over a year now, I have managed to use the Phi for a handful of computations. While there is decent opportunity in improved performance, this chip really requires that you design customized software for it specifically. This is especially true given that Intel Phi is less popular than graphics cards with Cuda, where many mathematical tools and frameworks already have customized backend targeting Cuda requiring limited effort on the user’s part. While this chip has a nice promise of being able to execute normal x86 instructions, this seems to be of fairly limited use since the only compiler that will target the chip and use its specialized vector instructions is Intel’s own compiler (similar in nature to Cuda). This makes it fairly difficult to natively run any non trivial programs on this chip as any external libraries require their own porting effort. (As an accelerator which accelerates embedded methods similar to Cuda this chip works fine, just if you are trying to run a program without the hosts involvement.)

Photos of the setup:

Fan zip tied onto the back of the computer

5 thoughts on “Intel Xeon Phi for “cheap””

Miguel

5 July, 2017 at 12:50 pm

so could i get this hardwate setup working with the phi on windows 10? thanks in advance
- Matthewfl
  
  6 July, 2017 at 7:17 pm
  
  This post is all about modifications required for getting the intel phi to work on standard consumer grade hardware. Some aspects such as mounting the high power fans will clearly carry over to windows, however other aspects like the linux kernel patch will not. I believe that Intel does provide instructions for getting this working on windows, so you should check there.
Ernest

22 October, 2017 at 7:32 am

So any shops who still selling for 200 usd or lower same model ? i am unable to locate any online shops who still does that.
NoDox John

2 January, 2018 at 10:32 am

Hey, do you happen to know of any motherboards that could hold 2 or 4 of these Phi’s ?
As long as there are x16 slots and the bios supports 4k decoding plus xeon chips it should be ok right ?
- Matthewfl
  
  12 January, 2018 at 8:23 pm
  
  I have no idea where you are going to find a large motherboard for cheap. I am assuming that to get something that large you are going to be looking at something in a server class.

5 thoughts on “Intel Xeon Phi for “cheap””

Leave a Comment Cancel reply