What Would Lazyweb Do?

In relation to my recent hardware issues I now have a bit of a dilemma, although it’s not a bad kind of dilemma to have.

Yesterday afternoon memtest86+ locked up after 10h28m running. It didn’t report any memory errors but clearly there are hardware problems there if even memtest can lock it up. So I got myself into the mindset of returning the whole server and getting a refund, going with a different vendor.

I asked for recommendations of who else to try, and someone suggested a company I will call Vendor C (I don’t know why I am hiding the identities of the vendors involved but it feels like the right thing to do until this is all sorted out).

I sent off a mail to Vendor C, showing them my quote for the system I’ve bought from Vendor S and asking them if they can match it. Here is the spec of the hardware in question:

  • 1U generic Supermicro chassis, 4xSATA hotswap
  • Pentium D 940 3.2GHz dual core
  • 4x1GB ECC DDR2-677
  • 4xWestern Digital 250GB SATA RE 16Mb cache 7.5KRPM
  • IPMI 2.0

Vendor C mailed back within 30 minutes to confirm that they could match Vendor S’s quote, but also suggesting that I may like to have a Core 2 Duo-based system instead as it would be a more modern CPU that uses less power. Their suggested spec for a system based on a Core 2 Duo Conroe 2.13Hz comes in slightly cheaper than the Pentium D 3.2GHz dual core.

I was a little concerned that 2.13GHz Core 2 Duo would not compete performance-wise with a Pentium D 3.2GHz dual core (Intel claims that Core 2 Duo is 40% faster than Pentium D at same clock speed and uses 40% less energy) so I had a look and a system from Vendor C based on the 2.4GHz Core 2 Duo (which also has 4MiB cache as opposed to 2MiB) is only slightly more expensive than Vendor S’s original quote. I’d be willing to pay a few more $ for that.

So at this point my mindset is “return server under RMA, get refund by yelling as loud as I need to, then buy 2.4GHz Core 2 Duo-based system from Vendor C”. I compose the email to Vendor S informing them of the bad news that the server’s hardware is flaky and I am returning it under warranty. I did not indicate if I want a refund or not as I wanted to see how they dealt with the matter first.

I was expecting Vendor S to be unhappy and give me a little bit of a hard time in returning the server. I was mentally preparing myself for a battle. Their reply then came as a bit of a shock, but a good one. They have been very reasonable about the whole thing and have offered me three choices:

  • Return the server and get an immediate refund.
  • Return the server and have them try and identify and replace broken parts.
  • Return the server and have the updated model from their range shipped to me instead at no extra charge. The updated model is based on a Core 2 Duo 1.8GHz.

The second one doesn’t sound too appealing to me, but to have the other two options immediately offered to me is really good customer service. So much so that I would feel a little bad just asking for my money back.

I have a slight issue with the third option in that I don’t think a 1.8GHz Core 2 Duo is going to be comparable in performance to a Pentium D 3.2GHz dual core. It may well compete in performance/Watt but if it hasn’t got the overall performance I need then it’s useless to me. Ideally I’d want the 2.4GHz Core 2 Duo here.

It appears that my options now pan out to:

  • Get refund, spend it with Vendor C to get 2.4GHz Core 2 Duo-based system.
  • Have Vendor S send me the 1.8GHz Core 2 Duo-based system and hope it performs well enough.
  • Be a ballbreaker and ask for a 2.4GHz Core 2 Duo-based system from Vendor S.
  • Be a bit cheeky and ask Vendor S how much extra for an upgrade from 1.8GHz to 2.4GHz.

I should probably point out that although Vendor S came highly recommended, and have always been polite and helpful, they already annoyed me by taking my order and saying they could supply within one week but actually taking almost four weeks to deliver what has turned out to be broken hardware, all the while giving me excuse after excuse. Do they deserve a second chance? They are clearly eager to maintain a good relationship if possible.

Lazyweb, what would you do?

Dear Lazyweb, am I using memtest86+ correctly?

I’ve got a Supermicro-based server that I’m in the process of setting up for Xen hosting purposes. After 3 or 4 days of uptime and light load (because it’s not in production yet) sitting in its rack in a datacentre weird things start to happen.

I get random kernel panics and OOPSes, it locks up or spontaneously reboots. When I power cycle it then the serial console gets a bit garbled and slow once it gets past the BIOS screen, and it rarely manages to boot a kernel then. If I turn it off for several hours and try again then I can usually get it booted. The BIOS event log contains multibit ECC errors.

Some of this sounds like overheating (the fact that nothing will boot yet this “gets better” after a few hours without power), but the ECC errors suggest bad RAM. The server has 4x1GB DDR2-533.

Earlier this evening I’ve booted the server into memtest86+ and left it running for default settings for over 5 hours. In this time it completed 4 test runs without error. I know it can take a long time for memtest86+ to find errors so I may let it run for a few days.

I did get curious though and poked around in memtest86+’s configuration. When I change the memory map from “BIOS-std” to “BIOS-all” I get this:

      Memtest86+ v1.70      | Pass  1%
Pentium D (65nm) 3192 MHz   | Test 50% ###################
L1 Cache:   16K 22322MB/s   | Test #0  [Address test, walking ones]
L2 Cache: 2048K 17348MB/s   | Testing:  120K - 4096M 4112M
Memory  : 4112M  3117MB/s   | Pattern:   00000000
Chipset :

 WallTime   Cached  RsvdMem   MemMap   Cache  ECC  Test  Pass  Errors ECC Errs
 ---------  ------  -------  --------  -----  ---  ----  ----  ------ --------
   0:00:01   4112M       0K  e820-All    on   off   Std     0      22        0
Tst  Pass   Failing Address          Good       Bad     Err-Bits  Count Chan
---  ----  -----------------------  --------  --------  --------  ----- ----
  0     0  000e7f00000 -  3711.0MB  ffffffff  00000000  e7f04000      1
  0     0  000e7f00000 -  3711.0MB  ffffffff  00000000  e7f08000      1
  0     0  000e7f00000 -  3711.0MB  ffffffff  00000000  e7f10000      1
  0     0  000e7f00000 -  3711.0MB  ffffffff  00000000  e7f20000      1
  0     0  000e7f00000 -  3711.0MB  ffffffff  00000000  e7f40000      1
  0     0  000e7f00000 -  3711.0MB  ffffffff  00000000  e7f80000      1
  0     0  000ff000000 -  4080.0MB  ffffffff  00000000  ff000004      1
  0     0  000ff000000 -  4080.0MB  ffffffff  00000000  ff000008      1
  0     0  000ff000000 -  4080.0MB  ffffffff  00000000  ff000010      1
  0     0  000ff000000 -  4080.0MB  ffffffff  00000000  ff000020      1
(ESC)Reboot  (c)configuration  (SP)scroll_lock  (CR)scroll_unlock         LOCKED

Instant errors, and all in the top ~400M of RAM.

But is this just a misconfiguration on my part of memtest86+? Is it expected that this should fail? Should I be taking out the last stick of RAM and seeing if life gets better? Is there some PAE issue here where memtest86+ can’t address higher than that amount of RAM?

My situation is made more difficult by the fact that this server is in a datacentre in San Francisco and I am in UK; my only means of interaction with it is by serial console and remote PDU to power cycle if necessary. Graham‘s going there for me in a couple of days and Paul may go there in a few weeks so I’d like to be able to make some suggestions of things they could try when they get there.

Any ideas?

Dear Lazyweb, help me set up my audio

In times past, when I had a desktop computer, I’d always have to be in that one place to use it, so attaching speakers to it was a reasonable way to play music while I worked. These days I only have a laptop for my personal computing needs, and all my data is on a fileserver here at home.

Unfortunately, playing music through a laptop speaker is less than acceptable. It got so bad that I just went to attach my old desktop’s speakers to my laptop, and found that one of them was dead. Clearly I need to revamp my audio setup.

So how should I do this? What I want is to be able to direct the audio from my laptop to come out of speakers in the room I am in, whether I am playing music/video from the fileserver’s samba share (common), or watching a dvd (not so common). Initially this needs to work in my bedroom, but should be easily expandable later to other places I use the laptop, such as the lounge.

I currently have no audio hardware (no hifi, no amp, no speakers) so am really starting from scratch. I’m not an audio buff and was previously happy with the quality of audio produced by decent powered computer speakers.  It’d be nice if I don’t have to attach more wires to my laptop also. There is a 100M switched ethernet network linking upstairs and downstairs so anything that isn’t going to move about can be plugged into that.

My budget is let’s say 500 UKP. Possibly more for a solution that is really good, will last ages and can be easily transported to any future accomodation.

Any suggestions? Thanks!

Dear Lazyweb, Can you suggest an organiser/calendaring solution?


What I’m looking for is a simple organiser or calendaring solution whereby everything is stored on a server. I want to be able to put in details of TODO items, events and meetings via a web interface (so that I can do it from mostly anywhere).

It should be able to email me about upcoming events, with a URL to the web interface to get more details.

I also want to have access to it not only via the web but also via some kind of API that would allow me to e.g. get a plain text list of events/TODOs for the current week in my shell login script.

Can anyone suggest anything?